Tuesday, 10 April 2018

AWK Programming Tutorial- Awk built-in variables FS, OFS, RS, ORS, NF, NR

Awk built-in variables: This is the fourth article of this tutorial series on awk and in this one, we will be learning about built-in variables in awk. In case you have missed any of our previous articles, you can find them out here.


Awk comes up with a number of built-in variables. Of these variables, some have a default value associated with them which can be changed e.g. FS ( field separator, with default value of a whitespace ) and RS ( record separator, with default value of \n ). While, some variables are quite useful while doing analysis or creating reports e.g. NF ( number of fields ) and NR ( number of records ). Lets take a look at them one-by-one.

FS (Field Separator) and OFS (Output Field Separator)

  • With FS, we instruct awk that, in a particular input file, fields are separated by some character.
  • Default value if this variable is a whitespace, telling awk that fields are separated by one or more whitespaces (including tabs).
  • This default value can be overwritten with a character or a regular expression. For example, we can use a colon ( : ) to separate fields while working on /etc/passwd file.
  • With OFS, we ask awk to use a particular character to separate the fields in the output.
  • For this variable too, default value is a single whitespace.
Lets take a look at an example now. For this, we will use demo csv file with contents as shown below:

1,"Eldon Base for stackable storage shelf, platinum",Muhammed MacIntyre,3,-213.25,38.94,35,Nunavut,Storage & Organization,0.8
2,"1.7 Cubic Foot Compact ""Cube"" Office Refrigerators",Barry French,293,457.81,208.16,68.02,Nunavut,Appliances,0.58
3,"Cardinal Slant-DÆ Ring Binder, Heavy Gauge Vinyl",Barry French,293,46.71,8.69,2.99,Nunavut,Binders and Binder Accessories,0.39
4,R380,Clay Rozendal,483,1198.97,195.99,3.99,Nunavut,Telephones and Communication,0.58
5,Holmes HEPA Air Purifier,Carlos Soltero,515,30.94,21.78,5.94,Nunavut,Appliances,0.5
6,G.E. Longer-Life Indoor Recessed Floodlight Bulbs,Carlos Soltero,515,4.43,6.64,4.95,Nunavut,Office Furnishings,0.37
7,"Angle-D Binders with Locking Rings, Label Holders",Carl Jackson,613,-54.04,7.3,7.72,Nunavut,Binders and Binder Accessories,0.38
8,"SAFCO Mobile Desk Side File, Wire Frame",Carl Jackson,613,127.70,42.76,6.22,Nunavut,Storage & Organization,
9,"SAFCO Commercial Wire Shelving, Black",Monica Federle,643,-695.26,138.14,35,Nunavut,Storage & Organization,
10,Xerox 198,Dorothy Badders,678,-226.36,4.98,8.33,Nunavut,Paper,0.38

By default FS will use whitespace as a default value. Lets check extracting 1st and 3rd column without default value of FS.

$ awk '{ print $1, $3 }' input.csv 
1,"Eldon for
2,"1.7 Foot
3,"Cardinal Ring
4,R380,Clay and
5,Holmes Air
6,G.E. Indoor
7,"Angle-D with
8,"SAFCO Desk
9,"SAFCO Wire
10,Xerox Badders,678,-226.36,4.98,8.33,Nunavut,Paper,0.38

And now, using comma ( , ) as the field separator value.

$ awk 'BEGIN { FS = ","; } { print $1, $3 }' input.csv
1  platinum"
2 Barry French
3  Heavy Gauge Vinyl"
4 Clay Rozendal
5 Carlos Soltero
6 Carlos Soltero
7  Label Holders"
8  Wire Frame"
9  Black"
10 Dorothy Badders

As we can see in above outputs, awk uses the default value of OFS which is a single whitespace. We can overwrite this value, to say a pipe ( | ) as shown in below example:

$ awk 'BEGIN { FS = ","; OFS = "|" } { print $1, $3 }' input.csv
1| platinum"
2|Barry French
3| Heavy Gauge Vinyl"
4|Clay Rozendal
5|Carlos Soltero
6|Carlos Soltero
7| Label Holders"
8| Wire Frame"
9| Black"
10|Dorothy Badders





RS (Record Separator) and ORS (Output Record Separator)

  • RS and ORS are useful while dealing with multi-line records. In this case, each field is on a new line.
  • Default value of both these variables is a newline character ( \n ).
  • With ORS value overwritten, we can tell awk to separate records with some other character then the newline.
Lets take a look at our demo file wherein each record is separated by dual newlines ( \n\n ) and each field in the record is separated using single newline character ( \n ).

$ cat address.txt 
Cecilia Chapman
711-2880 Nulla St.
Mankato Mississippi 96522
(257) 563-7401

Iris Watson
P.O. Box 283 8562 Fusce Rd.
Frederick Nebraska 20620
(372) 587-2335

Celeste Slater
606-3727 Ullamcorper. Street
Roseville NH 11523
(786) 713-8616

Theodore Lowe
Ap #867-859 Sit Rd.
Azusa New York 39531
(793) 151-6230

Now, to display a person's name ( $1 ) and his/her phone number ( $4 ) on a separate line ( ORS will be \n, while RS is \n\n ), we can use below command:

$ awk ' BEGIN { FS = "\n"; RS = "\n\n"; ORS = "\n" } { print $1, $4 } ' address.txt 
Cecilia Chapman (257) 563-7401
Iris Watson (372) 587-2335
Celeste Slater (786) 713-8616
Theodore Lowe (793) 151-6230

NF (Number of Fields) and NR (Number of Record)

  • Awk variable NF defines the number of fields if the current record ( $0 ).
  • If we try to increase the value of NF, awk adds additional fields separated by the delimiter value in OFS.
  • Whereas, when we decrease the value of NF, all the fields with identifiers greater than the value are ignored.
  • NR is the variable that stores the current record number being processed by awk.
  • There is another variable, FNR, which is useful while dealing with multiple files. It stores the position of the record relative to the current file only.
Lets take a look at below demo file to illustrate this example. If you observe, it has different number of fields on each record.

$ cat cities.txt 
Washington 18 23 21 19
London 10 7 13 5 -1
Moscow 2 0 -3
Mumbai 24 27

Now, we print number of fields a record has before printing the record itself, using below command:

$ awk '{print NF, $0}' cities.txt 
5 Washington 18 23 21 19
6 London 10 7 13 5 -1
4 Moscow 2 0 -3
3 Mumbai 24 27

To illustrate the use of NR, we use the same file again. Its pretty straight forward.

$ awk '{print NR, $0}' cities.txt 
1 Washington 18 23 21 19
2 London 10 7 13 5 -1
3 Moscow 2 0 -3
4 Mumbai 24 27

In case there are multiple files, we can print the record number relative to the current input file being processed using the variable FNR.

$ awk '{print FNR, $0}' cities.txt address.txt 
1 Washington 18 23 21 19
2 London 10 7 13 5 -1
3 Moscow 2 0 -3
4 Mumbai 24 27
1 Cecilia Chapman
2 711-2880 Nulla St.
3 Mankato Mississippi 96522
4 (257) 563-7401
5 
6 Iris Watson
7 P.O. Box 283 8562 Fusce Rd.
8 Frederick Nebraska 20620
9 (372) 587-2335
10 
11 Celeste Slater
12 606-3727 Ullamcorper. Street
13 Roseville NH 11523
14 (786) 713-8616
15 
16 Theodore Lowe
17 Ap #867-859 Sit Rd.
18 Azusa New York 39531
19 (793) 151-6230

Observe the line after line #4. Awk has numbered it #1, just because we have used FNR. Had we used NR here, it would have been numbered #5. You can check this out, I will leave this for you.

That's it for the scope of this article. Please share your feedback and suggestions in the comments section below and stay tuned for more articles. Thanks for reading.

Friday, 6 April 2018

How To: Install or Upgrade to Linux Kernel 4.12 in Ubuntu/Linux Mint

The Linux Kernel 4.12 is available for the users. This Linux Kernel version comes with plenty of fixes and improvements. This article will guide you to install or upgrade to Linux Kernel 4.12 in your Ubuntu or Linux Mint system.

Installation

For 32-Bit Systems

Download the .deb packages.

$ wget http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.12/linux-headers-4.12.0-041200_4.12.0-041200.201707022031_all.deb

$ wget http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.12/linux-headers-4.12.0-041200-generic_4.12.0-041200.201707022031_i386.deb

$ wget http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.12/linux-image-4.12.0-041200-generic_4.12.0-041200.201707022031_i386.deb

Install them.

$ sudo dpkg -i linux-headers-4.12.0*.deb linux-image-4.12.0*.deb

Reboot the system.

$ sudo reboot





For 64-Bit Systems

Download the .deb packages.

$ wget http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.12/linux-headers-4.12.0-041200_4.12.0-041200.201707022031_all.deb

$ wget http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.12/linux-headers-4.12.0-041200-generic_4.12.0-041200.201707022031_amd64.deb

$ wget http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.12/linux-image-4.12.0-041200-generic_4.12.0-041200.201707022031_amd64.deb

Install them.

$ sudo dpkg -i linux-headers-4.12.0*.deb linux-image-4.12.0*.deb

Reboot the system.

$ sudo reboot

To uninstall,

$ sudo apt-get remove 'linux-headers-4.12.0*' 'linux-image-4.12.0*'

Friday, 30 March 2018

AWK Programming Tutorial - Constants, Variables and Arithmetic Operators

Constants, Variables and Operators in Awk : So far in this tutorial series on awk programming, we have learned to print stuff, we learned about fields, records and delimiters and how to they are referenced. These are very basic operations as we are just extracting data from input lines and printing them. Now, we move a step ahead and manipulate the extracted fields by performing some common arithmetic operations on them. Before we proceed, I recommend you to please go through the third article published on awk - Field separator and Field references.


Constants

There are only two types of constants:
A string constant

  • A string constant is always surrounded within quotes, e.g. "Pineapple", "5_days", etc.
  • String can use escape sequences like \n ( newline ), \t ( horizontal tab ), \b ( backspace ), \v ( vertical tab ), \" ( double quotes ), etc.
A numeric constant

  • A numeric constant is a number without quotes, enough said.
  • A number enclosed within quotes is considered as a string.
Variables

  • A variable is an identifier that references to the memory location that stores a value.
  • We can initialize a variable by assigning a value to it, using = operator, e.g., age = 20, firstName = "Eric", etc. Here, age and firstName are variables.
  • A variable name consist of alphabets, digits and underscores, and it must start with a letter or underscore.
  • Variables are case-sensitive. It means, Age, age and AGE are different variables and they can store different values in them that won't get overwritten.
  • Variable initialization is optional. If we do not initialize a variable, awk defaults the value to numeric 0 or a blank string ( "" ) appropriately.
  • When we assign two or more strings separated by space to a variable, it stores a concatenated value
  • We can assign a field value to a variable using field reference variables $1, $2, etc.
Example:

# Assign a numeric value to variable 'myNum'
myNum = 10

# Assign a string value to variable 'myStr'
myStr = "awk!"

# Space concatenates the strings, so 'myVar' stores the value "AwesomeAwk!"
myVar = "Awesome" "Awk!"

# Assign a field value to a variable using field reference variable
marks = $1





Arithmetic Operators

awk supports basic arithmetic operators to be used in expressions, which are listed as below:

Operator Description
+ Addition
- Subtraction
* Multiplication
/ Division
% Modulus
^ or ** Exponentiation

Below example shows how we can define a variable and perform arithmetic operations on them.

Example:

# Initialize the variable 'salary'
salary = 300000

# Add 25000 to variable 'salary' and store the result in another variable 'newSalary'
newSalary = salary + 25000

# Print the updated salary
print newSalary

Alternately, you can directly print the addition of salary and the number 25000 to further shrink the code as:

salary = 300000
print salary + 25000

This way, we will get the similar result from print statement. But, the value stored in variable salary remains unchanged. If we were to update the variable salary with the added value, we can use assignment operator +=, that combines 2 operations, addition and assignment. So, we have the number 25000 added to the value stored in salary and the result of addition is again stored in the variable salary.

Below is the list of assignment operators:

Operator Description
+= Add and assign
-= Subtract and assign
*= Multiply and assign
/ Divide and assign
% Perform modulo and assign the result
^ or ** Perform exponentiation and assign the result

To demonstrate this, we can use /etc/passwd and count the number of lines in it. For this, we initialize a variable x and increment it after every line is read. After the last line, we print the variable, which gives us total number of records read by awk.

# We can also use { x++ } or { x = x + 1 } instead of { x += 1 } in below command
$ awk ' { x += 1 } END { print x } ' /etc/passwd
31

We can also include a condition here, to print the count of lines those have the string bash inside them.

$ awk '/bash/ { x += 1 } END { print x } ' /etc/passwd
3

Another example. This time, we use a demo file which has 3 fields - Name of the student, Subject name and the Marks. Below is the snippet.

Student Subject Marks
James Biology 31
Velma Biology 43
Kibo Biology 81
Louis Biology 11
Phyllis Biology 18
Zenaida Biology 55
Gillian Biology 38
Constance Biology 16
Giselle Biology 73
...
...

We can calculate the average marks obtained by students in Chemistry as follows:

awk ' /Chemistry/ { total += $3; count += 1 } END { print total/count } ' result.txt 
52.4074

We can also consider a data set of cities and their temperatures as below:

Washington 18 23 21 19 16
London 10 7 13 5 -1
Moscow 2 0 -3 -7 1
Mumbai 24 27 29 29 28

We can find average temperature for every city as shown below:

$ awk ' {total = $2 + $3 + $4 + $5 + $6; print $1 " : " total/5 }' cities.txt 
Washington : 19.4
London : 6.8
Moscow : -1.4
Mumbai : 27.4

That's all for the scope of this article. We did not cover all of the operators here as there are pretty straight forward and most of you may already have an idea about those ones. Let me know about your views and feedback in the comments section below and stay tuned for more articles.

Advance Your Career with Linux Foundation Training

Linux is the largest open-source technology that powers computers, mobiles and various other products and services across the world. It is the OS that runs more than 95% of the top 1 million domains and the top 500 supercomputers while accounting for 80% of smartphones, running Android based on Linux kernel.

Developers and engineers with Linux skills are in high demand in the job market. If you have a qualification or certification in Linux ecosystem, then you can get hired with high salary.
Linux Foundation, which is headed by the creator of Linux, provides several courses and certifications in different Linux technologies. You can easily apply for the course you want and increase your value as a potential employee or independent developer.

Popular Linux Foundation Training Courses

Which Linux Foundation training course should you go for? We will help you decide!

Certification

You can apply for the Linux Foundation Certified Engineer (LFCE) and Linux Foundation Certified System Administrator (LFCS) certification, which are both carried out online.

LFCS imparts the skills and knowledge of a sysadmin, which you need to help prove yourself to employers. LFCE gives you in-depth skills that enable you to design and implement system architecture. You can also provide guidance as a Subject Matter Expert using your newfound expertise.

Both certifications cost $499 and can be completed in 12 months.

Linux Courses

Linux Foundation offers both introductory and advanced Linux courses which can help you land a great job. The courses cover a wide range of aspects and are divided into the following categories-

  • Linux Programming & Development Training
  • Enterprise IT & Linux System Administration Training
  • Open Source Compliance Courses
Some of the courses can be completed in 1 year while others run for four days. You can even take your own time in the case of some courses, as they have no time limit.

The cost of the courses also varies according to their content- you can take some courses for completely free while others may cost anything from $179 to $3,150.

You can also apply for e-Learning courses, which can be completed at your own pace. The cost ranges between $149 and $299 while some are totally free! Use a Linux Foundation coupon and you can receive your certifications at a reduced price!

Who can Benefit from Linux Foundation Training?

The training can either be taken by individuals looking to advance their career or by employees of organizations, depending on Linux solutions.

Top companies like AMD, HP, Intel, and Nokia depend on Linux Foundation training to help their employees get skilled in open-source technologies. This also increases the demand for skilled Linux professionals who can successfully contribute towards the development of better products and services.

There is currently a shortage of talent when it comes to open-source professionals all over the world. If you have a certificate from the creators of Linux guaranteeing your skills, it becomes much easier to increase your demand in the job market.

You can get better and higher paying jobs without settling for anything less. Clients will also be happy to pay higher fees to a Linux freelancer with an approved qualification.

So whether you are a corporate employee or hoping one day to be a Linux pro, the courses at Linux Foundation are ideal to make you skilled and knowledgeable.

AWK Programming Tutorial - Field separator and Field references

Awk Field separator and field references: This is the third article from our tutorial series on awk. In first article, we had an introduction with awk and in second one, we created Hello world program in awk. In this article, we will be learning about separating fields and referencing them using awk.


Referencing Fields and Records

In the first article from this tutorial series, Introduction to awk, we covered following points:
  • awk presumes that the input is a structured type of data
  • It interprets each line from input file(s) as a Record
  • Each line will have strings/words separated (or delimited) by whitespaces or some character. These separators are referred to as delimiters.
  • Each of those strings/words separated by delimiter is called as a Field.

Lets consider a familiar example to know about records, fields and delimiters, /etc/passwd file:

messagebus:x:107:111::/var/run/dbus:/bin/false
uuidd:x:108:112::/run/uuidd:/bin/false
sshd:x:110:65534::/var/run/sshd:/usr/sbin/nologin
foouser:x:1001:1001:,,,:/home/foouser:/bin/bash

In above file, each of the line is interpreted as a record. As each word/string is separated by a colon ( : ), it becomes a delimiter and each word separated by the delimiter i.e. foouser, 1001, /bin/bash, etc. are the fields.





In awk, we reference each field using $ operator, followed by a number or an awk variable. We learn more about awk variables in later articles to keep things simple here. Thus, we can reference first field from the record using $1, second field with $2, third field with $3 and so on. $0 is used to reference the record (or the input line).

Lets take a look at following example. We have an input file result.txt with contents as below [snipped]:

Student Subject Marks
James Biology 31
Velma Biology 43
Kibo Biology 81
Louis Biology 11
Phyllis Biology 18
Zenaida Biology 55
Gillian Biology 38
Constance Biology 16
Giselle Biology 73

We can see that there are 10 records and each record has 3 fields. Now we refer to each record and every field with their respective identifiers.

# Referencing first field
$ awk '{ print $1 }' result.txt
Student
James
Velma
Kibo
...
...

# Referencing second field
$ awk '{ print $2 }' result.txt
Subject
Biology
Biology
Biology
...
...

# Referencing third field
$ awk '{ print $3 }' result.txt 
Marks
31
43
81
...
...

# Referencing all fields
$ awk '{ print $3, $1, $2 }' result.txt
Marks Student Subject
31 James Biology
43 Velma Biology
81 Kibo Biology
...
...

# Referencing a record
$ awk '{ print $0 }' result.txt
Student Subject Marks
James Biology 31
Velma Biology 43
Kibo Biology 81
...
...

Field Separator

In above example, we have not used any field separator or delimiter anywhere in the awk command. So, it can be concluded that, awk considers whitespace as a default field separator. awk allows us to set a field separator of our own choice with -F option followed by the delimiter. Lets check this with /etc/passwd file, that has fields delimited by a colon.

# /etc/passwd file contents (snipped)
$ cat /etc/passwd
...
messagebus:x:107:111::/var/run/dbus:/bin/false
uuidd:x:108:112::/run/uuidd:/bin/false
sshd:x:110:65534::/var/run/sshd:/usr/sbin/nologin
foouser:x:1001:1001:,,,:/home/foouser:/bin/bash
...

$ awk -F ':' '{ print $3, $1, $7 }' /etc/passwd
...
107 messagebus /bin/false
108 uuidd /bin/false
110 sshd /usr/sbin/nologin
1001 foouser /bin/bash
...

While writing an awk script, we can change the field separator by using awk variable FS. We need to instruct awk to consider a custom delimiter before it start reading lines from input file. Here, BEGIN block comes handy. BEGIN block is executed before any input lines are read. Similarly, we have END block which gets executed once all of the lines from input file are read. Both BEGIN and END blocks are optional.

So, we can write an awk script passwd.awk as:

BEGIN { FS = ":" }
{
    print $3, $1, $7
}

As covered in our first tutorial (link), we can use the instructions from this script using option -f as below:

$ awk -f passwd.awk /etc/passwd
...
107 messagebus /bin/false
108 uuidd /bin/false
110 sshd /usr/sbin/nologin
1001 foouser /bin/bash
...

To make the output comprehensible, we can introduce a tab ( \t ) character between two output fields.

$ cat passwd.awk
BEGIN { FS = ":" }
{
    print $3 "\t" $1 "\t" $7
}

$ awk -f passwd.awk /etc/passwd
107	messagebus	/bin/false
108	uuidd	/bin/false
110	sshd	/usr/sbin/nologin
1001	foouser	/bin/bash

By default, all the instructions from the script are executed on every single line from the input file. To execute these instructions on selected lines, we can also introduce pattern matching by enclosing the regular expression within slashes ( /[REGEX]/ ). This will execute the instructions from awk script on only those lines matching the regex.

To verify this, we use our results.txt file again. From the entire list of students and their marks in certain subjects, we can filter only those records of students who got exactly 50 marks, whichever may be the subject. So, we can use 50 as the pattern to match, as shown below:

awk ' /50/ {print $1"\t"$2"\t"$3} ' result.txt 
Ori	Chemistry	50
Hyatt	Mathematics	50

Or we can filter only those records in which students who have their names starting with string Jo. For this, we can use a regex ^Jo with tilde ( ~ ) operator to match against first field ( $1 ) which is name of the student.

$ awk ' $1 ~ /^Jo/ { print $1"\t"$2"\t"$3 }' result.txt 
John	Biology	55
Jonas	Mathematics	40

Or we can negate the same using the bang or logical not operator ( ! ) as shown below (result is be too long, hence now shown):

$ awk ' $1 !~ /^Jo/ { print $1"\t"$2"\t"$3 }' result.txt

That's all for the scope of this article. Please share your feedback and suggestions in the comments section below and stay tuned for more articles on this topic.

Thursday, 29 March 2018

How To: Install or Upgrade to Linux Kernel 4.11 in Ubuntu/Linux Mint

The Linux Kernel 4.11 is available for the users. This Linux Kernel version comes with plenty of fixes and improvements. This article will guide you to install or upgrade to Linux Kernel 4.11 in your Ubuntu or Linux Mint system.

Installation

For 32-Bit Systems

Download the .deb packages.

$ wget http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.11/linux-headers-4.11.0-041100_4.11.0-041100.201705041534_all.deb

$ wget http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.11/linux-headers-4.11.0-041100-generic_4.11.0-041100.201705041534_i386.deb

$ wget http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.11/linux-image-4.11.0-041100-generic_4.11.0-041100.201705041534_i386.deb

Install them.

$ sudo dpkg -i linux-headers-4.11.0*.deb linux-image-4.11.0*.deb

Reboot the system.

$ sudo reboot





For 64-Bit Systems

Download the .deb packages.

$ wget http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.11/linux-headers-4.11.0-041100_4.11.0-041100.201705041534_all.deb

$ wget http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.11/linux-headers-4.11.0-041100-generic_4.11.0-041100.201705041534_amd64.deb

$ wget http://kernel.ubuntu.com/~kernel-ppa/mainline/v4.11/linux-image-4.11.0-041100-generic_4.11.0-041100.201705041534_amd64.deb

Install them.

$ sudo dpkg -i linux-headers-4.11.0*.deb linux-image-4.11.0*.deb

Reboot the system.

$ sudo reboot

To uninstall,

$ sudo apt-get remove 'linux-headers-4.11.0*' 'linux-image-4.11.0*'