Friday, 30 March 2018

AWK Programming Tutorial - Constants, Variables and Arithmetic Operators

Constants, Variables and Operators in Awk : So far in this tutorial series on awk programming, we have learned to print stuff, we learned about fields, records and delimiters and how to they are referenced. These are very basic operations as we are just extracting data from input lines and printing them. Now, we move a step ahead and manipulate the extracted fields by performing some common arithmetic operations on them. Before we proceed, I recommend you to please go through the third article published on awk - Field separator and Field references.


Constants

There are only two types of constants:
A string constant

  • A string constant is always surrounded within quotes, e.g. "Pineapple", "5_days", etc.
  • String can use escape sequences like \n ( newline ), \t ( horizontal tab ), \b ( backspace ), \v ( vertical tab ), \" ( double quotes ), etc.
A numeric constant

  • A numeric constant is a number without quotes, enough said.
  • A number enclosed within quotes is considered as a string.
Variables

  • A variable is an identifier that references to the memory location that stores a value.
  • We can initialize a variable by assigning a value to it, using = operator, e.g., age = 20, firstName = "Eric", etc. Here, age and firstName are variables.
  • A variable name consist of alphabets, digits and underscores, and it must start with a letter or underscore.
  • Variables are case-sensitive. It means, Age, age and AGE are different variables and they can store different values in them that won't get overwritten.
  • Variable initialization is optional. If we do not initialize a variable, awk defaults the value to numeric 0 or a blank string ( "" ) appropriately.
  • When we assign two or more strings separated by space to a variable, it stores a concatenated value
  • We can assign a field value to a variable using field reference variables $1, $2, etc.
Example:

# Assign a numeric value to variable 'myNum'
myNum = 10

# Assign a string value to variable 'myStr'
myStr = "awk!"

# Space concatenates the strings, so 'myVar' stores the value "AwesomeAwk!"
myVar = "Awesome" "Awk!"

# Assign a field value to a variable using field reference variable
marks = $1





Arithmetic Operators

awk supports basic arithmetic operators to be used in expressions, which are listed as below:

Operator Description
+ Addition
- Subtraction
* Multiplication
/ Division
% Modulus
^ or ** Exponentiation

Below example shows how we can define a variable and perform arithmetic operations on them.

Example:

# Initialize the variable 'salary'
salary = 300000

# Add 25000 to variable 'salary' and store the result in another variable 'newSalary'
newSalary = salary + 25000

# Print the updated salary
print newSalary

Alternately, you can directly print the addition of salary and the number 25000 to further shrink the code as:

salary = 300000
print salary + 25000

This way, we will get the similar result from print statement. But, the value stored in variable salary remains unchanged. If we were to update the variable salary with the added value, we can use assignment operator +=, that combines 2 operations, addition and assignment. So, we have the number 25000 added to the value stored in salary and the result of addition is again stored in the variable salary.

Below is the list of assignment operators:

Operator Description
+= Add and assign
-= Subtract and assign
*= Multiply and assign
/ Divide and assign
% Perform modulo and assign the result
^ or ** Perform exponentiation and assign the result

To demonstrate this, we can use /etc/passwd and count the number of lines in it. For this, we initialize a variable x and increment it after every line is read. After the last line, we print the variable, which gives us total number of records read by awk.

# We can also use { x++ } or { x = x + 1 } instead of { x += 1 } in below command
$ awk ' { x += 1 } END { print x } ' /etc/passwd
31

We can also include a condition here, to print the count of lines those have the string bash inside them.

$ awk '/bash/ { x += 1 } END { print x } ' /etc/passwd
3

Another example. This time, we use a demo file which has 3 fields - Name of the student, Subject name and the Marks. Below is the snippet.

Student Subject Marks
James Biology 31
Velma Biology 43
Kibo Biology 81
Louis Biology 11
Phyllis Biology 18
Zenaida Biology 55
Gillian Biology 38
Constance Biology 16
Giselle Biology 73
...
...

We can calculate the average marks obtained by students in Chemistry as follows:

awk ' /Chemistry/ { total += $3; count += 1 } END { print total/count } ' result.txt 
52.4074

We can also consider a data set of cities and their temperatures as below:

Washington 18 23 21 19 16
London 10 7 13 5 -1
Moscow 2 0 -3 -7 1
Mumbai 24 27 29 29 28

We can find average temperature for every city as shown below:

$ awk ' {total = $2 + $3 + $4 + $5 + $6; print $1 " : " total/5 }' cities.txt 
Washington : 19.4
London : 6.8
Moscow : -1.4
Mumbai : 27.4

That's all for the scope of this article. We did not cover all of the operators here as there are pretty straight forward and most of you may already have an idea about those ones. Let me know about your views and feedback in the comments section below and stay tuned for more articles.

0 comments:

Post a Comment