Introduction to Linux - A Hands on Guide | Linux Bible | Linux From Scratch | A Newbie's Getting Started Guide to Linux | Linux Command Line Cheat Sheet | More Linux eBooks



Saturday, 22 October 2016

Python String Operations - Concatenation, Repetition, Index and Slice

Hello readers! We are into 6th article of the series 'Python on Terminal' and it is intended to provide you exposure to Python Strings. In this article, we will learn about string properties and basic operations associated with them.

python-string-operations-concatenation-repetition-index-slice


A string is a sequence of characters (or simply, some text) enclosed in single or double (or triple) quotes, that can include names, numbers, symbols, website's URL, this article's contents, ASCII and non-ASCII characters and so on. This makes strings - somestring, Some Other String, Only 1 String and What a String!, valid strings. Just like everything else in Python, a string is also an 'object' and it belongs to built-in str class, as we will see it shortly.

Creating Strings

Creating strings is child's play! Just open your Python interpreter, type some random text enclosed in single or double quotes and hit Enter.

Lets just begin with creating empty string using single and double quotes:

>>> ""
''

>>> ''
''

Now, we create a non-empty string with some text enclosed in single quotes or double quotes:

>>> 'I<3Python!'
'I<3Python!'

>>> "I<3Python!"
'I<3Python!'

Congrats, you've just created a string!

One thing to be noted here is that, even if we used double quotes, it's output is the same as we saw in case of strings in single quotes. Then, why should we use double quotes? 
Just try creating a string I'm a big fool using single quotes and double quotes.

>>> 'I'm a big fool'
  File "<stdin>", line 1
    'I'm a big fool'
       ^
SyntaxError: invalid syntax

>>> "I'm a big fool"
"I'm a big fool"

This is because, when we used 'I'm a big fool', Python sees that there are three 's and interprets that second ' is the ending quote. So, we somehow have to tell Python that, the second ' is not the original one, by escaping it with a \ as shown below:

>>> 'I\'m a big fool'
"I'm a big fool"

This way, Python will realize that, the escaped ' isn't the closing one. Same is the case with double quotes. When using a string with "s inside double quotes, you have to escape them. Just have a look at the example below:

>>> ""You are a big fool", he said."
  File "<stdin>", line 1
    ""You are a big fool", he said."
        ^
SyntaxError: invalid syntax

>>> "\"You are a big fool\", he said."
'"You are a big fool", he said.'

One more point to be noted here is that, you need not escape ' in double quotes and vice versa.

>>> "I'm a big fool"
"I'm a big fool"

>>> '"You are a big fool", he said.'
'"You are a big fool", he said.'

In the first example above, we have used ' in double quotes, without escaping it, while in other example, we have used " inside single quotes. Another way is to use triple quotes, with which -

1. One simply should not care about escaping special characters. Just put starting """ or ''' keep writing, put ending """ or ''' and relax. With """,
>>> """I'm a big fool"""
"I'm a big fool"

>>> """"You are a big fool", he said."""
'"You are a big fool", he said.'

and with ''',

>>> '''I'm a big fool'''
"I'm a big fool"

>>> '''"You are a big fool", he said.'''
'"You are a big fool", he said.'

2. One can use string that spans over multiple lines

>>> '''She: "What's your name?"
... He: "Bond, James Bond."
... She: "Nice name"
... '''
'She: "What\'s your name?"\nHe: "Bond, James Bond."\nShe: "Nice name"\n'

Observe those \n in the output string, you never need to escape those, when you are using triple quotes. Also, if you assign this triple quoted string to variable, and then print that variable, you will get the expected output printed on the screen.

>>> myString = '''She: "What's your name?"
... He: "Bond, James Bond."
... She: "Nice name"
... '''
>>> print myString
She: "What's your name?"
He: "Bond, James Bond."
She: "Nice name"

In above example, we have created a multiline string, using triple quotes, and saved it in a variable, which we named as myString.

String Operations

Now that, having learned to create single and multiple line strings, its time to know about operations associated with them. These operations either give information about the objects (string in our case) or perform certain operations on them. Lets begin with the type method (or function, if 'method' does not sound good to you).

The Python built-in type() function, when provided with an object (a string, in the context of this article) as a parameter, returns the type of that object. In order to verify this, let us create a string variable myNewString and pass it to type() function.

>>> myNewString = "Some stupid text here..."
>>> type(myNewString)
<type 'str'>

or, simply pass the string directly to the type() function as a parameter.

>>> type('Old MacDonald Had a Farm...')
<type 'str'>

The output <type 'str'> makes it clear that, the object (which is a string object) which you passed to the type() function as a parameter, is of the 'String type' (or belongs to class 'string', this will come when we will be learning about Python Classes).

String Concatenation and String Repetition

Python strings can be concatenated with a + sign. For those, who do not know what concatenation is, it joins (or links or places side by side) two or more strings together. So, concatenation of words Hello and World will provide is a new string object - HelloWorld. Lets check this in the terminal:

>>> str1 = "Code"
>>> str2 = "Ninja"
>>> str3 = ".in"
>>> str1 + str2 + str3
'CodeNinja.in'

Cool! We've just concatenated three strings. What if we have to concatenate two strings and one integer - 'He is', 60 and years old.? Will above trick work?

>>> str1 = 'He is '
>>> str2 = 60
>>> str3 = ' years old.'
>>> str1 + str2 + str3
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: cannot concatenate 'str' and 'int' objects

That's an exception (or error, simply) and it says - TypeError: cannot concatenate 'str' and 'int' objects. Meaning that, Python can never concatenate a str type (a string object) and an int type (an integer), it can only concatenate two or more strings. In this case, we need to convert that integer to a str type (string object) and this can be done using Python built-in str(). This function takes a parameters and returns its string version. So, when we pass the integer 777 to str() function, it returns us '777', a str object (look at those quotes there!). Now, we concatenate those two string objects with str type of the integer as below:

>>> str1 = 'He is '
>>> str2 = 60
>>> str3 = ' years old.'
>>> str1 + str(str2) + str3
'He is 60 years old.'

Note : we can concatenate only string objects. If we need to concatenate object with any other type, that has to be converted to str type.

We have another operator - "*", with which we can generate repetitive sequence of the string object. So, in order to repeat a string 'Zero' for N times, we use 'Zero' * N, which produces a single string ZeroZeroZero... repeating Zero N times. Lets do it on terminal:

>>> 'CodeNinja' * 6
'CodeNinjaCodeNinjaCodeNinjaCodeNinjaCodeNinjaCodeNinja'

>>> myString = 'CodeNinja'
>>> myString * 6
'CodeNinjaCodeNinjaCodeNinjaCodeNinjaCodeNinjaCodeNinja'

String Indexing and String Slicing

Any string object, being sequence of characters, is associated with positional parameters, called 'Index' ('Indices' in plural). Using these indices, we can count number of items in the string (often called as 'length' of a string), access each item of the string (or 'iterate' through a string) and to take out a sub-string from a string (or 'slice' a string). As you might have thought about, these indices start from offset zero, beginning from left, and string items can be accesses using a very popular syntax - stringName[index]. Thus, stringName[0] is the first string item and stringName[3] is the fourth one.

String objects do support Negative Indexing with which we can access elements from the end of the string (counting backward), instead of beginning of the string. This is useful when you have a really long string (like 41 characters long) and have to access the third last element. In this case, if you don't prefer to use negative index, you must know the length of the string. Length of a string can be determined by passing string to len() function. With the string length known to you, you can start counting through the string items, till you reach the third last element (stringName[38]). As an alternative, you can use -3 as the offset to access the same item, in simple words, stringName[-3] makes more sense than stringName[38].

Consider the example below:

>>> myString = "Here is a long, useless and boring stuff!"

# Determine the length of the string
>>> len(myString)
41

# Accesing 6th element
>>> myString[5]
'i'

# Accesing 4th element from last
>>> myString[-4]
'u'

String slicing allows us to extract a portion of a string (sub-string) from the original string, we just need to mention from where to start and where to stop counting, using the syntax - stringName[START:END], where START is inclusive and END is non-inclusive. So, when we say myString[1:6], it will start slicing from the item at index '1' and till but not including the item at index '6'. Effectively, you will have a sub-string of items starting from index '1' up to the one at index '5'.

If we do not mention the END index, we would get a sub-string starting at index START till the end. Similar is the case when we do not mention START index, we would get a sub-string starting from the item at index '0' till the one at index END. Apart from START and END, we have an optional parameter here - STEP, which gets added to the index when an item is extracted from the string. So, stringName[START:END:STEP] gives every STEPth element in the string stringName starting with item at index START till but not including item at index END.

Have a look at below examples to have more clarity on above description.

>>> myString = "CodeNinjaDotIn"

# Checking length of the string
>>> len(myString)
14

# Slice from index '2' up to but not including index '7'
>>> myString[2:7]
'deNin'

# Slice from index '5' onwards
>>> myString[5:]
'injaDotIn'

# Slice up to but not including index '9' 
>>> myString[:9]
'CodeNinja'

# Full slice
>>> myString[:]
'CodeNinjaDotIn'

# Slice including every other item starting from '0'th
>>> myString[0::2]
'CdNnaoI'

# Slice of every third element beginning from index at '0'
>>> myString[0:8:3]
'Cen'

# String reversal with a step of -1
>>> myString[::-1]
'moCtoDajniNedoC'

# Slice counting backwards starting at index '7' 
# up to but not including index '2' and
# catching every other item
>>> myString[7:2:-2]
'jie'

Strings are 'Immutable'!

Python objects are said to be 'Mutable', if their value is changeable i.e. they can be modified, otherwise they are called as 'Immutable'. Before we conclude (we have already concluded though, in the section title itself) whether a string object is mutable or immutable, we try to change its value from CodeNinjaDotIn to Cod3NinjaDotIn as below:
>>> myString = "CodeNinjaDotIn"

>>> myString[3] = '3'
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'str' object does not support item assignment
You can not change a string object and they are Immutable.

So, in this article, we have learned about Python string objects, how they can be created, concatenation & repetition of strings and how strings are indexed and sliced. In the next article on Python strings, we would be learning about String Methods. Please post your feedback in the comment section below.

This article is originally published at www.codeninja.in - Python Strings - Creation, Concatenation, Repetition, Indexing and Slicing

0 comments:

Post a Comment