User:Milanand/Python 3 Programming/Strings

Overview
Strings in Python at a glance:

Equality
Two strings are equal if they have exactly the same contents, meaning that they are both the same length and each character has a one-to-one positional correspondence. Many other languages compare strings by identity instead; that is, two strings are considered equal only if they occupy the same space in memory. Python uses the  operator to test the identity of strings and any two objects in general.

Examples:

Numerical
There are two quasi-numerical operations which can be done on strings – addition and multiplication. String addition is just another name for concatenation. String multiplication is repetitive addition, or concatenation. So:

Containment
There is a simple operator  that returns True if the first operand is contained in the second. This also works on substrings: Note that  would have also returned the same value.

Indexing and Slicing
Much like arrays in other languages, the individual characters in a string can be accessed by an integer representing its position in the string. The first character in string  would be   and the nth character would be at. Unlike arrays in other languages, Python also indexes the arrays backwards, using negative numbers. The last character has index -1, the second to last character has index -2, and so on. We can also use "slices" to access a substring of. will give us a string starting with  and ending with. None of these are assignable. Another feature of slices is that if the beginning or end is left empty, it will default to the first or last index, depending on context: You can also use negative numbers in slices: Element:    1     2     3     4 Index:   0     1     2     3     4 -4   -3    -2    -1 So, when we ask for the [1:3] slice, that means we start at index 1, and end at index 2, and take everything in between them. If you are used to indexes in C or Java, this can be a bit disconcerting until you get used to it.

String constants
String constants can be found in the standard string module. An example is, which equals to.

Links for further reference:


 * Python Documentation on String module

String methods
There are a number of methods or built-in string functions:



Only emphasized items will be covered.

is*
,,  ,  ,  ,  , and   fit into this category.

The length of the string object being compared must be at least 1, or the is* methods will return False. In other words, a string object of, is considered "empty", or.


 *   returns  if the string is entirely composed of alphabetic and/or numeric characters (i.e. no punctuation).
 *   and   work similarly for alphabetic characters or numeric characters only.
 *   returns  if the string is composed entirely of whitespace.
 *  ,  , and   return  if the string is in lowercase, uppercase, or titlecase respectively. Uncased characters are "allowed", such as digits, but there must be at least one cased character in the string object in order to return  . Titlecase means the first cased character of each word is uppercase, and any immediately following cased characters are lowercase. Curiously,   returns True. That is because uppercase characters can only follow uncased characters. Likewise, lowercase characters can only follow uppercase or lowercase characters. Hint: whitespace is uncased.

Example:

title, upper, lower, swapcase, capitalize
Returns the string converted to title case, upper case, lower case, inverts case, or capitalizes, respectively.

The   method capitalizes the first letter of each word in the string (and makes the rest lower case). Words are identified as substrings of alphabetic characters that are separated by non-alphabetic characters, such as digits, or whitespace. This can lead to some unexpected behavior. For example, the string  will be converted to   instead of.

The   method makes all uppercase letters lowercase and vice versa.

The   method is like title except that it considers the entire string to be a word. (i.e. it makes the first character upper case and the rest lower case)

Example:

count
Returns the number of the specified substrings in the string. i.e. Hint:  is case-sensitive, so this example will only count the number of lowercase letter  s. For example, if you ran:

strip, rstrip, lstrip
Returns a copy of the string with the leading and trailing  whitespace removed. removes both. Note the leading and trailing tabs and newlines.

Strip methods can also be used to remove other types of characters. Note that  and   require an   statement

ljust, rjust, center
left, right or center justifies a string into a given field size (the rest is padded with spaces).

join
Joins together the given sequence with the string as separator:  may be helpful here: (it converts numbers in   into strings) Now arbitrary objects may be in   instead of just strings.

find, index, rfind, rindex
The  and   methods return the index of the first found occurrence of the given subsequence. If it is not found,  returns   but   raises a. and  are the same as   and   except that they search through the string from right to left (i.e. they find the last occurrence) Because Python strings accept negative subscripts, index is probably better used in situations like the one shown because using find instead would yield an unintended value.

replace
works just like it sounds. It returns a copy of the string with all occurrences of the first parameter replaced with the second parameter. Or, using variable assignment: Notice, the original variable remains unchanged after the call to.

expandtabs
Replaces tabs with the appropriate number of spaces (default number of spaces per tab = 8; this can be changed by passing the tab size as an argument). Notice how (although these both look the same) the second string (t) has a different length because each tab is represented by spaces not tab characters.

To use a tab size of 4 instead of 8: Please note each tab is not always counted as eight spaces. Rather a tab "pushes" the count to the next multiple of eight. For example:

split, splitlines
The   method returns a list of the words in the string. It can take a separator argument to use instead of whitespace. Note that in neither case is the separator included in the split strings, but empty strings are allowed.

The   method breaks a multiline string into many single line strings. It is analogous to  (but accepts   and   as delimiters as well) except that if the string ends in a newline character,   ignores that final character (see example). The method   also accepts multi-character string literals: