05 Mar 2021 - fubar - Sreekar Guddeti
String literals appear at various places in a plot like the labels for the axes, legend description, text annotations. They may also contain non standard unicode characters. Sometimes we need to typset mathematical formulae. Matplotlib in conjunction with Python's string literal handling lets us do these task seamlessly.
While presenting scientific data in the form of graphs, we need to present the important parameters characterizing the data. These parameters can be of statistical nature, controls of the experiment. We need truncate these parameter values upto reasonable precision (usually 6 significant digits after the decimal is sufficient). Frequently, we need to variabilize so that changing the parameters changes the data and the graphs track these changes. f-strings and r-strings help to ease this task.
A Python literal is a notation to describe a constant value of built-in type. Built-in types of Python are bytes, string, numeric, floating and imaginary. For example, to denote a decimal number, the notation can only include digits 0-9 and an optional dot. In this post, let us focus on the rules for string literals.
The lexical definition for a string literal is
stringliteral ::= [stringprefix](shortstring | longstring)
stringprefix ::= "r" | "u" | "R" | "U" | "f" | "F"
| "fr" | "Fr" | "fR" | "FR" | "rf" | "rF" | "Rf" | "RF"
shortstring ::= "'" shortstringitem* "'" | '"' shortstringitem* '"'
longstring ::= "'''" longstringitem* "'''" | '"""' longstringitem* '"""'
The syntax is to be read in such a way that that assignment of variable within []
is optional, ()
is mandatory, |
separates options available.
Some valid string literals are
In [152]: 'a'
Out[152]: 'a'
In [153]: "a"
Out[153]: 'a'
with shortstring
syntax and
In [156]: """a"""
Out[156]: 'a'
In [157]: '''a'''
Out[157]: 'a'
with longstring
syntax.
Now coming to optional prefix stringprefix
. There are two broad categories: the raw and formatted string literals
Usually the above notations are sufficient for typical situations. However there are some special sequence of characters usually starting with backslash \
that have special meaning. These sequence of characters are called escape sequences. Some of these escape sequences and their functions are
escape sequence | action |
---|---|
\n | end of line |
\t | tab |
If our string contains his sequence, and we don ot want initiate the corresponding actions, we need to prefixed the notation with r
or R
. This string literal defines a raw string.
In [158]: '\n'
Out[158]: '\n'
In [159]: r'\n'
Out[159]: '\\n'
It is to be born in mind that NEWLINE
is a single character and string literal representation of NEWLINE
is the one character string'\n'
whereas for the raw string literal r'\n'
, the backslash is not escaped and it is a two character string as shown below clearly.
In [98]: newline = '\n'
In [99]: len(newline)
Out[99]: 1
In [100]: raw_newline = r'\n'
In [101]: len(raw_newline)
Out[101]: 2
Formatted string literals are a relatively new addition (version 3.6 onwards) to format strings. It was proposed in PEP498 titled Literal String Interpolation to improve upon previous string formatting methods like
f-strings share the same syntax as regular string literals.
While other string literals always have a constant value, formatted strings are really expressions evaluated at run time.
To evaluate these expressions are f-strings have replacement fields
to hold the variables that can be interspersed within character sequences. The replacement fields are enclosed withing {}
as in
In [29]: print(f'{t}')
300
In [27]: t = 300
In [28]: print(f'T = {t} K')
T = 300 K
In [30]: print(f'{t=}')
t=300
The third variant directly prints the expression name along with its value if ending with =
(version 3.8 onwards). This is useful while debugging.
Format Specification Mini language
The motivation for writing this post came after reading this SO post
Frequently, we use Greek characters to denote physical quantities. We can use unicode sequence for these character
plt.figure()
plt.xlabel('\u03bb (in nm)')
Additionally, we need to add mathematical expressions also. To write mathematical expressions in matplotlib, there are two ways:
mathtext
, an inbuilt TeX parser.The mathtext
way is easier. It requires, however, to use raw strings with math text enclosed within $
plt.figure()
plt.ylabel(r'$\omega$ (in THz)')
The LaTeX way adds for flexibility and aesthetics at the expense of additional set up of a working LaTex installation.
import matplotlib.pyplot as plt
# use custom style sheet
mpl.style.use('myMatplotlibStylesheet.mplstyle')
# use LaTeX rendering
mpl.rcParams['text.usetex']=True
plt.figure()
plt.xlabel(r'$\lambda$ (in nm)')
plt.ylabel(r'$\omega$ (in THz)')
From text rendering with LaTeX
Certain characters require special escaping in TeX, such as
$ % & ~ _ ^ \ { } ( ) [ ]
So in the LaTeX mode, using unicode characters will not work
Formatted raw ‘fr’-strings are a pleasant improvement in the string formatting capability of Python version 3.8 onwards when dealing with scientific data.