What are the statements placed before strings?

Asked

Viewed 400 times

1

In python, I observed that these "indications" were placed in two cases, first in strings before passing through a hash algorithm, e.g.:

import hashlib

m = hashlib.sha256()

m.update(b"Nobody inspects")

Note: The indication I refer to is 'b' before "Nobody inspects"

And second before the strings go through the 'Match' :

plaintext = u'algor u00edtimo'

encodedtext = plaintext.Encode('utf-8')

Note: The indication this time is 'u' before 'algor u00edtimo'

I would like to know what they mean, because in python to use a print or something similar it is not necessary (at least as far as I know) to use these "indications", except in cases where the hashlib library will be used for example.

2 answers

2

Are indicative of which literal of string will be placed there, each has a different feature in which the compiler/interpreter understands differently and decides what to do. Has documentation of all lexical analysis which is made in the code.

In case the u indicates that the string is coded as UTF-8 and the b as bytes without specific encoding (uses ASCII), in this case will not be treated as a type str. You can still use r together to indicate that it is a raw text and special characters are not treated in a special way.

1


Enter a python 2.7 terminal and do:

str1 = 'teste'
str2 = 'teste'

Now look at the two of you:

type(str1)
str

type(str2)
unicode

Python 2 needs the u' to indicate that the string is Unicode (to work with our accent, for example), if you do the same test in python 3, you will notice that u' became unnecessary.

Now go back to the terminal (can be python 2 or 3) do:

str1 = 'string1'
str2 = b'string1'

Now let’s look at the types of these strings:

print (type(str1))
<class 'str'>

print (type(str2))
<class 'bytes'> 

See this documentary observation:

Note: While strings are string sequences (represented by strings of size 1), byte and bytearray objects are integer sequences (from 0 to 255), representing the ASCII values of each byte. This means that for a byte or bytearray object, b[0] will return an integer. See full text here.

The b is to indicate that the object is of type bytes.

Now go back to the terminal and do it:

str1 = 'Linha 1\nLinha 2'
str2 = r'Linha 1\nLinha 2'

See now, the difference between the two when running a print in the same:

print (str1)
Linha 1
Linha 2

print (str2)
Linha 1\nLinha 2

The r (raw) indicates that it is a "raw" string, any escape character will be unseen.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.