In accordance with the official documentation:
class int([x])
[...] If x is not a number or if base is Given, then x must be a string, bytes, or bytearray instance Representing an integer literal in Radix base. Optionally, the literal can be preceded by + or - (with no space in between) and surrounded by whitespace.
As highlighted excerpt, the value of the parameter can be surrounded of white spaces. For practical purposes, a Trim in string before converting it to whole, thus ignoring white spaces at the beginning and the end.
TL;DR
The information below is based on the official Python implementation, known as Cpython.
To confirm this information, you can analyze the Python implementation in C:
/* Parses an int from a bytestring. Leading and trailing whitespace will be
* ignored.
*
* If successful, a PyLong object will be returned and 'pend' will be pointing
* to the first unused byte unless it's NULL.
*
* If unsuccessful, NULL will be returned.
*/
PyObject *
PyLong_FromString(const char *str, char **pend, int base);
The value you pass as parameter in int
will be the pointer *s
. When analyzing the body of the function, you will see that early on (line 2226) there:
while (*str != '\0' && Py_ISSPACE(Py_CHARMASK(*str))) {
str++;
}
I mean, walk the string and if it is a blank space increments the pointer, causing the character to be ignored in the later steps. Any character which Py_ISSPACE
true return.
#define Py_ISSPACE(c) (_Py_ctype_table[Py_CHARMASK(c)] & PY_CTF_SPACE)
// pyctype.c
PY_CTF_SPACE, /* 0x9 '\t' */
PY_CTF_SPACE, /* 0xa '\n' */
PY_CTF_SPACE, /* 0xb '\v' */
PY_CTF_SPACE, /* 0xc '\f' */
PY_CTF_SPACE, /* 0xd '\r' */
That is, the characters \t
, \n
, \v
, \f
and \r
will be disregarded in string.
>>> int('\t1')
1
>>> int('\n2')
2
>>> int('\v3')
3
>>> int('\f4')
4
>>> int('\r5')
5
Continuing the analysis of the body of function, we see the excerpt (line 2399):
scan = str;
# ...
while (_PyLong_DigitValue[Py_CHARMASK(*scan)] < base || *scan == '_') {
# ...
}
It assigns the input pointer str
for scan
and traverses it as long as the character is a valid digit, that is, less than the informed base, or the character _
. Any character that does not meet these conditions will cause it to be executed goto onError
, ending the function with error. Therefore, within the number the character will be allowed _
only, but any other character, including whitespace, will result in error.
>>> int('1_000')
1000
>>> int('1\n000')
...
ValueError: invalid literal for int() with base 10: '1\n000'
Finally, continuing the analysis of the function cup, we see again (line 2535):
while (*str && Py_ISSPACE(Py_CHARMASK(*str))) {
str++;
}
if (*str != '\0') {
goto onError;
}
Similar to the previous, to ignore the whitespace at the beginning of the string, the pointer is traversed ignoring the whitespace of the end. The condition of ending in \0
ensures that the string end with whitespace and not other characters.
In short,
- Any white space from the start will be ignored (
' '
, '\t'
, '\n'
, '\v'
, '\f'
, '\r'
);
- During the string, any character other than a digit or
_
make a mistake;
- Any blank space at the end will be ignored;
- Any character that is not a digit or
_
will give error, except in the above cases (beginning and end spaces);
When you have doubts like that, I suggest using Python’s interactive mode and doing some free trials - and seeing the answers. In fact, I always suggest mante rum promprt interactive open and test almost everything there - the autocomplete of the IDE dimuni the need for some things to move "really" and "live" in the code - but it is not even remotely didactic.
– jsbueno