module documentation

FIXME:https://github.com/twisted/twisted/issues/3843 This can be removed once t.persisted.aot is removed. New code should not make use of this. Tokenization help for Python programs. vendored from https://github.com/python/cpython/blob/6b825c1b8a14460641ca6f1647d83005c68199aa/Lib/tokenize.py Licence: https://docs.python.org/3/license.html tokenize(readline) is a generator that breaks a stream of bytes into Python tokens. It decodes the bytes according to PEP-0263 for determining source file encoding. It accepts a readline-like method which is called repeatedly to get the next line of input (or b"" for EOF). It generates 5-tuples with these members: the token type (see token.py) the token (a string) the starting (row, column) indices of the token (a 2-tuple of ints) the ending (row, column) indices of the token (a 2-tuple of ints) the original line (string) It is designed to match the working of the Python tokenizer exactly, except that it produces COMMENT tokens for comments and gives type OP for all operators. Additionally, all token lists start with an ENCODING token which tells you which encoding was used to decode the bytes stream.

Class TokenInfo Undocumented
Class Untokenizer Undocumented
Exception StopTokenizing Undocumented
Exception TokenError Undocumented
Function any Undocumented
Function detect_encoding The detect_encoding() function is used to detect the encoding that should be used to decode a Python source file. It requires one argument, readline, in the same way as the tokenize() generator.
Function generate_tokens Tokenize a source reading Python code as unicode strings.
Function group Undocumented
Function main Undocumented
Function maybe Undocumented
Function open Open a file in read only mode using the encoding detected by detect_encoding().
Function tokenize The tokenize() generator requires one argument, readline, which must be a callable object which provides the same interface as the readline() method of built-in file objects. Each call to the function should return one line of input as bytes...
Function untokenize Transform tokens back into Python source code. It returns a bytes object, encoded using the ENCODING token, which is the first token sequence output by tokenize.
Constant AMPER Undocumented
Constant AMPEREQUAL Undocumented
Constant ASYNC Undocumented
Constant AT Undocumented
Constant ATEQUAL Undocumented
Constant AWAIT Undocumented
Constant CIRCUMFLEX Undocumented
Constant CIRCUMFLEXEQUAL Undocumented
Constant COLON Undocumented
Constant COLONEQUAL Undocumented
Constant COMMA Undocumented
Constant COMMENT Undocumented
Constant DEDENT Undocumented
Constant DOT Undocumented
Constant DOUBLESLASH Undocumented
Constant DOUBLESLASHEQUAL Undocumented
Constant DOUBLESTAR Undocumented
Constant DOUBLESTAREQUAL Undocumented
Constant ELLIPSIS Undocumented
Constant ENCODING Undocumented
Constant ENDMARKER Undocumented
Constant EQEQUAL Undocumented
Constant EQUAL Undocumented
Constant ERRORTOKEN Undocumented
Constant GREATER Undocumented
Constant GREATEREQUAL Undocumented
Constant INDENT Undocumented
Constant LBRACE Undocumented
Constant LEFTSHIFT Undocumented
Constant LEFTSHIFTEQUAL Undocumented
Constant LESS Undocumented
Constant LESSEQUAL Undocumented
Constant LPAR Undocumented
Constant LSQB Undocumented
Constant MINEQUAL Undocumented
Constant MINUS Undocumented
Constant N_TOKENS Undocumented
Constant NAME Undocumented
Constant NEWLINE Undocumented
Constant NL Undocumented
Constant NOTEQUAL Undocumented
Constant NT_OFFSET Undocumented
Constant NUMBER Undocumented
Constant OP Undocumented
Constant PERCENT Undocumented
Constant PERCENTEQUAL Undocumented
Constant PLUS Undocumented
Constant PLUSEQUAL Undocumented
Constant RARROW Undocumented
Constant RBRACE Undocumented
Constant RIGHTSHIFT Undocumented
Constant RIGHTSHIFTEQUAL Undocumented
Constant RPAR Undocumented
Constant RSQB Undocumented
Constant SEMI Undocumented
Constant SLASH Undocumented
Constant SLASHEQUAL Undocumented
Constant SOFT_KEYWORD Undocumented
Constant STAR Undocumented
Constant STAREQUAL Undocumented
Constant STRING Undocumented
Constant TILDE Undocumented
Constant TYPE_COMMENT Undocumented
Constant TYPE_IGNORE Undocumented
Constant VBAR Undocumented
Constant VBAREQUAL Undocumented
Variable __author__ Undocumented
Variable __credits__ Undocumented
Variable Binnumber Undocumented
Variable blank_re Undocumented
Variable Comment Undocumented
Variable ContStr Undocumented
Variable cookie_re Undocumented
Variable Decnumber Undocumented
Variable Double Undocumented
Variable Double3 Undocumented
Variable endpats Undocumented
Variable Expfloat Undocumented
Variable Exponent Undocumented
Variable Floatnumber Undocumented
Variable Funny Undocumented
Variable Hexnumber Undocumented
Variable Ignore Undocumented
Variable Imagnumber Undocumented
Variable Intnumber Undocumented
Variable Name Undocumented
Variable Number Undocumented
Variable Octnumber Undocumented
Variable PlainToken Undocumented
Variable Pointfloat Undocumented
Variable PseudoExtras Undocumented
Variable PseudoToken Undocumented
Variable Single Undocumented
Variable Single3 Undocumented
Variable single_quoted Undocumented
Variable Special Undocumented
Variable String Undocumented
Variable StringPrefix Undocumented
Variable tabsize Undocumented
Variable Token Undocumented
Variable Triple Undocumented
Variable triple_quoted Undocumented
Variable Whitespace Undocumented
Function _all_string_prefixes Undocumented
Function _compile Undocumented
Function _get_normal_name Imitates get_normal_name in tokenizer.c.
Function _tokenize Undocumented
def any(*choices): (source)

Undocumented

def detect_encoding(readline): (source)

The detect_encoding() function is used to detect the encoding that should be used to decode a Python source file. It requires one argument, readline, in the same way as the tokenize() generator.

It will call readline a maximum of twice, and return the encoding used (as a string) and a list of any lines (left as bytes) it has read in.

It detects the encoding from the presence of a utf-8 bom or an encoding cookie as specified in pep-0263. If both a bom and a cookie are present, but disagree, a SyntaxError will be raised. If the encoding cookie is an invalid charset, raise a SyntaxError. Note that if a utf-8 bom is found, 'utf-8-sig' is returned.

If no encoding is specified, then the default of 'utf-8' will be returned.

def generate_tokens(readline): (source)

Tokenize a source reading Python code as unicode strings.

This has the same API as tokenize(), except that it expects the *readline* callable to return str objects instead of bytes.

def group(*choices): (source)

Undocumented

def main(): (source)

Undocumented

def maybe(*choices): (source)

Undocumented

def open(filename): (source)

Open a file in read only mode using the encoding detected by detect_encoding().

def tokenize(readline): (source)

The tokenize() generator requires one argument, readline, which must be a callable object which provides the same interface as the readline() method of built-in file objects. Each call to the function should return one line of input as bytes. Alternatively, readline can be a callable function terminating with StopIteration: readline = open(myfile, 'rb').__next__ # Example of alternate readline The generator produces 5-tuples with these members: the token type; the token string; a 2-tuple (srow, scol) of ints specifying the row and column where the token begins in the source; a 2-tuple (erow, ecol) of ints specifying the row and column where the token ends in the source; and the line on which the token was found. The line passed is the physical line. The first token sequence will always be an ENCODING token which tells you which encoding was used to decode the bytes stream.

def untokenize(iterable): (source)

Transform tokens back into Python source code. It returns a bytes object, encoded using the ENCODING token, which is the first token sequence output by tokenize. Each element returned by the iterable must be a token sequence with at least two elements, a token number and token value. If only two tokens are passed, the resulting output is poor. Round-trip invariant for full input: Untokenized source will match input source exactly Round-trip invariant for limited input: # Output bytes will tokenize back to the input t1 = [tok[:2] for tok in tokenize(f.readline)] newcode = untokenize(t1) readline = BytesIO(newcode).readline t2 = [tok[:2] for tok in tokenize(readline)] assert t1 == t2

Undocumented

Value
19
AMPEREQUAL: int = (source)

Undocumented

Value
41

Undocumented

Value
56

Undocumented

Value
49

Undocumented

Value
50

Undocumented

Value
55
CIRCUMFLEX: int = (source)

Undocumented

Value
32
CIRCUMFLEXEQUAL: int = (source)

Undocumented

Value
43

Undocumented

Value
11
COLONEQUAL: int = (source)

Undocumented

Value
53

Undocumented

Value
12

Undocumented

Value
61

Undocumented

Value
6

Undocumented

Value
23
DOUBLESLASH: int = (source)

Undocumented

Value
47
DOUBLESLASHEQUAL: int = (source)

Undocumented

Value
48
DOUBLESTAR: int = (source)

Undocumented

Value
35
DOUBLESTAREQUAL: int = (source)

Undocumented

Value
46
ELLIPSIS: int = (source)

Undocumented

Value
52
ENCODING: int = (source)

Undocumented

Value
63
ENDMARKER: int = (source)

Undocumented

Value
0

Undocumented

Value
27

Undocumented

Value
22
ERRORTOKEN: int = (source)

Undocumented

Value
60

Undocumented

Value
21
GREATEREQUAL: int = (source)

Undocumented

Value
30

Undocumented

Value
5

Undocumented

Value
25
LEFTSHIFT: int = (source)

Undocumented

Value
33
LEFTSHIFTEQUAL: int = (source)

Undocumented

Value
44

Undocumented

Value
20
LESSEQUAL: int = (source)

Undocumented

Value
29

Undocumented

Value
7

Undocumented

Value
9
MINEQUAL: int = (source)

Undocumented

Value
37

Undocumented

Value
15
N_TOKENS: int = (source)

Undocumented

Value
64

Undocumented

Value
1

Undocumented

Value
4

Undocumented

Value
62
NOTEQUAL: int = (source)

Undocumented

Value
28
NT_OFFSET: int = (source)

Undocumented

Value
256

Undocumented

Value
2

Undocumented

Value
54

Undocumented

Value
24
PERCENTEQUAL: int = (source)

Undocumented

Value
40

Undocumented

Value
14
PLUSEQUAL: int = (source)

Undocumented

Value
36

Undocumented

Value
51

Undocumented

Value
26
RIGHTSHIFT: int = (source)

Undocumented

Value
34
RIGHTSHIFTEQUAL: int = (source)

Undocumented

Value
45

Undocumented

Value
8

Undocumented

Value
10

Undocumented

Value
13

Undocumented

Value
17
SLASHEQUAL: int = (source)

Undocumented

Value
39
SOFT_KEYWORD: int = (source)

Undocumented

Value
59

Undocumented

Value
16
STAREQUAL: int = (source)

Undocumented

Value
38

Undocumented

Value
3

Undocumented

Value
31
TYPE_COMMENT: int = (source)

Undocumented

Value
58
TYPE_IGNORE: int = (source)

Undocumented

Value
57

Undocumented

Value
18
VBAREQUAL: int = (source)

Undocumented

Value
42
__author__: str = (source)

Undocumented

__credits__: str = (source)

Undocumented

Binnumber: str = (source)

Undocumented

blank_re = (source)

Undocumented

Undocumented

Undocumented

cookie_re = (source)

Undocumented

Decnumber: str = (source)

Undocumented

Undocumented

Undocumented

Undocumented

Expfloat = (source)

Undocumented

Exponent: str = (source)

Undocumented

Floatnumber = (source)

Undocumented

Undocumented

Hexnumber: str = (source)

Undocumented

Undocumented

Imagnumber = (source)

Undocumented

Intnumber = (source)

Undocumented

Undocumented

Undocumented

Octnumber: str = (source)

Undocumented

PlainToken = (source)

Undocumented

Pointfloat = (source)

Undocumented

PseudoExtras = (source)

Undocumented

PseudoToken = (source)

Undocumented

Undocumented

Undocumented

single_quoted: set = (source)

Undocumented

Undocumented

Undocumented

StringPrefix = (source)

Undocumented

Undocumented

Undocumented

Undocumented

triple_quoted: set = (source)

Undocumented

Whitespace: str = (source)

Undocumented

def _all_string_prefixes(): (source)

Undocumented

Undocumented

def _get_normal_name(orig_enc): (source)

Imitates get_normal_name in tokenizer.c.

def _tokenize(readline, encoding): (source)

Undocumented