nltk.toolbox.ToolboxData¶
- class nltk.toolbox.ToolboxData[source]¶
Bases:
StandardFormat- fields(strip=True, unwrap=True, encoding=None, errors='strict', unicode_fields=None)[source]¶
Return an iterator that returns the next field in a
(marker, value)tuple, wheremarkerandvalueare unicode strings if anencodingwas specified in thefields()method. Otherwise they are non-unicode strings.- Parameters
strip (bool) – strip trailing whitespace from the last line of each field
unwrap (bool) – Convert newlines in a field to spaces.
encoding (str or None) – Name of an encoding to use. If it is specified then the
fields()method returns unicode strings rather than non unicode strings.errors (str) – Error handling scheme for codec. Same as the
decode()builtin string method.unicode_fields (sequence) – Set of marker names whose values are UTF-8 encoded. Ignored if encoding is None. If the whole file is UTF-8 encoded set
encoding='utf8'and leaveunicode_fieldswith its default value of None.
- Return type
iter(tuple(str, str))
- open(sfm_file)[source]¶
Open a standard format marker file for sequential reading.
- Parameters
sfm_file (str) – name of the standard format marker input file