Basic¶
Implements a few basic transformers.
-
class
convokit.basic.tokenizer.
Tokenizer
(verbosity: int = 0)¶ tokenizes utterances. stores tokens as space-separated string.
- Parameters
verbosity – frequency to print status messages while tokenizing.
-
class
convokit.basic.wordcount.
WordCount
(use_tokenized=True)¶ computes wordcount per utterance.
- Parameters
use_tokenized – whether or not to use nltk-tokenized output (requires tokenizer to be run)
-
transform
(corpus: convokit.model.corpus.Corpus)¶ computes wordcount per utterance
::param corpus: the Corpus to compute wordcount for. :type corpus: Corpus