Transformer

class convokit.transformer.Transformer

Abstract base class for modules that take in a Corpus and modify the Corpus and/or extend it with additional information, imitating the scikit-learn Transformer API. Exposes fit() and transform() methods. fit() performs any necessary precomputation (or “training” in machine learning parlance) while transform() does the work of actually computing the modification and applying it to the corpus.

All subclasses must implement transform(); subclasses that require precomputation should also override fit(), which by default does nothing. Additionally, the interface also exposes a fit_transform() method that does both steps on the same Corpus in one line. By default this is implemented to simply call fit() followed by transform(), but designers of Transformer subclasses may also choose to overwrite the default implementation in cases where the combined operation can be implemented more efficiently than doing the steps separately.

fit(corpus: convokit.model.corpus.Corpus, y=None, **kwargs)

Use the provided Corpus to perform any precomputations necessary to later perform the actual transformation step.

Parameters

corpus – the Corpus to use for fitting

Returns

the fitted Transformer

fit_transform(corpus: convokit.model.corpus.Corpus, y=None, **kwargs) → convokit.model.corpus.Corpus

Fit and run the Transformer on a single Corpus.

Parameters

corpus – the Corpus to use

Returns

same as transform

abstract transform(corpus: convokit.model.corpus.Corpus, **kwargs) → convokit.model.corpus.Corpus

Modify the provided corpus. This is an abstract method that must be implemented by any Transformer subclass

Parameters

corpus – the Corpus to transform

Returns

modified version of the input Corpus. Note that unlike the scikit-learn equivalent, transform() operates inplace on the Corpus (though for convenience and compatibility with scikit-learn, it also returns the modified Corpus).