Forecaster

Conversational Forecasting Transformers.

Refer to CRAFT Model and Cumulative Bag-of-Words for the models that Forecaster can be loaded with.

Example usage: CRAFT forecasting of conversational derailment.

Example usage: Forecasting of conversational derailment using a cumulative bag-of-words model.

class convokit.forecaster.forecaster.Forecaster(forecaster_model: convokit.forecaster.forecasterModel.ForecasterModel = None, forecast_mode: str = 'future', convo_structure: str = 'branched', text_func=<function Forecaster.<lambda>>, label_func: Callable[[convokit.model.utterance.Utterance], bool] = <function Forecaster.<lambda>>, use_last_only: bool = False, skip_broken_convos: bool = True, forecast_attribute_name: str = 'forecast', forecast_prob_attribute_name: str = 'forecast_prob')

Implements basic Forecaster behavior.

Parameters:
  • forecaster_model – ForecasterModel to use, e.g. cumulativeBoW or CRAFT
  • forecast_mode – ‘future’ or ‘past’. ‘future’ (the default behavior) annotates each utterance with a forecast score using all context up to and including that utterance (i.e., a prediction of the future state of the conversation after this utterance). ‘past’ annotates each utterance with a forecast score using all context prior to that utterance (i.e., what the model believed this utterance would look like prior to actually seeing it)
  • convo_structure – conversations in expected corpus are ‘branched’ or ‘linear’, default: “branched”
  • text_func – optional function for extracting the text of the utterance, default: uses utterance’s text attribute
  • label_func – callable function for getting the utterance’s forecast label (True or False); only used in training
  • use_last_only – if forecast_mode is ‘past’ and use_last_only is True, for each dialog, use only the context-reply pair where the reply is the last utterance in the dialog
  • skip_broken_convos – if True and convo_structure is ‘branched’, exclude all conversations that have broken reply-to structures, default: True
  • forecast_attribute_name – metadata feature name to use in annotation for forecast result, default: “forecast”
  • forecast_prob_attribute_name – metadata feature name to use in annotation for forecast result probability, default: “forecast_prob”
fit(corpus: convokit.model.corpus.Corpus, y=None, selector: Callable[[convokit.model.conversation.Conversation], bool] = <function Forecaster.<lambda>>, ignore_utterances: Callable[[convokit.model.utterance.Utterance], bool] = <function Forecaster.<lambda>>)

Train the ForecasterModel on the given corpus.

Parameters:
  • corpus – target Corpus
  • selector – a (lambda) function that takes a Conversation and returns a bool: True if the Conversation is to be included in the fitting step. By default, includes all Conversations.
  • ignore_utterances – a (lambda) function that takes an Utterance and returns a bool: True if the Utterance should be excluded from the Conversation in the fitting step. By default, all Utterances are included.
Returns:

fitted Forecaster Transformer

fit_transform(corpus: convokit.model.corpus.Corpus, y=None, selector: Callable[[convokit.model.conversation.Conversation], bool] = <function Forecaster.<lambda>>, ignore_utterances: Callable[[convokit.model.utterance.Utterance], bool] = <function Forecaster.<lambda>>) → convokit.model.corpus.Corpus

Fit and run the Transformer on a single Corpus.

Parameters:corpus – the Corpus to use
Returns:same as transform
get_model()

Get the forecaster model object

set_model(forecaster_model)

Set the forecaster model :return:

summarize(corpus: convokit.model.corpus.Corpus, selector: Callable[[convokit.model.conversation.Conversation], bool] = <function Forecaster.<lambda>>, ignore_utterances: Callable[[convokit.model.utterance.Utterance], bool] = <function Forecaster.<lambda>>, exclude_na=True)

Returns a DataFrame of utterances and their forecasts (and forecast probabilities)

Parameters:
  • corpus – target Corpus
  • exclude_na – whether to drop NaN results
  • selector – a (lambda) function that takes a Conversation and returns a bool: True if the Conversation is to be included in the summary step. By default, includes all Conversations.
  • ignore_utterances – a (lambda) function that takes an Utterance and returns a bool: True if the Utterance should be excluded from the Conversation in the summary step. By default, all Utterances are included.
Returns:

a pandas DataFrame

transform(corpus: convokit.model.corpus.Corpus, selector: Callable[[convokit.model.conversation.Conversation], bool] = <function Forecaster.<lambda>>, ignore_utterances: Callable[[convokit.model.utterance.Utterance], bool] = <function Forecaster.<lambda>>) → convokit.model.corpus.Corpus

Annotate the corpus utterances with forecast and forecast score information

Parameters:
  • corpus – target Corpus
  • selector – a (lambda) function that takes a Conversation and returns a bool: True if the Conversation is to be included in the transformation step. By default, includes all Conversations.
  • ignore_utterances – a (lambda) function that takes an Utterance and returns a bool: True if the Utterance should be excluded from the Conversation in the transformation step. By default, all Utterances are included.
Returns:

annotated Corpus

class convokit.forecaster.forecasterModel.ForecasterModel(forecast_attribute_name: str = 'prediction', forecast_feat_name=None, forecast_prob_attribute_name: str = 'score', forecast_prob_feat_name=None)
forecast(id_to_context_reply_label)

Use the Forecaster Model to compute forecasts and scores for given context-reply pairs and return a results dataframe

train(id_to_context_reply_label)

Train the Forecaster Model with the context-reply-label tuples