CRAFT Model

A backend for Forecaster that implements the CRAFT algorithm from the EMNLP 2019 paper “Trouble on the Horizon: Forecasting the Derailment of Conversations as they Develop”.

CRAFT is a neural model based on a pre-train-then-fine-tune paradigm. As the purpose of this class is to enable CRAFT to be used as a backend for Forecaster, it uses the author-provided already-trained CRAFT instance. Training a new CRAFT instance from scratch is considered outside the scope of ConvoKit. Users interested in creating their own custom CRAFT models can instead consult the authors’ official implementation

IMPORTANT NOTE: This implementation directly uses the author-provided CRAFT model that was used in the paper’s experiments. This model was developed separately from ConvoKit and uses its own tokenization scheme, which differs from ConvoKit’s default. Using ConvoKit’s tokenization could therefore result in tokens that are inconsistent with what the CRAFT model expects, leading to errors. ConvoKit ships with a workaround in the form of a special tokenizer, craft_tokenize, which implements the tokenization scheme used in the CRAFT model. Users of this class should therefore always use craft_tokenize in place of ConvoKit’s default tokenization. See the CRAFT demo notebook for an example of how to do this.

class convokit.forecaster.CRAFTModel.CRAFTModel(device_type: str = 'cpu', model_path: str = None, options: Dict = None, forecast_attribute_name: str = 'prediction', forecast_feat_name=None, forecast_prob_attribute_name: str = 'pred_score', forecast_prob_feat_name=None)

CRAFTModel is one of the Forecaster models that can be used with the Forecaster Transformer.

By default, CRAFTModel will be initialized with default options

  • hidden_size: 500

  • encoder_n_layers: 2

  • context_encoder_n_layers: 2

  • decoder_n_layers: 2

  • dropout: 0.1

  • batch_size (batch size for computation, i.e. how many (context, reply, id) tuples to use per batch of evaluation): 64

  • clip: 50.0

  • learning_rate: 1e-5

  • print_every: 10

  • train_epochs (number of epochs for training): 30

  • validation_size (percentage of training input data to use as validation): 0.2

  • max_length (maximum utterance length in the dataset): 80

Parameters
  • device_type – ‘cpu’ or ‘cuda’, default: ‘cpu’

  • model_path – filepath to CRAFT model if loading a custom CRAFT model

  • options – configuration options for the neural network: uses default options otherwise.

  • forecast_attribute_name – name of DataFrame column containing predictions, default: “prediction”

  • forecast_prob_attribute_name – name of DataFrame column containing prediction scores, default: “score”

forecast(id_to_context_reply_label)

Compute forecasts and forecast scores for the given dictionary of utterance id to (context, reply) pairs. Return the values in a DataFrame.

Parameters

id_to_context_reply_label – dict mapping utterance id to (context, reply, label)

Returns

a pandas DataFrame

train(id_to_context_reply_label)

Train the Forecaster Model with the context-reply-label tuples