Decision Policy

A decision policy converts the continuous score produced by a ForecasterModel’s belief estimator into a discrete intervention decision. Separating belief estimation (the score) from the action (intervene or not) lets you swap decision logic, ranging from simple thresholding to look-ahead deferral and simulation-based voting, without retraining or modifying the underlying forecaster.

Every policy implements two methods:

  • decide(context, score_fn): returns a (score, action, metadata) tuple, where action is the binary intervention decision and metadata is an optional dict of extra per-utterance information (e.g. the simulated replies used by deferral policies).

  • fit(contexts, val_contexts, score_fn): tunes policy-specific parameters (such as the decision threshold) on a held-out validation set.

A ForecasterModel owns a single decision policy, defaulting to ThresholdDecisionPolicy, and exposes it via its decision_policy property. The policy receives the model’s score function as score_fn and shares the model’s labeler and forecast_prob cache key so it can reuse already-computed forecast probabilities.

This mechanism is introduced in Wait! There’s a Way Out.

Base Class

class convokit.decisionpolicy.decisionPolicy.DecisionPolicy(forecast_prob_attribute_name: str = 'forecast_prob', reuse_cached_forecast_probs: bool = True)

Abstract interface for converting a conversational context into an action.

abstract decide(context, score_fn: Callable) → Tuple[float, int, Optional[Dict[str, Any]]]

Decide whether to intervene for a context.

Parameters
  • context – context tuple supplied by Forecaster

  • score_fn – callable that maps a context tuple to a scalar score

Returns

tuple containing the score, the integer action label (currently 0/1), and any additional metadata

abstract fit(contexts, val_contexts=None, score_fn: Callable = None)

Fit policy-specific parameters if needed.

Parameters
  • contexts – training contexts for policy fitting

  • val_contexts – optional validation contexts

  • score_fn – optional scorer callable exposed by ForecasterModel

Threshold Decision Policy

class convokit.decisionpolicy.thresholdDecisionPolicy.ThresholdDecisionPolicy(threshold: float = 0.5, forecast_prob_attribute_name: str = 'forecast_prob', reuse_cached_forecast_probs: bool = True)

A simple decision policy that predicts 1 when score > threshold.

decide(context, score_fn: Callable) → Tuple[float, int]

Decide whether to intervene for a context.

Parameters
  • context – context tuple supplied by Forecaster

  • score_fn – callable that maps a context tuple to a scalar score

Returns

tuple containing the score, the integer action label (currently 0/1), and any additional metadata

fit(contexts, val_contexts=None, score_fn: Callable = None)

Fit policy-specific parameters if needed.

Parameters
  • contexts – training contexts for policy fitting

  • val_contexts – optional validation contexts

  • score_fn – optional scorer callable exposed by ForecasterModel

Deferral Decision Policy

class convokit.decisionpolicy.deferralDecisionPolicy.DeferralDecisionPolicy(simulator, threshold, tau: int = 5, num_simulations: int = 10, store_simulations: bool = False, simulated_reply_attribute_name: str = 'sim_replies', sim_replies_forecast_probs_attribute_name: str = 'sim_replies_forecast_probs', reuse_cached_simulations: bool = True, forecast_prob_attribute_name: str = 'forecast_prob', reuse_cached_forecast_probs: bool = True)

Decision policy that defers intervention by looking ahead at simulated next utterances.

Parameters
  • simulator – utterance simulator model (must have a transform(contexts) method returning a DataFrame indexed by utterance id). if the simulator exposes get_num_simulations(), num_simulations is capped to that value.

  • threshold – probability threshold above which a context is flagged.

  • tau – minimum number of simulated branches that must exceed the threshold before an intervention is issued.

  • num_simulations – how many simulated branches to use per context (capped to simulator’s get_num_simulations() if available).

  • store_simulations – if True, simulated reply strings are cached during decide() and written to corpus utterance metadata by post_transform().

  • simulated_reply_attribute_name – metadata field name used when storing simulations on corpus utterances (only relevant when store_simulations=True).

  • reuse_cached_simulations – if True (default), simulations already present on the current utterance’s metadata under simulated_reply_attribute_name are reused instead of re-invoking the simulator. similarly, cached simulation scores under sim_replies_forecast_probs_attribute_name are reused when they align with the reused simulations, skipping re-scoring. set to False to force regeneration.

decide(context, score_fn: Callable) → Tuple[float, int, Optional[Dict[str, Any]]]

Decide whether to intervene for a context.

Parameters
  • context – context tuple supplied by Forecaster

  • score_fn – callable that maps a context tuple to a scalar score

Returns

tuple containing the score, the integer action label (currently 0/1), and any additional metadata

fit(contexts, val_contexts=None, score_fn: Callable = None)

Fit policy-specific parameters if needed.

Parameters
  • contexts – training contexts for policy fitting

  • val_contexts – optional validation contexts

  • score_fn – optional scorer callable exposed by ForecasterModel

Random Deferral Decision Policy

class convokit.decisionpolicy.randomDeferralDecisionPolicy.RandomDeferralDecisionPolicy(simulator, threshold, deferral_probability: float = 0.1515, reuse_cached_forecast_probs: bool = True, forecast_prob_attribute_name: str = 'forecast_prob')

Decision policy that defers intervention by looking ahead at simulated next utterances.

Parameters
  • simulator – utterance simulator model (must have a transform(contexts) method returning a DataFrame indexed by utterance id). if the simulator exposes get_num_simulations(), num_simulations is capped to that value.

  • threshold – probability threshold above which a context is flagged.

  • tau – minimum number of simulated branches that must exceed the threshold before an intervention is issued.

  • num_simulations – how many simulated branches to use per context (capped to simulator’s get_num_simulations() if available).

  • store_simulations – if True, simulated reply strings are cached during decide() and written to corpus utterance metadata by post_transform().

  • simulated_reply_attribute_name – metadata field name used when storing simulations on corpus utterances (only relevant when store_simulations=True).

decide(context, score_fn: Callable) → Tuple[float, int, Optional[Dict[str, Any]]]

Decide whether to intervene for a context.

Parameters
  • context – context tuple supplied by Forecaster

  • score_fn – callable that maps a context tuple to a scalar score

Returns

tuple containing the score, the integer action label (currently 0/1), and any additional metadata

fit(contexts, val_contexts=None, score_fn: Callable = None)

Fit policy-specific parameters if needed.

Parameters
  • contexts – training contexts for policy fitting

  • val_contexts – optional validation contexts

  • score_fn – optional scorer callable exposed by ForecasterModel

Simulation Average Decision Policy

This policy is based upon the forecasting approach described in Simulation-based Decision Making for Dialogue Intervention in the static forecasting task, and is adapted for the non-static forecasting task in Wait! There’s a Way Out

class convokit.decisionpolicy.simulationAverageDecisionPolicy.SimulationAverageDecisionPolicy(simulator, threshold, num_simulations: int = 10, store_simulations: bool = False, simulated_reply_attribute_name: str = 'sim_replies', sim_replies_forecast_probs_attribute_name: str = 'sim_replies_forecast_probs', reuse_cached_simulations: bool = True)

decision policy that intervenes if the mean of the simulated next-utterance scores is at or above the threshold.

this subclass inherits all simulation fetching, per-utterance metadata caching, sim-score caching, and threshold fitting from DeferralDecisionPolicy. the only differences are:

  • no tau parameter (unused; forwarded as 0 to super)

  • decide predicts based on mean(simulation_scores) >= threshold

decide(context, score_fn: Callable) → Tuple[float, int, Optional[Dict[str, Any]]]

Decide whether to intervene for a context.

Parameters
  • context – context tuple supplied by Forecaster

  • score_fn – callable that maps a context tuple to a scalar score

Returns

tuple containing the score, the integer action label (currently 0/1), and any additional metadata

Simulation Majority Decision Policy

class convokit.decisionpolicy.simulationMajorityDecisionPolicy.SimulationMajorityDecisionPolicy(simulator, threshold, tau: int = 5, num_simulations: int = 10, store_simulations: bool = False, simulated_reply_attribute_name: str = 'sim_replies', sim_replies_forecast_probs_attribute_name: str = 'sim_replies_forecast_probs', reuse_cached_simulations: bool = True, forecast_prob_attribute_name: str = 'forecast_prob', reuse_cached_forecast_probs: bool = True)

decision policy that intervenes if at least tau of the simulated next utterances score above the threshold, ignoring the current utterance score.

this subclass inherits all simulation fetching, per-utterance metadata caching, sim-score caching, and threshold fitting from DeferralDecisionPolicy. the only difference is in decide: the gate decision_score > threshold is dropped so that only the simulated-branch vote count drives the prediction.

decide(context, score_fn: Callable) → Tuple[float, int, Optional[Dict[str, Any]]]

Decide whether to intervene for a context.

Parameters
  • context – context tuple supplied by Forecaster

  • score_fn – callable that maps a context tuple to a scalar score

Returns

tuple containing the score, the integer action label (currently 0/1), and any additional metadata