Decision Policy¶
A decision policy converts the continuous score produced by a ForecasterModel’s belief estimator into a discrete intervention decision. Separating belief estimation (the score) from the action (intervene or not) lets you swap decision logic, ranging from simple thresholding to look-ahead deferral and simulation-based voting, without retraining or modifying the underlying forecaster.
Every policy implements two methods:
decide(context, score_fn): returns a(score, action, metadata)tuple, whereactionis the binary intervention decision andmetadatais an optional dict of extra per-utterance information (e.g. the simulated replies used by deferral policies).fit(contexts, val_contexts, score_fn): tunes policy-specific parameters (such as the decision threshold) on a held-out validation set.
A ForecasterModel owns a single decision policy, defaulting to ThresholdDecisionPolicy, and exposes it via
its decision_policy property. The policy receives the model’s score function as score_fn and shares
the model’s labeler and forecast_prob cache key so it can reuse already-computed forecast probabilities.
This mechanism is introduced in Wait! There’s a Way Out.
Base Class¶
-
class
convokit.decisionpolicy.decisionPolicy.DecisionPolicy(forecast_prob_attribute_name: str = 'forecast_prob', reuse_cached_forecast_probs: bool = True)¶ Abstract interface for converting a conversational context into an action.
-
abstract
decide(context, score_fn: Callable) → Tuple[float, int, Optional[Dict[str, Any]]]¶ Decide whether to intervene for a context.
- Parameters
context – context tuple supplied by Forecaster
score_fn – callable that maps a context tuple to a scalar score
- Returns
tuple containing the score, the integer action label (currently 0/1), and any additional metadata
-
abstract
fit(contexts, val_contexts=None, score_fn: Callable = None)¶ Fit policy-specific parameters if needed.
- Parameters
contexts – training contexts for policy fitting
val_contexts – optional validation contexts
score_fn – optional scorer callable exposed by ForecasterModel
-
abstract
Threshold Decision Policy¶
-
class
convokit.decisionpolicy.thresholdDecisionPolicy.ThresholdDecisionPolicy(threshold: float = 0.5, forecast_prob_attribute_name: str = 'forecast_prob', reuse_cached_forecast_probs: bool = True)¶ A simple decision policy that predicts 1 when score > threshold.
-
decide(context, score_fn: Callable) → Tuple[float, int]¶ Decide whether to intervene for a context.
- Parameters
context – context tuple supplied by Forecaster
score_fn – callable that maps a context tuple to a scalar score
- Returns
tuple containing the score, the integer action label (currently 0/1), and any additional metadata
-
fit(contexts, val_contexts=None, score_fn: Callable = None)¶ Fit policy-specific parameters if needed.
- Parameters
contexts – training contexts for policy fitting
val_contexts – optional validation contexts
score_fn – optional scorer callable exposed by ForecasterModel
-
Deferral Decision Policy¶
-
class
convokit.decisionpolicy.deferralDecisionPolicy.DeferralDecisionPolicy(simulator, threshold, tau: int = 5, num_simulations: int = 10, store_simulations: bool = False, simulated_reply_attribute_name: str = 'sim_replies', sim_replies_forecast_probs_attribute_name: str = 'sim_replies_forecast_probs', reuse_cached_simulations: bool = True, forecast_prob_attribute_name: str = 'forecast_prob', reuse_cached_forecast_probs: bool = True)¶ Decision policy that defers intervention by looking ahead at simulated next utterances.
- Parameters
simulator – utterance simulator model (must have a
transform(contexts)method returning a DataFrame indexed by utterance id). if the simulator exposesget_num_simulations(),num_simulationsis capped to that value.threshold – probability threshold above which a context is flagged.
tau – minimum number of simulated branches that must exceed the threshold before an intervention is issued.
num_simulations – how many simulated branches to use per context (capped to simulator’s
get_num_simulations()if available).store_simulations – if True, simulated reply strings are cached during decide() and written to corpus utterance metadata by post_transform().
simulated_reply_attribute_name – metadata field name used when storing simulations on corpus utterances (only relevant when store_simulations=True).
reuse_cached_simulations – if True (default), simulations already present on the current utterance’s metadata under
simulated_reply_attribute_nameare reused instead of re-invoking the simulator. similarly, cached simulation scores undersim_replies_forecast_probs_attribute_nameare reused when they align with the reused simulations, skipping re-scoring. set to False to force regeneration.
-
decide(context, score_fn: Callable) → Tuple[float, int, Optional[Dict[str, Any]]]¶ Decide whether to intervene for a context.
- Parameters
context – context tuple supplied by Forecaster
score_fn – callable that maps a context tuple to a scalar score
- Returns
tuple containing the score, the integer action label (currently 0/1), and any additional metadata
-
fit(contexts, val_contexts=None, score_fn: Callable = None)¶ Fit policy-specific parameters if needed.
- Parameters
contexts – training contexts for policy fitting
val_contexts – optional validation contexts
score_fn – optional scorer callable exposed by ForecasterModel
Random Deferral Decision Policy¶
-
class
convokit.decisionpolicy.randomDeferralDecisionPolicy.RandomDeferralDecisionPolicy(simulator, threshold, deferral_probability: float = 0.1515, reuse_cached_forecast_probs: bool = True, forecast_prob_attribute_name: str = 'forecast_prob')¶ Decision policy that defers intervention by looking ahead at simulated next utterances.
- Parameters
simulator – utterance simulator model (must have a
transform(contexts)method returning a DataFrame indexed by utterance id). if the simulator exposesget_num_simulations(),num_simulationsis capped to that value.threshold – probability threshold above which a context is flagged.
tau – minimum number of simulated branches that must exceed the threshold before an intervention is issued.
num_simulations – how many simulated branches to use per context (capped to simulator’s
get_num_simulations()if available).store_simulations – if True, simulated reply strings are cached during decide() and written to corpus utterance metadata by post_transform().
simulated_reply_attribute_name – metadata field name used when storing simulations on corpus utterances (only relevant when store_simulations=True).
-
decide(context, score_fn: Callable) → Tuple[float, int, Optional[Dict[str, Any]]]¶ Decide whether to intervene for a context.
- Parameters
context – context tuple supplied by Forecaster
score_fn – callable that maps a context tuple to a scalar score
- Returns
tuple containing the score, the integer action label (currently 0/1), and any additional metadata
-
fit(contexts, val_contexts=None, score_fn: Callable = None)¶ Fit policy-specific parameters if needed.
- Parameters
contexts – training contexts for policy fitting
val_contexts – optional validation contexts
score_fn – optional scorer callable exposed by ForecasterModel
Simulation Average Decision Policy¶
This policy is based upon the forecasting approach described in Simulation-based Decision Making for Dialogue Intervention in the static forecasting task, and is adapted for the non-static forecasting task in Wait! There’s a Way Out
-
class
convokit.decisionpolicy.simulationAverageDecisionPolicy.SimulationAverageDecisionPolicy(simulator, threshold, num_simulations: int = 10, store_simulations: bool = False, simulated_reply_attribute_name: str = 'sim_replies', sim_replies_forecast_probs_attribute_name: str = 'sim_replies_forecast_probs', reuse_cached_simulations: bool = True)¶ decision policy that intervenes if the mean of the simulated next-utterance scores is at or above the threshold.
this subclass inherits all simulation fetching, per-utterance metadata caching, sim-score caching, and threshold fitting from DeferralDecisionPolicy. the only differences are:
no
tauparameter (unused; forwarded as 0 to super)decidepredicts based on mean(simulation_scores) >= threshold
-
decide(context, score_fn: Callable) → Tuple[float, int, Optional[Dict[str, Any]]]¶ Decide whether to intervene for a context.
- Parameters
context – context tuple supplied by Forecaster
score_fn – callable that maps a context tuple to a scalar score
- Returns
tuple containing the score, the integer action label (currently 0/1), and any additional metadata
Simulation Majority Decision Policy¶
-
class
convokit.decisionpolicy.simulationMajorityDecisionPolicy.SimulationMajorityDecisionPolicy(simulator, threshold, tau: int = 5, num_simulations: int = 10, store_simulations: bool = False, simulated_reply_attribute_name: str = 'sim_replies', sim_replies_forecast_probs_attribute_name: str = 'sim_replies_forecast_probs', reuse_cached_simulations: bool = True, forecast_prob_attribute_name: str = 'forecast_prob', reuse_cached_forecast_probs: bool = True)¶ decision policy that intervenes if at least
tauof the simulated next utterances score above the threshold, ignoring the current utterance score.this subclass inherits all simulation fetching, per-utterance metadata caching, sim-score caching, and threshold fitting from DeferralDecisionPolicy. the only difference is in
decide: the gatedecision_score > thresholdis dropped so that only the simulated-branch vote count drives the prediction.-
decide(context, score_fn: Callable) → Tuple[float, int, Optional[Dict[str, Any]]]¶ Decide whether to intervene for a context.
- Parameters
context – context tuple supplied by Forecaster
score_fn – callable that maps a context tuple to a scalar score
- Returns
tuple containing the score, the integer action label (currently 0/1), and any additional metadata
-