politenessStrategies¶
Computes Politeness Strategies features.
Currently offering three strategy collections covering two languages:
politeness_api: English politeness strategies described in A computational approach to politeness with application to social factors
politeness_local: English politeness strategies realized through local markers as used in Facilitating the Communication of Politeness through Fine-Grained Paraphrasing
politeness_cscw_zh: Chinese politeness strategies adapted from Studying Politeness across Cultures using English Twitter and Mandarin Weibo
Example usage:
understanding the (mis)use of politeness strategies in conversations gone awry on Wikipedia,
assessing the permeability of politeness markers in machine-translated communication
-
class
convokit.politenessStrategies.politenessStrategies.
PolitenessStrategies
(parse_attribute_name: str = 'parsed', strategy_attribute_name: str = 'politeness_strategies', marker_attribute_name: str = 'politeness_markers', strategy_collection: str = 'politeness_api', verbose: int = 0)¶ Encapsulates extraction of politeness strategies from utterances in a Corpus.
- Parameters
parse_attribute_name – metadata attribute name to read parses from. Default is ‘parsed’.
strategy_attribute_name – metadata attribute name to store politeness strategies features under during the transform() step. Default is ‘politeness_strategies’.
marker_attribute_name – metadata attribute name to store politeness markers under during the transform() step. Default is ‘politeness_markers’.
strategy_collection – collection of politeness strategies to extract. Options include: “politeness_api”: English politeness strategies proposed in A computational approach to politeness with application to social factors (https://www.cs.cornell.edu/~cristian/Politeness.html) “politeness_local”: English politeness strategies realized through local markers as used in Facilitating the Communication of Politeness through Fine-Grained Paraphrasing (https://www.cs.cornell.edu/~cristian/Politeness_Paraphrasing.html) “politeness_cscw_zh”: Chinese politeness strategies adapted from `Studying Politeness across Cultures using English Twitter and Mandarin Weibo (https://dl.acm.org/doi/abs/10.1145/3415190) Default is “politeness_api”.
verbose – whether and how often to print status messages while computing features.
-
transform
(corpus: convokit.model.corpus.Corpus, selector: Optional[Callable[[convokit.model.utterance.Utterance], bool]] = <function PolitenessStrategies.<lambda>>, markers: bool = False)¶ Extract politeness strategies from each utterances in the corpus and annotate the utterances with the extracted strategies. Requires that the corpus has previously been transformed by a Parser, such that each utterance has dependency parse info in its metadata table.
- Parameters
corpus – the corpus to compute features for.
selector – a (lambda) function that takes an Utterance and returns a bool indicating whether the utterance should be included in this annotation step.
markers – whether or not to add politeness occurrence markers
-
transform_utterance
(utt: convokit.model.utterance.Utterance, spacy_nlp: Callable[[str], spacy.tokens.Doc] = None, markers: bool = False)¶ Extract politeness strategies for raw string inputs (or individual utterances)
- Parameters
utt – the utterance to be annotated with politeness strategies.
- Spacy_nlp
if provided, will use this SpaCy object to do parsing; otherwise will initialize an object via load(‘en’).
- Returns
the utterance with politeness annotations.
-
summarize
(corpus: convokit.model.corpus.Corpus, selector: Callable[[convokit.model.utterance.Utterance], bool] = <function PolitenessStrategies.<lambda>>, plot: bool = False, y_lim=None)¶ Calculates strategy prevalence and plot graph if plot == True, with an optional selector that specifies which utterances to include in the analysis.
- Parameters
corpus – the target Corpus
selector – a function (typically, a lambda function) that takes an Utterance and returns True or False (i.e. include / exclude).
By default, the selector includes all Utterances in the Corpus. :param plot: whether or not to output graph. :return: a pandas DataFrame of scores with graph optionally outputted