Conversation¶
-
class
convokit.model.conversation.
Conversation
(owner, id: Optional[str] = None, utterances: Optional[List[str]] = None, meta: Optional[Dict] = None)¶ Represents a discrete subset of utterances in the dataset, connected by a reply-to chain.
- Parameters
owner – The Corpus that this Conversation belongs to
id – The unique ID of this Conversation
utterances – A list of the IDs of the Utterances in this Conversation
meta – Table of initial values for conversation-level metadata
- Variables
id – the ID of the Conversation
meta – A dictionary-like view object providing read-write access to conversation-level metadata.
-
add_meta
(key: str, value) → None¶ Adds a key-value pair to the metadata of the corpus object :param key: name of metadata attribute :param value: value of metadata attribute :return: None
-
add_vector
(vector_name: str)¶ Logs in the Corpus component object’s internal vectors list that the component object has a vector row associated with it in the vector matrix named vector_name. Transformers that add vectors to the Corpus should use this to update the relevant component objects during the transform() step. :param vector_name: name of vector matrix :return: None
-
check_integrity
(verbose: bool = True) → bool¶ Check the integrity of this Conversation; i.e. do the constituent utterances form a complete reply-to chain?
- Parameters
verbose – whether to print errors indicating the problems with the Conversation
- Returns
True if the conversation structure is complete else False
-
delete_vector
(vector_name: str)¶ Delete a vector associated with this Corpus component object. :param vector_name: :return: None
-
get_chronological_speaker_list
(selector: Callable[[convokit.model.speaker.Speaker], bool] = <function Conversation.<lambda>>)¶ Get the speakers in the conversation sorted in chronological order (speakers may appear more than once)
- Parameters
selector – (lambda) function for which speakers should be included; all speakers are included by default
- Returns
list of speakers for each chronological utterance
-
get_chronological_utterance_list
(selector: Callable[[convokit.model.utterance.Utterance], bool] = <function Conversation.<lambda>>)¶ Get the utterances in the conversation sorted in increasing order of timestamp
- Parameters
selector – function for which utterances should be included; all utterances are included by default
- Returns
list of utterances, sorted by timestamp
-
get_longest_paths
() → List[List[convokit.model.utterance.Utterance]]¶ Finds the Utterances form the longest path (i.e. root to leaf) in the Conversation tree. If there are multiple paths with tied lengths, returns all of them as a list of lists. If only one such path exists, a list containing a single list of Utterances is returned.
- Returns
a list of lists of Utterances
-
get_root_to_leaf_paths
() → List[List[convokit.model.utterance.Utterance]]¶ Get the paths (stored as a list of lists of utterances) from the root to each of the leaves in the conversational tree
- Returns
List of lists of Utterances
-
get_speaker
(speaker_id: str) → convokit.model.speaker.Speaker¶ Looks up the Speaker with the given name. Raises a KeyError if no speaker with that name exists.
- Returns
the Speaker with the given speaker_id
-
get_speaker_ids
() → List[str]¶ Produces a list of ids of all speakers in the Conversation, which can be used in calls to get_speaker() to retrieve specific speakers. Provides no ordering guarantees for the list.
- Returns
a list of speaker ids
-
get_speakers_dataframe
(selector: Optional[Callable[[convokit.model.speaker.Speaker], bool]] = <function Conversation.<lambda>>, exclude_meta: bool = False)¶ Get a DataFrame of the Speakers that have participated in the Conversation with fields and metadata attributes, with an optional selector that filters Speakers that should be included. Edits to the DataFrame do not change the corpus in any way.
- param exclude_meta
whether to exclude metadata
- param selector
selector: a (lambda) function that takes a Speaker and returns True or False (i.e. include / exclude). By default, the selector includes all Speakers in the Conversation.
- return
a pandas DataFrame
-
get_subtree
(root_utt_id)¶ Get the utterance node of the specified input id
- Parameters
root_utt_id – id of the root node that the subtree starts from
- Returns
UtteranceNode object
-
get_utterance
(ut_id: str) → convokit.model.utterance.Utterance¶ Looks up the Utterance associated with the given ID. Raises a KeyError if no utterance by that ID exists.
- Returns
the Utterance with the given ID
-
get_utterance_ids
() → List[str]¶ Produces a list of the unique IDs of all utterances in the Conversation, which can be used in calls to get_utterance() to retrieve specific utterances. Provides no ordering guarantees for the list.
- Returns
a list of IDs of Utterances in the Conversation
-
get_utterances_dataframe
(selector=<function Conversation.<lambda>>, exclude_meta: bool = False)¶ Get a DataFrame of the Utterances in the COnversation with fields and metadata attributes. Set an optional selector that filters Utterances that should be included. Edits to the DataFrame do not change the corpus in any way.
- Parameters
exclude_meta – whether to exclude metadata
selector – a (lambda) function that takes a Utterance and returns True or False (i.e. include / exclude). By default, the selector includes all Utterances in the Conversation.
- Returns
a pandas DataFrame
-
get_vector
(vector_name: str, as_dataframe: bool = False, columns: Optional[List[str]] = None)¶ Get the vector stored as vector_name for this object. :param vector_name: name of vector :param as_dataframe: whether to return the vector as a dataframe (True) or in its raw array form (False). False
by default.
- Parameters
columns – optional list of named columns of the vector to include. All columns returned otherwise. This parameter is only used if as_dataframe is set to True
- Returns
a numpy / scipy array
-
iter_speakers
(selector: Callable[[convokit.model.speaker.Speaker], bool] = <function Conversation.<lambda>>) → Generator[convokit.model.speaker.Speaker, None, None]¶ Get Speakers that have participated in the Conversation, with an optional selector that filters for Speakers that should be included.
- param selector
a (lambda) function that takes a Speaker and returns True or False (i.e. include / exclude). By default, the selector includes all Speakers in the Conversation.
- return
a generator of Speakers
-
iter_utterances
(selector: Callable[[convokit.model.utterance.Utterance], bool] = <function Conversation.<lambda>>) → Generator[convokit.model.utterance.Utterance, None, None]¶ Get utterances in the Corpus, with an optional selector that filters for Utterances that should be included.
- Parameters
selector –
- a (lambda) function that takes an Utterance and returns True or False (i.e. include / exclude).
By default, the selector includes all Utterances in the Conversation.
- return
a generator of Utterances
-
print_conversation_stats
()¶ Helper function for printing the number of Utterances and Spekaers in the Conversation.
- Returns
None (prints output)
-
print_conversation_structure
(utt_info_func: Callable[[convokit.model.utterance.Utterance], str] = <function Conversation.<lambda>>, limit: int = None) → None¶ Prints an indented representation of utterances in the Conversation with conversation reply-to structure determining the indented level. The details of each utterance to be printed can be configured.
If limit is set to a value other than None, this will annotate utterances with an ‘order’ metadata indicating their temporal order in the conversation, where the first utterance in the conversation is annotated with 1.
- Parameters
utt_info_func – callable function taking an utterance as input and returning a string of the desired utterance information. By default, this is a lambda function returning the utterance’s speaker’s id
limit – maximum number of utterances to print out. if k, this includes the first k utterances.
- Returns
None. Prints to stdout.
-
retrieve_meta
(key: str)¶ Retrieves a value stored under the key of the metadata of corpus object :param key: name of metadata attribute :return: value
-
traverse
(traversal_type: str, as_utterance: bool = True)¶ Traverse through the Conversation tree structure in a breadth-first search (‘bfs’), depth-first search (dfs), pre-order (‘preorder’), or post-order (‘postorder’) way.
- Parameters
traversal_type – dfs, bfs, preorder, or postorder
as_utterance – whether the iterator should yield the utterance (True) or the utterance node (False)
- Returns
an iterator of the utterances or utterance nodes