ConvoKitMatrix¶
-
class
convokit.model.convoKitMatrix.
ConvoKitMatrix
(name, matrix, ids: Optional[List[str]] = None, columns: Optional[List[str]] = None)¶ A ConvoKitMatrix stores the vector representations of some set of Corpus components (i.e. Utterances, Conversations, Speakers).
- Parameters
name – descriptive name for the matrix
matrix – numpy or scipy array matrix
ids – optional list of Corpus component object ids, where each id corresponds to each row of the matrix
columns – optional list of names for the columns of the matrix
- Variables
name – name of the matrix
matrix – the matrix data
ids – ids corresponding to rows
columns – names corresponding to columns
ids_to_idx – a mapping from id to the row index
cols_to_idx – a mapping from column name to the column index
-
dump
(dirpath)¶ Dumps the ConvoKitMatrix as a pickle file.
- Parameters
dirpath – directory path to Corpus
- Returns
None
-
static
from_dir
(dirpath, matrix_name)¶ Initialize a ConvoKitMatrix of the specified matrix_name from a specified directory dirpath.
- Parameters
dirpath – path to Corpus directory
matrix_name – name of vector matrix
- Returns
the initialized ConvoKitMatrix
-
static
from_file
(filepath)¶ Initialize a ConvoKitMatrix from a file of form “vector.[name].p”.
- Parameters
filepath –
- Returns
-
get_vectors
(ids: Optional[List[str]] = None, columns: Optional[List[str]] = None, as_dataframe: bool = False)¶ - Parameters
ids – optional list of object ids to get vectors for; all by default
columns – optional list of named columns of the vector to include; all by default
as_dataframe – whether to return the vector as a dataframe (True) or in its raw array form (False). False by default.
- Returns
a vector matrix (either np.ndarray or csr_matrix) or a pandas dataframe
-
static
hstack
(name: str, matrices: List[ConvoKitMatrix])¶ Combines multiple ConvoKitMatrices into a single ConvoKitMatrix by stacking them horizontally (i.e. each constituent matrix must have the same ids).
- Parameters
name – name of new matrix
matrices – constituent ConvoKiMatrices
- Returns
a new ConvoKitMatrix
-
subset
(ids: Optional[List[str]] = None, columns: Optional[List[str]] = None)¶ Get a (subset) copy of the ConvoKitMatrix object according to specified subset of ids and columns :param ids: list of ids to be included in the subset; all by default :param columns: list of columns to be included in the subset; all by default :return: a new ConvoKitMatrix object with the subset of
-
to_dataframe
() → pandas.DataFrame¶ Converts the matrix of vectors into a pandas DataFrame.
- Returns
a pandas DataFrame
-
static
vstack
(name: str, matrices: List[ConvoKitMatrix])¶ Combines multiple ConvoKitMatrices into a single ConvoKitMatrix by stacking them horizontally (i.e. each constituent matrix must have the same columns).
- Parameters
name – name of new matrix
matrices – constituent ConvoKiMatrices
- Returns
a new ConvoKitMatrix