LLMPromptTransformer¶

The LLMPromptTransformer is a flexible ConvoKit transformer that allows you to apply custom LLM prompts to corpus objects at different levels (utterances, conversations, speakers, or the entire corpus). It provides fine-grained control over how objects are formatted for LLM processing and where the results are stored as metadata.

This transformer is part of the GenAI module (see GenAI) and integrates seamlessly with the GenAI client infrastructure to support multiple LLM providers (OpenAI GPT, Google Gemini, and local models).

Example usage: GenAI module demo.

class convokit.genai.llmprompttransformer.LLMPromptTransformer(provider: str, model: str, object_level: str, prompt: str, formatter: Callable[[Union[convokit.model.corpus.Corpus, convokit.model.conversation.Conversation, convokit.model.speaker.Speaker, convokit.model.utterance.Utterance]], str], metadata_name: str, selector: Optional[Callable[[Union[convokit.model.corpus.Corpus, convokit.model.conversation.Conversation, convokit.model.speaker.Speaker, convokit.model.utterance.Utterance]], bool]] = None, config_manager: Optional[convokit.genai.genai_config.GenAIConfigManager] = None, llm_kwargs: Optional[Dict[str, Any]] = None)¶

A ConvoKit Transformer that uses GenAI clients to process objects and store outputs as metadata.

This transformer applies LLM prompts to different levels of the corpus (conversation, speaker, utterance, corpus) using a formatter function to prepare the object data for the prompt, and stores the LLM responses as metadata.

Parameters

provider – LLM provider name (“gpt”, “gemini”, “local”, etc.)
model – LLM model name
object_level – Object level at which to apply the transformer (“conversation”, “speaker”, “utterance”, “corpus”)
prompt – Template string for the prompt. Must contain ‘{formatted_object}’ as a placeholder where the formatted object data will be inserted
formatter – Function that takes an object and returns a string representation that will replace the ‘{formatted_object}’ placeholder in the prompt
metadata_name – Name of the metadata field to store the LLM response
selector – Optional function to filter which objects to process. Defaults to processing all objects
config_manager – GenAIConfigManager instance for LLM API key management
llm_kwargs – Additional keyword arguments to pass to the LLM client

transform(corpus: convokit.model.corpus.Corpus) → convokit.model.corpus.Corpus¶

Apply the GenAI transformer to the corpus.

Parameters: corpus – The corpus to transform
Returns: The transformed corpus with LLM responses added as metadata

transform_single(obj: Union[str, convokit.model.corpus.Corpus, convokit.model.conversation.Conversation, convokit.model.speaker.Speaker, convokit.model.utterance.Utterance]) → Union[convokit.model.corpus.Corpus, convokit.model.conversation.Conversation, convokit.model.speaker.Speaker, convokit.model.utterance.Utterance]¶

Transform a single object (utterance, conversation, speaker, or corpus) with the LLM prompt. This method allows users to easily test their prompt on a single unit without processing an entire corpus.

Parameters: obj – The object to transform. Can be: - A string (will be converted to an Utterance with a default speaker) - An Utterance, Conversation, or Speaker object
Returns: The transformed object with LLM response stored in metadata