LLMPromptTransformer¶
The LLMPromptTransformer is a flexible ConvoKit transformer that allows you to apply custom LLM prompts to corpus objects at different levels (utterances, conversations, speakers, or the entire corpus). It provides fine-grained control over how objects are formatted for LLM processing and where the results are stored as metadata.
This transformer is part of the GenAI module (see GenAI) and integrates seamlessly with the GenAI client infrastructure to support multiple LLM providers (OpenAI GPT, Google Gemini, and local models).
Example usage: GenAI module demo.
-
class
convokit.genai.llmprompttransformer.LLMPromptTransformer(provider: str, model: str, object_level: str, prompt: str, formatter: Callable[[Union[convokit.model.corpus.Corpus, convokit.model.conversation.Conversation, convokit.model.speaker.Speaker, convokit.model.utterance.Utterance]], str], metadata_name: str, selector: Optional[Callable[[Union[convokit.model.corpus.Corpus, convokit.model.conversation.Conversation, convokit.model.speaker.Speaker, convokit.model.utterance.Utterance]], bool]] = None, config_manager: Optional[convokit.genai.genai_config.GenAIConfigManager] = None, llm_kwargs: Optional[Dict[str, Any]] = None)¶ A ConvoKit Transformer that uses GenAI clients to process objects and store outputs as metadata.
This transformer applies LLM prompts to different levels of the corpus (conversation, speaker, utterance, corpus) using a formatter function to prepare the object data for the prompt, and stores the LLM responses as metadata.
- Parameters
provider – LLM provider name (“gpt”, “gemini”, “local”, etc.)
model – LLM model name
object_level – Object level at which to apply the transformer (“conversation”, “speaker”, “utterance”, “corpus”)
prompt – Template string for the prompt. Must contain ‘{formatted_object}’ as a placeholder where the formatted object data will be inserted
formatter – Function that takes an object and returns a string representation that will replace the ‘{formatted_object}’ placeholder in the prompt
metadata_name – Name of the metadata field to store the LLM response
selector – Optional function to filter which objects to process. Defaults to processing all objects
config_manager – GenAIConfigManager instance for LLM API key management
llm_kwargs – Additional keyword arguments to pass to the LLM client
-
transform(corpus: convokit.model.corpus.Corpus) → convokit.model.corpus.Corpus¶ Apply the GenAI transformer to the corpus.
- Parameters
corpus – The corpus to transform
- Returns
The transformed corpus with LLM responses added as metadata
-
transform_single(obj: Union[str, convokit.model.corpus.Corpus, convokit.model.conversation.Conversation, convokit.model.speaker.Speaker, convokit.model.utterance.Utterance]) → Union[convokit.model.corpus.Corpus, convokit.model.conversation.Conversation, convokit.model.speaker.Speaker, convokit.model.utterance.Utterance]¶ Transform a single object (utterance, conversation, speaker, or corpus) with the LLM prompt. This method allows users to easily test their prompt on a single unit without processing an entire corpus.
- Parameters
obj – The object to transform. Can be: - A string (will be converted to an Utterance with a default speaker) - An Utterance, Conversation, or Speaker object
- Returns
The transformed object with LLM response stored in metadata