LLMPromptTransformer

The LLMPromptTransformer is a flexible ConvoKit transformer that allows you to apply custom LLM prompts to corpus objects at different levels (utterances, conversations, speakers, or the entire corpus). It provides fine-grained control over how objects are formatted for LLM processing and where the results are stored as metadata.

This transformer is part of the GenAI module (see GenAI) and integrates seamlessly with the GenAI client infrastructure to support multiple LLM providers (OpenAI GPT, Google Gemini, and local models).

Example usage: GenAI module demo.

class convokit.genai.llmprompttransformer.LLMPromptTransformer(provider: str, model: str, object_level: str, prompt: str, formatter: Callable[[Union[convokit.model.corpus.Corpus, convokit.model.conversation.Conversation, convokit.model.speaker.Speaker, convokit.model.utterance.Utterance]], str], metadata_name: str, selector: Optional[Callable[[Union[convokit.model.corpus.Corpus, convokit.model.conversation.Conversation, convokit.model.speaker.Speaker, convokit.model.utterance.Utterance]], bool]] = None, config_manager: Optional[convokit.genai.genai_config.GenAIConfigManager] = None, llm_kwargs: Optional[Dict[str, Any]] = None)

A ConvoKit Transformer that uses GenAI clients to process objects and store outputs as metadata.

This transformer applies LLM prompts to different levels of the corpus (conversation, speaker, utterance, corpus) using a formatter function to prepare the object data for the prompt, and stores the LLM responses as metadata.

Parameters
  • provider – LLM provider name (“gpt”, “gemini”, “local”, etc.)

  • model – LLM model name

  • object_level – Object level at which to apply the transformer (“conversation”, “speaker”, “utterance”, “corpus”)

  • prompt – Template string for the prompt. Must contain ‘{formatted_object}’ as a placeholder where the formatted object data will be inserted

  • formatter – Function that takes an object and returns a string representation that will replace the ‘{formatted_object}’ placeholder in the prompt

  • metadata_name – Name of the metadata field to store the LLM response

  • selector – Optional function to filter which objects to process. Defaults to processing all objects

  • config_manager – GenAIConfigManager instance for LLM API key management

  • llm_kwargs – Additional keyword arguments to pass to the LLM client

transform(corpus: convokit.model.corpus.Corpus) → convokit.model.corpus.Corpus

Apply the GenAI transformer to the corpus.

Parameters

corpus – The corpus to transform

Returns

The transformed corpus with LLM responses added as metadata

transform_single(obj: Union[str, convokit.model.corpus.Corpus, convokit.model.conversation.Conversation, convokit.model.speaker.Speaker, convokit.model.utterance.Utterance]) → Union[convokit.model.corpus.Corpus, convokit.model.conversation.Conversation, convokit.model.speaker.Speaker, convokit.model.utterance.Utterance]

Transform a single object (utterance, conversation, speaker, or corpus) with the LLM prompt. This method allows users to easily test their prompt on a single unit without processing an entire corpus.

Parameters

obj – The object to transform. Can be: - A string (will be converted to an Utterance with a default speaker) - An Utterance, Conversation, or Speaker object

Returns

The transformed object with LLM response stored in metadata