asyncapi: 3.0.0 info: title: Deepgram API Specification version: 1.0.0 description: | APIs for speech-to-text transcription, text-to-speech synthesis, language understanding, and account management. termsOfService: https://deepgram.com/terms/ contact: email: devrel@deepgram.com url: https://community.deepgram.com name: Deepgram Developer Relations license: name: Privacy Notice url: https://deepgram.com/privacy/ externalDocs: description: Visit our docs to learn more about using Deepgram APIs url: https://developers.deepgram.com tags: - name: Listen description: Speech-to-text transcription - name: Read description: Text analysis - name: Speak description: Text-to-speech generation - name: Agent description: Conversational voice agent - name: experimental description: Experimental features servers: production: host: api.deepgram.com protocol: wss x-fern-server-name: Production security: - $ref: '#/components/securitySchemes/ApiKeyAuth' - $ref: '#/components/securitySchemes/JwtAuth' agent: host: agent.deepgram.com protocol: wss x-fern-server-name: Agent security: - $ref: '#/components/securitySchemes/ApiKeyAuth' - $ref: '#/components/securitySchemes/JwtAuth' components: operationTraits: V1AuthTrait: security: - $ref: '#/components/securitySchemes/ApiKeyAuth' - $ref: '#/components/securitySchemes/JwtAuth' V2AuthTrait: security: - $ref: '#/components/securitySchemes/ApiKeyAuth' - $ref: '#/components/securitySchemes/JwtAuth' securitySchemes: ApiKeyAuth: type: httpApiKey in: header name: Authorization x-fern-header: prefix: Token env: DEEPGRAM_API_KEY JwtAuth: type: http scheme: bearer description: Use a temporary JWT token for authentication. Generate a temporary token using your API key and pass it as a Bearer token. x-fern-token: name: authToken env: DEEPGRAM_TOKEN schemas: SpeakV1Encoding: description: Encoding allows you to specify the expected encoding of your audio output for streaming TTS. Only streaming-compatible encodings are supported. default: linear16 enum: - linear16 - mulaw - alaw examples: - linear16 SpeakV1MipOptOut: description: Opts out requests from the Deepgram Model Improvement Program. Refer to our Docs for pricing impacts before setting this to true. https://dpgr.am/deepgram-mip default: 'false' examples: - 'true' - 'false' SpeakV1Model: description: AI model used to process submitted text default: aura-asteria-en enum: - aura-asteria-en - aura-luna-en - aura-stella-en - aura-athena-en - aura-hera-en - aura-orion-en - aura-arcas-en - aura-perseus-en - aura-angus-en - aura-orpheus-en - aura-helios-en - aura-zeus-en - aura-2-amalthea-en - aura-2-andromeda-en - aura-2-apollo-en - aura-2-arcas-en - aura-2-aries-en - aura-2-asteria-en - aura-2-athena-en - aura-2-atlas-en - aura-2-aurora-en - aura-2-callista-en - aura-2-cordelia-en - aura-2-cora-en - aura-2-delia-en - aura-2-draco-en - aura-2-electra-en - aura-2-harmonia-en - aura-2-helena-en - aura-2-hera-en - aura-2-hermes-en - aura-2-hyperion-en - aura-2-iris-en - aura-2-janus-en - aura-2-juno-en - aura-2-jupiter-en - aura-2-luna-en - aura-2-mars-en - aura-2-minerva-en - aura-2-neptune-en - aura-2-odysseus-en - aura-2-ophelia-en - aura-2-orion-en - aura-2-orpheus-en - aura-2-pandora-en - aura-2-phoebe-en - aura-2-pluto-en - aura-2-saturn-en - aura-2-selene-en - aura-2-thalia-en - aura-2-theia-en - aura-2-vesta-en - aura-2-zeus-en - aura-2-sirio-es - aura-2-nestor-es - aura-2-carina-es - aura-2-celeste-es - aura-2-alvaro-es - aura-2-diana-es - aura-2-aquila-es - aura-2-selena-es - aura-2-estrella-es - aura-2-javier-es examples: - aura-2-thalia-en SpeakV1SampleRate: description: Sample Rate specifies the sample rate for the output audio. Based on encoding 8000 or 24000 are possible defaults. For some encodings sample rate is not configurable. default: '24000' enum: - '8000' - '16000' - '24000' - '32000' - '48000' x-fern-enum: '8000': name: EightThousand '16000': name: SixteenThousand '24000': name: TwentyFourThousand '32000': name: ThirtyTwoThousand '48000': name: FortyEightThousand examples: - '24000' ListenV1Callback: description: URL to which we'll make the callback request examples: - https://example.com ListenV1CallbackMethod: description: HTTP method by which the callback request will be made default: POST enum: - POST - GET - PUT - DELETE examples: - POST - GET - PUT - DELETE ListenV1Channels: description: The number of channels in the submitted audio default: '1' examples: - '1' ListenV1DetectEntities: description: Identifies and extracts key entities from content in submitted audio. Entities appear in final results. When enabled, Punctuation will also be enabled by default default: 'false' enum: - 'true' - 'false' examples: - 'true' ListenV1Diarize: description: Defaults to `false`. Recognize speaker changes. Each word in the transcript will be assigned a speaker number starting at 0 default: 'false' enum: - 'true' - 'false' examples: - 'true' ListenV1Dictation: description: Identify and extract key entities from content in submitted audio default: 'false' enum: - 'true' - 'false' examples: - 'true' ListenV1Encoding: description: Specify the expected encoding of your submitted audio enum: - linear16 - linear32 - flac - alaw - mulaw - amr-nb - amr-wb - opus - ogg-opus - speex - g729 examples: - linear16 ListenV1Endpointing: description: Indicates how long Deepgram will wait to detect whether a speaker has finished speaking or pauses for a significant period of time. When set to a value, the streaming endpoint immediately finalizes the transcription for the processed time range and returns the transcript with a speech_final parameter set to true. Can also be set to false to disable endpointing default: '10' examples: - '300' - 'false' ListenV1Extra: description: Arbitrary key-value pairs that are attached to the API response for usage in downstream processing examples: - key:value ListenV1InterimResults: description: Specifies whether the streaming endpoint should provide ongoing transcription updates as more audio is received. When set to true, the endpoint sends continuous updates, meaning transcription results may evolve over time default: 'false' enum: - 'true' - 'false' examples: - 'true' ListenV1Keyterm: description: Key term prompting can boost specialized terminology and brands. Only compatible with Nova-3 examples: - Snuffleupagus ListenV1Keywords: description: Keywords can boost or suppress specialized terminology and brands examples: - Twilio:2 ListenV1Language: description: The [BCP-47 language tag](https://tools.ietf.org/html/bcp47) that hints at the primary spoken language. Depending on the Model you choose only certain languages are available default: en examples: - en ListenV1MipOptOut: description: Opts out requests from the Deepgram Model Improvement Program. Refer to our Docs for pricing impacts before setting this to true. https://dpgr.am/deepgram-mip default: 'false' examples: - 'true' - 'false' ListenV1Model: description: AI model to use for the transcription enum: - nova-3 - nova-3-general - nova-3-medical - nova-2 - nova-2-general - nova-2-meeting - nova-2-finance - nova-2-conversationalai - nova-2-voicemail - nova-2-video - nova-2-medical - nova-2-drivethru - nova-2-automotive - nova - nova-general - nova-phonecall - nova-medical - enhanced - enhanced-general - enhanced-meeting - enhanced-phonecall - enhanced-finance - base - meeting - phonecall - finance - conversationalai - voicemail - video - custom examples: - nova-2 - custom model name ListenV1Multichannel: description: Transcribe each audio channel independently default: 'false' enum: - 'true' - 'false' examples: - 'true' ListenV1Numerals: description: Convert numbers from written format to numerical format default: 'false' enum: - 'true' - 'false' examples: - 'true' ListenV1ProfanityFilter: description: Profanity Filter looks for recognized profanity and converts it to the nearest recognized non-profane word or removes it from the transcript completely default: 'false' enum: - 'true' - 'false' examples: - 'true' ListenV1Punctuate: description: Add punctuation and capitalization to the transcript default: 'false' enum: - 'true' - 'false' examples: - 'true' ListenV1Redact: description: Redaction removes sensitive information from your transcripts default: 'false' enum: - 'true' - 'false' - pci - numbers - aggressive_numbers - ssn examples: - 'true' ListenV1Replace: description: Search for terms or phrases in submitted audio and replaces them examples: - monika:Monica ListenV1SampleRate: description: Sample rate of submitted audio. Required (and only read) when a value is provided for encoding examples: - '8000' ListenV1Search: description: Search for terms or phrases in submitted audio examples: - Deepgram - Text to Speech ListenV1SmartFormat: description: Apply formatting to transcript output. When set to true, additional formatting will be applied to transcripts to improve readability default: 'false' enum: - 'true' - 'false' examples: - 'true' ListenV1Tag: description: Label your requests for the purpose of identification during usage reporting examples: - my-team - marketing%20team ListenV1UtteranceEndMs: description: Indicates how long Deepgram will wait to send an UtteranceEnd message after a word has been transcribed. Use with interim_results examples: - '1000' ListenV1VadEvents: description: Indicates that speech has started. You'll begin receiving Speech Started messages upon speech starting default: 'false' enum: - 'true' - 'false' examples: - 'true' ListenV1Version: description: Version of an AI model to use default: latest examples: - MODEL_VERSION ListenV2Model: description: Defines the AI model used to process submitted audio. enum: - flux-general-en examples: - flux-general-en ListenV2Encoding: description: Encoding of the audio stream. Required if sending non-containerized/raw audio. If sending containerized audio, this parameter should be omitted. enum: - linear16 - linear32 - mulaw - alaw - opus - ogg-opus examples: - linear16 ListenV2SampleRate: description: Sample rate of the audio stream in Hz. Required if sending non-containerized/raw audio. If sending containerized audio, this parameter should be omitted. examples: - '16000' ListenV2EagerEotThreshold: description: | End-of-turn confidence required to fire an eager end-of-turn event. When set, enables `EagerEndOfTurn` and `TurnResumed` events. Valid Values 0.3 - 0.9. examples: - '0.4' - '0.5' - '0.6' ListenV2EotThreshold: description: | End-of-turn confidence required to finish a turn. Valid Values 0.5 - 0.9. default: '0.7' examples: - '0.5' - '0.7' - '0.8' - '0.9' ListenV2EotTimeoutMs: description: | A turn will be finished when this much time has passed after speech, regardless of EOT confidence. default: '5000' ListenV2Keyterm: oneOf: - type: string - type: array items: type: string description: | Keyterm prompting can improve recognition of specialized terminology. Pass multiple keyterm query parameters to boost multiple keyterms. examples: - Snuffleupagus - - Snuffleupagus - BigBird - OscarTheGrouch ListenV2MipOptOut: description: | Opts out requests from the Deepgram Model Improvement Program. Refer to our Docs for pricing impacts before setting this to true. https://dpgr.am/deepgram-mip examples: - 'true' ListenV2Tag: description: | Label your requests for the purpose of identification during usage reporting examples: - production - staging - client-xyz OpenAiThinkProvider: type: object required: - type - model properties: type: type: string const: open_ai version: type: string enum: - v1 description: The REST API version for the OpenAI chat completions API model: type: string description: OpenAI model to use enum: - gpt-5 - gpt-5-mini - gpt-5-nano - gpt-4.1 - gpt-4.1-mini - gpt-4.1-nano - gpt-4o - gpt-4o-mini temperature: type: number description: OpenAI temperature (0-2) minimum: 0 maximum: 2 AwsBedrockThinkProvider: type: object required: - type - model properties: type: type: string const: aws_bedrock model: type: string description: AWS Bedrock model to use enum: - anthropic/claude-3-5-sonnet-20240620-v1:0 - anthropic/claude-3-5-haiku-20240307-v1:0 temperature: type: number description: AWS Bedrock temperature (0-2) minimum: 0 maximum: 2 credentials: type: object description: AWS credentials type (STS short-lived or IAM long-lived) properties: type: type: string description: AWS credentials type (STS short-lived or IAM long-lived) enum: - sts - iam region: type: string description: AWS region access_key_id: type: string description: AWS access key secret_access_key: type: string description: AWS secret access key session_token: type: string description: AWS session token (required for STS only) AnthropicThinkProvider: type: object required: - type - model properties: type: type: string const: anthropic version: type: string enum: - v1 description: The REST API version for the Anthropic Messages API model: type: string description: Anthropic model to use enum: - claude-3-5-haiku-latest - claude-sonnet-4-20250514 temperature: type: number description: Anthropic temperature (0-1) minimum: 0 maximum: 1 GoogleThinkProvider: type: object required: - type - model properties: type: type: string const: google version: type: string enum: - v1beta description: The REST API version for the Google generative language API model: type: string description: Google model to use enum: - gemini-2.0-flash - gemini-2.0-flash-lite - gemini-2.5-flash temperature: type: number description: Google temperature (0-2) minimum: 0 maximum: 2 GroqThinkProvider: type: object required: - type - model properties: type: type: string const: groq version: type: string enum: - v1 description: The REST API version for the Groq's chat completions API (mostly OpenAI-compatible) model: type: string description: Groq model to use enum: - openai/gpt-oss-20b temperature: type: number description: Groq temperature (0-2) minimum: 0 maximum: 2 ThinkSettingsV1: type: object required: - provider properties: provider: type: object oneOf: - $ref: '#/components/schemas/OpenAiThinkProvider' - $ref: '#/components/schemas/AwsBedrockThinkProvider' - $ref: '#/components/schemas/AnthropicThinkProvider' - $ref: '#/components/schemas/GoogleThinkProvider' - $ref: '#/components/schemas/GroqThinkProvider' endpoint: type: object description: | Optional for non-Deepgram LLM providers. When present, must include url field and headers object properties: url: type: string description: Custom LLM endpoint URL headers: type: object description: Custom headers for the endpoint additionalProperties: type: string functions: type: array items: type: object properties: name: type: string description: Function name description: type: string description: Function description parameters: type: object description: Function parameters endpoint: type: object description: The Function endpoint to call. if not passed, function is called client-side properties: url: type: string description: Endpoint URL method: type: string description: HTTP method headers: type: object additionalProperties: type: string prompt: type: string context_length: description: | Specifies the number of characters retained in context between user messages, agent responses, and function calls. This setting is only configurable when a custom think endpoint is used oneOf: - title: max description: Agent will not discard context regardless of length type: string enum: - max - type: number minimum: 2 DeepgramSpeakProvider: type: object required: - type - model properties: type: type: string const: deepgram version: type: string enum: - v1 description: The REST API version for the Deepgram text-to-speech API model: type: string enum: - aura-asteria-en - aura-luna-en - aura-stella-en - aura-athena-en - aura-hera-en - aura-orion-en - aura-arcas-en - aura-perseus-en - aura-angus-en - aura-orpheus-en - aura-helios-en - aura-zeus-en - aura-2-amalthea-en - aura-2-andromeda-en - aura-2-apollo-en - aura-2-arcas-en - aura-2-aries-en - aura-2-asteria-en - aura-2-athena-en - aura-2-atlas-en - aura-2-aurora-en - aura-2-callista-en - aura-2-cora-en - aura-2-cordelia-en - aura-2-delia-en - aura-2-draco-en - aura-2-electra-en - aura-2-harmonia-en - aura-2-helena-en - aura-2-hera-en - aura-2-hermes-en - aura-2-hyperion-en - aura-2-iris-en - aura-2-janus-en - aura-2-juno-en - aura-2-jupiter-en - aura-2-luna-en - aura-2-mars-en - aura-2-minerva-en - aura-2-neptune-en - aura-2-odysseus-en - aura-2-ophelia-en - aura-2-orion-en - aura-2-orpheus-en - aura-2-pandora-en - aura-2-phoebe-en - aura-2-pluto-en - aura-2-saturn-en - aura-2-selene-en - aura-2-thalia-en - aura-2-theia-en - aura-2-vesta-en - aura-2-zeus-en - aura-2-sirio-es - aura-2-nestor-es - aura-2-carina-es - aura-2-celeste-es - aura-2-alvaro-es - aura-2-diana-es - aura-2-aquila-es - aura-2-selena-es - aura-2-estrella-es - aura-2-javier-es description: Deepgram TTS model ElevenLabsSpeakProvider: type: object required: - type - model_id properties: type: type: string const: eleven_labs version: type: string enum: - v1 description: The REST API version for the ElevenLabs text-to-speech API model_id: type: string enum: - eleven_turbo_v2_5 - eleven_monolingual_v1 - eleven_multilingual_v2 description: Eleven Labs model ID language: type: string description: Optional language to use, e.g. 'en-US'. Corresponds to the `language_code` parameter in the ElevenLabs API language_code: type: string deprecated: true description: Use the `language` field instead. CartesiaSpeakProvider: type: object required: - type - model_id - voice properties: type: type: string const: cartesia version: type: string enum: - '2025-03-17' description: The API version header for the Cartesia text-to-speech API model_id: type: string enum: - sonic-2 - sonic-multilingual description: Cartesia model ID voice: type: object required: - mode - id properties: mode: type: string description: Cartesia voice mode id: type: string description: Cartesia voice ID language: type: string description: Cartesia language code OpenAiSpeakProvider: type: object required: - type - model - voice properties: type: type: string const: open_ai version: type: string enum: - v1 description: The REST API version for the OpenAI text-to-speech API model: type: string enum: - tts-1 - tts-1-hd description: OpenAI TTS model voice: type: string enum: - alloy - echo - fable - onyx - nova - shimmer description: OpenAI voice AwsPollySpeakProvider: type: object required: - type - voice - language - engine - credentials properties: type: type: string const: aws_polly voice: type: string enum: - Matthew - Joanna - Amy - Emma - Brian - Arthur - Aria - Ayanda description: AWS Polly voice name language: type: string description: Language code to use, e.g. 'en-US'. Corresponds to the `language_code` parameter in the AWS Polly API language_code: type: string deprecated: true description: Use the `language` field instead. engine: type: string enum: - generative - long-form - standard - neural credentials: type: object required: - type - region - access_key_id - secret_access_key properties: type: type: string enum: - sts - iam region: type: string access_key_id: type: string secret_access_key: type: string session_token: type: string description: Required for STS only SpeakSettingsV1: type: object required: - provider properties: provider: type: object oneOf: - $ref: '#/components/schemas/DeepgramSpeakProvider' - $ref: '#/components/schemas/ElevenLabsSpeakProvider' - $ref: '#/components/schemas/CartesiaSpeakProvider' - $ref: '#/components/schemas/OpenAiSpeakProvider' - $ref: '#/components/schemas/AwsPollySpeakProvider' endpoint: type: object description: | Optional if provider is Deepgram. Required for non-Deepgram TTS providers. When present, must include url field and headers object. Valid schemes are https and wss with wss only supported for Eleven Labs. properties: url: type: string description: | Custom TTS endpoint URL. Cannot contain `output_format` or `model_id` query parameters when the provider is Eleven Labs. headers: type: object additionalProperties: type: string messages: SpeakV1TextMessage: name: SpeakV1TextMessage description: Request to convert text to speech payload: type: object properties: type: type: string enum: - Speak description: Message type identifier example: Speak text: type: string description: The input text to be converted to speech example: Hello, world! required: - type - text SpeakV1ControlMessage: name: SpeakV1ControlMessage description: Control messages for managing the Text to Speech WebSocket connection payload: type: object properties: type: type: string enum: - Flush - Clear - Close description: Message type identifier required: - type SpeakV1AudioChunkEvent: name: SpeakV1AudioChunkEvent description: Audio data in the format specified by the request parameters contentType: application/octet-stream payload: type: string format: binary SpeakV1MetadataEvent: name: SpeakV1MetadataEvent description: Metadata sent after the WebSocket handshake payload: type: object properties: type: type: string enum: - Metadata description: Message type identifier request_id: type: string format: uuid description: Unique identifier for the request model_name: type: string description: Name of the model being used model_version: type: string description: Version of the primary model being used model_uuid: type: string format: uuid description: Unique identifier for the primary model used additional_model_uuids: type: array description: List of unique identifiers for any additional models used to serve the request items: type: string format: uuid required: - type - request_id - model_name - model_version - model_uuid SpeakV1ControlEvent: name: SpeakV1ControlEvent payload: type: object properties: type: type: string enum: - Flushed - Cleared description: Message type identifier sequence_id: type: number description: The sequence ID of the response required: - type - sequence_id SpeakV1WarningEvent: name: SpeakV1WarningEvent payload: type: object properties: type: type: string enum: - Warning description: Message type identifier description: type: string description: A description of what went wrong code: type: string description: Error code identifying the type of error required: - type - description - code ListenV1MediaMessage: name: ListenV1MediaMessage description: Audio data is transmitted as raw binary WebSocket messages contentType: application/octet-stream payload: type: string format: binary ListenV1ControlMessage: name: ListenV1ControlMessage description: Control messages for managing the Speech to Text WebSocket connection payload: type: object properties: type: type: string enum: - Finalize - CloseStream - KeepAlive description: Message type identifier required: - type ListenV1ResultsEvent: name: ListenV1ResultsEvent description: Deepgram has responded with a transcription payload: type: object properties: type: type: string enum: - Results description: Message type identifier channel_index: type: array items: type: number description: The index of the channel duration: type: number description: The duration of the transcription start: type: number description: The start time of the transcription is_final: type: boolean description: Whether the transcription is final speech_final: type: boolean description: Whether the transcription is speech final channel: type: object properties: alternatives: type: array items: type: object properties: transcript: type: string description: The transcript of the transcription confidence: type: number description: The confidence of the transcription languages: type: array items: type: string description: The languages of the transcription words: type: array items: type: object properties: word: type: string description: The word of the transcription start: type: number description: The start time of the word end: type: number description: The end time of the word confidence: type: number description: The confidence of the word language: type: string description: The language of the word punctuated_word: type: string description: The punctuated word of the word speaker: type: number description: The speaker of the word required: - word - start - end - confidence required: - transcript - confidence - words required: - alternatives metadata: type: object properties: request_id: type: string description: The request ID model_info: type: object properties: name: type: string description: The name of the model version: type: string description: The version of the model arch: type: string description: The arch of the model required: - name - version - arch model_uuid: type: string description: The model UUID required: - request_id - model_info - model_uuid from_finalize: type: boolean description: Whether the transcription is from a finalize message entities: type: array description: Extracted entities from the audio when detect_entities is enabled. Only present in is_final messages. Returns an empty array if no entities are detected items: type: object properties: label: type: string description: The type/category of the entity (e.g., NAME, PHONE_NUMBER, EMAIL_ADDRESS, ORGANIZATION, CARDINAL) value: type: string description: The formatted text representation of the entity raw_value: type: string description: The original spoken text of the entity (present when formatting is enabled) confidence: type: number description: The confidence score of the entity detection start_word: type: integer description: The index of the first word of the entity in the transcript (inclusive) end_word: type: integer description: The index of the last word of the entity in the transcript (exclusive) required: - label - value - raw_value - confidence - start_word - end_word required: - type - channel_index - duration - start - channel - metadata ListenV1MetadataEvent: name: ListenV1MetadataEvent description: Metadata event - these are usually information describing the connection payload: type: object properties: type: type: string enum: - Metadata description: Message type identifier transaction_key: type: string description: The transaction key deprecated: true request_id: type: string description: The request ID format: uuid sha256: type: string description: The sha256 created: type: string description: The created duration: type: number description: The duration channels: type: number description: The channels required: - type - transaction_key - request_id - sha256 - created - duration - channels ListenV1UtteranceEndEvent: name: ListenV1UtteranceEndEvent description: An utterance has ended payload: type: object properties: type: type: string enum: - UtteranceEnd description: Message type identifier channel: type: array items: type: number description: The channel last_word_end: type: number description: The last word end required: - type - channel - last_word_end ListenV1SpeechStartedEvent: name: ListenV1SpeechStartedEvent description: | `vad_events` is true and speech has been detected payload: type: object properties: type: type: string enum: - SpeechStarted description: Message type identifier channel: type: array items: type: number description: The channel timestamp: type: number description: The timestamp required: - type - channel - timestamp ListenV2MediaMessage: name: ListenV2MediaMessage description: Audio data is transmitted as raw binary WebSocket messages contentType: application/octet-stream payload: type: string format: binary ListenV2ControlMessage: name: ListenV2ControlMessage description: Control messages for managing the Speech to Text WebSocket connection payload: type: object properties: type: type: string enum: - Finalize - CloseStream - KeepAlive description: Message type identifier required: - type ListenV2ConnectedEvent: name: ListenV2ConnectedEvent description: This message is sent at the start of each connection, indicating the connection is active. payload: type: object required: - type - request_id - sequence_id properties: type: type: string enum: - Connected description: Message type identifier request_id: type: string format: uuid description: The unique identifier of the request sequence_id: type: number minimum: 0 description: | Starts at `0` and increments for each message the server sends to the client. This includes messages of other types, like `TurnInfo` messages. ListenV2TurnInfoEvent: contentType: application/json payload: type: object description: Describes the current turn and latest state of the turn required: - type - request_id - sequence_id - event - turn_index - audio_window_start - audio_window_end - transcript - words - end_of_turn_confidence properties: type: type: string const: TurnInfo request_id: type: string format: uuid description: The unique identifier of the request sequence_id: type: number minimum: 0 description: | Starts at `0` and increments for each message the server sends to the client. This includes messages of other types, like `Connected` messages. event: type: string enum: - Update - StartOfTurn - EagerEndOfTurn - TurnResumed - EndOfTurn description: | The type of event being reported. - **Update** - Additional audio has been transcribed, but the turn state hasn't changed - **StartOfTurn** - The user has begun speaking for the first time in the turn - **EagerEndOfTurn** - The system has moderate confidence that the user has finished speaking for the turn. This is an opportunity to begin preparing an agent reply - **TurnResumed** - The system detected that speech had ended and therefore sent an **EagerEndOfTurn** event, but speech is actually continuing for this turn - **EndOfTurn** - The user has finished speaking for the turn turn_index: type: number minimum: 0 description: The index of the current turn audio_window_start: type: number format: float minimum: 0 description: Start time in seconds of the audio range that was transcribed audio_window_end: type: number format: float minimum: 0 description: End time in seconds of the audio range that was transcribed transcript: type: string description: Text that was said over the course of the current turn words: type: array description: The words in the `transcript` items: type: object required: - word - confidence properties: word: type: string description: The individual punctuated, properly-cased word from the transcript confidence: type: number format: float minimum: 0 maximum: 1 description: Confidence that this word was transcribed correctly end_of_turn_confidence: type: number format: float minimum: 0 maximum: 1 description: Confidence that no more speech is coming in this turn examples: - payload: type: TurnInfo request_id: ad12514a-0d38-4f7e-8fba-cce10d8f174c sequence_id: 1 event: StartOfTurn turn_index: 0 audio_window_start: 0 audio_window_end: 0.1 transcript: Hello words: - word: Hello confidence: 0.8 end_of_turn_confidence: 0.12 - payload: type: TurnInfo request_id: ad12514a-0d38-4f7e-8fba-cce10d8f174c sequence_id: 8 event: Update turn_index: 0 audio_window_start: 0 audio_window_end: 0.6 transcript: Hello, how are words: - word: Hello, confidence: 0.96 - word: how confidence: 0.94 - word: are confidence: 0.97 end_of_turn_confidence: 0.05 - payload: type: TurnInfo request_id: ad12514a-0d38-4f7e-8fba-cce10d8f174c sequence_id: 11 event: EndOfTurn turn_index: 0 audio_window_start: 0 audio_window_end: 1.3 transcript: Hello, how are you? words: - word: Hello, confidence: 0.96 - word: how confidence: 0.94 - word: are confidence: 0.97 - word: you? confidence: 0.92 end_of_turn_confidence: 0.86 ListenV2ConfigureSuccessEvent: name: ListenV2ConfigureSuccessEvent description: Sent when a `Configure` message was successfully applied. Returns the current, up-to-date values that were applied. payload: type: object required: - type - request_id - thresholds - keyterms - sequence_id properties: type: type: string const: ConfigureSuccess description: Message type identifier request_id: type: string format: uuid description: The unique identifier of the request thresholds: type: object properties: eager_eot_threshold: $ref: '#/components/schemas/ListenV2EagerEotThreshold' eot_threshold: $ref: '#/components/schemas/ListenV2EotThreshold' eot_timeout_ms: $ref: '#/components/schemas/ListenV2EotTimeoutMs' description: | Updates each parameter, if it is supplied. If a particular threshold parameter is not supplied, the configuration continues using the currently configured value. keyterms: $ref: '#/components/schemas/ListenV2Keyterm' sequence_id: type: number minimum: 0 description: | Starts at `0` and increments for each message the server sends to the client. This includes messages of other types, like `TurnInfo` messages. ListenV2ConfigureFailureEvent: name: ListenV2ConfigureFailureEvent description: Indicates that a Configure message was rejected payload: type: object required: - type - request_id - sequence_id properties: type: type: string const: ConfigureFailure description: Message type identifier request_id: type: string format: uuid description: The unique identifier of the request sequence_id: type: number minimum: 0 description: | Starts at `0` and increments for each message the server sends to the client. This includes messages of other types, like `TurnInfo` messages. ListenV2FatalErrorEvent: name: ListenV2FatalErrorEvent description: Receive an error message from the server when a fatal error occurs payload: type: object required: - type - sequence_id - description - code properties: type: type: string enum: - Error description: Message type identifier sequence_id: type: number minimum: 0 description: | Starts at `0` and increments for each message the server sends to the client. This includes messages of other types, like `Connected` messages. code: type: string description: A string code describing the error, e.g. `INTERNAL_SERVER_ERROR` description: type: string description: Prose description of the error AgentV1SettingsMessage: name: AgentV1SettingsMessage description: Configure the voice agent and sets the input and output audio formats payload: type: object properties: type: type: string const: Settings tags: type: array description: Tags to associate with the request items: type: string description: A tag associated with the request can be used for filtered searching experimental: type: boolean default: false description: To enable experimental features flags: type: object properties: history: type: boolean default: true description: Enable or disable history message reporting mip_opt_out: type: boolean default: false description: To opt out of Deepgram Model Improvement Program audio: type: object properties: input: type: object description: Audio input configuration settings. If omitted, defaults to encoding=linear16 and sample_rate=24000. Higher sample rates like 44100 Hz provide better audio quality. required: - encoding - sample_rate properties: encoding: type: string default: linear16 enum: - linear16 - linear32 - flac - alaw - mulaw - amr-nb - amr-wb - opus - ogg-opus - speex - g729 description: Audio encoding format sample_rate: type: number default: 24000 description: Sample rate in Hz. Common values are 16000, 24000, 44100, 48000 output: type: object description: Audio output configuration settings properties: encoding: type: string default: linear16 enum: - linear16 - mulaw - alaw description: Audio encoding format for streaming TTS output sample_rate: type: number description: Sample rate in Hz bitrate: type: number description: Audio bitrate in bits per second container: type: string description: Audio container format. If omitted, defaults to 'none' agent: oneOf: - type: object properties: language: type: string default: en description: Deprecated. Use `listen.provider.language` and `speak.provider.language` fields instead. deprecated: true context: type: object description: Conversation context including the history of messages and function calls properties: messages: type: array description: Conversation history as a list of messages and function calls items: type: object description: A message here is either a conversational message or a function call oneOf: - type: object description: Conversation text as part of the conversation history required: - type - role - content properties: type: type: string const: History description: Message type identifier for conversation text role: type: string enum: - user - assistant description: Identifies who spoke the statement content: type: string description: The actual statement that was spoken - type: object description: Client-side or server-side function call request and response as part of the conversation history required: - type - function_calls properties: type: type: string const: History function_calls: type: array description: List of function call objects items: type: object required: - id - name - client_side - arguments - response properties: id: type: string description: Unique identifier for the function call name: type: string description: Name of the function called client_side: type: boolean description: Indicates if the call was client-side or server-side arguments: type: string description: Arguments passed to the function response: type: string description: Response from the function call listen: type: object properties: provider: type: object oneOf: - type: object required: - type properties: type: type: string const: deepgram description: Provider type for speech-to-text version: type: string const: v1 description: Specifies usage of the V1 Deepgram speech-to-text API model: type: string description: Model to use for speech to text using the V1 API (e.g. Nova-3, Nova-2) language: type: string default: en-US description: Language code to use for speech-to-text. Can be a BCP-47 language tag (e.g. `en`), or `multi` for code-switching transcription keyterms: type: array items: type: string description: Prompt keyterm recognition to improve Keyword Recall Rate smart_format: type: boolean default: false description: Applies smart formatting to improve transcript readability - type: object required: - type - model properties: type: type: string const: deepgram description: Provider type for speech-to-text version: type: string const: v2 description: Specifies usage of the V2 Deepgram speech-to-text API (e.g. Flux) model: type: string description: Model to use for speech to text using the V2 API (e.g. flux-general-en) keyterms: type: array items: type: string description: Prompt keyterm recognition to improve Keyword Recall Rate think: oneOf: - $ref: '#/components/schemas/ThinkSettingsV1' - type: array items: $ref: '#/components/schemas/ThinkSettingsV1' speak: oneOf: - $ref: '#/components/schemas/SpeakSettingsV1' - type: array items: $ref: '#/components/schemas/SpeakSettingsV1' greeting: type: string description: Optional message that agent will speak at the start - type: string format: uuid description: The ID of an agent created using the agent builder required: - type - audio - agent AgentV1UpdateSpeakMessage: description: Send a message to change the Speak model in the middle of a conversation payload: type: object properties: type: type: string const: UpdateSpeak description: Message type identifier for updating the speak model speak: oneOf: - $ref: '#/components/schemas/SpeakSettingsV1' - type: array items: $ref: '#/components/schemas/SpeakSettingsV1' required: - type - speak AgentV1UpdateThinkMessage: description: Send a message to change the Think model in the middle of a conversation payload: type: object properties: type: type: string const: UpdateThink description: Message type identifier for updating the think model think: oneOf: - $ref: '#/components/schemas/ThinkSettingsV1' - type: array items: $ref: '#/components/schemas/ThinkSettingsV1' required: - type - think AgentV1UpdatePromptMessage: description: Send a message to update the system prompt of the agent payload: type: object properties: type: type: string const: UpdatePrompt description: Message type identifier for prompt update request prompt: type: string description: The new system prompt to be used by the agent required: - type - prompt AgentV1InjectUserMessageMessage: description: Send a text based message to the agent payload: type: object properties: type: type: string const: InjectUserMessage description: Message type identifier for injecting a user message content: type: string description: The specific phrase or statement the agent should respond to required: - type - content AgentV1InjectAgentMessageMessage: description: Immediately trigger an agent response during a conversation payload: type: object properties: type: type: string const: InjectAgentMessage description: Message type identifier for injecting an agent message message: type: string description: The statement that the agent should say required: - type - message AgentV1InjectionRefusedEvent: description: Receive injection refused message payload: type: object properties: type: type: string const: InjectionRefused description: Message type identifier for injection refused message: type: string description: Details about why the injection was refused required: - type - message AgentV1FunctionCallResponseMessage: payload: type: object description: | Function call response message used bidirectionally: • **Client → Server**: Response after client executes a function marked as client_side: true • **Server → Client**: Response after server executes a function marked as client_side: false The same message structure serves both directions, enabling a unified interface for function call responses regardless of execution location. properties: type: type: string const: FunctionCallResponse description: Message type identifier for function call responses id: type: string description: | The unique identifier for the function call. • **Required for client responses**: Should match the id from the corresponding `FunctionCallRequest` • **Optional for server responses**: Server may omit when responding to internal function executions name: type: string description: The name of the function being called content: type: string description: The content or result of the function call required: - type - name - content examples: - name: Client-side function response summary: Client responding after executing a weather lookup function payload: type: FunctionCallResponse id: func_12345 name: get_weather content: '{"temperature": 72, "condition": "sunny"}' - name: Server-side function response summary: Server responding after executing an internal database query payload: type: FunctionCallResponse name: query_database content: '{"results": [{"id": 1, "name": "example"}]}' AgentV1ControlMessage: description: Send a control message to the agent payload: type: object description: Send a control message to the agent properties: type: type: string enum: - KeepAlive description: Message type identifier required: - type AgentV1WelcomeMessage: description: Confirms that the WebSocket connection has been successfully opened payload: type: object properties: type: type: string const: Welcome description: Message type identifier for welcome message request_id: type: string description: Unique identifier for the request required: - type - request_id AgentV1SettingsAppliedEvent: description: Confirm the server has successfully received and applied the Settings message payload: type: object properties: type: type: string const: SettingsApplied description: Message type identifier for settings applied confirmation required: - type AgentV1ConversationTextEvent: description: Facilitate real-time communication by relaying spoken statements from both the user and the assistant payload: type: object properties: type: type: string const: ConversationText description: Message type identifier for conversation text role: type: string enum: - user - assistant description: Identifies who spoke the statement content: type: string description: The actual statement that was spoken required: - type - role - content AgentV1UserStartedSpeakingEvent: description: Notify the client that the user has begun speaking payload: type: object properties: type: type: string const: UserStartedSpeaking description: Message type identifier indicating that the user has begun speaking required: - type AgentV1AgentThinkingEvent: description: Inform the client when the agent is processing information payload: type: object properties: type: type: string const: AgentThinking description: Message type identifier for agent thinking content: type: string description: The text of the agent's thought process required: - type - content AgentV1FunctionCallRequestEvent: name: AgentV1FunctionCallRequestEvent description: Client-side or server-side function call request sent by the server payload: type: object properties: type: type: string const: FunctionCallRequest description: Message type identifier for function call requests functions: type: array description: Array of functions to be called items: type: object required: - id - name - arguments - client_side properties: id: type: string description: Unique identifier for the function call name: type: string description: The name of the function to call arguments: type: string description: JSON string containing the function arguments client_side: type: boolean description: Whether the function should be executed client-side required: - type - functions AgentV1AgentStartedSpeakingEvent: name: AgentV1AgentStartedSpeakingEvent description: Get notified when the server begins streaming an agent's audio response for playback. This message is only sent when the experimental flag is enabled payload: type: object properties: type: type: string const: AgentStartedSpeaking description: Message type identifier for agent started speaking total_latency: type: number format: float description: Seconds from receiving the user's utterance to producing the agent's reply tts_latency: type: number format: float description: The portion of total latency attributable to text-to-speech ttt_latency: type: number format: float description: The portion of total latency attributable to text-to-text (usually an LLM) required: - type - total_latency - tts_latency - ttt_latency AgentV1AgentAudioDoneEvent: name: AgentV1AgentAudioDoneEvent description: Get signals that the server has finished sending the final audio segment to the client payload: type: object properties: type: type: string const: AgentAudioDone description: Message type identifier indicating the agent has finished sending audio required: - type AgentV1ErrorEvent: name: AgentV1ErrorEvent description: Receive an error message from the server when an error occurs payload: type: object properties: type: type: string enum: - Error description: Message type identifier for error responses description: type: string description: A description of what went wrong code: type: string description: Error code identifying the type of error required: - type - description - code AgentV1PromptUpdatedEvent: name: AgentV1PromptUpdatedEvent description: Confirms that an UpdatePrompt message from the client has been applied payload: type: object properties: type: type: string const: PromptUpdated description: Message type identifier for prompt update confirmation required: - type AgentV1SpeakUpdatedEvent: name: AgentV1SpeakUpdatedEvent description: Confirms that an UpdateSpeak message from the client has been applied payload: type: object properties: type: type: string const: SpeakUpdated description: Message type identifier for speak update confirmation required: - type AgentV1ThinkUpdatedEvent: name: AgentV1ThinkUpdatedEvent description: Confirms that an UpdateThink message from the client has been applied payload: type: object properties: type: type: string const: ThinkUpdated description: Message type identifier for think update confirmation required: - type AgentV1WarningEvent: name: AgentV1WarningEvent description: Notifies the client of non-fatal errors or warnings payload: type: object description: Notifies the client of non-fatal errors or warnings properties: type: type: string enum: - Warning description: Message type identifier for warnings description: type: string description: Description of the warning code: type: string description: Warning code identifier required: - type - description - code AgentV1MediaMessage: name: AgentV1MediaMessage description: Raw binary audio data sent to the Voice Agent for processing contentType: application/octet-stream payload: type: string format: binary AgentV1AudioChunkEvent: name: AgentV1AudioChunkEvent description: Raw binary audio data generated by the Voice Agent and sent to the client contentType: application/octet-stream payload: type: string format: binary channels: SpeakV1: x-dg-public: true x-fern-sdk-group-name: - speak - v1 address: /v1/speak description: Convert text into natural-sounding speech using Deepgram's TTS WebSocket servers: - $ref: '#/servers/production' bindings: ws: query: type: object properties: encoding: $ref: '#/components/schemas/SpeakV1Encoding' mip_opt_out: $ref: '#/components/schemas/SpeakV1MipOptOut' model: $ref: '#/components/schemas/SpeakV1Model' sample_rate: $ref: '#/components/schemas/SpeakV1SampleRate' messages: SpeakV1Text: $ref: '#/components/messages/SpeakV1TextMessage' SpeakV1Flush: $ref: '#/components/messages/SpeakV1ControlMessage' SpeakV1Clear: $ref: '#/components/messages/SpeakV1ControlMessage' SpeakV1Close: $ref: '#/components/messages/SpeakV1ControlMessage' SpeakV1Audio: $ref: '#/components/messages/SpeakV1AudioChunkEvent' SpeakV1Metadata: $ref: '#/components/messages/SpeakV1MetadataEvent' SpeakV1Flushed: $ref: '#/components/messages/SpeakV1ControlEvent' SpeakV1Cleared: $ref: '#/components/messages/SpeakV1ControlEvent' SpeakV1Warning: $ref: '#/components/messages/SpeakV1WarningEvent' ListenV1: x-dg-public: true x-fern-sdk-group-name: - listen - v1 address: /v1/listen description: Transcribe audio and video using Deepgram's speech-to-text WebSocket servers: - $ref: '#/servers/production' bindings: ws: query: type: object properties: callback: $ref: '#/components/schemas/ListenV1Callback' callback_method: $ref: '#/components/schemas/ListenV1CallbackMethod' channels: $ref: '#/components/schemas/ListenV1Channels' detect_entities: $ref: '#/components/schemas/ListenV1DetectEntities' diarize: $ref: '#/components/schemas/ListenV1Diarize' dictation: $ref: '#/components/schemas/ListenV1Dictation' encoding: $ref: '#/components/schemas/ListenV1Encoding' endpointing: $ref: '#/components/schemas/ListenV1Endpointing' extra: $ref: '#/components/schemas/ListenV1Extra' interim_results: $ref: '#/components/schemas/ListenV1InterimResults' keyterm: $ref: '#/components/schemas/ListenV1Keyterm' keywords: $ref: '#/components/schemas/ListenV1Keywords' language: $ref: '#/components/schemas/ListenV1Language' mip_opt_out: $ref: '#/components/schemas/ListenV1MipOptOut' model: $ref: '#/components/schemas/ListenV1Model' multichannel: $ref: '#/components/schemas/ListenV1Multichannel' numerals: $ref: '#/components/schemas/ListenV1Numerals' profanity_filter: $ref: '#/components/schemas/ListenV1ProfanityFilter' punctuate: $ref: '#/components/schemas/ListenV1Punctuate' redact: $ref: '#/components/schemas/ListenV1Redact' replace: $ref: '#/components/schemas/ListenV1Replace' sample_rate: $ref: '#/components/schemas/ListenV1SampleRate' search: $ref: '#/components/schemas/ListenV1Search' smart_format: $ref: '#/components/schemas/ListenV1SmartFormat' tag: $ref: '#/components/schemas/ListenV1Tag' utterance_end_ms: $ref: '#/components/schemas/ListenV1UtteranceEndMs' vad_events: $ref: '#/components/schemas/ListenV1VadEvents' version: $ref: '#/components/schemas/ListenV1Version' messages: ListenV1Media: $ref: '#/components/messages/ListenV1MediaMessage' ListenV1Finalize: $ref: '#/components/messages/ListenV1ControlMessage' ListenV1CloseStream: $ref: '#/components/messages/ListenV1ControlMessage' ListenV1KeepAlive: $ref: '#/components/messages/ListenV1ControlMessage' ListenV1Results: $ref: '#/components/messages/ListenV1ResultsEvent' ListenV1Metadata: $ref: '#/components/messages/ListenV1MetadataEvent' ListenV1UtteranceEnd: $ref: '#/components/messages/ListenV1UtteranceEndEvent' ListenV1SpeechStarted: $ref: '#/components/messages/ListenV1SpeechStartedEvent' ListenV2: x-fern-sdk-group-name: - listen - v2 address: /v2/listen description: | Real-time conversational speech recognition with contextual turn detection for natural voice conversations servers: - $ref: '#/servers/production' bindings: ws: query: type: object properties: model: $ref: '#/components/schemas/ListenV2Model' encoding: $ref: '#/components/schemas/ListenV2Encoding' sample_rate: $ref: '#/components/schemas/ListenV2SampleRate' eager_eot_threshold: $ref: '#/components/schemas/ListenV2EagerEotThreshold' eot_threshold: $ref: '#/components/schemas/ListenV2EotThreshold' eot_timeout_ms: $ref: '#/components/schemas/ListenV2EotTimeoutMs' keyterm: $ref: '#/components/schemas/ListenV2Keyterm' mip_opt_out: $ref: '#/components/schemas/ListenV2MipOptOut' tag: $ref: '#/components/schemas/ListenV2Tag' required: - model messages: ListenV2Media: $ref: '#/components/messages/ListenV2MediaMessage' ListenV2CloseStream: $ref: '#/components/messages/ListenV2ControlMessage' ListenV2Configure: name: ListenV2Configure description: Send a Configure message to update Flux settings payload: type: object required: - type properties: type: type: string const: Configure description: Message type identifier thresholds: type: object properties: eager_eot_threshold: $ref: '#/components/schemas/ListenV2EagerEotThreshold' eot_threshold: $ref: '#/components/schemas/ListenV2EotThreshold' eot_timeout_ms: $ref: '#/components/schemas/ListenV2EotTimeoutMs' description: | Updates each parameter, if it is supplied. If a particular threshold parameter is not supplied, the configuration continues using the currently configured value. keyterms: $ref: '#/components/schemas/ListenV2Keyterm' ListenV2Connected: $ref: '#/components/messages/ListenV2ConnectedEvent' ListenV2TurnInfo: $ref: '#/components/messages/ListenV2TurnInfoEvent' ListenV2ConfigureSuccess: $ref: '#/components/messages/ListenV2ConfigureSuccessEvent' ListenV2ConfigureFailure: $ref: '#/components/messages/ListenV2ConfigureFailureEvent' ListenV2FatalError: $ref: '#/components/messages/ListenV2FatalErrorEvent' AgentV1: x-dg-public: true x-fern-sdk-group-name: - agent - v1 address: /v1/agent/converse servers: - $ref: '#/servers/agent' description: Build a conversational voice agent using Deepgram's Voice Agent WebSocket messages: AgentV1Media: $ref: '#/components/messages/AgentV1MediaMessage' AgentV1Audio: $ref: '#/components/messages/AgentV1AudioChunkEvent' AgentV1Welcome: $ref: '#/components/messages/AgentV1WelcomeMessage' AgentV1SettingsApplied: $ref: '#/components/messages/AgentV1SettingsAppliedEvent' AgentV1ConversationText: $ref: '#/components/messages/AgentV1ConversationTextEvent' AgentV1UserStartedSpeaking: $ref: '#/components/messages/AgentV1UserStartedSpeakingEvent' AgentV1AgentThinking: $ref: '#/components/messages/AgentV1AgentThinkingEvent' AgentV1FunctionCallRequest: $ref: '#/components/messages/AgentV1FunctionCallRequestEvent' AgentV1AgentStartedSpeaking: $ref: '#/components/messages/AgentV1AgentStartedSpeakingEvent' AgentV1AgentAudioDone: $ref: '#/components/messages/AgentV1AgentAudioDoneEvent' AgentV1Error: $ref: '#/components/messages/AgentV1ErrorEvent' AgentV1Warning: $ref: '#/components/messages/AgentV1WarningEvent' AgentV1PromptUpdated: $ref: '#/components/messages/AgentV1PromptUpdatedEvent' AgentV1SpeakUpdated: $ref: '#/components/messages/AgentV1SpeakUpdatedEvent' AgentV1ThinkUpdated: $ref: '#/components/messages/AgentV1ThinkUpdatedEvent' AgentV1InjectionRefused: $ref: '#/components/messages/AgentV1InjectionRefusedEvent' AgentV1Settings: $ref: '#/components/messages/AgentV1SettingsMessage' AgentV1UpdateSpeak: $ref: '#/components/messages/AgentV1UpdateSpeakMessage' AgentV1UpdateThink: $ref: '#/components/messages/AgentV1UpdateThinkMessage' AgentV1InjectUserMessage: $ref: '#/components/messages/AgentV1InjectUserMessageMessage' AgentV1InjectAgentMessage: $ref: '#/components/messages/AgentV1InjectAgentMessageMessage' AgentV1SendFunctionCallResponse: $ref: '#/components/messages/AgentV1FunctionCallResponseMessage' AgentV1ReceiveFunctionCallResponse: $ref: '#/components/messages/AgentV1FunctionCallResponseMessage' AgentV1KeepAlive: $ref: '#/components/messages/AgentV1ControlMessage' AgentV1UpdatePrompt: $ref: '#/components/messages/AgentV1UpdatePromptMessage' operations: SpeakV1Text: x-fern-sdk-method-name: sendText description: Text to convert to audio action: send channel: $ref: '#/channels/SpeakV1' traits: - $ref: '#/components/operationTraits/V1AuthTrait' messages: - $ref: '#/channels/SpeakV1/messages/SpeakV1Text' SpeakV1Flush: x-fern-sdk-method-name: sendFlush description: Flush the buffer and receive the final audio for text sent so far action: send channel: $ref: '#/channels/SpeakV1' traits: - $ref: '#/components/operationTraits/V1AuthTrait' messages: - $ref: '#/channels/SpeakV1/messages/SpeakV1Flush' SpeakV1Clear: x-fern-sdk-method-name: sendClear description: Clear the buffer and start a new audio generation. Potentially destructive operation for any text in the buffer action: send channel: $ref: '#/channels/SpeakV1' traits: - $ref: '#/components/operationTraits/V1AuthTrait' messages: - $ref: '#/channels/SpeakV1/messages/SpeakV1Clear' SpeakV1Close: x-fern-sdk-method-name: sendClose description: Flush the buffer and close the connection gracefully after all audio is generated action: send channel: $ref: '#/channels/SpeakV1' traits: - $ref: '#/components/operationTraits/V1AuthTrait' messages: - $ref: '#/channels/SpeakV1/messages/SpeakV1Close' SpeakV1Audio: description: Receive audio chunks as they are generated action: receive channel: $ref: '#/channels/SpeakV1' traits: - $ref: '#/components/operationTraits/V1AuthTrait' messages: - $ref: '#/channels/SpeakV1/messages/SpeakV1Audio' SpeakV1Metadata: description: Receive metadata about the audio generation action: receive channel: $ref: '#/channels/SpeakV1' traits: - $ref: '#/components/operationTraits/V1AuthTrait' messages: - $ref: '#/channels/SpeakV1/messages/SpeakV1Metadata' SpeakV1Flushed: description: Receive metadata about the audio generation action: receive channel: $ref: '#/channels/SpeakV1' traits: - $ref: '#/components/operationTraits/V1AuthTrait' messages: - $ref: '#/channels/SpeakV1/messages/SpeakV1Flushed' SpeakV1Cleared: description: Receive metadata about the audio generation action: receive channel: $ref: '#/channels/SpeakV1' traits: - $ref: '#/components/operationTraits/V1AuthTrait' messages: - $ref: '#/channels/SpeakV1/messages/SpeakV1Cleared' SpeakV1Warning: description: Receive a warning about the audio generation action: receive channel: $ref: '#/channels/SpeakV1' traits: - $ref: '#/components/operationTraits/V1AuthTrait' messages: - $ref: '#/channels/SpeakV1/messages/SpeakV1Warning' ListenV1Media: x-fern-sdk-method-name: sendMedia description: Send audio or video data to be transcribed action: send channel: $ref: '#/channels/ListenV1' traits: - $ref: '#/components/operationTraits/V1AuthTrait' messages: - $ref: '#/channels/ListenV1/messages/ListenV1Media' ListenV1Finalize: x-fern-sdk-method-name: sendFinalize description: Send a Finalize message to flush the WebSocket stream action: send channel: $ref: '#/channels/ListenV1' traits: - $ref: '#/components/operationTraits/V1AuthTrait' messages: - $ref: '#/channels/ListenV1/messages/ListenV1Finalize' ListenV1CloseStream: x-fern-sdk-method-name: sendCloseStream description: Send a CloseStream message to close the WebSocket stream action: send channel: $ref: '#/channels/ListenV1' traits: - $ref: '#/components/operationTraits/V1AuthTrait' messages: - $ref: '#/channels/ListenV1/messages/ListenV1CloseStream' ListenV1KeepAlive: x-fern-sdk-method-name: sendKeepAlive description: Send a KeepAlive message to keep the WebSocket stream alive action: send channel: $ref: '#/channels/ListenV1' traits: - $ref: '#/components/operationTraits/V1AuthTrait' messages: - $ref: '#/channels/ListenV1/messages/ListenV1KeepAlive' ListenV1Results: description: Receive transcription results action: receive channel: $ref: '#/channels/ListenV1' traits: - $ref: '#/components/operationTraits/V1AuthTrait' messages: - $ref: '#/channels/ListenV1/messages/ListenV1Results' ListenV1Metadata: description: Receive metadata about the transcription action: receive channel: $ref: '#/channels/ListenV1' traits: - $ref: '#/components/operationTraits/V1AuthTrait' messages: - $ref: '#/channels/ListenV1/messages/ListenV1Metadata' ListenV1UtteranceEnd: description: Receive an utterance end event action: receive channel: $ref: '#/channels/ListenV1' traits: - $ref: '#/components/operationTraits/V1AuthTrait' messages: - $ref: '#/channels/ListenV1/messages/ListenV1UtteranceEnd' ListenV1SpeechStarted: description: Receive a speech started event action: receive channel: $ref: '#/channels/ListenV1' traits: - $ref: '#/components/operationTraits/V1AuthTrait' messages: - $ref: '#/channels/ListenV1/messages/ListenV1SpeechStarted' ListenV2Media: x-fern-sdk-method-name: sendMedia description: Send audio or video data to be transcribed action: send channel: $ref: '#/channels/ListenV2' traits: - $ref: '#/components/operationTraits/V2AuthTrait' messages: - $ref: '#/channels/ListenV2/messages/ListenV2Media' ListenV2CloseStream: x-fern-sdk-method-name: sendCloseStream description: Send a CloseStream message to close the WebSocket stream action: send channel: $ref: '#/channels/ListenV2' traits: - $ref: '#/components/operationTraits/V2AuthTrait' messages: - $ref: '#/channels/ListenV2/messages/ListenV2CloseStream' ListenV2Configure: description: Send a Configure message to update Flux settings action: send channel: $ref: '#/channels/ListenV2' traits: - $ref: '#/components/operationTraits/V2AuthTrait' messages: - $ref: '#/channels/ListenV2/messages/ListenV2Configure' ListenV2Connected: description: Receive a connected message action: receive channel: $ref: '#/channels/ListenV2' traits: - $ref: '#/components/operationTraits/V2AuthTrait' messages: - $ref: '#/channels/ListenV2/messages/ListenV2Connected' ListenV2TurnInfo: description: Receive a turn info message action: receive channel: $ref: '#/channels/ListenV2' traits: - $ref: '#/components/operationTraits/V2AuthTrait' messages: - $ref: '#/channels/ListenV2/messages/ListenV2TurnInfo' ListenV2ConfigureSuccess: description: Sent when a `Configure` message was successfully applied. Returns the current, up-to-date values that were applied. action: receive channel: $ref: '#/channels/ListenV2' traits: - $ref: '#/components/operationTraits/V2AuthTrait' messages: - $ref: '#/channels/ListenV2/messages/ListenV2ConfigureSuccess' ListenV2ConfigureFailure: description: Indicates that a Configure message was rejected action: receive channel: $ref: '#/channels/ListenV2' traits: - $ref: '#/components/operationTraits/V2AuthTrait' messages: - $ref: '#/channels/ListenV2/messages/ListenV2ConfigureFailure' ListenV2FatalError: description: Receive a fatal error message action: receive channel: $ref: '#/channels/ListenV2' traits: - $ref: '#/components/operationTraits/V2AuthTrait' messages: - $ref: '#/channels/ListenV2/messages/ListenV2FatalError' AgentV1Settings: x-fern-sdk-method-name: sendSettings description: Send settings configuration to Deepgram's Voice Agent API action: send channel: $ref: '#/channels/AgentV1' traits: - $ref: '#/components/operationTraits/V1AuthTrait' messages: - $ref: '#/channels/AgentV1/messages/AgentV1Settings' AgentV1UpdateSpeak: x-fern-sdk-method-name: sendUpdateSpeak description: Send update speak to Deepgram's Voice Agent API action: send channel: $ref: '#/channels/AgentV1' traits: - $ref: '#/components/operationTraits/V1AuthTrait' messages: - $ref: '#/channels/AgentV1/messages/AgentV1UpdateSpeak' AgentV1UpdateThink: x-fern-sdk-method-name: sendUpdateThink description: Send update think to Deepgram's Voice Agent API action: send channel: $ref: '#/channels/AgentV1' traits: - $ref: '#/components/operationTraits/V1AuthTrait' messages: - $ref: '#/channels/AgentV1/messages/AgentV1UpdateThink' AgentV1InjectUserMessage: x-fern-sdk-method-name: sendInjectUserMessage action: send description: Send inject user message to Deepgram's Voice Agent API channel: $ref: '#/channels/AgentV1' traits: - $ref: '#/components/operationTraits/V1AuthTrait' messages: - $ref: '#/channels/AgentV1/messages/AgentV1InjectUserMessage' AgentV1InjectAgentMessage: x-fern-sdk-method-name: sendInjectAgentMessage action: send description: Send inject agent message to Deepgram's Voice Agent API channel: $ref: '#/channels/AgentV1' traits: - $ref: '#/components/operationTraits/V1AuthTrait' messages: - $ref: '#/channels/AgentV1/messages/AgentV1InjectAgentMessage' AgentV1SendFunctionCallResponse: x-fern-sdk-method-name: sendFunctionCallResponse description: | Send a function call response from the client to the server after executing a client-side function call. This is used when the server requests execution of a function marked with `client_side: true`. action: send channel: $ref: '#/channels/AgentV1' traits: - $ref: '#/components/operationTraits/V1AuthTrait' messages: - $ref: '#/channels/AgentV1/messages/AgentV1SendFunctionCallResponse' AgentV1ReceiveFunctionCallResponse: x-fern-sdk-method-name: onFunctionCallResponse description: | Receive a function call response from the server after the server has executed a server-side function call internally. This occurs when functions are marked with `client_side: false`. action: receive channel: $ref: '#/channels/AgentV1' traits: - $ref: '#/components/operationTraits/V1AuthTrait' messages: - $ref: '#/channels/AgentV1/messages/AgentV1ReceiveFunctionCallResponse' AgentV1KeepAlive: x-fern-sdk-method-name: sendKeepAlive description: Send keep alive to Deepgram's Voice Agent API action: send channel: $ref: '#/channels/AgentV1' traits: - $ref: '#/components/operationTraits/V1AuthTrait' messages: - $ref: '#/channels/AgentV1/messages/AgentV1KeepAlive' AgentV1UpdatePrompt: x-fern-sdk-method-name: sendUpdatePrompt description: Send a prompt update to Deepgram's Voice Agent API action: send channel: $ref: '#/channels/AgentV1' traits: - $ref: '#/components/operationTraits/V1AuthTrait' messages: - $ref: '#/channels/AgentV1/messages/AgentV1UpdatePrompt' AgentV1PromptUpdated: description: Receive prompt update from Deepgram's Voice Agent API action: receive channel: $ref: '#/channels/AgentV1' traits: - $ref: '#/components/operationTraits/V1AuthTrait' messages: - $ref: '#/channels/AgentV1/messages/AgentV1PromptUpdated' AgentV1SpeakUpdated: description: Receive speak update from Deepgram's Voice Agent API action: receive channel: $ref: '#/channels/AgentV1' traits: - $ref: '#/components/operationTraits/V1AuthTrait' messages: - $ref: '#/channels/AgentV1/messages/AgentV1SpeakUpdated' AgentV1ThinkUpdated: description: Receive think update from Deepgram's Voice Agent API action: receive channel: $ref: '#/channels/AgentV1' traits: - $ref: '#/components/operationTraits/V1AuthTrait' messages: - $ref: '#/channels/AgentV1/messages/AgentV1ThinkUpdated' AgentV1InjectionRefused: description: Receive injection refused message from Deepgram's Voice Agent API action: receive channel: $ref: '#/channels/AgentV1' traits: - $ref: '#/components/operationTraits/V1AuthTrait' messages: - $ref: '#/channels/AgentV1/messages/AgentV1InjectionRefused' AgentV1Welcome: description: Receive welcome message from Deepgram's Voice Agent API action: receive channel: $ref: '#/channels/AgentV1' traits: - $ref: '#/components/operationTraits/V1AuthTrait' messages: - $ref: '#/channels/AgentV1/messages/AgentV1Welcome' AgentV1SettingsApplied: description: Receive settings applied message from Deepgram's Voice Agent API action: receive channel: $ref: '#/channels/AgentV1' traits: - $ref: '#/components/operationTraits/V1AuthTrait' messages: - $ref: '#/channels/AgentV1/messages/AgentV1SettingsApplied' AgentV1ConversationText: description: Receive conversation text from Deepgram's Voice Agent API action: receive channel: $ref: '#/channels/AgentV1' traits: - $ref: '#/components/operationTraits/V1AuthTrait' messages: - $ref: '#/channels/AgentV1/messages/AgentV1ConversationText' AgentV1UserStartedSpeaking: description: Receive user started speaking message from Deepgram's Voice Agent API action: receive channel: $ref: '#/channels/AgentV1' traits: - $ref: '#/components/operationTraits/V1AuthTrait' messages: - $ref: '#/channels/AgentV1/messages/AgentV1UserStartedSpeaking' AgentV1AgentThinking: description: Receive agent thinking message from Deepgram's Voice Agent API action: receive channel: $ref: '#/channels/AgentV1' traits: - $ref: '#/components/operationTraits/V1AuthTrait' messages: - $ref: '#/channels/AgentV1/messages/AgentV1AgentThinking' AgentV1FunctionCallRequest: description: Receive function call request from Deepgram's Voice Agent API action: receive channel: $ref: '#/channels/AgentV1' traits: - $ref: '#/components/operationTraits/V1AuthTrait' messages: - $ref: '#/channels/AgentV1/messages/AgentV1FunctionCallRequest' AgentV1AgentStartedSpeaking: description: Receive agent started speaking message from Deepgram's Voice Agent API action: receive channel: $ref: '#/channels/AgentV1' traits: - $ref: '#/components/operationTraits/V1AuthTrait' messages: - $ref: '#/channels/AgentV1/messages/AgentV1AgentStartedSpeaking' AgentV1AgentAudioDone: description: Receive agent audio done message from Deepgram's Voice Agent API action: receive channel: $ref: '#/channels/AgentV1' traits: - $ref: '#/components/operationTraits/V1AuthTrait' messages: - $ref: '#/channels/AgentV1/messages/AgentV1AgentAudioDone' AgentV1Error: description: Receive error response from Deepgram's Voice Agent API action: receive channel: $ref: '#/channels/AgentV1' traits: - $ref: '#/components/operationTraits/V1AuthTrait' messages: - $ref: '#/channels/AgentV1/messages/AgentV1Error' AgentV1Warning: description: Receive warning messages from Deepgram's Voice Agent API action: receive channel: $ref: '#/channels/AgentV1' traits: - $ref: '#/components/operationTraits/V1AuthTrait' messages: - $ref: '#/channels/AgentV1/messages/AgentV1Warning' AgentV1Media: x-fern-sdk-method-name: sendMedia description: Send raw binary audio data to Deepgram's Voice Agent API for processing action: send channel: $ref: '#/channels/AgentV1' traits: - $ref: '#/components/operationTraits/V1AuthTrait' messages: - $ref: '#/channels/AgentV1/messages/AgentV1Media' AgentV1Audio: description: Receive raw binary audio data generated by Deepgram's Voice Agent API action: receive channel: $ref: '#/channels/AgentV1' traits: - $ref: '#/components/operationTraits/V1AuthTrait' messages: - $ref: '#/channels/AgentV1/messages/AgentV1Audio'