Overview
KrispVivaTurn is a turn analyzer that uses Krisp’s VIVA SDK turn detection (Tt) API to determine when a user has finished speaking. Unlike the Smart Turn model which analyzes audio in batches when VAD detects a pause, KrispVivaTurn processes audio frame-by-frame in real time using Krisp’s streaming model.
Krisp VIVA Example
Complete example with Krisp VIVA voice isolation and turn detection
Krisp Developers
Get the Krisp SDK and API key
Installation
KrispVivaTurn requires the Krisp Python SDK. See the Krisp VIVA guide for installation instructions.
Environment Variables
You need to provide the path to the Krisp turn detection model file (.kef extension). This can either be done by setting theKRISP_VIVA_TURN_MODEL_PATH environment variable or by passing model_path to the constructor.
For SDK v1.6.1+, you also need to provide a Krisp API key via the api_key constructor parameter or the KRISP_VIVA_API_KEY environment variable.
Configuration
TheKrispTurnParams class configures turn detection behavior:
Probability threshold for turn completion (0.0 to 1.0). Higher values require more confidence before marking a turn as complete.
Frame duration in milliseconds for turn detection. Supported values: 10, 15, 20, 30, 32.
Constructor Parameters
Path to the Krisp turn detection model file (.kef extension). If not provided, falls back to the
KRISP_VIVA_TURN_MODEL_PATH environment variable.Audio sample rate (will be set by the transport if not provided).
Configuration parameters for turn detection.
Krisp SDK API key for licensing (required for SDK v1.6.1+). If empty, falls back to the
KRISP_VIVA_API_KEY environment variable.Example
How It Works
KrispVivaTurn processes audio as a streaming model, analyzing each audio frame in real time:
- Frame-by-frame processing: Each incoming audio frame is processed by the Krisp turn detection model, which outputs a probability that the user’s turn is complete.
- Speech tracking: VAD signals are used to track when speech starts and stops.
- Threshold crossing: When the model’s probability exceeds the configured
thresholdafter speech has been detected, the turn is marked as complete.
KrispVivaTurn makes its decision continuously as audio flows through, which can result in faster turn detection.
Notes
- Requires a valid Krisp SDK license and turn detection model file
- Works with any VAD analyzer (Silero is recommended)
- Emits
TurnMetricsDatawith end-to-end processing time, measuring the interval from VAD speech-to-silence transition to the model crossing the probability threshold