Grouping Speakers with ECAPA-TDNN and ONNX Runtime

By Meridian48 News Desk · Summarised from DEV Community · July 5, 2026

A developer tested ECAPA-TDNN speaker embeddings via ONNX Runtime to group utterances from a 14-second conversation. The model consistently produced 192-dimensional embeddings, and a sequential threshold algorithm successfully grouped utterances without knowing speaker count. ONNX Runtime processed audio faster than real time on CPU.

Meridian48 take

A practical demonstration of speaker diarization using open-source models, but the simple threshold approach may struggle with more speakers or noisy audio.

Read the full reporting

Grouping Utterances by Speaker with ECAPA-TDNN and ONNX Runtime →

DEV Community

speaker-embeddingonnx-runtime

Grouping Speakers with ECAPA-TDNN and ONNX Runtime

HTTP/2 Multiplexing: How One Connection Replaced Six

Build a Local Services App with Flutter and HosteDay Backend

OWASP BLT HackerHouse: Real-Time Open Source Contributor Visualization