diarisation-benchmark

First steps in setting up a diarisation benchmark for Slovenian and related languages

Dataset

We will start with the open dataset ROG-Dialog http://hdl.handle.net/11356/2073. The audio is to be taken from the repository, while the rttm format is available in this repository for simplicity (original repository contains XML Exmaralda files that can be investigated if needed, editor is this: https://exmaralda.org/en/).

Models

Models to be evaluated in the first iteration are

pyannote (legacy 3.1, community-1, precision-2, or any others looking promising) https://huggingface.co/pyannote
NVIDIA softformer https://huggingface.co/nvidia/diar_sortformer_4spk-v1
NVIDIA NeMo models?
SpeechBrain models?
any other models identified as promising
feel free to spend a few EUR (and bill us for these) on API-based diarisers (precision-2 etc.), if they perform significantly better, we are happy to use these as well for some data

Evaluation

While all model outputs are to be logged for future evaluation runs, the first iteration should report

diarisation error rate (DER) pyannote.metrics.diarization.DiarizationErrorRate
processing speed

We are very open to additional metrics as well.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
data		data
docs		docs
evaluation		evaluation
models		models
reports		reports
results		results
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
README.md		README.md
compare_rttm.py		compare_rttm.py
convert_trs_to_rttm.py		convert_trs_to_rttm.py
prepare_data.sh		prepare_data.sh
rog-dialog.rttm		rog-dialog.rttm

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

diarisation-benchmark

Dataset

Models

Evaluation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

diarisation-benchmark

Dataset

Models

Evaluation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages