This project analyzes temporal dependencies in dolphin whistle sequences by predicting the next whistle label based on context.
The analysis evaluates whether longer temporal context improves prediction of the next whistle type in a bout:
- Input: Concatenated embeddings of k previous whistles
- Output: Label of the next whistle
- Goal: Determine if cross-entropy decreases as context length k increases
dataset = {
"Recording_1": {
"id": "Recording_1",
"bouts": {
"bout_1": {
"whistle_1": {
"label": "...",
"embedding": [...],
"start_time": ...,
"end_time": ...,
...
},
"whistle_2": { ... },
},
"bout_2": { ... },
}
},
"Recording_2": { ... },
}- For each context length k ∈ {2, 3, 4, 5, 6, 7}
- Use all bouts with length ≥ k+1
- Slide window of size k through each bout
- Creates thousands of training samples per k
- Use only bouts with length ≥ 8
- Construct shared prediction positions across all k values
- Ensures fair comparison: all models evaluated on exact same positions
- Plot cross-entropy vs k
- Plot accuracy vs k
- Flat line → weak temporal dependence
- Downward slope → increasing context helps prediction
dataset_creation.py- Creates dataset from raw recordingsprepare_training_data.py- Prepares sliding window samplestrain_and_evaluate.py- Trains models and evaluatesrun_experiment.py- Complete pipeline runner
analyze_bout_lengths.py- Statistics on bout lengthsvisualize_embeddings.py- UMAP visualization of embeddings
conda activate seq-dolphins
cd src
python dataset_creation.pyThis creates dataset.json from recordings in the specified folder.
conda activate seq-dolphins
cd src
python run_experiment.pyThis will:
- Prepare training/test data with sliding windows
- Train logistic regression for each k
- Evaluate on shared test positions
- Generate plots showing cross-entropy and accuracy vs k
Output files in data/:
train_data.npz- Training samples for all k valuestest_data.npz- Test samples with shared positionsevaluation_results.npz- Cross-entropy and accuracy per kcontext_length_evaluation.png- Visualization of results
The key metric is cross-entropy loss vs context length:
- Decreasing cross-entropy: Longer context provides useful information about the next whistle type → temporal structure exists
- Flat cross-entropy: Context length doesn't help → weak or no temporal dependencies
- Increasing cross-entropy: Longer context hurts (unusual, may indicate overfitting)
Secondary metric is accuracy, which provides interpretability but is less sensitive than cross-entropy.
numpy
scikit-learn
matplotlib
umap-learn
Install with:
conda activate seq-dolphins
pip install numpy scikit-learn matplotlib umap-learn