test: add unit tests for DPO and NPO loss functions by stanley1208 · Pull Request #571 · google-deepmind/gemma

stanley1208 · 2026-02-18T07:05:48Z

Summary

The DPO and NPO loss functions in gemma/gm/losses/ had zero test coverage. This PR adds comprehensive unit tests for both.

Tests added

DPO loss (_dpo_test.py — 5 tests):

test_get_logprobs_for_target: verifies log-probability computation against manual calculation
test_get_logprobs_masked_positions_ignored: masked positions don't contribute to logprob sum
test_dpo_loss_output_shape: correct output shape [B, 1]
test_dpo_loss_prefers_chosen: loss is lower when policy increases chosen response probability
test_dpo_loss_label_smoothing: with label_smoothing=0.5, loss is symmetric

NPO loss (_npo_test.py — 4 tests):

test_npo_get_logprobs_for_target: verifies log-probability computation against manual calculation
test_npo_loss_output_shape: correct output shape [B, 1]
test_npo_loss_penalizes_high_policy_prob: loss is higher when policy assigns more probability than anchor to undesired content
test_npo_loss_zero_when_policy_matches_anchor: when policy equals anchor, loss equals log(2) (the theoretical baseline)

Test plan

All 9 new tests pass (pytest -vv gemma/gm/losses/_dpo_test.py gemma/gm/losses/_npo_test.py)
Full test suite unaffected (new files only, no changes to existing code)

The DPO and NPO loss functions in gemma/gm/losses/ had no test coverage. This adds 9 tests covering: DPO (5 tests): logprob computation, mask handling, output shape, preference ordering, label smoothing symmetry NPO (4 tests): logprob computation, output shape, undesired content penalization, policy-matches-anchor baseline

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

test: add unit tests for DPO and NPO loss functions#571

test: add unit tests for DPO and NPO loss functions#571
stanley1208 wants to merge 1 commit intogoogle-deepmind:mainfrom
stanley1208:test/dpo-npo-loss-tests

stanley1208 commented Feb 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments

Conversation

stanley1208 commented Feb 18, 2026

Summary

Tests added

Test plan

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Comments