Skip to content

test: add unit tests for DPO and NPO loss functions#571

Open
stanley1208 wants to merge 1 commit intogoogle-deepmind:mainfrom
stanley1208:test/dpo-npo-loss-tests
Open

test: add unit tests for DPO and NPO loss functions#571
stanley1208 wants to merge 1 commit intogoogle-deepmind:mainfrom
stanley1208:test/dpo-npo-loss-tests

Conversation

@stanley1208
Copy link

Summary

The DPO and NPO loss functions in gemma/gm/losses/ had zero test coverage. This PR adds comprehensive unit tests for both.

Tests added

DPO loss (_dpo_test.py — 5 tests):

  • test_get_logprobs_for_target: verifies log-probability computation against manual calculation
  • test_get_logprobs_masked_positions_ignored: masked positions don't contribute to logprob sum
  • test_dpo_loss_output_shape: correct output shape [B, 1]
  • test_dpo_loss_prefers_chosen: loss is lower when policy increases chosen response probability
  • test_dpo_loss_label_smoothing: with label_smoothing=0.5, loss is symmetric

NPO loss (_npo_test.py — 4 tests):

  • test_npo_get_logprobs_for_target: verifies log-probability computation against manual calculation
  • test_npo_loss_output_shape: correct output shape [B, 1]
  • test_npo_loss_penalizes_high_policy_prob: loss is higher when policy assigns more probability than anchor to undesired content
  • test_npo_loss_zero_when_policy_matches_anchor: when policy equals anchor, loss equals log(2) (the theoretical baseline)

Test plan

  • All 9 new tests pass (pytest -vv gemma/gm/losses/_dpo_test.py gemma/gm/losses/_npo_test.py)
  • Full test suite unaffected (new files only, no changes to existing code)

The DPO and NPO loss functions in gemma/gm/losses/ had no test coverage. This adds 9 tests covering:

DPO (5 tests): logprob computation, mask handling, output shape, preference ordering, label smoothing symmetry

NPO (4 tests): logprob computation, output shape, undesired content penalization, policy-matches-anchor baseline
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant

Comments