Add metrics #31

jopetty · 2020-11-24T01:29:17Z

Training, evaluation, and testing all require a robust internal metrics framework. This needs to have several components:

A Metric superclass which represents an abstract measurement of model performance on a particular set of data. This should define a template which takes in an (input, target) pair and returns some numerical representation of how accurate the model is.
A collection of specific metrics which inherit the Metric superclass and provide specific implementations of model accuracy. Examples might be FullSequenceAccuracy, TokenAccuracy, ClauseAccuracy, and so on.
A logging framework which handles (1) the computation of these metrics when called (2) the saving of these logs to disk and (3) the reporting of these values to other parts of the code to handle things like early stopping, TQDM post-fixing, etc.

The text was updated successfully, but these errors were encountered:

jopetty · 2021-01-01T18:43:11Z

This is partially done. The basic infrastructure is there, but it seems like metrics are being calculated incorrectly on some iterators. For example, consider this training output:

[2020-12-31 18:10:10,907][core.trainer][INFO] - EPOCH 100 / 100
[2020-12-31 18:10:10,907][core.trainer][INFO] - Computing metrics for 'train' dataset
100%|███████████████████████████████████████████████████████| 821/821 [00:16<00:00, 49.63it/s, trn_loss=0.855]
[2020-12-31 18:10:27,451][core.metrics.meter][INFO] - TokenAccuracy:	0.761
Average Loss:	0.867
[2020-12-31 18:10:27,612][core.trainer][INFO] - Computing metrics for 'val' dataset
100%|████████████████████████████████████████████████████████████████████████| 99/99 [00:00<00:00, 183.42it/s]
[2020-12-31 18:10:28,153][core.metrics.meter][INFO] - TokenAccuracy:	0.755
Average Loss:	0.873
[2020-12-31 18:10:28,153][core.trainer][INFO] - Computing metrics for 'test' dataset
100%|██████████████████████████████████████████████████████████████████████| 104/104 [00:00<00:00, 181.37it/s]
[2020-12-31 18:10:28,727][core.metrics.meter][INFO] - TokenAccuracy:	0.000
Average Loss:	0.000
[2020-12-31 18:10:28,727][core.trainer][INFO] - Computing metrics for 'gen' dataset
100%|██████████████████████████████████████████████████████████████████████████| 2/2 [00:00<00:00, 208.29it/s]
[2020-12-31 18:10:28,737][core.metrics.meter][INFO] - TokenAccuracy:	0.000
Average Loss:	0.000
[2020-12-31 18:10:28,737][core.trainer][INFO] - Computing metrics for 'alice' dataset
100%|████████████████████████████████████████████████████████████████████████| 40/40 [00:00<00:00, 186.79it/s]
[2020-12-31 18:10:28,952][core.metrics.meter][INFO] - TokenAccuracy:	0.000
Average Loss:	0.000

The alice, gen, and test sets all score 0.000 on both token-level accuracy and loss. The loss is clearly wrong, and the output of the model on the alice set shows that while it is still not scoring well, it should be getting credit for the transitive verbs and (, ,, and ) tokens:

source	target	prediction
alice sees grace	see ( alice , grace )	see ( oswald , grace )
alice knows zelda	know ( alice , zelda )	know ( bob , grace )
alice sees zelda	see ( alice , zelda )	see ( winnifred , grace )
alice notices henry	notice ( alice , henry )	notice ( oswald , grace )
alice likes daniel	like ( alice , daniel )	like ( samuel , grace )

jopetty · 2021-01-01T18:43:50Z

Another thing which needs to be done: the metrics should be configurable in the yaml conf files; right now they are hard-coded into the trainer.

jopetty · 2021-01-01T18:50:40Z

Okay, the first issue (of metrics being incorrectly calculated on non train/val sets) has been solved: I just forgot to compute them 🙄 fixed in 2dccc75.

jopetty · 2021-01-04T22:27:07Z

Okay, the main issue is solved. Still need to make the metrics configurable in the YAML files.

jopetty self-assigned this Nov 24, 2020

jopetty added v2 Version 2 (with Hydra) enhancement New feature or request labels Nov 24, 2020

jopetty added this to the Stable 1.0 milestone Jan 4, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add metrics #31

Add metrics #31

jopetty commented Nov 24, 2020

jopetty commented Jan 1, 2021

jopetty commented Jan 1, 2021

jopetty commented Jan 1, 2021 •

edited

Loading

jopetty commented Jan 4, 2021 •

edited

Loading

Add metrics #31

Add metrics #31

Comments

jopetty commented Nov 24, 2020

jopetty commented Jan 1, 2021

jopetty commented Jan 1, 2021

jopetty commented Jan 1, 2021 • edited Loading

jopetty commented Jan 4, 2021 • edited Loading

jopetty commented Jan 1, 2021 •

edited

Loading

jopetty commented Jan 4, 2021 •

edited

Loading