Formalize task type? #186

maximsch2 · 2021-03-29T18:31:54Z

maximsch2
Mar 29, 2021

🚀 Feature

Let's have a formal system of task types. Things like BinaryClassificationTask, MultiClassClassificationTask, MultilabelClassificationTask, etc.

Motivation

We are seeing slowdown from format checking performance drop from v1.1.8 to >= v1.2.0 when using metrics pytorch-lightning#6605
We would like to be able to do more sanity checking that metrics specified by a LightningModule are a correct fit for the task.

Pitch

Add a type hierarchy of possible task types. Each task is defined by type signature of the (predictions, labels) tuple and semantics inside it (e.g. multticlass and multilabel have same shape, but different semantics).

Then, each metric takes a task_type and can assume that predictions/labels conform to it. If we want to add checking at the run_time, each type can provide a class method (e.g. BinaryClassificationTask.validate_input) that can be enabled for checking on opt-in basis.

Alternatives

People building reusable frameworks implement task types on their own as wrappers around TorchMetrics.
TorchMetrics continue to have format checking in each metric.

justusschock · 2021-03-31T08:26:33Z

justusschock
Mar 31, 2021
Maintainer

@maximsch2 I like the idea. The only thing I'm a bit afraid of is that if we don't check on each update, we will silently calculate wrong values. therefore I like having the method explicitly, but I'd probably do it on an opt-out basis rather than opt-in. (i.e. having a flag for that that defaults to true and can be set to false). What do you think?

@SkafteNicki @Borda thoughts?

0 replies

SkafteNicki · 2021-04-01T13:44:12Z

SkafteNicki
Apr 1, 2021
Maintainer

We are seeing slowdown from format checking PyTorchLightning/pytorch-lightning#6605

I would argue that the first step would be trying to lower the computational time of our implementations (they may not be optimal)

We would like to be able to do more sanity checking that metrics specified by a LightningModule are a correct fit for the task.

I think it is important that we remember that torchmetrics are not intended to be used only with lightning but also with native pytorch. Also the concept of task is more related to flash than lightning right?

If this really boils down to our implementations being too slow because we make sure that the user input is correct, I would argue that we should have some kind of flag:

import torchmetrics
torchmetrics.performance_mode = True

that turns off all checking (meant for users knowing what they are doing).

1 reply

Borda May 11, 2021
Maintainer

I would agree here with @SkafteNicki , torchmetrics is meant to be a stand-alone and very native package so if we want to do other simplification for PL it shall live in PL package...

maximsch2 · 2021-04-01T17:10:52Z

maximsch2
Apr 1, 2021
Author

Performance optimizations is not the main goal (as that can be addressed by other implementations anyway).

Connection between training task and metrics is the key. Assume you are building a framework that allows people to train various tasks of different shape and you want to add metrics configurability in it. For simplicity, you can say that task==LightningModule, but of course this doesn't have to be the case. Now, you need a way to know how to pipe the output from the arbitrary model to a set of metrics. There are two ways:

Let users write it manually - most flexible, but makes configuring things harder
Explicitly support task_types and give the model ability to do things generically.

In Lightning terms:

def training_step(self, batch):
    loss, outputs = self.model.get_loss_and_outputs()
    # outputs is Dict[TaskType, TTaskTypeOutput]
    for task_type, output in outputs.items():
       for metric_name, metric in self.metrics_collection.items():
           if metric.supports(task_type):
                  self.log(metric_name, metric(*output))
    return loss

Why do we have a the same model outputting different types? This can happen in various ways:

Multi-tasking where you have a model with multiple heads outputting different types (e.g. classification head, regression head, similarity head, etc)
Representation learning models where we can output both a set of class probabilities and an embedding representation of the object and want to compute metrics on both
etc

I think it is important that we remember that torchmetrics are not intended to be used only with lightning but also with native pytorch. Also the concept of task is more related to flash than lightning right?

Right, flash is a more appropriate analogy.

1 reply

Borda May 11, 2021
Maintainer

the same #186 (reply in thread) lets have these in PL

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Formalize task type? #186

{{title}}

Replies: 3 comments 2 replies

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

Select a reply

Formalize task type? #186

maximsch2 Mar 29, 2021

🚀 Feature

Motivation

Pitch

Alternatives

Replies: 3 comments · 2 replies

justusschock Mar 31, 2021 Maintainer

SkafteNicki Apr 1, 2021 Maintainer

Borda May 11, 2021 Maintainer

maximsch2 Apr 1, 2021 Author

Borda May 11, 2021 Maintainer

maximsch2
Mar 29, 2021

Replies: 3 comments 2 replies

justusschock
Mar 31, 2021
Maintainer

SkafteNicki
Apr 1, 2021
Maintainer

Borda May 11, 2021
Maintainer

maximsch2
Apr 1, 2021
Author

Borda May 11, 2021
Maintainer