Adding Multimodal and nominal domain
We are happy to announce that Torchmetrics v0.11 is now publicly available. In Torchmetrics v0.11 we have primarily focused on the cleanup of the large classification refactor from v0.10 and adding new metrics. With v0.11 are crossing 90+ metrics in Torchmetrics nearing the milestone of having 100+ metrics.
New domains
In Torchmetrics we are not only looking to expand with new metrics in already established metric domains such as classification or regression, but also new domains. We are therefore happy to report that v0.11 includes two new domains: Multimodal and nominal.
Multimodal
If there is one topic within machine learning that is hot right now then it is generative models and in particular image-to-text generative models. Just recently stable diffusion v2 was released, able to create even more photorealistic images from a single text prompt than ever
In Torchmetrics v0.11 we are adding a new domain called multimodal to support the evaluation of such models. For now, we are starting out with a single metric, the CLIPScore from this paper that can be used to evaluate such image-to-text models. CLIPScore currently achieves the highest correlation with human judgment, and thus a high CLIPScore for an image-text pair means that it is highly plausible that an image caption and an image are related to each other.
Nominal
If you have ever taken any course in statistics or introduction to machine learning you should hopefully have heard about data can be of different types of attributes: nominal, ordinal, interval, and ratio. This essentially refers to how data can be compared. For example, nominal data cannot be ordered and cannot be measured. An example, would it be data that describes the color of your car: blue, red, or green? It does not make sense to compare the different values. Ordinal data can be compared but does have not a relative meaning. An example, would it be the safety rating of a car: 1,2,3? We can say that 3 is better than 1 but the actual numerical value does not mean anything.
In v0.11 of TorchMetrics, we are adding support for classic metrics on nominal data. In fact, 4 new metrics have already been added to this domain:
CramersV
PearsonsContingencyCoefficient
TschuprowsT
TheilsU
All metrics are measures of association between two nominal variables, giving a value between 0 and 1, with 1 meaning that there is a perfect association between the variables.
Small improvements
In addition to metrics within the two new domains v0.11 of Torchmetrics contains other smaller changes and fixes:
-
TotalVariation
metric has been added to the image package, which measures the complexity of an image with respect to its spatial variation. -
MulticlassExactMatch
metric has been added to the classification package, which for example can be used to measure sentence level accuracy where all tokens need to match for a sentence to be counted as correct -
KendallRankCorrCoef
have been added to the regression package for measuring the overall correlation between two variables -
LogCoshError
have been added to the regression package for measuring the residual error between two variables. It is similar to the mean squared error close to 0 but similar to the mean absolute error away from 0.
Finally, Torchmetrics now only supports v1.8 and higher of Pytorch. It was necessary to increase from v1.3 to secure because we were running into compatibility issues with an older version of Pytorch. We strive to support as many versions of Pytorch, but for the best experience, we always recommend keeping Pytorch and Torchmetrics up to date.
[0.11.0] - 2022-11-30
Added
- Added
MulticlassExactMatch
to classification metrics (#1343) - Added
TotalVariation
to image package (#978) - Added
CLIPScore
to new multimodal package (#1314) - Added regression metrics:
- Added new nominal metrics:
- Added option to pass
distributed_available_fn
to metrics to allow checks for custom communication backend for makingdist_sync_fn
actually useful (#1301) - Added
normalize
argument toInception
,FID
,KID
metrics (#1246)
Changed
- Changed minimum Pytorch version to be 1.8 (#1263)
- Changed interface for all functional and modular classification metrics after refactor (#1252)
Removed
- Removed deprecated
BinnedAveragePrecision
,BinnedPrecisionRecallCurve
,RecallAtFixedPrecision
(#1251) - Removed deprecated
LabelRankingAveragePrecision
,LabelRankingLoss
andCoverageError
(#1251) - Removed deprecated
KLDivergence
andAUC
(#1251)
Fixed
- Fixed precision bug in
pairwise_euclidean_distance
(#1352)
Contributors
@Borda, @justusschock, @ragavvenkatesan, @shenoynikhil, @SkafteNicki, @stancld
If we forgot someone due to not matching commit email with GitHub account, let us know :]