The calculation in metrics.regex_rows() is not consistent with the documentation

https://github.com/georgianpartners/foreshadow/blob/c2c213e0009cfdcf0aa9df75f0a6cf4c983d7090/foreshadow/metrics.py#L184

Here, before the sum, we should get a 0 or 1 value for each row. But instead, we are getting the matched length for each row, which leads to a final score larger than 1. Here are the code the reproduce the issue:

```python
import pandas as pd
from foreshadow.concrete import DollarFinancialCleaner

x = pd.DataFrame({'price': ['$3', '$5.0', '$5,000.00']})
financial_cleaner = DollarFinancialCleaner()
metric = financial_cleaner.metric_score(x)
print(metric)
```
The expected value is 1 but get 4.2 instead.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

The calculation in metrics.regex_rows() is not consistent with the documentation #161

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

The calculation in metrics.regex_rows() is not consistent with the documentation #161

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions