Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(matrix): allow odd matrix format when importing #2305

Merged
merged 2 commits into from
Jan 22, 2025

Conversation

MartinBelthle
Copy link
Contributor

@MartinBelthle MartinBelthle commented Jan 20, 2025

The csv sniffer is not reliable and can detect the wrong delimiter. So we can't trust it.
We have to check ourselves. It's probably a bit longer but at least we don't have failures anymore

@MartinBelthle MartinBelthle self-assigned this Jan 20, 2025
@MartinBelthle MartinBelthle changed the title fix(matrix): allow weird matrix format at the import fix(matrix): allow odd matrix format when importing Jan 20, 2025
Copy link
Contributor

@TheoPascoli TheoPascoli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, just two questions

df = df.dropna(axis=1, how="all") # We want to remove columns full of NaN at the import
matrix = df.to_numpy(dtype=np.float64)
return matrix
raise ValueError("Could not import the matrix")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have more informations to make the error message more explicit ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately, inside the calling method, I don't even know on which study I'm applied

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I can perhaps catch it higher in the code to raise a different error. For the moment we have a HTTP 500 saying Unexpected server error: Could not import the matrix

Copy link
Contributor Author

@MartinBelthle MartinBelthle Jan 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a home-made exception to raise a 422 with a clearer exception name (+ a test). So it's slightly better

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok thanks !

return np.zeros(shape=(0, 0))
for delimiter in [",", ";", "\t"]:
with contextlib.suppress(Exception):
df = pd.read_csv(io.BytesIO(data), delimiter=delimiter, header=None).replace(",", ".", regex=True)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is there a way to load the matrix only once ? Maybe we can do something to "detect" the delimiter to prevent three openings

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just realised that it is exactly what was done before ...

Copy link
Contributor Author

@MartinBelthle MartinBelthle Jan 22, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep indeed. And it's because it wasn't reliable that I had to do this for loop

@MartinBelthle MartinBelthle force-pushed the fix/add-test-case-for-matrix-import branch from d374515 to c4499d6 Compare January 22, 2025 15:09
@MartinBelthle MartinBelthle merged commit 5ca279b into dev Jan 22, 2025
11 of 13 checks passed
@MartinBelthle MartinBelthle deleted the fix/add-test-case-for-matrix-import branch January 22, 2025 15:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants