Support 'number-columns-repeated' attribute of ODS cells #616

istride · 2025-03-15T17:25:31Z

Repeat a cell's value if the 'number-columns-repeated' attribute is set.

src/tablib/formats/_ods.py

claudep · 2025-03-15T18:24:35Z

src/tablib/formats/_ods.py

+                    end = row_vals.index("")
+                except ValueError:
+                    end = len(row_vals)
+                dset.headers = row_vals[:end]


I'm a bit worried here that a column with an empty header cell will cut off data. I suppose having an empty header in the middle of headers is not well tested currently, probably something to improve.

If I revert this, then test_ods_import_set_ragged will fail because the first row of 'ragged.ods' contains 16,380 trailing empty cells. If this constitutes a valid header row, then the assertion in the test would need to be modified to accommodate it. Should I do this instead?

Should we use some heuristic to guess the end of the headers, like 5 successive empty headers would mean the header line is over?

For other formats (csv, xlsx, html) the header row is just accepted as it is, so I am now hesitant about changing this convention just for ods. The 'ragged.ods' file seems like a very extreme case, that is very unlikely to occur naturally. I'm more in favour of reverting my change and fixing the test.

@claudep I've decided to follow the convention used in other formats, of accepting the header row as-is. Would you let me know if this ok?

The problem is that in my experience, most .ods files resulting from an xlsx import (a common use case) will almost always have >16000 rows and also a big number of repeated (empty) rows at the end. This will result in tablib in big data structures mostly filled with empty strings. I would really try to avoid that, even if one could consider this as an ods import bug.

Default is 5.

hugovk reviewed Mar 15, 2025

View reviewed changes

src/tablib/formats/_ods.py Outdated Show resolved Hide resolved

claudep reviewed Mar 15, 2025

View reviewed changes

istride added 4 commits July 15, 2025 16:07

Support 'number-columns-repeated' attribute of ODS cells

d66ec21

Catch specific errors when getting 'number-columns-repeated attribute

bff484e

Accept the header row as it is when reading ODS format

5dc164d

Limit the number of empty header row values

5d74fb4

Default is 5.

istride force-pushed the ods-support-number-cols-repeated branch from be85a28 to 5d74fb4 Compare July 15, 2025 16:56

istride requested a review from claudep July 16, 2025 09:11

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Support 'number-columns-repeated' attribute of ODS cells #616

Support 'number-columns-repeated' attribute of ODS cells #616

Uh oh!

istride commented Mar 15, 2025

Uh oh!

Uh oh!

claudep Mar 15, 2025

Uh oh!

istride Mar 16, 2025

Uh oh!

claudep Mar 16, 2025

Uh oh!

istride Mar 19, 2025

Uh oh!

istride Apr 22, 2025

Uh oh!

claudep Apr 22, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Uh oh!

Support 'number-columns-repeated' attribute of ODS cells #616

Are you sure you want to change the base?

Support 'number-columns-repeated' attribute of ODS cells #616

Uh oh!

Conversation

istride commented Mar 15, 2025

Uh oh!

Uh oh!

claudep Mar 15, 2025

Choose a reason for hiding this comment

Uh oh!

istride Mar 16, 2025

Choose a reason for hiding this comment

Uh oh!

claudep Mar 16, 2025

Choose a reason for hiding this comment

Uh oh!

istride Mar 19, 2025

Choose a reason for hiding this comment

Uh oh!

istride Apr 22, 2025

Choose a reason for hiding this comment

Uh oh!

claudep Apr 22, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants