-
Notifications
You must be signed in to change notification settings - Fork 3.3k
Open
Labels
bugSomething isn't workingSomething isn't workinggood first issueIssues and pull requests for new contributorsIssues and pull requests for new contributorsxlsxissue related to xlsx backendissue related to xlsx backend
Description
Bug
Docling splits a single table into two tables. The rightmost column is detected as a separate table instead of being part of the main table.
Steps to reproduce
Convert a document with a table where:
- Table has multiple columns
- One header cell is empty (no content)
- Last column has different visual spacing
- All rows span the full width
Example table structure:
| Product/Integration | Sub-category | ID | Question | Answer | |
|---|---|---|---|---|---|
| Overview | Purpose and Use Cases | AI 1 | What is the main objective? | The main objective is... | Additional details here |
| Overview | Purpose and Use Cases | AI 2 | What are the specific use cases? | The AI is applied for... | More information |
| Overview | Purpose and Use Cases | AI 3 | What types of data will the AI process? | The AI processes code data | With Privacy Mode disabled, no code data is stored |
Result:
- Table 1: Contains columns 1-5 (Product/Integration through Answer)
- Table 2: Contains only column 6 (the column with empty header)
Instead of one table with 6 columns.
Docling version
docling 2.57.0
Question: Is there a way to configure table boundary detection to prevent this splitting?
teowave
Metadata
Metadata
Assignees
Labels
bugSomething isn't workingSomething isn't workinggood first issueIssues and pull requests for new contributorsIssues and pull requests for new contributorsxlsxissue related to xlsx backendissue related to xlsx backend