Skip to content

Single table incorrectly detected as two separate tables #2626

@KulovacNedim

Description

@KulovacNedim

Bug

Docling splits a single table into two tables. The rightmost column is detected as a separate table instead of being part of the main table.

Steps to reproduce

Convert a document with a table where:

  • Table has multiple columns
  • One header cell is empty (no content)
  • Last column has different visual spacing
  • All rows span the full width

Example table structure:

Product/Integration Sub-category ID Question Answer
Overview Purpose and Use Cases AI 1 What is the main objective? The main objective is... Additional details here
Overview Purpose and Use Cases AI 2 What are the specific use cases? The AI is applied for... More information
Overview Purpose and Use Cases AI 3 What types of data will the AI process? The AI processes code data With Privacy Mode disabled, no code data is stored

Result:

  • Table 1: Contains columns 1-5 (Product/Integration through Answer)
  • Table 2: Contains only column 6 (the column with empty header)

Instead of one table with 6 columns.

Docling version

docling 2.57.0

Question: Is there a way to configure table boundary detection to prevent this splitting?

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workinggood first issueIssues and pull requests for new contributorsxlsxissue related to xlsx backend

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions