Skip to content

Prototype contribution: Embedded clause analysis for Grafite evaluation use-case #38

@nane100503-gif

Description

@nane100503-gif

Is your feature request related to a problem? Please describe.

Yes. Current LLM evaluation systems struggle to detect deep structural and syntactic errors in complex sentences, especially in morphologically rich languages like Turkish, where embedded clauses can change meaning without visible surface-level errors. This leads to incomplete evaluation of model performance in real-world complex language scenarios.

Describe the solution you'd like

I propose a structural evaluation layer that transforms sentences into hierarchical representations and recursively detects embedded clause structures across different syntactic roles. This allows the system to identify where LLM outputs preserve or distort deep grammatical relationships rather than only evaluating surface-level correctness.

Describe alternatives you've considered

Traditional evaluation methods such as BLEU scores, semantic similarity metrics, and general parsing approaches. However, these methods do not capture recursive embedded clause structures or fine-grained syntactic dependencies, especially in complex sentence constructions.

Additional context

I have implemented a working prototype in Google Colab that demonstrates this approach using structured JSON representations and recursive extraction of embedded clauses in Turkish sentences, including medical-style texts. A demo video is attached showing the full pipeline from sentence input to structured output extraction.

Demo 🔗 👉 https://screenrec.com/share/4dIDcZSfbF

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions