Skip to content

Metadata correction for 2026.loresmt-1.7 #7881

@DavidSamuell

Description

@DavidSamuell

JSON data block

{
  "anthology_id": "2026.loresmt-1.7",
  "abstract": "Neural Machine Translation (NMT) models for low-resource languages suffer significant performance degradation under domain shift. We quantify this challenge using <b>Dhao</b>, an indigenous language of Eastern Indonesia with no digital footprint beyond the New Testament (NT). When applied to the unseen Old Testament (OT), a standard NMT model fine-tuned on the NT drops from an in-domain score of 36.17 chrF++ to 27.11 chrF++. To recover this loss, we introduce a <b>hybrid framework</b> where a fine-tuned NMT model generates an initial draft, which is then refined by a Large Language Model (LLM) using Retrieval-Augmented Generation (RAG). The final system achieves 35.21 chrF++ (<tex-math>+8.10</tex-math> recovery), effectively matching the original in-domain quality. Our analysis reveals that this performance is driven primarily by the <b>number of retrieved examples</b> rather than the choice of retrieval algorithm. Qualitative analysis confirms the LLM acts as a robust \"safety net,\" repairing severe failures in zero-shot domains.",
  "authors": [
    {
      "first": "David Samuel",
      "last": "Setiawan",
      "id": "david-samuel-setiawan/unverified"
    },
    {
      "first": "Raphaël",
      "last": "Merx",
      "id": "raphael-merx"
    },
    {
      "first": "Jey Han",
      "last": "Lau",
      "id": "jey-han-lau"
    }
  ],
  "authors_old": "David Samuel  Setiawan | Raphael  Merx | Jey Han  Lau",
  "authors_new": "David Samuel  Setiawan | Raphaël  Merx | Jey Han  Lau"
}

Metadata

Metadata

Assignees

No one assigned

    Labels

    approvedUsed to note team approval of metadata requestscorrectionfor corrections submitted to the anthologymetadataCorrection to paper metadata

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions