- 
          
- 
                Notifications
    You must be signed in to change notification settings 
- Fork 307
Added PO handling for new lines + tests #3282
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
| WalkthroughPo parsing was changed to split header meta lines by raw newline characters and to interpret common escape sequences inside quoted strings (e.g.,  Changes
 Sequence Diagram(s)sequenceDiagram
  participant FS as FileSystem
  participant Parser as PoParser
  participant Test as TestRunner
  FS->>Parser: load example.po (header + entries)
  note right of Parser #DDEEFF: Header processing
  Parser->>Parser: processHeader(split by raw "\n")
  note right of Parser #F6F2DD: Entry parsing
  Parser->>Parser: parse msgid/msgstr chars
  alt escaped char inside quotes
    Parser->>Parser: translate '\n','\r','\t','"','\\' → actual char
  else unknown escape
    Parser->>Parser: keep backslash + char
  end
  Parser->>Test: produce translations (11)
  Test->>FS: compare against updated example.po expectations
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
 Pre-merge checks and finishing touches❌ Failed checks (1 warning)
 ✅ Passed checks (2 passed)
 ✨ Finishing touches
 🧪 Generate unit tests (beta)
 📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
 🚧 Files skipped from review as they are similar to previous changes (1)
 Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment  | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (4)
backend/data/src/main/kotlin/io/tolgee/formats/po/in/PoParser.kt (2)
54-68: Header split on LF aligns with new decoding; add safety and locale fixes.Good change. Two follow-ups:
- Guard malformed header lines (no colon) to avoid IndexOutOfBounds.
- Use Locale.ROOT for case-insensitive keys; avoid default-locale surprises (e.g., Turkish i).
- Minor: use lineSequence()/forEach instead of map-for-side-effects.
Apply:
- it.msgstr.split("\n").map { metaLine -> - val trimmed = metaLine.trim() - if (trimmed.isBlank()) { - return@map - } - val colonPosition = trimmed.indexOf(":") - val key = trimmed.substring(0 until colonPosition) - val value = trimmed.substring(colonPosition + 1).trim() - when (key.lowercase(Locale.getDefault())) { + it.msgstr.lineSequence().forEach { metaLine -> + val trimmed = metaLine.trim() + if (trimmed.isEmpty()) return@forEach + val colonPosition = trimmed.indexOf(':') + if (colonPosition <= 0) return@forEach + val key = trimmed.substring(0, colonPosition).lowercase(Locale.ROOT) + val value = trimmed.substring(colonPosition + 1).trim() + when (key) { "project-id-version" -> result.projectIdVersion = value "language" -> result.language = value "plural-forms" -> result.pluralForms = value else -> result.other[key] = value } - } + }Please add a small test asserting header meta (e.g., language/plural-forms) still parses when a line lacks a colon or has trailing spaces.
129-137: Escape decoding inside quotes looks right; consider completing the table.Current set covers n/r/t/"/\. Add common C escapes b (backspace), f (form feed), v (vertical tab). Unknowns remain literal (good).
- val specialEscape: Char? = if (quoted) when (this) { + val specialEscape: Char? = if (quoted) when (this) { 'n' -> '\n' 'r' -> '\r' 't' -> '\t' + 'b' -> '\b' + 'f' -> '\u000C' + 'v' -> '\u000B' '"' -> '"' '\\' -> '\\' else -> null } else nullAdd one assertion that unknown escapes (e.g., "\q") are preserved as two characters and that "\b" maps to backspace when enabled.
Also applies to: 138-143
backend/data/src/test/kotlin/io/tolgee/unit/formats/po/in/PoParserTest.kt (1)
30-31: Avoid index fragility; assert by msgid.Order can shift; look up entries by msgid for robustness.
- assertThat(result.translations[10].msgstr.toString()).isEqualTo("This\nis\na\nmultiline\nstring") - assertThat(result.translations[11].msgstr.toString()).isEqualTo("This\r\nis\r\na\r\nmultiline\r\nstring") + val lf = result.translations.first { it.msgid.toString() == "Multiline message with \\n" } + assertThat(lf.msgstr.toString()).isEqualTo("This\nis\na\nmultiline\nstring") + val crlf = result.translations.first { it.msgid.toString() == "Multiline message with \\n\\r" } + assertThat(crlf.msgstr.toString()).isEqualTo("This\r\nis\r\na\r\nmultiline\r\nstring")backend/data/src/test/kotlin/io/tolgee/unit/formats/po/in/PoFileProcessorTest.kt (1)
32-32: Consider asserting presence, not only count.Exact size can fluctuate as the fixture evolves. Optionally assert keys exist to make intent explicit.
assertThat(mockUtil.fileProcessorContext.translations).hasSize(11) +assertThat(mockUtil.fileProcessorContext.translations.keys) + .contains("Multiline message with \\n", "Multiline message with \\n\\r")
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
- backend/data/src/main/kotlin/io/tolgee/formats/po/in/PoParser.kt(2 hunks)
- backend/data/src/test/kotlin/io/tolgee/unit/formats/po/in/PoFileProcessorTest.kt(1 hunks)
- backend/data/src/test/kotlin/io/tolgee/unit/formats/po/in/PoParserTest.kt(1 hunks)
- backend/data/src/test/resources/import/po/example.po(1 hunks)
🔇 Additional comments (1)
backend/data/src/test/resources/import/po/example.po (1)
53-57: Test data additions LGTM.The LF/CRLF cases are well chosen and align with parser changes.
Summary by CodeRabbit
Bug Fixes
Tests