Feat/1253 basic punctuation#1294
Open
sudhanshu112233shukla wants to merge 2 commits intoEpicenterHQ:mainfrom
Open
Feat/1253 basic punctuation#1294sudhanshu112233shukla wants to merge 2 commits intoEpicenterHQ:mainfrom
sudhanshu112233shukla wants to merge 2 commits intoEpicenterHQ:mainfrom
Conversation
…terHQ#1253) - Adds 'simple_punctuation' transformation step type - Implements regex-based replacement for period, comma, newline, etc. - Updates database schema to Version 3 - Includes unit tests verifying logic
…atches and improve whitespace handling (EpicenterHQ#1253)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
This PR introduces the “Simple Punctuation” transformation step requested in issue #1253, enabling automatic conversion of spoken punctuation commands—such as “period,” “comma,” and “new line”—into their corresponding symbols during post-processing. The update adds a new simple_punctuation entry to the transformation registry and updates the database schema to version 3. The transformation logic, implemented in transformation-logic.ts, uses carefully designed regular expressions with word-boundary checks to avoid false positives (for example, ensuring that words like “commander” are not mistakenly altered). Special handling is included for the “new line” command to remove awkward leading spaces on the following line, resulting in clean formatting. To ensure correctness and robustness, comprehensive unit tests were added in transformer.test.ts, covering case insensitivity, spacing variations, and edge cases. Verification was performed by running bun test src/lib/query/isomorphic/transformer.test.ts locally, with successful results confirming basic replacements (e.g., “Hello comma world” → “Hello, world”), case-insensitive behavior (“Period” → “.”), safety checks (“The commander” remains unchanged), and proper newline handling (“Line one new line Line two” → “Line one\nLine two”). This change fully resolves and closes issue #1253 without omitting any requested functionality.