Python runtime: NoViableAltException at <EOF> reported by listener despite successful exit from start rule in trace

**Versions:**
*   ANTLR Tool version: 4.13.2
*   antlr4-python3-runtime version: 4.13.2
*   Python version: 3.13.2
*   OS: macOS 13.7 Ventura (darwin 22.6.0)

**Problem Description:**
We are encountering a strange issue where the ANTLR Python3 runtime seems to report a `NoViableAltException` at the `<EOF>` token for certain input files, even though the parser trace (`parser.setTrace(True)`) clearly shows a successful exit from the start rule (`program`) just before encountering the `<EOF>`.

**Grammar (Minimal Relevant Parts):**

*KumirLexer.g4:*
```antlr
lexer grammar KumirLexer;

// ... (keywords, operators, literals) ...

ID : LETTER (LETTER | DIGIT | '_' | '@')* ;

LINE_COMMENT : '|' ~[\r\n]* -> channel(HIDDEN);
DOC_COMMENT  : '#' ~[\r\n]* -> channel(HIDDEN);
WS           : [ \t\r\n]+ -> skip;

fragment LETTER : [a-zA-Zа-яА-ЯёЁ]; // Note: Includes Cyrillic letters
// ... (other fragments) ...
```

*KumirParser.g4:*
```antlr
parser grammar KumirParser;

options { tokenVocab=KumirLexer; }

// ... (expression rules, type rules, etc.) ...

statement
    : variableDeclaration SEMICOLON?
    | assignmentStatement SEMICOLON?
    | ioStatement SEMICOLON?
    | ifStatement SEMICOLON?
    | switchStatement SEMICOLON?
    | loopStatement SEMICOLON?
    | exitStatement SEMICOLON?
    | pauseStatement SEMICOLON?
    | stopStatement SEMICOLON?
    | assertionStatement SEMICOLON?
    | procedureCallStatement SEMICOLON?
    | SEMICOLON
    ;

statementSequence
    : statement*
    ;

algorithmBody
    : statementSequence
    ;

// Captures tokens for algorithm name using a predicate
algorithmNameTokens
    : ( {self._input.LA(1) != self.LPAREN and \
         self._input.LA(1) != self.ALG_BEGIN and \
         self._input.LA(1) != self.PRE_CONDITION and \
         self._input.LA(1) != self.POST_CONDITION and \
         self._input.LA(1) != self.SEMICOLON and \
         self._input.LA(1) != self.EOF}? .
      )+
    ;

algorithmHeader
    : ALG_HEADER (typeSpecifier)? algorithmNameTokens (LPAREN parameterList? RPAREN)? SEMICOLON?
    ;

algorithmDefinition
    : algorithmHeader (preCondition | postCondition | variableDeclaration)*
      ALG_BEGIN
      algorithmBody
      ALG_END (algorithmName)? SEMICOLON?
    ;

// ... (module definition rules) ...

program // Start Rule
    : programItem* (moduleDefinition | algorithmDefinition)* SEMICOLON? EOF
    ;
```

**Example Input File (`15-while.kum` - triggers the error):**
(Note: File uses `\n` line endings after `\r\n` -> `\n` normalization)
```kumir
| Программа к учебнику информатики для 10 класса
| К.Ю. Полякова и Е.А. Еремина.
| Глава 8.
| Программа № 15. Цикл "пока": количество цифр числа
| Вход:
|   12345
| Результат:
|   Цифр в числе: 5
алг Количество цифр
нач
цел n, count
вывод 'Введите целое число: '
ввод n
count:= 0
нц пока n > 0
n:= div(n,10)
count:= count + 1
кц
вывод 'Цифр в числе: ', count
кон



```

**Code to Reproduce (Python):**
```python
from antlr4 import InputStream, CommonTokenStream
from antlr4.error.ErrorListener import ErrorListener
# Assuming generated KumirLexer, KumirParser are importable
from KumirLexer import KumirLexer
from KumirParser import KumirParser
import sys

class SyntaxErrorListener(ErrorListener):
    def __init__(self):
        super().__init__()
        self.errors = []

    def syntaxError(self, recognizer, offendingSymbol, line, column, msg, e):
        offending_symbol_text = repr(offendingSymbol.text) if offendingSymbol else 'None'
        exception_type = type(e).__name__ if e else 'None'
        error_msg = (f"line {line}:{column} MSG: {msg} | "
                     f"OFFENDING_SYMBOL: {offending_symbol_text} | "
                     f"EXCEPTION: {exception_type}")
        self.errors.append(error_msg)
        # Also print to stderr immediately for visibility
        print(f"[ERROR] {error_msg}", file=sys.stderr)


def parse_kumir_code(code: str):
    # Normalize line endings
    code = code.replace('\r\n', '\n')
    input_stream = InputStream(code)
    lexer = KumirLexer(input_stream)
    stream = CommonTokenStream(lexer)
    parser = KumirParser(stream)

    parser.removeErrorListeners()
    error_listener = SyntaxErrorListener()
    parser.addErrorListener(error_listener)

    # Enable trace
    # --- TRACE ---
    print("\n--- Parser Trace Start ---", file=sys.stderr)
    parser.setTrace(True)
    # -------------

    try:
        tree = parser.program() # Start rule
        print("--- Parser Trace End ---", file=sys.stderr) # Should be reached if trace exit happens
        return tree, error_listener.errors
    except Exception as e:
        print(f"[EXCEPTION DURING PARSE] {e}", file=sys.stderr)
        return None, [str(e)]

# Example usage:
file_content = """
| ... comments ...
алг Количество цифр
нач
цел n, count
вывод 'Введите целое число: '
ввод n
count:= 0
нц пока n > 0
n:= div(n,10)
count:= count + 1
кц
вывод 'Цифр в числе: ', count
кон



""" # Content of 15-while.kum

tree, errors = parse_kumir_code(file_content)

if errors:
    print("\n--- Parsing Errors Reported by Listener: ---")
    for err in errors:
        print(err)
else:
    print("\n--- No Parsing Errors Reported by Listener ---")

if tree:
    print("\n--- Parse Tree Built Successfully ---")
else:
     print("\n--- Parse Tree NOT Built ---")

```

**Observed Behavior:**
When running the code with the `15-while.kum` input (and 11 other similar files from our test suite):
1.  The parser trace (`parser.setTrace(True)`) shows a successful entry into and **successful exit** from the `program` rule. `LT(1)` upon exiting is `<EOF>`. Example trace output:
    ```
    --- Parser Trace Start ---
    enter   program, LT(1)=алг
    ... (trace of parsing the whole file) ...
    exit    program, LT(1)=<EOF>
    --- Parser Trace End ---
    ```
2.  However, the `SyntaxErrorListener` **is still invoked** and reports an error:
    ```
    [ERROR] line 24:0 MSG: no viable alternative at input 'алгКоличествоцифрнач...' | OFFENDING_SYMBOL: '<EOF>' | EXCEPTION: NoViableAltException
    ```
3.  A test checking `assert not errors` fails.

**Expected Behavior:**
If the parser trace indicates a successful exit from the start rule `program` upon reaching `EOF`, the `SyntaxErrorListener` should not be invoked with a `NoViableAltException` error for that `EOF`. The parsing should be considered successful.

**Additional Notes:**
*   This issue only occurs for 12 out of 60 test files. The other 48 files (including some with very similar structure but lacking certain elements like loops) parse successfully without errors.
*   Normalizing line endings (`\r\n` -> `\n`) did not resolve the issue.
*   Changing the `program` rule's quantifier from `...+ EOF` to `...* EOF` allowed parsing files with a single top-level definition but did not fix the error-on-EOF issue for these 12 files.
*   This problem appears similar to issues #4242 and #3851.

Could this be a bug in the Python3 runtime's error reporting mechanism or state handling related to EOF, especially since the internal trace reports success?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Python runtime: NoViableAltException at <EOF> reported by listener despite successful exit from start rule in trace #4830

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Python runtime: NoViableAltException at <EOF> reported by listener despite successful exit from start rule in trace #4830

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions