Skip to content

[BUG] CEA-708 hangs forever if character set is specified on rust builds #1697

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
hrideshmg opened this issue Apr 29, 2025 · 1 comment · Fixed by #1698
Closed

[BUG] CEA-708 hangs forever if character set is specified on rust builds #1697

hrideshmg opened this issue Apr 29, 2025 · 1 comment · Fixed by #1698

Comments

@hrideshmg
Copy link
Contributor

Can be verified by running this sample using the following arguments: --service 1[EUC-KR].

It works as expected on a build of ccextractor built -without-rust but hangs forever otherwise. This is currently causing the sample platform tests to abort from a timeout as evidenced by this run taking more than 3 hours:

[INFO] Starting with entry 10 of 14
[WARN] Aborting CCExtractor, maximum time elapsed.
[ERROR] Path is empty
[INFO] Finished entry 10 with exit code: -1
[INFO] Starting with entry 11 of 14
[WARN] Aborting CCExtractor, maximum time elapsed.
[ERROR] Path is empty
[INFO] Finished entry 11 with exit code: -1
[INFO] Starting with entry 12 of 14
[ERROR] Path is empty
[INFO] Finished entry 12 with exit code: 0
[INFO] Starting with entry 13 of 14
[WARN] Aborting CCExtractor, maximum time elapsed.
[ERROR] Path is empty
[INFO] Finished entry 13 with exit code: -1
[INFO] Starting with entry 14 of 14
[ERROR] Path is empty
[INFO] Finished entry 14 with exit code: 0
[INFO] Runtime: 03:21:00.9565500
@hrideshmg
Copy link
Contributor Author

hrideshmg commented May 1, 2025

I've managed to debug this issue after messing around with gdb a bit, it seems to be caused by the iconv crate (which is currently unmaintained) being broken on the latest version of rust, see this issue.

I've managed to fix it by swapping out the iconv crate for encoding_rs (which appears to be the most popular and well maintained option) in this PR.

Note that the rust implementation for this particular case was and still is buggy, regression tests 142, 147 and 149 all of which test this specific scenario have historically had empty output files (see here for eg), which is why the tests were passing. The C builds however, produce the proper outputs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant