BLD,BUG: Add charset-normalizer to improve compatibility with non-ascii environments. #266

BeiyanYunyi · 2025-04-16T16:42:56Z

On a system with a non-ascii compatible LANG environment variable, gfortran will produce non-ascii output. My working environment is Linux with LANG=zh_CN.UTF-8, in my environment,

gfortran -E ompgen.F90 -o omp.f90 -cpp

will output:

# 1 "ompgen.F90"
# 1 "<built-in>"
# 1 "<命令行>"
# 1 "ompgen.F90"
!... other code

instead of:

# 1 "ompgen.F90"
# 1 "<built-in>"
# 1 "<command-line>"
# 1 "ompgen.F90"

Chinese character at line 3 will cause the project fail to build:

      Traceback (most recent call last):
        File "/home/BeiyanYunyi/.cache/uv/builds-v0/.tmpP5ioKB/lib/python3.11/site-packages/numpy/f2py/crackfortran.py", line 391, in
      readfortrancode
          l = fin.readline()
              ^^^^^^^^^^^^^^
        File "/home/BeiyanYunyi/.local/share/uv/python/cpython-3.11.12-linux-x86_64-gnu/lib/python3.11/fileinput.py", line 292, in
      readline
          line = self._readline()
                 ^^^^^^^^^^^^^^^^
        File "/home/BeiyanYunyi/.local/share/uv/python/cpython-3.11.12-linux-x86_64-gnu/lib/python3.11/fileinput.py", line 372, in
      _readline
          return self._readline()
                 ^^^^^^^^^^^^^^^^
        File "/home/BeiyanYunyi/.local/share/uv/python/cpython-3.11.12-linux-x86_64-gnu/lib/python3.11/encodings/ascii.py", line 26, in
      decode
          return codecs.ascii_decode(input, self.errors)[0]
                 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
      UnicodeDecodeError: 'ascii' codec can't decode byte 0xe5 in position 71: ordinal not in range(128)

To reproduce the bug, simply run this command in the repo (POSIX environment):

LANG=zh_CN.UTF-8 pip install

As numpy.f2py suggests, It is likely that installing charset_normalizer package will help f2py determine the input file encoding correctly. Adding charset-normalizer to build-system.requires will make it infer the encoding correctly. After adding it to build-system.requires, I've successfully built this package.

… environment

kafitzgerald · 2025-04-17T21:22:44Z

Thanks for the PR!

I'll take a look at this tomorrow.

BLD,BUG: Add charset-normalizer to improve compability with non-ascii…

c7dfeb3

… environment

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

BLD,BUG: Add charset-normalizer to improve compatibility with non-ascii environments. #266

BLD,BUG: Add charset-normalizer to improve compatibility with non-ascii environments. #266

BeiyanYunyi commented Apr 16, 2025

kafitzgerald commented Apr 17, 2025

BLD,BUG: Add charset-normalizer to improve compatibility with non-ascii environments. #266

Are you sure you want to change the base?

BLD,BUG: Add charset-normalizer to improve compatibility with non-ascii environments. #266

Conversation

BeiyanYunyi commented Apr 16, 2025

kafitzgerald commented Apr 17, 2025