Support non-ASCII characters in PyGMT arguments and text in Figure.text

## Problems

Due to the limitation of the PostScript language, GMT can only work with ASCII characters and a small set of non-ASCII characters. See https://docs.generic-mapping-tools.org/latest/cookbook/octal-codes.html for the full list of characters that PostScript/GMT/PyGMT can accept.

These non-ASCII characters must be specified using their [octal codes](https://docs.generic-mapping-tools.org/latest/cookbook/octal-codes.html) or [character escape sequence](https://docs.generic-mapping-tools.org/latest/cookbook/features.html#character-escape-sequences). A few non-ASCII characters (e.g., ü, Î) are allowed and GMT can substitute these non-ASCII characters with the correct PostScript octal codes.

Users who don't know the limitations may pass non-ASCII characters directly in the arguments. For example:
```
import pygmt
fig = pygmt.Figure()
fig.basemap(region=[0, 10, 0, 5], projection="x1c", frame="WSen+tTime (s) vs Distance (°)")
fig.show()
```
The above script produces this "surprising" figure:

![non-ascii](https://user-images.githubusercontent.com/3974108/203463340-bad4d200-252d-477e-8428-0a451fe0c7b0.png)


So, if users want to add a non-ASCII character to a plot, they must know the limitations and have to go to this page https://docs.generic-mapping-tools.org/latest/cookbook/octal-codes.html, look for the character in the four tables, and figure out the corresponding octal code (`\260` for the symbol `°`), which is tedious and not easy.

After finding the octal code, users may think changing `°` to `\260` should work: 
```
import pygmt
fig = pygmt.Figure()
fig.basemap(region=[0, 10, 0, 5], projection="x1c", frame="WSen+tTime (s) vs Distance (\260)")
fig.show()
```
but it still produces the same "surprising" figure, because the Python interpreter recognizes `\260` first, and converts it to `°` before passing it to the GMT API. So, users have to use double backslashes or raw strings:
```
frame="WSen+tTime (s) vs Distance (\\260)"
```
or
```
frame=r"WSen+tTime (s) vs Distance (\260)"
```

## Solutions

Since Python works well with non-ASCII characters (acutally it works with any unicode characters), it's possible to pass `°` in Python, and PyGMT should substitute the non-ASCII characters with the corresponding octal codes.

Here are some tests in Python:

```python
# Python support non-ASCII characters
>>> "WSen+tTime (s) vs Distance (°)"
'WSen+tTime (s) vs Distance (°)'

# Python knows how to convert \260 to °
>>> "WSen+tTime (s) vs Distance (\260)"
'WSen+tTime (s) vs Distance (°)'

# replace ° with \\260
>>> "WSen+tTime (s) vs Distance (°)".replace("°", "\\260")
'WSen+tTime (s) vs Distance (\\260)'

# how to convert ° to \\260. It should work for other non-ASCII characters
>>> oct(ord("°")).replace("0o", '\\')
'\\260'
```
So, if we can do the substitutions/conversions internally, we can support non-ASCII characters better. The simplest solution is to define a big dictionary that maps non-ASCII characters (e.g., `°`) to octal codes (e.g., `\260`). Better and more clever solutions are also possible.

## Notes about the possible limitations of the solutions

Non-ASCII characters can be used in many cases:

1. PyGMT arguments, e.g., `frame="WSen+tTime (s) vs Distance (°)"`
2. Text strings as input data, e.g., `fig.text(x=0, y=0, text="Distance (°)")`
3. Text strings in a plaintext file, e.g., a plaintext file with a record like `0 0 Distance (°)`

The above solution should work well for case 1, may work or not work (depending on the implentation)
for case 2, and likely don't work for case 3.

**Are you willing to help implement and maintain this feature?**

Yes, but more discussions are needed.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support non-ASCII characters in PyGMT arguments and text in Figure.text #2204

Problems

Solutions

Notes about the possible limitations of the solutions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support non-ASCII characters in PyGMT arguments and text in Figure.text #2204

Description

Problems

Solutions

Notes about the possible limitations of the solutions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions