Internationalization (i18n) / multilingualism for some text fields ? #86

LaurentAjdnik · 2024-11-29T10:51:45Z

Hello.

All text fields included in the spec are provided on a single-language basis.

As often, English is the most widely used language, in the spec or examples found elsewhere.

It is usually not a problem when dealing with APIs, since these are seen only by developers or machines.

However, with MCP, I understand some texts (prompts, descriptions...) might be read / selected / typed directly by end users.

Are there any plans to provide an internationalization (i18n) / multilingualism mechanism in the spec?

Or shall we rely on the LLM to handle the (possibly not totally accurate) translations?

Thanks!

jspahrsummers · 2024-12-02T13:16:17Z

Good flag! We indeed overlooked this when designing the spec.

The main things intended to be human-readable are names (e.g., of resources and prompts) and descriptions, so I think we'd want to focus i18n support on those.

Would you be interested in drafting a proposal PR?

LaurentAjdnik · 2024-12-03T08:15:58Z

Would you be interested in drafting a proposal PR?

I'd love to.

However, this would be a somehow ambitious PR, with quite a few design choices to be made beforehand.

Do you really think this is a must-have feature? If so, I'll gladly give it some work.

I'll follow up with some more targeted questions.

LaurentAjdnik · 2024-12-03T08:32:17Z

What would be the most appropriate standard to encode language codes? ISO 693-1? ISO 693-2? IETF BCP 47? Any other?

LaurentAjdnik · 2024-12-03T08:33:47Z

Should we provide backward compatibility? If so, any advice on how to achieve this here?

LaurentAjdnik · 2024-12-03T08:43:59Z

In terms of workflow, where should language selection occur?

Server-side: The client adds an optional language code parameter to its requests and the server returns only the appropriate language, if available, or its default language.
- This might not make sense if notifications sent by server must also be i18n-compliant
- More complex servers
- Reduced load on network
Client-side: The server returns all available languages and the clients selects its preferred one.
- More complex clients
- Heavier load on network

These options would have pretty different impacts on the spec.

LaurentAjdnik · 2024-12-03T09:26:00Z

The main things intended to be human-readable are names (e.g., of resources and prompts) and descriptions [...]

Names are used as identifiers. I fear we might end up having problems by making them multilingual, for prompts names or tools names in particular.

Especially if they end up being hardcoded in some clients when calling "well-known" servers, or for inter-servers communication someday, if this ever occurs.

On the other hand, descriptions can be translated harmlessly.

jspahrsummers · 2024-12-05T15:00:13Z

Sorry for the delay in getting back to you here!

Would you be interested in drafting a proposal PR?

I'd love to.

However, this would be a somehow ambitious PR, with quite a few design choices to be made beforehand.

Makes sense. We can hold off on the PR while we figure out these key questions, just to avoid unnecessary work that then requires changes.

Do you really think this is a must-have feature? If so, I'll gladly give it some work.

I don't know if it's must have, but I think it's pretty silly in 2024 to not consider internationalization. Really glad you brought it up!

Regarding the specific questions, I'd be curious for your initial opinions. Always happy to discuss and suggest alternatives, if needed, but you've had a lot of good thoughts so far and I want to make sure to hear you first.

LaurentAjdnik · 2024-12-07T11:45:14Z

Regarding the specific questions, I'd be curious for your initial opinions. Always happy to discuss and suggest alternatives, if needed, but you've had a lot of good thoughts so far and I want to make sure to hear you first.

OK, my suggestions will follow. I'll address each topic in a specific "thread" inside this issue, to keep things separated.

LaurentAjdnik · 2024-12-07T12:16:00Z

What would be the most appropriate standard to encode language codes? ISO 693-1? ISO 693-2? IETF BCP 47? Any other?

Let's go for the wiiiiiiidely used IETF BCP 47. For instance:

Basic language codes: en for English, fr for French, es for Spanish, zh for "Generic Chinese" (usually defaulted to Mandarin), yue for Cantonese...
Language + Region Codes: en-US for American English, en-GB for British English, fr-CA for Canadian French...

The component that selects the language (either client-side or server-side, see other topic) SHOULD provide a fallback mechanism to a close variant if the preferred language is not available. For instance: en=>en-US or en-US=>en or en-GB=>en-US.

If no close variant is available, or if i18n is not handled by the component, the "default" language MUST be returned (text + corresponding language code).

LaurentAjdnik · 2024-12-07T12:48:32Z

Names are used as identifiers. I fear we might end up having problems by making them multilingual [...]

The more I think about it, the more I feel it would be a hassle to translate names for tools and prompts:

Servers would have to check for variants of names (much higher complexity)
The spec indeed states they are "unique identifiers"

On the other hand, names for resources could be i18n-compliant, since the identifier would be the uri in that case. The spec indeed states they are "human-readable names".

That being said, we have another problem, not spec-related: In Claude Desktop, the "Allow" popup says "Run tool_name from server", without description, which is some kind of flaw in itself. And even more if only the description is translated.

LaurentAjdnik · 2024-12-07T17:39:48Z

In terms of workflow, where should language selection occur?
[...]
These options would have pretty different impacts on the spec.

Let's avoid unnecessary network traffic.

For requests, clients should add an optional preferredLanguage parameter.

For responses, servers should add a (mandatory?) language field.

Language selection would be handled by the server.

The server would fall back to its default language if:

It does not support i18n
No preferred language was specified
The preferred language is not available
No equivalence was found (see above)

LaurentAjdnik · 2024-12-07T17:42:43Z

Should we provide backward compatibility?

Not sure yet, but I feel me might figure out a smooth solution.

If not, this would probably join other major and impactful changes in a next version of the spec.

jspahrsummers · 2024-12-09T13:51:19Z

Thanks for all the thoughts! Broadly, this sounds great. I'd suggest the following:

preferredLanguage becomes part of the client's advertised capabilities, under a new locale subobject (i.e., capabilities.locale.preferredLanguage)
language is then part of the server's response in a similar object, iff supported (i.e., capabilities.locale.language)

Done this way, it should actually be 100% backwards compatible, since this doesn't involve changes to the shapes of any other responses—and we can add it to any protocol version.

What do you think? If this sounds good to you, let's get started on the spec changes. 🙏

LaurentAjdnik added the enhancement New feature or request label Nov 29, 2024

jspahrsummers added this to the after-2024-11-05 milestone Dec 2, 2024

LaurentAjdnik linked a pull request Dec 13, 2024 that will close this issue

Adds i18n support #115

Open

9 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Internationalization (i18n) / multilingualism for some text fields ? #86

Internationalization (i18n) / multilingualism for some text fields ? #86

LaurentAjdnik commented Nov 29, 2024

jspahrsummers commented Dec 2, 2024

LaurentAjdnik commented Dec 3, 2024

LaurentAjdnik commented Dec 3, 2024 •

edited

Loading

LaurentAjdnik commented Dec 3, 2024

LaurentAjdnik commented Dec 3, 2024 •

edited

Loading

LaurentAjdnik commented Dec 3, 2024

jspahrsummers commented Dec 5, 2024

LaurentAjdnik commented Dec 7, 2024

LaurentAjdnik commented Dec 7, 2024

LaurentAjdnik commented Dec 7, 2024

LaurentAjdnik commented Dec 7, 2024 •

edited

Loading

LaurentAjdnik commented Dec 7, 2024

jspahrsummers commented Dec 9, 2024 •

edited

Loading

Internationalization (i18n) / multilingualism for some text fields ? #86

Internationalization (i18n) / multilingualism for some text fields ? #86

Comments

LaurentAjdnik commented Nov 29, 2024

jspahrsummers commented Dec 2, 2024

LaurentAjdnik commented Dec 3, 2024

LaurentAjdnik commented Dec 3, 2024 • edited Loading

LaurentAjdnik commented Dec 3, 2024

LaurentAjdnik commented Dec 3, 2024 • edited Loading

LaurentAjdnik commented Dec 3, 2024

jspahrsummers commented Dec 5, 2024

LaurentAjdnik commented Dec 7, 2024

LaurentAjdnik commented Dec 7, 2024

LaurentAjdnik commented Dec 7, 2024

LaurentAjdnik commented Dec 7, 2024 • edited Loading

LaurentAjdnik commented Dec 7, 2024

jspahrsummers commented Dec 9, 2024 • edited Loading

LaurentAjdnik commented Dec 3, 2024 •

edited

Loading

LaurentAjdnik commented Dec 3, 2024 •

edited

Loading

LaurentAjdnik commented Dec 7, 2024 •

edited

Loading

jspahrsummers commented Dec 9, 2024 •

edited

Loading