Skip to content

ChatCompletions->Choices->Messages->AdditionalProperties not set #415

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
jpalvarezl opened this issue Apr 8, 2025 · 8 comments
Closed

Comments

@jpalvarezl
Copy link

jpalvarezl commented Apr 8, 2025

Hello,

I am experiencing deserialization issues with ChatCompletions in a newer version of the SDK.

I am currently using version 0.43.0 but I was able to verify that the SDK was working as expected back in version 0.37.0.

Detailed description

Today I was testing the library against an Azure OpenAI endpoint. While debugging I noticed that some of the Azure specifc fields were not being poputlated in the additionalProperties map of the returned ChatCompletionMessage.

My request looks something liket this:

ChatCompletionCreateParams params = createParamsBuilder("gpt-4o-mini")
        .messages(asList(
                createSystemMessageParam(),
                createUserMessageParam("What do most contributions require you to do?")))
        .additionalBodyProperties(createExtraBodyForByod()) // This method populates the necessary additional fields for the Azure side of things
        .build();
ChatCompletion completion = client.chat().completions().create(params);

While digging down into the library code, I was able to verify that by setting a break point in Line 118 of the ChatCompletionsServiceImpl.kt file and calling response.body().readAllBytes().decodeToString(), returns the following value: (I've edited the JSON to make it a bit shorter)

{
  "id" : "REDACTED",
  "model" : "gpt-4o-mini",
  "created" : 1744116299,
  "object" : "extensions.chat.completion",
  "choices" : [ {
    "index" : 0,
    "finish_reason" : "stop",
    "message" : {
      "role" : "assistant",
      "content" : "Most contributions require you to agree to a Contributor License Agreement (CLA), which declares that you have the right to, and actually do, grant the rights to use your contribution [doc1].",
      "end_turn" : true,
      "context" : {
        "citations" : [ {
          "content" : "...",
          "title" : "...",
          "url" : "...",
          "filepath" : "...",
          "chunk_id" : "0"
        }, 
        // ...
        ],
        "intent" : "[\"What do most contributions require?\", \"Requirements for contributions\", \"What is needed for contributions?\"]"
      }
    }
  } ],
  "usage" : {
    "prompt_tokens" : 5436,
    "completion_tokens" : 60,
    "total_tokens" : 5496
  },
  "system_fingerprint" : "REDACTED"
}

The result of calling ChatCompletion completion = client.chat().completions().create(params);

Looks like this in the debugger:

Image

As you can see from the image, completion.choices[0].message.additionalProperties is empty, despite having the JSON described above received as response from the request.

Back in version 0.37.0 of the SDK, the JSON for the "context", "intent" and "end_turn" was correctly added under ChatCompletionMessage::additionalProperties but after updating to 0.43.0 that is no longer the case.

Let me know if there is anything else I can supply for this issue. Thank you for your help!

@TomerAberbach
Copy link
Collaborator

TomerAberbach commented Apr 8, 2025

I'm surprised you're noticing this behavior because we have JSON roundtripping tests throughout the SDK. e.g.


I tested adding an additional property to these and it works fine (I can also print out the JSON and object and it's correct).

I tried doing a test Azure request, but I don't see any additional properties in the response JSON. How do I get Azure to respond with stuff like that?

@jpalvarezl
Copy link
Author

I think it's a bit cumbersome but it needs setup with Azure Search for the "on your data" functionality. Here are the docs for this but it's a lot of steps! I will try to get you a simpler repro of this.

What I am currently seeing though is that if I do the following:

// Approach 1. Directly parse into a String
byte[] responseAsBytes = client.chat().completions().withRawResponse().create(params).body().readAllBytes();
String responseAsString = new String(responseAsBytes, StandardCharsets.UTF_8);
System.out.println(responseAsString);

// Approach 2. Directly parse & serialize
ChatCompletion chatCompletion = client.chat().completions().create(params);
System.out.println(ObjectMappers.jsonMapper().writeValueAsString(chatCompletion));

(Note that params is identical)

The resulting JSON is different. You would observe that all the Azure fileds are gone when you use Approach 2 and you parse the response into a ChatCompletion through the SDK. Not sure if this provides some hint as to what else could be causing this.

@TomerAberbach
Copy link
Collaborator

Yeah, unfortunately this doesn't really clarify what the issue is for me, since the roundtripping tests should be doing the same thing as approach 2

I probably need a simpler repro. I am also curious if you are depending on Jackson directly, and if so, then what version would that be?

@jpalvarezl
Copy link
Author

Hi @TomerAberbach , I have a "simpler" repro for this. But first, to answer your question about our Jackson version, you can see the full pom.xml file I am working with here. But the TL;DR, is that I believe we are using Jackson transitively from SDK. The dependencies section looks like this:

 <dependencies>
    <!-- compile scope -->
    <dependency>
      <groupId>com.openai</groupId>
      <artifactId>openai-java</artifactId>
      <version>0.43.0</version> <!-- {x-version-update;com.openai:openai-java;external_dependency} -->
    </dependency>

    <!-- provided scope -->

    <dependency>
      <groupId>com.azure</groupId>
      <artifactId>azure-identity</artifactId>
      <version>1.15.1</version> <!-- {x-version-update;com.azure:azure-identity;dependency} -->
      <exclusions>
        <exclusion>
          <groupId>com.fasterxml.jackson.core</groupId>
          <artifactId>jackson-annotations</artifactId>
        </exclusion>
        <exclusion>
          <groupId>com.fasterxml.jackson.core</groupId>
          <artifactId>jackson-core</artifactId>
        </exclusion>
        <exclusion>
          <groupId>com.fasterxml.jackson.core</groupId>
          <artifactId>jackson-databind</artifactId>
        </exclusion>
        <exclusion>
          <groupId>com.fasterxml.jackson.datatype</groupId>
          <artifactId>jackson-datatype-jsr310</artifactId>
        </exclusion>
      </exclusions>
      <scope>test</scope>
    </dependency>

...

@jpalvarezl
Copy link
Author

jpalvarezl commented Apr 9, 2025

Regarding the repro, I did the following:

  • I created a simple python proxy (openai_proxy.py). I will paste at the end of the message because it's a lot of text (also this is LLM code, so not a 100% vouching for this implementation)
  • You can run it locally by:
chmod +x openai_proxy.py
./openai_proxy.py
  • Then in the same folder of the script, create a response.json file with contents that I will share below.
  • Then from a Java project, initialize your OpenAIClient with a proxy. I am doing the following:
// client initialization
private OpenAIClient createClient() {
        OpenAIOkHttpClient.Builder clientBuilder = OpenAIOkHttpClient.builder();
        clientBuilder.proxy(new Proxy(
                Proxy.Type.HTTP, new InetSocketAddress("127.0.0.1", 8080)
        ))
// maybe this is optional. Sorry if this doesn't work "as is". I edited the code in the issue to make it more brief.
//       .setAzureServiceApiVersion(clientBuilder, apiVersion)
//            .credential(AzureApiKeyCredential.create(System.getenv("AZURE_OPENAI_KEY")))
//            .baseUrl(getEndpoint());

        return clientBuilder.build();
    }
  • Add a unit test that looks like this:
@Test
public void testChatCompletionByod() {
    ChatCompletionCreateParams params = createParamsBuilder("gpt-4o-mini")
            .messages(asList(
                    createSystemMessageParam(),
                    createUserMessageParam("What do most contributions require you to do?")))
            .additionalBodyProperties(createExtraBodyForByod())
            .build();

    OpenAIClient proxyClient = createClient();

    ChatCompletion completion = proxyClient.chat().completions().create(params);
    // If you set a breakpoint above, you can inspect it. I don't see any additional properties under `message` 
    // ... assertions
}

@jpalvarezl
Copy link
Author

jpalvarezl commented Apr 9, 2025

Here is the code for the openai_proxy.py:

#!/usr/bin/env python3
"""
Simple proxy server that returns a static JSON response.
Usage:
    python openai_proxy.py [port]  # Start server (default port: 8080)
    Ctrl+C                         # Stop server
"""
from http.server import HTTPServer, BaseHTTPRequestHandler
import os
import sys
import socket
import select

class ProxyHandler(BaseHTTPRequestHandler):
    def do_GET(self):
        file_path = "response.json"
        if os.path.exists(file_path):
            with open(file_path, "r", encoding="utf-8") as file:
                content = file.read()
            self.send_response(200)
            self.send_header('Content-type', 'application/json')
            self.end_headers()
            self.wfile.write(content.encode())
        else:
            self.send_response(404)
            self.send_header('Content-type', 'text/plain')
            self.end_headers()
            self.wfile.write(b"File 'response.json' not found.")

    def do_POST(self):
        content_length = int(self.headers['Content-Length'])
        post_body = self.rfile.read(content_length).decode('utf-8')
        print("Request body:", post_body)

        file_path = "response.json"
        if os.path.exists(file_path):
            with open(file_path, "r", encoding="utf-8") as file:
                content = file.read()
            self.send_response(200)
            self.send_header('Content-type', 'application/json')
            self.end_headers()
            self.wfile.write(content.encode())
        else:
            self.send_response(404)
            self.send_header('Content-type', 'text/plain')
            self.end_headers()
            self.wfile.write(b"File 'response.json' not found.")

    def do_CONNECT(self):
        try:
            host, port = self.path.split(":")
            port = int(port)

            with socket.create_connection((host, port)) as remote_socket:
                self.send_response(200, "Connection Established")
                self.end_headers()

                # Tunnel data between client and remote server
                self.connection.setblocking(False)
                remote_socket.setblocking(False)

                while True:
                    readable, _, _ = select.select([self.connection, remote_socket], [], [], 0.1)

                    if self.connection in readable:
                        data = self.connection.recv(4096)
                        if not data:
                            break
                        remote_socket.sendall(data)

                    if remote_socket in readable:
                        data = remote_socket.recv(4096)
                        if not data:
                            break
                        self.connection.sendall(data)
        except Exception as e:
            self.send_error(500, str(e))

if __name__ == '__main__':
    port = int(sys.argv[1]) if len(sys.argv) > 1 else 8080
    server = HTTPServer(('localhost', port), ProxyHandler)
    print(f"Proxy server running on http://localhost:{port}")
    try:
        server.serve_forever()
    except KeyboardInterrupt:
        print("\nShutting down server...")
        server.server_close()

And here is the contents of response.json :

{
  "id": "foo-bar",
  "model": "gpt-4o-mini",
  "created": 1744190318,
  "object": "extensions.chat.completion",
  "choices": [
    {
      "index": 0,
      "finish_reason": "stop",
      "message": {
        "role": "assistant",
        "content": "Most contributions require you to agree to a Contributor License Agreement (CLA), which declares that you have the right to, and actually do, grant the rights to use your contribution [doc1].",
        "end_turn": true,
        "context": {
          "citations": [
            {
              "content": "foo-bar",
              "title": "foo-bar",
              "url": "foo-bar",
              "filepath": "foo-bar",
              "chunk_id": "0"
            },
            {
              "content": "foo-bar",
              "title": "foo-bar",
              "url": "foo-bar",
              "filepath": "foo-bar",
              "chunk_id": "2"
            },
            {
              "content": "foo-bar",
              "title": "foo-bar",
              "url": "foo-bar",
              "filepath": "foo-bar",
              "chunk_id": "0"
            },
            {
              "content": "foo-bar",
              "title": "foo-bar",
              "url": "foo-bar",
              "filepath": "foo-bar",
              "chunk_id": "0"
            },
            {
              "content": "foo-bar",
              "title": "foo-bar",
              "url": "foo-bar",
              "filepath": "foo-bar",
              "chunk_id": "0"
            }
          ],
          "intent": "[\"What do most contributions require?\", \"Requirements for contributions\", \"What is needed for contributions?\"]"
        }
      }
    }
  ],
  "usage": {
    "prompt_tokens": 5436,
    "completion_tokens": 60,
    "total_tokens": 5496
  },
  "system_fingerprint": "foo"
}

@TomerAberbach
Copy link
Collaborator

I couldn't get the proxy working properly, but I simply hard coded your response in OkHttpClient and I understand what happened

In our test suite, we test with Jackson 2.13.4, to ensure we're compatible with old versions, but we publish with Jackson 2.18.1, so that's what's used by default. Turns out that Jackson 2.18.1 specifically has a bug that affects us and causes this issue: FasterXML/jackson-databind#4639

I'm going to do two things:

  1. Upgrade our published Jackson dependency version to 2.18.2, which I've confirmed does not have the issue
  2. Disallow Jackson 2.18.1 here:
    internal fun checkJacksonVersionCompatibility() {

Thanks for reporting!

@TomerAberbach
Copy link
Collaborator

This should be fixed in v1.1.1 (getting published now)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants