Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support images directly in UserMessage #387

Merged
merged 48 commits into from
Jan 6, 2025

Conversation

jackmpcollins
Copy link
Owner

@jackmpcollins jackmpcollins commented Dec 3, 2024

  • Enable UserMessage to contain text and image parts. Introduce new types ImageBytes, ImageUrl to identify these.
  • Deprecate UserImageMessage
  • Update docs to use UserMessage instead of UserImageMessage

Possible breaking changes

  • UserMessage type hint needs to be changed to UserMessage[Any] if strict type checking

Example

from magentic import chatprompt, ImageUrl, Placeholder, UserMessage


IMAGE_URL_WOODEN_BOARDWALK = "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg"


@chatprompt(
    UserMessage(
        [
            "Describe the following image in one sentence.",
            Placeholder(ImageUrl, "image_url"),
        ]
    ),
)
def describe_image(image_url: str) -> str: ...


describe_image(IMAGE_URL_WOODEN_BOARDWALK)
# 'A wooden boardwalk meanders through lush green wetlands under a partly cloudy blue sky.'

This more closely aligns magentic with provider APIs

openai
https://platform.openai.com/docs/guides/vision?lang=python

from openai import OpenAI

client = OpenAI()

response = client.chat.completions.create(
  model="gpt-4o-mini",
  messages=[
    {
      "role": "user",
      "content": [
        {"type": "text", "text": "What’s in this image?"},
        {
          "type": "image_url",
          "image_url": {
            "url": "https://upload.wikimedia.org/wikipedia/commons/thumb/d/dd/Gfp-wisconsin-madison-the-nature-boardwalk.jpg/2560px-Gfp-wisconsin-madison-the-nature-boardwalk.jpg",
          },
        },
      ],
    }
  ],
  max_tokens=300,
)

print(response.choices[0])

anthropic
https://docs.anthropic.com/en/docs/build-with-claude/vision#about-the-prompt-examples

import anthropic

client = anthropic.Anthropic()
message = client.messages.create(
    model="claude-3-5-sonnet-20241022",
    max_tokens=1024,
    messages=[
        {
            "role": "user",
            "content": [
                {
                    "type": "image",
                    "source": {
                        "type": "base64",
                        "media_type": image1_media_type,
                        "data": image1_data,
                    },
                },
                {
                    "type": "text",
                    "text": "Describe this image."
                }
            ],
        }
    ],
)
print(message)

@jackmpcollins jackmpcollins self-assigned this Dec 3, 2024
@jackmpcollins jackmpcollins changed the title Support for multi-part UserMessage Support images directly in UserMessage Jan 6, 2025
@jackmpcollins jackmpcollins marked this pull request as ready for review January 6, 2025 02:19
@jackmpcollins jackmpcollins merged commit 0cb9e7b into main Jan 6, 2025
1 check passed
@jackmpcollins jackmpcollins deleted the allow-images-in-user-message branch January 6, 2025 02:19
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant