From 2c09198ec8b117946ad7c38e31e35765ca35e028 Mon Sep 17 00:00:00 2001 From: Guillaume Vernade Date: Thu, 11 Jul 2024 20:45:44 +0200 Subject: [PATCH 01/16] Aligning the README with the Python's one (#180) --- README.md | 158 ++++++++++++++++++++++++++++++------------------------ 1 file changed, 88 insertions(+), 70 deletions(-) diff --git a/README.md b/README.md index 7de87dd..2704c01 100644 --- a/README.md +++ b/README.md @@ -1,109 +1,127 @@ +# Google AI SDK for Swift + [![](https://img.shields.io/endpoint?url=https%3A%2F%2Fswiftpackageindex.com%2Fapi%2Fpackages%2Fgoogle-gemini%2Fgenerative-ai-swift%2Fbadge%3Ftype%3Dswift-versions)](https://swiftpackageindex.com/google-gemini/generative-ai-swift) [![](https://img.shields.io/endpoint?url=https%3A%2F%2Fswiftpackageindex.com%2Fapi%2Fpackages%2Fgoogle-gemini%2Fgenerative-ai-swift%2Fbadge%3Ftype%3Dplatforms)](https://swiftpackageindex.com/google-gemini/generative-ai-swift) -# Google AI SDK for Swift - -> [!CAUTION] -> **The Google AI SDK for Swift is recommended for prototyping only.** If you plan to enable -> billing, we strongly recommend that you use a backend SDK to access the Google AI Gemini API. You -> risk potentially exposing your API key to malicious actors if you embed your API key directly in -> your Swift app or fetch it remotely at runtime. +The Google AI Swift SDK is the easiest way for Swift developers to build with +the Gemini API. The Gemini API gives you access to Gemini +[models](https://ai.google.dev/models/gemini) created by +[Google DeepMind](https://deepmind.google/technologies/gemini/#introduction). +Gemini models are built from the ground up to be multimodal, so you can reason +seamlessly across text, images, and code. + +> [!CAUTION] **The Google AI SDK for Swift is recommended for prototyping +> only.** If you plan to enable billing, we strongly recommend that you use a +> backend SDK to access the Google AI Gemini API. You risk potentially exposing +> your API key to malicious actors if you embed your API key directly in your +> Swift app or fetch it remotely at runtime. + +## Get started with the Gemini API + +1. Go to [Google AI Studio](https://aistudio.google.com/). +2. Login with your Google account. +3. [Create an API key](https://aistudio.google.com/app/apikey). Note that in + Europe the free tier is not available. +4. Check out this repository. \ + `git clone https://github.com/google/generative-ai-swift` +5. Open and build the sample app in the `Examples` folder of this repo. +6. Run the app once to ensure the build script generates an empty + `GenerativeAI-Info.plist` file +7. Paste your API key into the `API_KEY` property in the + `GenerativeAI-Info.plist` file. +8. Run the app +9. For detailed instructions, try the + [Swift SDK tutorial](https://ai.google.dev/tutorials/swift_quickstart) on + [ai.google.dev](https://ai.google.dev). + +## Usage example + +1. Add [`generative-ai-swift`](https://github.com/google/generative-ai-swift) + to your Xcode project using Swift Package Manager. + +2. Import the `GoogleGenerativeAI` module -The Google AI SDK for Swift enables developers to use Google's state-of-the-art generative AI models -(like Gemini) to build AI-powered features and applications. This SDK supports use cases like: -- Generate text from text-only input -- Generate text from text-and-images input (multimodal) -- Build multi-turn conversations (chat) +```swift +import GoogleGenerativeAI +``` -For example, with just a few lines of code, you can access Gemini's multimodal capabilities to -generate text from text-and-image input: +1. Initialize the model ```swift let model = GenerativeModel(name: "gemini-1.5-flash-latest", apiKey: "YOUR_API_KEY") +``` + +1. Run a prompt + +```swift let cookieImage = UIImage(...) let prompt = "Do these look store-bought or homemade?" let response = try await model.generateContent(prompt, cookieImage) ``` -## Try out the sample Swift app - -This repository contains a sample app demonstrating how the SDK can access and utilize the Gemini -model for various use cases. - -To try out the sample app, follow these steps: - -1. Check out this repository.\ -`git clone https://github.com/google/generative-ai-swift` - -1. [Obtain an API key](https://makersuite.google.com/app/apikey) to use with the Google AI SDKs. - -1. Open and build the sample app in the `Examples` folder of this repo. - -1. Run the app once to ensure the build script generates an empty `GenerativeAI-Info.plist` file - -1. Paste your API key into the `API_KEY` property in the `GenerativeAI-Info.plist` file. - -1. Run the app. - -## Use the SDK in your app - -Add [`generative-ai-swift`](https://github.com/google/generative-ai-swift) to your Xcode project -using Swift Package Manager. - For detailed instructions, you can find a -[quickstart](https://ai.google.dev/tutorials/swift_quickstart) for the Google AI SDK for Swift in the -Google documentation. +[quickstart](https://ai.google.dev/tutorials/swift_quickstart) for the Google AI +SDK for Swift in the Google documentation. -This quickstart describes how to add your API key and the Swift package to your app, initialize the -model, and then call the API to access the model. It also describes some additional use cases and -features, like streaming, counting tokens, and controlling responses. +This quickstart describes how to add your API key and the Swift package to your +app, initialize the model, and then call the API to access the model. It also +describes some additional use cases and features, like streaming, counting +tokens, and controlling responses. ## Logging -To enable additional logging in the Xcode console, including a cURL command and raw stream -response for each model request, add `-GoogleGenerativeAIDebugLogEnabled` as -`Arguments Passed On Launch` in the Xcode scheme. +To enable additional logging in the Xcode console, including a cURL command and +raw stream response for each model request, add +`-GoogleGenerativeAIDebugLogEnabled` as `Arguments Passed On Launch` in the +Xcode scheme. ## Command Line Tool -A command line tool is available to experiment with Gemini model requests via Xcode or the command -line: +A command line tool is available to experiment with Gemini model requests via +Xcode or the command line: -1. `open Examples/GenerativeAICLI/Package.swift` -1. Run in Xcode and examine the console to see the options. -1. Edit the scheme's `Arguments Passed On Launch` with the desired options. +1. `open Examples/GenerativeAICLI/Package.swift` +1. Run in Xcode and examine the console to see the options. +1. Edit the scheme's `Arguments Passed On Launch` with the desired options. ## Documentation -Find complete documentation for the Google AI SDKs and the Gemini model in the Google -documentation: https://ai.google.dev/docs +See the +[Gemini API Cookbook](https://github.com/google-gemini/gemini-api-cookbook/) or +[ai.google.dev](https://ai.google.dev) for complete documentation. ## Contributing -See [Contributing](https://github.com/google/generative-ai-swift/blob/main/docs/CONTRIBUTING.md) -for more information on -contributing to the Google AI SDK for Swift. - +See +[Contributing](https://github.com/google/generative-ai-swift/blob/main/docs/CONTRIBUTING.md) +for more information on contributing to the Google AI SDK for Swift. ## Developers who use the PaLM SDK for Swift (Deprecated) -> [!IMPORTANT] -> The PaLM API is deprecated for use with Google AI services and tools (but _not_ for Vertex AI). -> Learn more about this deprecation, its timeline, and how to migrate to use Gemini in the +> [!IMPORTANT] The PaLM API is deprecated for use with Google AI services and +> tools (but *not* for Vertex AI). Learn more about this deprecation, its +> timeline, and how to migrate to use Gemini in the > [PaLM API deprecation guide](http://ai.google.dev/palm_docs/deprecation). -​​If you're using the PaLM SDK for Swift, review the information below to continue using the -**deprecated** PaLM SDK until you've migrated to the new version that allows you to use Gemini. +​​If you're using the PaLM SDK for Swift, review the information below to +continue using the **deprecated** PaLM SDK until you've migrated to the new +version that allows you to use Gemini. -- To continue using PaLM models, make sure your app depends on version -[`0.3.0`](https://github.com/google/generative-ai-swift/releases/tag/0.3.0) -_up to_ the next minor version -([`0.4.0`](https://github.com/google/generative-ai-swift/releases/tag/0.4.0)) -of `generative-ai-swift`. +- To continue using PaLM models, make sure your app depends on version + [`0.3.0`](https://github.com/google/generative-ai-swift/releases/tag/0.3.0) + *up to* the next minor version + ([`0.4.0`](https://github.com/google/generative-ai-swift/releases/tag/0.4.0)) + of `generative-ai-swift`. -- When you're ready to use Gemini models, migrate your code to the Gemini API and update your app's -`generative-ai-swift` dependency to version `0.4.0` or higher. +- When you're ready to use Gemini models, migrate your code to the Gemini API + and update your app's `generative-ai-swift` dependency to version `0.4.0` or + higher. To see the PaLM documentation and code, go to the [`palm` branch](https://github.com/google/generative-ai-swift/tree/palm). + +## License + +The contents of this repository are licensed under the +[Apache License, version 2.0](http://www.apache.org/licenses/LICENSE-2.0). \ No newline at end of file From b72e9ba6293823a5f50fd1898b4193ffdfe635cb Mon Sep 17 00:00:00 2001 From: Andrew Heard Date: Thu, 11 Jul 2024 20:03:50 -0400 Subject: [PATCH 02/16] Add code snippets for text generation (#181) --- Package.swift | 5 ++ samples/APIKey.swift | 62 +++++++++++++ samples/TextGeneration.swift | 166 +++++++++++++++++++++++++++++++++++ 3 files changed, 233 insertions(+) create mode 100644 samples/APIKey.swift create mode 100644 samples/TextGeneration.swift diff --git a/Package.swift b/Package.swift index 8402e52..2a4cb43 100644 --- a/Package.swift +++ b/Package.swift @@ -44,5 +44,10 @@ let package = Package( .process("GoogleAITests/GenerateContentResponses"), ] ), + .testTarget( + name: "CodeSnippetTests", + dependencies: ["GoogleGenerativeAI"], + path: "samples" + ), ] ) diff --git a/samples/APIKey.swift b/samples/APIKey.swift new file mode 100644 index 0000000..3368712 --- /dev/null +++ b/samples/APIKey.swift @@ -0,0 +1,62 @@ +// Copyright 2024 Google LLC +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +import Foundation +import XCTest + +/// A private wrapper for `APIKey`, hiding it from test files. +private enum APIKeyCodeSnippet { + // The implementation of `APIKey` for use in documentation code snippets; shown in + // https://ai.google.dev/gemini-api/docs/quickstart?lang=swift + // [START setup_api_key] + enum APIKey { + // Fetch the API key from `GenerativeAI-Info.plist` + static var `default`: String { + guard let filePath = Bundle.main.path(forResource: "GenerativeAI-Info", ofType: "plist") + else { + fatalError("Couldn't find file 'GenerativeAI-Info.plist'.") + } + let plist = NSDictionary(contentsOfFile: filePath) + guard let value = plist?.object(forKey: "API_KEY") as? String else { + fatalError("Couldn't find key 'API_KEY' in 'GenerativeAI-Info.plist'.") + } + if value.starts(with: "_") { + fatalError( + "Follow the instructions at https://ai.google.dev/tutorials/setup to get an API key." + ) + } + return value + } + } + // [END setup_api_key] +} + +/// Protocol to ensure that the `APIKey` APIs do not diverge. +protocol APIKeyProtocol { + static var `default`: String { get } +} + +extension APIKeyCodeSnippet.APIKey: APIKeyProtocol {} + +/// An implementation of `APIKey` for use in code snippet tests only. +enum APIKey: APIKeyProtocol { + static let apiKeyEnvVar = "API_KEY" + + static var `default`: String { + guard let apiKey = ProcessInfo.processInfo.environment[apiKeyEnvVar] else { + return "" + } + return apiKey + } +} diff --git a/samples/TextGeneration.swift b/samples/TextGeneration.swift new file mode 100644 index 0000000..6849fdd --- /dev/null +++ b/samples/TextGeneration.swift @@ -0,0 +1,166 @@ +// Copyright 2024 Google LLC +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +import GoogleGenerativeAI +import XCTest + +// Set up your API Key +// ==================== +// To use the Gemini API, you'll need an API key. To learn more, see the "Set up your API Key" +// section in the Gemini API quickstart: +// https://ai.google.dev/gemini-api/docs/quickstart?lang=swift#set-up-api-key + +#if canImport(UIKit) + @available(iOS 15.0, macCatalyst 15.0, *) + final class TextGeneration: XCTestCase { + override func setUpWithError() throws { + try XCTSkipIf( + APIKey.default.isEmpty, + "`\(APIKey.apiKeyEnvVar)` environment variable not set." + ) + } + + func testTextOnlyPrompt() async throws { + // [START text_gen_text_only_prompt] + let generativeModel = + GenerativeModel( + // Specify a Gemini model appropriate for your use case + name: "gemini-1.5-flash", + // Access your API key from your on-demand resource .plist file (see "Set up your API key" + // above) + apiKey: APIKey.default + ) + + let prompt = "Write a story about a magic backpack." + let response = try await generativeModel.generateContent(prompt) + if let text = response.text { + print(text) + } + // [END text_gen_text_only_prompt] + } + + func testTextOnlyPromptStreaming() async throws { + // [START text_gen_text_only_prompt_streaming] + let generativeModel = + GenerativeModel( + // Specify a Gemini model appropriate for your use case + name: "gemini-1.5-flash", + // Access your API key from your on-demand resource .plist file (see "Set up your API key" + // above) + apiKey: APIKey.default + ) + + let prompt = "Write a story about a magic backpack." + // Use streaming with text-only input + for try await response in generativeModel.generateContentStream(prompt) { + if let text = response.text { + print(text) + } + } + // [END text_gen_text_only_prompt_streaming] + } + + func testMultimodalOneImagePrompt() async throws { + // [START text_gen_multimodal_one_image_prompt] + let generativeModel = + GenerativeModel( + // Specify a Gemini model appropriate for your use case + name: "gemini-1.5-flash", + // Access your API key from your on-demand resource .plist file (see "Set up your API key" + // above) + apiKey: APIKey.default + ) + + guard let image = UIImage(systemName: "cloud.sun") else { fatalError() } + + let prompt = "What's in this picture?" + + let response = try await generativeModel.generateContent(image, prompt) + if let text = response.text { + print(text) + } + // [END text_gen_multimodal_one_image_prompt] + } + + func testMultimodalOneImagePromptStreaming() async throws { + // [START text_gen_multimodal_one_image_prompt_streaming] + let generativeModel = + GenerativeModel( + // Specify a Gemini model appropriate for your use case + name: "gemini-1.5-flash", + // Access your API key from your on-demand resource .plist file (see "Set up your API key" + // above) + apiKey: APIKey.default + ) + + guard let image = UIImage(systemName: "cloud.sun") else { fatalError() } + + let prompt = "What's in this picture?" + + for try await response in generativeModel.generateContentStream(image, prompt) { + if let text = response.text { + print(text) + } + } + // [END text_gen_multimodal_one_image_prompt_streaming] + } + + func testMultimodalMultiImagePrompt() async throws { + // [START text_gen_multimodal_multi_image_prompt] + let generativeModel = + GenerativeModel( + // Specify a Gemini model appropriate for your use case + name: "gemini-1.5-flash", + // Access your API key from your on-demand resource .plist file (see "Set up your API key" + // above) + apiKey: APIKey.default + ) + + guard let image1 = UIImage(systemName: "cloud.sun") else { fatalError() } + guard let image2 = UIImage(systemName: "cloud.heavyrain") else { fatalError() } + + let prompt = "What's the difference between these pictures?" + + let response = try await generativeModel.generateContent(image1, image2, prompt) + if let text = response.text { + print(text) + } + // [END text_gen_multimodal_multi_image_prompt] + } + + func testMultimodalMultiImagePromptStreaming() async throws { + // [START text_gen_multimodal_multi_image_prompt_streaming] + let generativeModel = + GenerativeModel( + // Specify a Gemini model appropriate for your use case + name: "gemini-1.5-flash", + // Access your API key from your on-demand resource .plist file (see "Set up your API key" + // above) + apiKey: APIKey.default + ) + + guard let image1 = UIImage(systemName: "cloud.sun") else { fatalError() } + guard let image2 = UIImage(systemName: "cloud.heavyrain") else { fatalError() } + + let prompt = "What's the difference between these pictures?" + + for try await response in generativeModel.generateContentStream(image1, image2, prompt) { + if let text = response.text { + print(text) + } + } + // [END text_gen_multimodal_multi_image_prompt_streaming] + } + } +#endif // canImport(UIKit) From 8ae5205f1f5f85a7dd4b99a68291d3924043f04a Mon Sep 17 00:00:00 2001 From: Andrew Heard Date: Fri, 12 Jul 2024 16:11:32 -0400 Subject: [PATCH 03/16] Add documentation code snippets for counting tokens (#185) --- samples/CountTokens.swift | 101 ++++++++++++++++++++++++++++++++++++++ 1 file changed, 101 insertions(+) create mode 100644 samples/CountTokens.swift diff --git a/samples/CountTokens.swift b/samples/CountTokens.swift new file mode 100644 index 0000000..2bb5a49 --- /dev/null +++ b/samples/CountTokens.swift @@ -0,0 +1,101 @@ +// Copyright 2024 Google LLC +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +import GoogleGenerativeAI +import XCTest + +// Set up your API Key +// ==================== +// To use the Gemini API, you'll need an API key. To learn more, see the "Set up your API Key" +// section in the Gemini API quickstart: +// https://ai.google.dev/gemini-api/docs/quickstart?lang=swift#set-up-api-key + +@available(iOS 15.0, macOS 11.0, macCatalyst 15.0, *) +final class CountTokensSnippets: XCTestCase { + override func setUpWithError() throws { + try XCTSkipIf( + APIKey.default.isEmpty, + "`\(APIKey.apiKeyEnvVar)` environment variable not set." + ) + } + + func testCountTokensTextOnly() async throws { + // [START tokens_text_only] + let generativeModel = + GenerativeModel( + // Specify a Gemini model appropriate for your use case + name: "gemini-1.5-flash", + // Access your API key from your on-demand resource .plist file (see "Set up your API key" + // above) + apiKey: APIKey.default + ) + + let prompt = "Write a story about a magic backpack." + + let response = try await generativeModel.countTokens(prompt) + + print("Total Tokens: \(response.totalTokens)") + // [END tokens_text_only] + } + + func testCountTokensChat() async throws { + // [START tokens_chat] + let generativeModel = + GenerativeModel( + // Specify a Gemini model appropriate for your use case + name: "gemini-1.5-flash", + // Access your API key from your on-demand resource .plist file (see "Set up your API key" + // above) + apiKey: APIKey.default + ) + + // Optionally specify existing chat history + let history = [ + ModelContent(role: "user", parts: "Hello, I have 2 dogs in my house."), + ModelContent(role: "model", parts: "Great to meet you. What would you like to know?"), + ] + + // Initialize the chat with optional chat history + let chat = generativeModel.startChat(history: history) + + let response = try await generativeModel.countTokens(chat.history + [ + ModelContent(role: "user", parts: "This is the message I intend to send"), + ]) + print("Total Tokens: \(response.totalTokens)") + // [END tokens_chat] + } + + #if canImport(UIKit) + func testCountTokensMultimodalInline() async throws { + // [START tokens_multimodal_image_inline] + let generativeModel = + GenerativeModel( + // Specify a Gemini model appropriate for your use case + name: "gemini-1.5-flash", + // Access your API key from your on-demand resource .plist file (see "Set up your API key" + // above) + apiKey: APIKey.default + ) + + guard let image1 = UIImage(systemName: "cloud.sun") else { fatalError() } + guard let image2 = UIImage(systemName: "cloud.heavyrain") else { fatalError() } + + let prompt = "What's the difference between these pictures?" + + let response = try await generativeModel.countTokens(image1, image2, prompt) + print("Total Tokens: \(response.totalTokens)") + // [END tokens_multimodal_image_inline] + } + #endif // canImport(UIKit) +} From 5ec651cbde9e50860b94fb237c52341bdd05fd8a Mon Sep 17 00:00:00 2001 From: Andrew Heard Date: Fri, 12 Jul 2024 16:12:07 -0400 Subject: [PATCH 04/16] Add documentation code snippets for Chat (#184) --- samples/ChatSnippets.swift | 125 +++++++++++++++++++++++++++++++++++++ 1 file changed, 125 insertions(+) create mode 100644 samples/ChatSnippets.swift diff --git a/samples/ChatSnippets.swift b/samples/ChatSnippets.swift new file mode 100644 index 0000000..10690bb --- /dev/null +++ b/samples/ChatSnippets.swift @@ -0,0 +1,125 @@ +// Copyright 2024 Google LLC +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +import GoogleGenerativeAI +import XCTest + +// Set up your API Key +// ==================== +// To use the Gemini API, you'll need an API key. To learn more, see the "Set up your API Key" +// section in the Gemini API quickstart: +// https://ai.google.dev/gemini-api/docs/quickstart?lang=swift#set-up-api-key + +@available(iOS 15.0, macOS 11.0, macCatalyst 15.0, *) +final class ChatSnippets: XCTestCase { + override func setUpWithError() throws { + try XCTSkipIf( + APIKey.default.isEmpty, + "`\(APIKey.apiKeyEnvVar)` environment variable not set." + ) + } + + func testChat() async throws { + // [START chat] + let generativeModel = + GenerativeModel( + // Specify a Gemini model appropriate for your use case + name: "gemini-1.5-flash", + // Access your API key from your on-demand resource .plist file (see "Set up your API key" + // above) + apiKey: APIKey.default + ) + + // Optionally specify existing chat history + let history = [ + ModelContent(role: "user", parts: "Hello, I have 2 dogs in my house."), + ModelContent(role: "model", parts: "Great to meet you. What would you like to know?"), + ] + + // Initialize the chat with optional chat history + let chat = generativeModel.startChat(history: history) + + // To generate text output, call sendMessage and pass in the message + let response = try await chat.sendMessage("How many paws are in my house?") + if let text = response.text { + print(text) + } + // [END chat] + } + + func testChatStreaming() async throws { + // [START chat_streaming] + let generativeModel = + GenerativeModel( + // Specify a Gemini model appropriate for your use case + name: "gemini-1.5-flash", + // Access your API key from your on-demand resource .plist file (see "Set up your API key" + // above) + apiKey: APIKey.default + ) + + // Optionally specify existing chat history + let history = [ + ModelContent(role: "user", parts: "Hello, I have 2 dogs in my house."), + ModelContent(role: "model", parts: "Great to meet you. What would you like to know?"), + ] + + // Initialize the chat with optional chat history + let chat = generativeModel.startChat(history: history) + + // To stream generated text output, call sendMessageStream and pass in the message + let contentStream = chat.sendMessageStream("How many paws are in my house?") + for try await chunk in contentStream { + if let text = chunk.text { + print(text) + } + } + // [END chat_streaming] + } + + #if canImport(UIKit) + func testChatStreamingWithImages() async throws { + // [START chat_streaming_with_images] + let generativeModel = + GenerativeModel( + // Specify a Gemini model appropriate for your use case + name: "gemini-1.5-flash", + // Access your API key from your on-demand resource .plist file (see "Set up your API key" + // above) + apiKey: APIKey.default + ) + + // Optionally specify existing chat history + let history = [ + ModelContent(role: "user", parts: "I'm trying to remember a fable about two animals."), + ModelContent(role: "model", parts: "Do you remember what kind of animals were they?"), + ] + + guard let image1 = UIImage(systemName: "tortoise") else { fatalError() } + guard let image2 = UIImage(systemName: "hare") else { fatalError() } + + // Initialize the chat with optional chat history + let chat = generativeModel.startChat(history: history) + + // To stream generated text output, call sendMessageStream and pass in the message + let contentStream = chat.sendMessageStream("The animals from these pictures.", image1, image2) + for try await chunk in contentStream { + if let text = chunk.text { + print(text) + } + } + // [END chat_streaming_with_images] + } + #endif // canImport(UIKit) +} From bad1c2ba8b41e0110874592584bc6aa0d0557674 Mon Sep 17 00:00:00 2001 From: Andrew Heard Date: Fri, 12 Jul 2024 17:30:38 -0400 Subject: [PATCH 05/16] Add documentation code snippets for model config (#186) --- samples/GenerationConfig.swift | 58 +++++++++++++++++++++++++ samples/SafetySettings.swift | 72 ++++++++++++++++++++++++++++++++ samples/SystemInstructions.swift | 49 ++++++++++++++++++++++ 3 files changed, 179 insertions(+) create mode 100644 samples/GenerationConfig.swift create mode 100644 samples/SafetySettings.swift create mode 100644 samples/SystemInstructions.swift diff --git a/samples/GenerationConfig.swift b/samples/GenerationConfig.swift new file mode 100644 index 0000000..cdc4fc3 --- /dev/null +++ b/samples/GenerationConfig.swift @@ -0,0 +1,58 @@ +// Copyright 2024 Google LLC +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +import GoogleGenerativeAI +import XCTest + +// Set up your API Key +// ==================== +// To use the Gemini API, you'll need an API key. To learn more, see the "Set up your API Key" +// section in the Gemini API quickstart: +// https://ai.google.dev/gemini-api/docs/quickstart?lang=swift#set-up-api-key + +@available(iOS 15.0, macOS 11.0, macCatalyst 15.0, *) +final class GenerationConfigSnippets: XCTestCase { + override func setUpWithError() throws { + try XCTSkipIf( + APIKey.default.isEmpty, + "`\(APIKey.apiKeyEnvVar)` environment variable not set." + ) + } + + func testConfigureModelParameters() { + // [START configure_model_parameters] + let config = GenerationConfig( + temperature: 0.9, + topP: 0.1, + topK: 16, + candidateCount: 1, + maxOutputTokens: 200, + stopSequences: ["red", "orange"] + ) + + let generativeModel = + GenerativeModel( + // Specify a Gemini model appropriate for your use case + name: "gemini-1.5-flash", + // Access your API key from your on-demand resource .plist file (see "Set up your API key" + // above) + apiKey: APIKey.default, + generationConfig: config + ) + // [END configure_model_parameters] + + // Added to silence the compiler warning about unused variable. + let _ = String(describing: generativeModel) + } +} diff --git a/samples/SafetySettings.swift b/samples/SafetySettings.swift new file mode 100644 index 0000000..a92afeb --- /dev/null +++ b/samples/SafetySettings.swift @@ -0,0 +1,72 @@ +// Copyright 2024 Google LLC +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +import GoogleGenerativeAI +import XCTest + +// Set up your API Key +// ==================== +// To use the Gemini API, you'll need an API key. To learn more, see the "Set up your API Key" +// section in the Gemini API quickstart: +// https://ai.google.dev/gemini-api/docs/quickstart?lang=swift#set-up-api-key + +@available(iOS 15.0, macOS 11.0, macCatalyst 15.0, *) +final class SafetySettingsSnippets: XCTestCase { + override func setUpWithError() throws { + try XCTSkipIf( + APIKey.default.isEmpty, + "`\(APIKey.apiKeyEnvVar)` environment variable not set." + ) + } + + func testSafetySettings() { + // [START safety_settings] + let generativeModel = + GenerativeModel( + // Specify a Gemini model appropriate for your use case + name: "gemini-1.5-flash", + // Access your API key from your on-demand resource .plist file (see "Set up your API key" + // above) + apiKey: APIKey.default, + safetySettings: [SafetySetting(harmCategory: .harassment, threshold: .blockLowAndAbove)] + ) + // [END safety_settings] + + // Added to silence the compiler warning about unused variable. + let _ = String(describing: generativeModel) + } + + func testSafetySettingsMulti() { + // [START safety_settings_multi] + let safetySettings = [ + SafetySetting(harmCategory: .dangerousContent, threshold: .blockLowAndAbove), + SafetySetting(harmCategory: .harassment, threshold: .blockMediumAndAbove), + SafetySetting(harmCategory: .hateSpeech, threshold: .blockOnlyHigh), + ] + + let generativeModel = + GenerativeModel( + // Specify a Gemini model appropriate for your use case + name: "gemini-1.5-flash", + // Access your API key from your on-demand resource .plist file (see "Set up your API key" + // above) + apiKey: APIKey.default, + safetySettings: safetySettings + ) + // [END safety_settings_multi] + + // Added to silence the compiler warning about unused variable. + let _ = String(describing: generativeModel) + } +} diff --git a/samples/SystemInstructions.swift b/samples/SystemInstructions.swift new file mode 100644 index 0000000..46cdd11 --- /dev/null +++ b/samples/SystemInstructions.swift @@ -0,0 +1,49 @@ +// Copyright 2024 Google LLC +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +import GoogleGenerativeAI +import XCTest + +// Set up your API Key +// ==================== +// To use the Gemini API, you'll need an API key. To learn more, see the "Set up your API Key" +// section in the Gemini API quickstart: +// https://ai.google.dev/gemini-api/docs/quickstart?lang=swift#set-up-api-key + +@available(iOS 15.0, macOS 11.0, macCatalyst 15.0, *) +final class SystemInstructionsSnippets: XCTestCase { + override func setUpWithError() throws { + try XCTSkipIf( + APIKey.default.isEmpty, + "`\(APIKey.apiKeyEnvVar)` environment variable not set." + ) + } + + func testSystemInstruction() { + // [START system_instruction] + let generativeModel = + GenerativeModel( + // Specify a model that supports system instructions, like a Gemini 1.5 model + name: "gemini-1.5-flash", + // Access your API key from your on-demand resource .plist file (see "Set up your API key" + // above) + apiKey: APIKey.default, + systemInstruction: ModelContent(role: "system", parts: "You are a cat. Your name is Neko.") + ) + // [END system_instruction] + + // Added to silence the compiler warning about unused variable. + let _ = String(describing: generativeModel) + } +} From 74d2f2b973f34068ffcbe47e2b07c5445d24fa67 Mon Sep 17 00:00:00 2001 From: Andrew Heard Date: Fri, 12 Jul 2024 19:29:45 -0400 Subject: [PATCH 06/16] Add a code snippet for function calling (#187) --- samples/FunctionCalling.swift | 109 ++++++++++++++++++++++++++++++++++ 1 file changed, 109 insertions(+) create mode 100644 samples/FunctionCalling.swift diff --git a/samples/FunctionCalling.swift b/samples/FunctionCalling.swift new file mode 100644 index 0000000..020d5d4 --- /dev/null +++ b/samples/FunctionCalling.swift @@ -0,0 +1,109 @@ +// Copyright 2024 Google LLC +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +import GoogleGenerativeAI +import XCTest + +// Set up your API Key +// ==================== +// To use the Gemini API, you'll need an API key. To learn more, see the "Set up your API Key" +// section in the Gemini API quickstart: +// https://ai.google.dev/gemini-api/docs/quickstart?lang=swift#set-up-api-key + +@available(iOS 15.0, macOS 11.0, macCatalyst 15.0, *) +final class FunctionCallingSnippets: XCTestCase { + override func setUpWithError() throws { + try XCTSkipIf( + APIKey.default.isEmpty, + "`\(APIKey.apiKeyEnvVar)` environment variable not set." + ) + } + + func testFunctionCalling() async throws { + // [BEGIN function_calling] + // Calls a hypothetical API to control a light bulb and returns the values that were set. + func controlLight(brightness: Double, colorTemperature: String) -> JSONObject { + return ["brightness": .number(brightness), "colorTemperature": .string(colorTemperature)] + } + + let generativeModel = + GenerativeModel( + // Use a model that supports function calling, like a Gemini 1.5 model + name: "gemini-1.5-flash", + // Access your API key from your on-demand resource .plist file (see "Set up your API key" + // above) + apiKey: APIKey.default, + tools: [Tool(functionDeclarations: [ + FunctionDeclaration( + name: "controlLight", + description: "Set the brightness and color temperature of a room light.", + parameters: [ + "brightness": Schema( + type: .number, + format: "double", + description: "Light level from 0 to 100. Zero is off and 100 is full brightness." + ), + "colorTemperature": Schema( + type: .string, + format: "enum", + description: "Color temperature of the light fixture.", + enumValues: ["daylight", "cool", "warm"] + ), + ], + requiredParameters: ["brightness", "colorTemperature"] + ), + ])] + ) + + let chat = generativeModel.startChat() + + let prompt = "Dim the lights so the room feels cozy and warm." + + // Send the message to the model. + let response1 = try await chat.sendMessage(prompt) + + // Check if the model responded with a function call. + // For simplicity, this sample uses the first function call found. + guard let functionCall = response1.functionCalls.first else { + fatalError("Model did not respond with a function call.") + } + // Print an error if the returned function was not declared + guard functionCall.name == "controlLight" else { + fatalError("Unexpected function called: \(functionCall.name)") + } + // Verify that the names and types of the parameters match the declaration + guard case let .number(brightness) = functionCall.args["brightness"] else { + fatalError("Missing argument: brightness") + } + guard case let .string(colorTemperature) = functionCall.args["colorTemperature"] else { + fatalError("Missing argument: colorTemperature") + } + + // Call the executable function named in the FunctionCall with the arguments specified in the + // FunctionCall and let it call the hypothetical API. + let apiResponse = controlLight(brightness: brightness, colorTemperature: colorTemperature) + + // Send the API response back to the model so it can generate a text response that can be + // displayed to the user. + let response2 = try await chat.sendMessage([ModelContent( + role: "function", + parts: [.functionResponse(FunctionResponse(name: "controlLight", response: apiResponse))] + )]) + + if let text = response2.text { + print(text) + } + // [END function_calling] + } +} From ce5715a83cee78022613e7e24d86fb958f444391 Mon Sep 17 00:00:00 2001 From: Andrew Heard Date: Fri, 12 Jul 2024 21:28:31 -0400 Subject: [PATCH 07/16] Add code snippets for controlled generation (#188) --- samples/ControlledGeneration.swift | 89 ++++++++++++++++++++++++++++++ 1 file changed, 89 insertions(+) create mode 100644 samples/ControlledGeneration.swift diff --git a/samples/ControlledGeneration.swift b/samples/ControlledGeneration.swift new file mode 100644 index 0000000..2fdb4d0 --- /dev/null +++ b/samples/ControlledGeneration.swift @@ -0,0 +1,89 @@ +// Copyright 2024 Google LLC +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +import GoogleGenerativeAI +import XCTest + +// Set up your API Key +// ==================== +// To use the Gemini API, you'll need an API key. To learn more, see the "Set up your API Key" +// section in the Gemini API quickstart: +// https://ai.google.dev/gemini-api/docs/quickstart?lang=swift#set-up-api-key + +@available(iOS 15.0, macCatalyst 15.0, *) +final class ControlledGenerationSnippets: XCTestCase { + override func setUpWithError() throws { + try XCTSkipIf( + APIKey.default.isEmpty, + "`\(APIKey.apiKeyEnvVar)` environment variable not set." + ) + } + + func testJSONControlledGeneration() async throws { + // [START json_controlled_generation] + let jsonSchema = Schema( + type: .array, + description: "List of recipes", + items: Schema( + type: .object, + properties: [ + "recipeName": Schema(type: .string, description: "Name of the recipe", nullable: false), + ], + requiredProperties: ["recipeName"] + ) + ) + + let generativeModel = GenerativeModel( + // Specify a model that supports controlled generation like Gemini 1.5 Pro + name: "gemini-1.5-pro", + // Access your API key from your on-demand resource .plist file (see "Set up your API key" + // above) + apiKey: APIKey.default, + generationConfig: GenerationConfig( + responseMIMEType: "application/json", + responseSchema: jsonSchema + ) + ) + + let prompt = "List a few popular cookie recipes." + let response = try await generativeModel.generateContent(prompt) + if let text = response.text { + print(text) + } + // [END json_controlled_generation] + } + + func testJSONNoSchema() async throws { + // [START json_no_schema] + let generativeModel = GenerativeModel( + name: "gemini-1.5-flash", + // Access your API key from your on-demand resource .plist file (see "Set up your API key" + // above) + apiKey: APIKey.default, + generationConfig: GenerationConfig(responseMIMEType: "application/json") + ) + + let prompt = """ + List a few popular cookie recipes using this JSON schema: + + Recipe = {'recipeName': string} + Return: Array + """ + let response = try await generativeModel.generateContent(prompt) + if let text = response.text { + print(text) + } + // [END json_no_schema] + } +} From 124584b089009c19a19c9e90de8c49a831a29ddf Mon Sep 17 00:00:00 2001 From: Andrew Heard Date: Fri, 12 Jul 2024 21:29:03 -0400 Subject: [PATCH 08/16] Add count tokens snippets for system instructions and tools (#189) --- samples/CountTokens.swift | 57 +++++++++++++++++++++++++++++++++++++++ 1 file changed, 57 insertions(+) diff --git a/samples/CountTokens.swift b/samples/CountTokens.swift index 2bb5a49..7ec3d87 100644 --- a/samples/CountTokens.swift +++ b/samples/CountTokens.swift @@ -98,4 +98,61 @@ final class CountTokensSnippets: XCTestCase { // [END tokens_multimodal_image_inline] } #endif // canImport(UIKit) + + func testCountTokensSystemInstruction() async throws { + // [START tokens_system_instruction] + let generativeModel = + GenerativeModel( + // Specify a model that supports system instructions, like a Gemini 1.5 model + name: "gemini-1.5-flash", + // Access your API key from your on-demand resource .plist file (see "Set up your API key" + // above) + apiKey: APIKey.default, + systemInstruction: ModelContent(role: "system", parts: "You are a cat. Your name is Neko.") + ) + + let prompt = "What is your name?" + + let response = try await generativeModel.countTokens(prompt) + print("Total Tokens: \(response.totalTokens)") + // [END tokens_system_instruction] + } + + func testCountTokensTools() async throws { + // [START tokens_tools] + let generativeModel = + GenerativeModel( + // Specify a model that supports system instructions, like a Gemini 1.5 model + name: "gemini-1.5-flash", + // Access your API key from your on-demand resource .plist file (see "Set up your API key" + // above) + apiKey: APIKey.default, + tools: [Tool(functionDeclarations: [ + FunctionDeclaration( + name: "controlLight", + description: "Set the brightness and color temperature of a room light.", + parameters: [ + "brightness": Schema( + type: .number, + format: "double", + description: "Light level from 0 to 100. Zero is off and 100 is full brightness." + ), + "colorTemperature": Schema( + type: .string, + format: "enum", + description: "Color temperature of the light fixture.", + enumValues: ["daylight", "cool", "warm"] + ), + ], + requiredParameters: ["brightness", "colorTemperature"] + ), + ])] + ) + + let prompt = "Dim the lights so the room feels cozy and warm." + + let response = try await generativeModel.countTokens(prompt) + print("Total Tokens: \(response.totalTokens)") + // [END tokens_tools] + } } From 7d3e4883be6f431ad57a49ee2c0376f95a74b28c Mon Sep 17 00:00:00 2001 From: Mark Daoust Date: Wed, 17 Jul 2024 15:07:22 -0700 Subject: [PATCH 09/16] Fix region tag. (#191) --- samples/FunctionCalling.swift | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/samples/FunctionCalling.swift b/samples/FunctionCalling.swift index 020d5d4..d4fe423 100644 --- a/samples/FunctionCalling.swift +++ b/samples/FunctionCalling.swift @@ -31,7 +31,7 @@ final class FunctionCallingSnippets: XCTestCase { } func testFunctionCalling() async throws { - // [BEGIN function_calling] + // [START function_calling] // Calls a hypothetical API to control a light bulb and returns the values that were set. func controlLight(brightness: Double, colorTemperature: String) -> JSONObject { return ["brightness": .number(brightness), "colorTemperature": .string(colorTemperature)] From f012e913819b103b9e3d911a5355c590860f53ff Mon Sep 17 00:00:00 2001 From: Andrew Heard Date: Mon, 29 Jul 2024 12:34:46 -0400 Subject: [PATCH 10/16] Update Xcode version in CLI workflow (#194) --- .github/workflows/cli.yml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/.github/workflows/cli.yml b/.github/workflows/cli.yml index 857c46c..b7344ce 100644 --- a/.github/workflows/cli.yml +++ b/.github/workflows/cli.yml @@ -18,7 +18,7 @@ jobs: os: [macos-13] include: - os: macos-13 - xcode: Xcode_15.0.1 + xcode: Xcode_15.1 runs-on: ${{ matrix.os }} steps: - uses: actions/checkout@v4 From 672fdb723da8d1af24bb8d3a2e53afbd1e30bf6b Mon Sep 17 00:00:00 2001 From: Andrew Heard Date: Wed, 31 Jul 2024 18:01:48 -0400 Subject: [PATCH 11/16] Fix caution callout in README (#197) --- README.md | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/README.md b/README.md index 2704c01..89565e6 100644 --- a/README.md +++ b/README.md @@ -10,11 +10,12 @@ the Gemini API. The Gemini API gives you access to Gemini Gemini models are built from the ground up to be multimodal, so you can reason seamlessly across text, images, and code. -> [!CAUTION] **The Google AI SDK for Swift is recommended for prototyping -> only.** If you plan to enable billing, we strongly recommend that you use a -> backend SDK to access the Google AI Gemini API. You risk potentially exposing -> your API key to malicious actors if you embed your API key directly in your -> Swift app or fetch it remotely at runtime. +> [!CAUTION] +> **The Google AI SDK for Swift is recommended for prototyping only.** If you +> plan to enable billing, we strongly recommend that you use a backend SDK to +> access the Google AI Gemini API. You risk potentially exposing your API key to +> malicious actors if you embed your API key directly in your Swift app or fetch +> it remotely at runtime. ## Get started with the Gemini API From 2bf9fe511480f6ea8bbd792d4b207f85815257bd Mon Sep 17 00:00:00 2001 From: Andrew Heard Date: Thu, 1 Aug 2024 10:40:18 -0400 Subject: [PATCH 12/16] Add code execution support (#196) --- .../ViewModels/FunctionCallingViewModel.swift | 2 +- Sources/GoogleAI/Chat.swift | 3 +- Sources/GoogleAI/FunctionCalling.swift | 85 ++++++++- .../GoogleAI/GenerateContentResponse.swift | 21 ++- Sources/GoogleAI/ModelContent.swift | 19 ++ Tests/GoogleAITests/CodeExecutionTests.swift | 154 ++++++++++++++++ .../GenerateContentRequestTests.swift | 20 +- .../GenerateContentResponseTests.swift | 173 ++++++++++++++++++ .../streaming-success-code-execution.txt | 16 ++ .../unary-success-code-execution.json | 54 ++++++ .../GoogleAITests/GenerativeModelTests.swift | 109 ++++++++++- 11 files changed, 644 insertions(+), 12 deletions(-) create mode 100644 Tests/GoogleAITests/CodeExecutionTests.swift create mode 100644 Tests/GoogleAITests/GenerateContentResponseTests.swift create mode 100644 Tests/GoogleAITests/GenerateContentResponses/streaming-success-code-execution.txt create mode 100644 Tests/GoogleAITests/GenerateContentResponses/unary-success-code-execution.json diff --git a/Examples/GenerativeAISample/FunctionCallingSample/ViewModels/FunctionCallingViewModel.swift b/Examples/GenerativeAISample/FunctionCallingSample/ViewModels/FunctionCallingViewModel.swift index 7ebb821..8d65c5f 100644 --- a/Examples/GenerativeAISample/FunctionCallingSample/ViewModels/FunctionCallingViewModel.swift +++ b/Examples/GenerativeAISample/FunctionCallingSample/ViewModels/FunctionCallingViewModel.swift @@ -157,7 +157,7 @@ class FunctionCallingViewModel: ObservableObject { case let .functionCall(functionCall): messages.insert(functionCall.chatMessage(), at: messages.count - 1) functionCalls.append(functionCall) - case .data, .fileData, .functionResponse: + case .data, .fileData, .functionResponse, .executableCode, .codeExecutionResult: fatalError("Unsupported response content.") } } diff --git a/Sources/GoogleAI/Chat.swift b/Sources/GoogleAI/Chat.swift index 6549df4..5f8dddf 100644 --- a/Sources/GoogleAI/Chat.swift +++ b/Sources/GoogleAI/Chat.swift @@ -160,7 +160,8 @@ public class Chat { case let .text(str): combinedText += str - case .data, .fileData, .functionCall, .functionResponse: + case .data, .fileData, .functionCall, .functionResponse, .executableCode, + .codeExecutionResult: // Don't combine it, just add to the content. If there's any text pending, add that as // a part. if !combinedText.isEmpty { diff --git a/Sources/GoogleAI/FunctionCalling.swift b/Sources/GoogleAI/FunctionCalling.swift index 57130eb..159c8b4 100644 --- a/Sources/GoogleAI/FunctionCalling.swift +++ b/Sources/GoogleAI/FunctionCalling.swift @@ -161,6 +161,9 @@ public struct Tool { /// A list of `FunctionDeclarations` available to the model. let functionDeclarations: [FunctionDeclaration]? + /// Enables the model to execute code as part of generation. + let codeExecution: CodeExecution? + /// Constructs a new `Tool`. /// /// - Parameters: @@ -172,8 +175,11 @@ public struct Tool { /// populating ``FunctionCall`` in the response. The next conversation turn may contain a /// ``FunctionResponse`` in ``ModelContent/Part/functionResponse(_:)`` with the /// ``ModelContent/role`` "function", providing generation context for the next model turn. - public init(functionDeclarations: [FunctionDeclaration]?) { + /// - codeExecution: Enables the model to execute code as part of generation, if provided. + public init(functionDeclarations: [FunctionDeclaration]? = nil, + codeExecution: CodeExecution? = nil) { self.functionDeclarations = functionDeclarations + self.codeExecution = codeExecution } } @@ -244,6 +250,55 @@ public struct FunctionResponse: Equatable { } } +/// Tool that executes code generated by the model, automatically returning the result to the model. +/// +/// This type has no fields. See ``ExecutableCode`` and ``CodeExecutionResult``, which are only +/// generated when using this tool. +public struct CodeExecution { + /// Constructs a new `CodeExecution` tool. + public init() {} +} + +/// Code generated by the model that is meant to be executed, and the result returned to the model. +/// +/// Only generated when using the ``CodeExecution`` tool, in which case the code will automatically +/// be executed, and a corresponding ``CodeExecutionResult`` will also be generated. +public struct ExecutableCode: Equatable { + /// The programming language of the ``code``. + public let language: String + + /// The code to be executed. + public let code: String +} + +/// Result of executing the ``ExecutableCode``. +/// +/// Only generated when using the ``CodeExecution`` tool, and always follows a part containing the +/// ``ExecutableCode``. +public struct CodeExecutionResult: Equatable { + /// Possible outcomes of the code execution. + public enum Outcome: String { + /// An unrecognized code execution outcome was provided. + case unknown = "OUTCOME_UNKNOWN" + /// Unspecified status; this value should not be used. + case unspecified = "OUTCOME_UNSPECIFIED" + /// Code execution completed successfully. + case ok = "OUTCOME_OK" + /// Code execution finished but with a failure; ``CodeExecutionResult/output`` should contain + /// the failure details from `stderr`. + case failed = "OUTCOME_FAILED" + /// Code execution ran for too long, and was cancelled. There may or may not be a partial + /// ``CodeExecutionResult/output`` present. + case deadlineExceeded = "OUTCOME_DEADLINE_EXCEEDED" + } + + /// Outcome of the code execution. + public let outcome: Outcome + + /// Contains `stdout` when code execution is successful, `stderr` or other description otherwise. + public let output: String +} + // MARK: - Codable Conformance extension FunctionCall: Decodable { @@ -293,3 +348,31 @@ extension FunctionCallingConfig.Mode: Encodable {} extension ToolConfig: Encodable {} extension FunctionResponse: Encodable {} + +extension CodeExecution: Encodable {} + +extension ExecutableCode: Codable {} + +@available(iOS 15.0, macOS 11.0, macCatalyst 15.0, *) +extension CodeExecutionResult.Outcome: Codable { + public init(from decoder: any Decoder) throws { + let value = try decoder.singleValueContainer().decode(String.self) + guard let decodedOutcome = CodeExecutionResult.Outcome(rawValue: value) else { + Logging.default + .error("[GoogleGenerativeAI] Unrecognized Outcome with value \"\(value)\".") + self = .unknown + return + } + + self = decodedOutcome + } +} + +@available(iOS 15.0, macOS 11.0, macCatalyst 15.0, *) +extension CodeExecutionResult: Codable { + public init(from decoder: any Decoder) throws { + let container = try decoder.container(keyedBy: CodingKeys.self) + outcome = try container.decode(Outcome.self, forKey: .outcome) + output = try container.decodeIfPresent(String.self, forKey: .output) ?? "" + } +} diff --git a/Sources/GoogleAI/GenerateContentResponse.swift b/Sources/GoogleAI/GenerateContentResponse.swift index 04c41f7..44083c4 100644 --- a/Sources/GoogleAI/GenerateContentResponse.swift +++ b/Sources/GoogleAI/GenerateContentResponse.swift @@ -46,16 +46,31 @@ public struct GenerateContentResponse { return nil } let textValues: [String] = candidate.content.parts.compactMap { part in - guard case let .text(text) = part else { + switch part { + case let .text(text): + return text + case let .executableCode(executableCode): + let codeBlockLanguage: String + if executableCode.language == "LANGUAGE_UNSPECIFIED" { + codeBlockLanguage = "" + } else { + codeBlockLanguage = executableCode.language.lowercased() + } + return "```\(codeBlockLanguage)\n\(executableCode.code)\n```" + case let .codeExecutionResult(codeExecutionResult): + if codeExecutionResult.output.isEmpty { + return nil + } + return "```\n\(codeExecutionResult.output)\n```" + case .data, .fileData, .functionCall, .functionResponse: return nil } - return text } guard textValues.count > 0 else { Logging.default.error("Could not get a text part from the first candidate.") return nil } - return textValues.joined(separator: " ") + return textValues.joined(separator: "\n") } /// Returns function calls found in any `Part`s of the first candidate of the response, if any. diff --git a/Sources/GoogleAI/ModelContent.swift b/Sources/GoogleAI/ModelContent.swift index 979c406..59bf1be 100644 --- a/Sources/GoogleAI/ModelContent.swift +++ b/Sources/GoogleAI/ModelContent.swift @@ -51,6 +51,12 @@ public struct ModelContent: Equatable { /// A response to a function call. case functionResponse(FunctionResponse) + /// Code generated by the model that is meant to be executed. + case executableCode(ExecutableCode) + + /// Result of executing the ``ExecutableCode``. + case codeExecutionResult(CodeExecutionResult) + // MARK: Convenience Initializers /// Convenience function for populating a Part with JPEG data. @@ -129,6 +135,8 @@ extension ModelContent.Part: Codable { case fileData case functionCall case functionResponse + case executableCode + case codeExecutionResult } enum InlineDataKeys: String, CodingKey { @@ -164,6 +172,10 @@ extension ModelContent.Part: Codable { try container.encode(functionCall, forKey: .functionCall) case let .functionResponse(functionResponse): try container.encode(functionResponse, forKey: .functionResponse) + case let .executableCode(executableCode): + try container.encode(executableCode, forKey: .executableCode) + case let .codeExecutionResult(codeExecutionResult): + try container.encode(codeExecutionResult, forKey: .codeExecutionResult) } } @@ -181,6 +193,13 @@ extension ModelContent.Part: Codable { self = .data(mimetype: mimetype, bytes) } else if values.contains(.functionCall) { self = try .functionCall(values.decode(FunctionCall.self, forKey: .functionCall)) + } else if values.contains(.executableCode) { + self = try .executableCode(values.decode(ExecutableCode.self, forKey: .executableCode)) + } else if values.contains(.codeExecutionResult) { + self = try .codeExecutionResult(values.decode( + CodeExecutionResult.self, + forKey: .codeExecutionResult + )) } else { throw DecodingError.dataCorrupted(.init( codingPath: [CodingKeys.text, CodingKeys.inlineData], diff --git a/Tests/GoogleAITests/CodeExecutionTests.swift b/Tests/GoogleAITests/CodeExecutionTests.swift new file mode 100644 index 0000000..2818fe6 --- /dev/null +++ b/Tests/GoogleAITests/CodeExecutionTests.swift @@ -0,0 +1,154 @@ +// Copyright 2024 Google LLC +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +import XCTest + +@testable import GoogleGenerativeAI + +@available(iOS 15.0, macOS 11.0, macCatalyst 15.0, *) +final class CodeExecutionTests: XCTestCase { + let decoder = JSONDecoder() + let encoder = JSONEncoder() + + let languageKey = "language" + let languageValue = "PYTHON" + let codeKey = "code" + let codeValue = "print('Hello, world!')" + let outcomeKey = "outcome" + let outcomeValue = "OUTCOME_OK" + let outputKey = "output" + let outputValue = "Hello, world!" + + override func setUp() { + encoder.outputFormatting = .init( + arrayLiteral: .prettyPrinted, .sortedKeys, .withoutEscapingSlashes + ) + } + + func testEncodeCodeExecution() throws { + let jsonData = try encoder.encode(CodeExecution()) + + let json = try XCTUnwrap(String(data: jsonData, encoding: .utf8)) + XCTAssertEqual(json, """ + { + + } + """) + } + + func testDecodeExecutableCode() throws { + let expectedExecutableCode = ExecutableCode(language: languageValue, code: codeValue) + let json = """ + { + "\(languageKey)": "\(languageValue)", + "\(codeKey)": "\(codeValue)" + } + """ + let jsonData = try XCTUnwrap(json.data(using: .utf8)) + + let executableCode = try XCTUnwrap(decoder.decode(ExecutableCode.self, from: jsonData)) + + XCTAssertEqual(executableCode, expectedExecutableCode) + } + + func testEncodeExecutableCode() throws { + let executableCode = ExecutableCode(language: languageValue, code: codeValue) + + let jsonData = try encoder.encode(executableCode) + + let json = try XCTUnwrap(String(data: jsonData, encoding: .utf8)) + XCTAssertEqual(json, """ + { + "\(codeKey)" : "\(codeValue)", + "\(languageKey)" : "\(languageValue)" + } + """) + } + + func testDecodeCodeExecutionResultOutcome_ok() throws { + let expectedOutcome = CodeExecutionResult.Outcome.ok + let json = "\"\(outcomeValue)\"" + let jsonData = try XCTUnwrap(json.data(using: .utf8)) + + let outcome = try XCTUnwrap(decoder.decode(CodeExecutionResult.Outcome.self, from: jsonData)) + + XCTAssertEqual(outcome, expectedOutcome) + } + + func testDecodeCodeExecutionResultOutcome_unknown() throws { + let expectedOutcome = CodeExecutionResult.Outcome.unknown + let json = "\"OUTCOME_NEW_VALUE\"" + let jsonData = try XCTUnwrap(json.data(using: .utf8)) + + let outcome = try XCTUnwrap(decoder.decode(CodeExecutionResult.Outcome.self, from: jsonData)) + + XCTAssertEqual(outcome, expectedOutcome) + } + + func testEncodeCodeExecutionResultOutcome() throws { + let jsonData = try encoder.encode(CodeExecutionResult.Outcome.ok) + + let json = try XCTUnwrap(String(data: jsonData, encoding: .utf8)) + XCTAssertEqual(json, "\"\(outcomeValue)\"") + } + + func testDecodeCodeExecutionResult() throws { + let expectedCodeExecutionResult = CodeExecutionResult(outcome: .ok, output: "Hello, world!") + let json = """ + { + "\(outcomeKey)": "\(outcomeValue)", + "\(outputKey)": "\(outputValue)" + } + """ + let jsonData = try XCTUnwrap(json.data(using: .utf8)) + + let codeExecutionResult = try XCTUnwrap(decoder.decode( + CodeExecutionResult.self, + from: jsonData + )) + + XCTAssertEqual(codeExecutionResult, expectedCodeExecutionResult) + } + + func testDecodeCodeExecutionResult_missingOutput() throws { + let expectedCodeExecutionResult = CodeExecutionResult(outcome: .deadlineExceeded, output: "") + let json = """ + { + "\(outcomeKey)": "OUTCOME_DEADLINE_EXCEEDED" + } + """ + let jsonData = try XCTUnwrap(json.data(using: .utf8)) + + let codeExecutionResult = try XCTUnwrap(decoder.decode( + CodeExecutionResult.self, + from: jsonData + )) + + XCTAssertEqual(codeExecutionResult, expectedCodeExecutionResult) + } + + func testEncodeCodeExecutionResult() throws { + let codeExecutionResult = CodeExecutionResult(outcome: .ok, output: outputValue) + + let jsonData = try encoder.encode(codeExecutionResult) + + let json = try XCTUnwrap(String(data: jsonData, encoding: .utf8)) + XCTAssertEqual(json, """ + { + "\(outcomeKey)" : "\(outcomeValue)", + "\(outputKey)" : "\(outputValue)" + } + """) + } +} diff --git a/Tests/GoogleAITests/GenerateContentRequestTests.swift b/Tests/GoogleAITests/GenerateContentRequestTests.swift index a808799..0ef1ac4 100644 --- a/Tests/GoogleAITests/GenerateContentRequestTests.swift +++ b/Tests/GoogleAITests/GenerateContentRequestTests.swift @@ -42,11 +42,16 @@ final class GenerateContentRequestTests: XCTestCase { harmCategory: .dangerousContent, threshold: .blockLowAndAbove )], - tools: [Tool(functionDeclarations: [FunctionDeclaration( - name: "test-function-name", - description: "test-function-description", - parameters: nil - )])], + tools: [ + Tool(functionDeclarations: [ + FunctionDeclaration( + name: "test-function-name", + description: "test-function-description", + parameters: nil + ), + ]), + Tool(codeExecution: CodeExecution()), + ], toolConfig: ToolConfig(functionCallingConfig: FunctionCallingConfig(mode: .auto)), systemInstruction: ModelContent(role: "system", parts: "test-system-instruction"), isStreaming: false, @@ -102,6 +107,11 @@ final class GenerateContentRequestTests: XCTestCase { } } ] + }, + { + "codeExecution" : { + + } } ] } diff --git a/Tests/GoogleAITests/GenerateContentResponseTests.swift b/Tests/GoogleAITests/GenerateContentResponseTests.swift new file mode 100644 index 0000000..46ee8b7 --- /dev/null +++ b/Tests/GoogleAITests/GenerateContentResponseTests.swift @@ -0,0 +1,173 @@ +// Copyright 2024 Google LLC +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +import Foundation +import XCTest + +@testable import GoogleGenerativeAI + +@available(iOS 15.0, macOS 11.0, macCatalyst 15.0, *) +final class GenerateContentResponseTests: XCTestCase { + let testText1 = "test-text-1" + let testText2 = "test-text-2" + let testLanguage = "PYTHON" + let testCode = "print('Hello, world!')" + let testOutput = "Hello, world!" + + override func setUp() {} + + func testText_textPart() throws { + let parts = [ModelContent.Part.text(testText1)] + let candidate = CandidateResponse( + content: ModelContent(role: "model", parts: parts), + safetyRatings: [], + finishReason: nil, + citationMetadata: nil + ) + let response = GenerateContentResponse(candidates: [candidate]) + + let text = try XCTUnwrap(response.text) + + XCTAssertEqual(text, "\(testText1)") + } + + func testText_textParts_concatenated() throws { + let parts = [ModelContent.Part.text(testText1), ModelContent.Part.text(testText2)] + let candidate = CandidateResponse( + content: ModelContent(role: "model", parts: parts), + safetyRatings: [], + finishReason: nil, + citationMetadata: nil + ) + let response = GenerateContentResponse(candidates: [candidate]) + + let text = try XCTUnwrap(response.text) + + XCTAssertEqual(text, """ + \(testText1) + \(testText2) + """) + } + + func testText_executableCodePart_python() throws { + let parts = [ModelContent.Part.executableCode(ExecutableCode( + language: testLanguage, + code: testCode + ))] + let candidate = CandidateResponse( + content: ModelContent(role: "model", parts: parts), + safetyRatings: [], + finishReason: nil, + citationMetadata: nil + ) + let response = GenerateContentResponse(candidates: [candidate]) + + let text = try XCTUnwrap(response.text) + + XCTAssertEqual(text, """ + ```\(testLanguage.lowercased()) + \(testCode) + ``` + """) + } + + func testText_executableCodePart_unspecifiedLanguage() throws { + let parts = [ModelContent.Part.executableCode(ExecutableCode( + language: "LANGUAGE_UNSPECIFIED", + code: "echo $SHELL" + ))] + let candidate = CandidateResponse( + content: ModelContent(role: "model", parts: parts), + safetyRatings: [], + finishReason: nil, + citationMetadata: nil + ) + let response = GenerateContentResponse(candidates: [candidate]) + + let text = try XCTUnwrap(response.text) + + XCTAssertEqual(text, """ + ``` + echo $SHELL + ``` + """) + } + + func testText_codeExecutionResultPart_hasOutput() throws { + let parts = [ModelContent.Part.codeExecutionResult(CodeExecutionResult( + outcome: .ok, + output: testOutput + ))] + let candidate = CandidateResponse( + content: ModelContent(role: "model", parts: parts), + safetyRatings: [], + finishReason: nil, + citationMetadata: nil + ) + let response = GenerateContentResponse(candidates: [candidate]) + + let text = try XCTUnwrap(response.text) + + XCTAssertEqual(text, """ + ``` + \(testOutput) + ``` + """) + } + + func testText_codeExecutionResultPart_emptyOutput() throws { + let parts = [ModelContent.Part.codeExecutionResult(CodeExecutionResult( + outcome: .deadlineExceeded, + output: "" + ))] + let candidate = CandidateResponse( + content: ModelContent(role: "model", parts: parts), + safetyRatings: [], + finishReason: nil, + citationMetadata: nil + ) + let response = GenerateContentResponse(candidates: [candidate]) + + XCTAssertNil(response.text) + } + + func testText_codeExecution_concatenated() throws { + let parts: [ModelContent.Part] = [ + .text("test-text-1"), + .executableCode(ExecutableCode(language: testLanguage, code: testCode)), + .codeExecutionResult(CodeExecutionResult(outcome: .ok, output: testOutput)), + .text("test-text-2"), + ] + let candidate = CandidateResponse( + content: ModelContent(role: "model", parts: parts), + safetyRatings: [], + finishReason: nil, + citationMetadata: nil + ) + let response = GenerateContentResponse(candidates: [candidate]) + + let text = try XCTUnwrap(response.text) + + XCTAssertEqual(text, """ + \(testText1) + ```\(testLanguage.lowercased()) + \(testCode) + ``` + ``` + \(testOutput) + ``` + \(testText2) + """) + } +} diff --git a/Tests/GoogleAITests/GenerateContentResponses/streaming-success-code-execution.txt b/Tests/GoogleAITests/GenerateContentResponses/streaming-success-code-execution.txt new file mode 100644 index 0000000..24c9ef6 --- /dev/null +++ b/Tests/GoogleAITests/GenerateContentResponses/streaming-success-code-execution.txt @@ -0,0 +1,16 @@ +data: {"candidates": [{"content": {"parts": [{"text": "Thoughts"}],"role": "model"},"finishReason": "STOP","index": 0}],"usageMetadata": {"promptTokenCount": 21,"candidatesTokenCount": 1,"totalTokenCount": 22}} + +data: {"candidates": [{"content": {"parts": [{"text": ": I can use the `print()` function in Python to print strings. "}],"role": "model"},"finishReason": "STOP","index": 0,"safetyRatings": [{"category": "HARM_CATEGORY_SEXUALLY_EXPLICIT","probability": "NEGLIGIBLE"},{"category": "HARM_CATEGORY_HATE_SPEECH","probability": "NEGLIGIBLE"},{"category": "HARM_CATEGORY_HARASSMENT","probability": "NEGLIGIBLE"},{"category": "HARM_CATEGORY_DANGEROUS_CONTENT","probability": "NEGLIGIBLE"}]}],"usageMetadata": {"promptTokenCount": 21,"candidatesTokenCount": 16,"totalTokenCount": 37}} + +data: {"candidates": [{"content": {"parts": [{"text": "\n\n"}],"role": "model"},"finishReason": "STOP","index": 0,"safetyRatings": [{"category": "HARM_CATEGORY_SEXUALLY_EXPLICIT","probability": "NEGLIGIBLE"},{"category": "HARM_CATEGORY_HATE_SPEECH","probability": "NEGLIGIBLE"},{"category": "HARM_CATEGORY_HARASSMENT","probability": "NEGLIGIBLE"},{"category": "HARM_CATEGORY_DANGEROUS_CONTENT","probability": "NEGLIGIBLE"}]}],"usageMetadata": {"promptTokenCount": 21,"candidatesTokenCount": 16,"totalTokenCount": 37}} + +data: {"candidates": [{"content": {"parts": [{"executableCode": {"language": "PYTHON","code": "\nprint(\"Hello, world!\")\n"}}],"role": "model"},"finishReason": "STOP","index": 0,"safetyRatings": [{"category": "HARM_CATEGORY_SEXUALLY_EXPLICIT","probability": "NEGLIGIBLE"},{"category": "HARM_CATEGORY_HATE_SPEECH","probability": "NEGLIGIBLE"},{"category": "HARM_CATEGORY_HARASSMENT","probability": "NEGLIGIBLE"},{"category": "HARM_CATEGORY_DANGEROUS_CONTENT","probability": "NEGLIGIBLE"}]}],"usageMetadata": {"promptTokenCount": 21,"candidatesTokenCount": 29,"totalTokenCount": 50}} + +data: {"candidates": [{"content": {"parts": [{"codeExecutionResult": {"outcome": "OUTCOME_OK","output": "Hello, world!\n"}}],"role": "model"},"index": 0}],"usageMetadata": {"promptTokenCount": 21,"candidatesTokenCount": 29,"totalTokenCount": 50}} + +data: {"candidates": [{"content": {"parts": [{"text": "OK"}],"role": "model"},"finishReason": "STOP","index": 0}],"usageMetadata": {"promptTokenCount": 21,"candidatesTokenCount": 1,"totalTokenCount": 22}} + +data: {"candidates": [{"content": {"parts": [{"text": ". I have printed \"Hello, world!\" using the `print()` function in"}],"role": "model"},"finishReason": "STOP","index": 0,"safetyRatings": [{"category": "HARM_CATEGORY_SEXUALLY_EXPLICIT","probability": "NEGLIGIBLE"},{"category": "HARM_CATEGORY_HATE_SPEECH","probability": "NEGLIGIBLE"},{"category": "HARM_CATEGORY_HARASSMENT","probability": "NEGLIGIBLE"},{"category": "HARM_CATEGORY_DANGEROUS_CONTENT","probability": "NEGLIGIBLE"}]}],"usageMetadata": {"promptTokenCount": 21,"candidatesTokenCount": 17,"totalTokenCount": 38}} + +data: {"candidates": [{"content": {"parts": [{"text": " Python. \n"}],"role": "model"},"finishReason": "STOP","index": 0,"safetyRatings": [{"category": "HARM_CATEGORY_SEXUALLY_EXPLICIT","probability": "NEGLIGIBLE"},{"category": "HARM_CATEGORY_HATE_SPEECH","probability": "NEGLIGIBLE"},{"category": "HARM_CATEGORY_HARASSMENT","probability": "NEGLIGIBLE"},{"category": "HARM_CATEGORY_DANGEROUS_CONTENT","probability": "NEGLIGIBLE"}]}],"usageMetadata": {"promptTokenCount": 21,"candidatesTokenCount": 19,"totalTokenCount": 40}} + diff --git a/Tests/GoogleAITests/GenerateContentResponses/unary-success-code-execution.json b/Tests/GoogleAITests/GenerateContentResponses/unary-success-code-execution.json new file mode 100644 index 0000000..0b5a955 --- /dev/null +++ b/Tests/GoogleAITests/GenerateContentResponses/unary-success-code-execution.json @@ -0,0 +1,54 @@ +{ + "candidates": [ + { + "content": { + "parts": [ + { + "text": "To print strings in Python, you use the `print()` function. Here's how you can print \"Hello, world!\":\n\n" + }, + { + "executableCode": { + "language": "PYTHON", + "code": "\nprint(\"Hello, world!\")\n" + } + }, + { + "codeExecutionResult": { + "outcome": "OUTCOME_OK", + "output": "Hello, world!\n" + } + }, + { + "text": "The code successfully prints the string \"Hello, world!\". \n" + } + ], + "role": "model" + }, + "finishReason": "STOP", + "index": 0, + "safetyRatings": [ + { + "category": "HARM_CATEGORY_SEXUALLY_EXPLICIT", + "probability": "NEGLIGIBLE" + }, + { + "category": "HARM_CATEGORY_HATE_SPEECH", + "probability": "NEGLIGIBLE" + }, + { + "category": "HARM_CATEGORY_HARASSMENT", + "probability": "NEGLIGIBLE" + }, + { + "category": "HARM_CATEGORY_DANGEROUS_CONTENT", + "probability": "NEGLIGIBLE" + } + ] + } + ], + "usageMetadata": { + "promptTokenCount": 21, + "candidatesTokenCount": 11, + "totalTokenCount": 32 + } +} diff --git a/Tests/GoogleAITests/GenerativeModelTests.swift b/Tests/GoogleAITests/GenerativeModelTests.swift index 5a20343..3dbe7cd 100644 --- a/Tests/GoogleAITests/GenerativeModelTests.swift +++ b/Tests/GoogleAITests/GenerativeModelTests.swift @@ -285,7 +285,61 @@ final class GenerativeModelTests: XCTestCase { let functionCalls = response.functionCalls XCTAssertEqual(functionCalls.count, 2) let text = try XCTUnwrap(response.text) - XCTAssertEqual(text, "The sum of [1, 2, 3] is") + XCTAssertEqual(text, "The sum of [1, 2,\n3] is") + } + + func testGenerateContent_success_codeExecution() async throws { + MockURLProtocol + .requestHandler = try httpRequestHandler( + forResource: "unary-success-code-execution", + withExtension: "json" + ) + let expectedText1 = """ + To print strings in Python, you use the `print()` function. \ + Here's how you can print \"Hello, world!\":\n\n + """ + let expectedText2 = "The code successfully prints the string \"Hello, world!\". \n" + let expectedLanguage = "PYTHON" + let expectedCode = "\nprint(\"Hello, world!\")\n" + let expectedOutput = "Hello, world!\n" + + let response = try await model.generateContent(testPrompt) + + XCTAssertEqual(response.candidates.count, 1) + let candidate = try XCTUnwrap(response.candidates.first) + XCTAssertEqual(candidate.content.parts.count, 4) + guard case let .text(text1) = candidate.content.parts[0] else { + XCTFail("Expected first part to be text.") + return + } + XCTAssertEqual(text1, expectedText1) + guard case let .executableCode(executableCode) = candidate.content.parts[1] else { + XCTFail("Expected second part to be executable code.") + return + } + XCTAssertEqual(executableCode.language, expectedLanguage) + XCTAssertEqual(executableCode.code, expectedCode) + guard case let .codeExecutionResult(codeExecutionResult) = candidate.content.parts[2] else { + XCTFail("Expected second part to be a code execution result.") + return + } + XCTAssertEqual(codeExecutionResult.outcome, .ok) + XCTAssertEqual(codeExecutionResult.output, expectedOutput) + guard case let .text(text2) = candidate.content.parts[3] else { + XCTFail("Expected fourth part to be text.") + return + } + XCTAssertEqual(text2, expectedText2) + XCTAssertEqual(try XCTUnwrap(response.text), """ + \(expectedText1) + ```\(expectedLanguage.lowercased()) + \(expectedCode) + ``` + ``` + \(expectedOutput) + ``` + \(expectedText2) + """) } func testGenerateContent_usageMetadata() async throws { @@ -818,6 +872,59 @@ final class GenerativeModelTests: XCTestCase { })) } + func testGenerateContentStream_success_codeExecution() async throws { + MockURLProtocol + .requestHandler = try httpRequestHandler( + forResource: "streaming-success-code-execution", + withExtension: "txt" + ) + let expectedTexts1 = [ + "Thoughts", + ": I can use the `print()` function in Python to print strings. ", + "\n\n", + ] + let expectedTexts2 = [ + "OK", + ". I have printed \"Hello, world!\" using the `print()` function in", + " Python. \n", + ] + let expectedTexts = Set(expectedTexts1 + expectedTexts2) + let expectedLanguage = "PYTHON" + let expectedCode = "\nprint(\"Hello, world!\")\n" + let expectedOutput = "Hello, world!\n" + + var textValues = [String]() + let stream = model.generateContentStream(testPrompt) + for try await content in stream { + let candidate = try XCTUnwrap(content.candidates.first) + let part = try XCTUnwrap(candidate.content.parts.first) + switch part { + case let .text(textPart): + XCTAssertTrue(expectedTexts.contains(textPart)) + case let .executableCode(executableCode): + XCTAssertEqual(executableCode.language, expectedLanguage) + XCTAssertEqual(executableCode.code, expectedCode) + case let .codeExecutionResult(codeExecutionResult): + XCTAssertEqual(codeExecutionResult.outcome, .ok) + XCTAssertEqual(codeExecutionResult.output, expectedOutput) + default: + XCTFail("Unexpected part type: \(part)") + } + try textValues.append(XCTUnwrap(content.text)) + } + + XCTAssertEqual(textValues.joined(separator: "\n"), """ + \(expectedTexts1.joined(separator: "\n")) + ```\(expectedLanguage.lowercased()) + \(expectedCode) + ``` + ``` + \(expectedOutput) + ``` + \(expectedTexts2.joined(separator: "\n")) + """) + } + func testGenerateContentStream_usageMetadata() async throws { MockURLProtocol .requestHandler = try httpRequestHandler( From 2b8f14cfd158982fe3b2c8a8582ac3af5743742f Mon Sep 17 00:00:00 2001 From: Andrew Heard Date: Thu, 1 Aug 2024 12:06:52 -0400 Subject: [PATCH 13/16] Add snippets for code execution (#198) --- samples/CodeExecution.swift | 80 +++++++++++++++++++++++++++++++++++++ 1 file changed, 80 insertions(+) create mode 100644 samples/CodeExecution.swift diff --git a/samples/CodeExecution.swift b/samples/CodeExecution.swift new file mode 100644 index 0000000..e9c5f88 --- /dev/null +++ b/samples/CodeExecution.swift @@ -0,0 +1,80 @@ +// Copyright 2024 Google LLC +// +// Licensed under the Apache License, Version 2.0 (the "License"); +// you may not use this file except in compliance with the License. +// You may obtain a copy of the License at +// +// http://www.apache.org/licenses/LICENSE-2.0 +// +// Unless required by applicable law or agreed to in writing, software +// distributed under the License is distributed on an "AS IS" BASIS, +// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. +// See the License for the specific language governing permissions and +// limitations under the License. + +import GoogleGenerativeAI +import XCTest + +// Set up your API Key +// ==================== +// To use the Gemini API, you'll need an API key. To learn more, see the "Set up your API Key" +// section in the Gemini API quickstart: +// https://ai.google.dev/gemini-api/docs/quickstart?lang=swift#set-up-api-key + +@available(iOS 15.0, macOS 11.0, macCatalyst 15.0, *) +final class CodeExecutionSnippets: XCTestCase { + override func setUpWithError() throws { + try XCTSkipIf( + APIKey.default.isEmpty, + "`\(APIKey.apiKeyEnvVar)` environment variable not set." + ) + } + + func testCodeExecutionBasic() async throws { + // [START code_execution_basic] + let generativeModel = + GenerativeModel( + // Specify a Gemini model appropriate for your use case + name: "gemini-1.5-flash", + // Access your API key from your on-demand resource .plist file (see + // "Set up your API key" above) + apiKey: APIKey.default, + tools: [Tool(codeExecution: CodeExecution())] + ) + + let prompt = """ + What is the sum of the first 50 prime numbers? + Generate and run code for the calculation, and make sure you get all 50. + """ + let response = try await generativeModel.generateContent(prompt) + if let text = response.text { + print(text) + } + // [END code_execution_basic] + } + + func testCodeExecutionChat() async throws { + // [START code_execution_chat] + let generativeModel = + GenerativeModel( + // Specify a Gemini model appropriate for your use case + name: "gemini-1.5-flash", + // Access your API key from your on-demand resource .plist file (see + // "Set up your API key" above) + apiKey: APIKey.default, + tools: [Tool(codeExecution: CodeExecution())] + ) + + let chat = generativeModel.startChat() + + let prompt = """ + What is the sum of the first 50 prime numbers? + Generate and run code for the calculation, and make sure you get all 50. + """ + let response = try await chat.sendMessage(prompt) + if let text = response.text { + print(text) + } + // [END code_execution_chat] + } +} From 15c12087a97d1991b6d871f43c785e633100ff7f Mon Sep 17 00:00:00 2001 From: Mark McDonald Date: Fri, 16 Aug 2024 10:58:12 +0800 Subject: [PATCH 14/16] Add a README for /samples (#202) --- samples/README.md | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) create mode 100644 samples/README.md diff --git a/samples/README.md b/samples/README.md new file mode 100644 index 0000000..86c9335 --- /dev/null +++ b/samples/README.md @@ -0,0 +1,22 @@ +# Gemini API Swift SDK sample code + +This directory contains sample code for key features of the SDK, organised by high level feature. + +These samples are embedded in parts of the [documentation](https://ai.google.dev), most notably in the [API reference](https://ai.google.dev/api). + +Each file is structured as a runnable test case, ensuring that samples are executable and functional. Each test demonstrates a single concept, and contains region tags that are used to demarcate the test scaffolding from the spotlight code. If you are contributing, code within region tags should follow sample code best practices - being clear, complete and concise. + +## Contents + +| File | Description | +| ---- | ----------- | +| [APIKey.swift](./APIKey.swift) | Setting up your API key | +| [ChatSnippets.swift](./ChatSnippets.swift) | Multi-turn chat conversations | +| [CodeExecution.swift](./CodeExecution.swift) | Executing code | +| [ControlledGeneration.swift](./ControlledGeneration.swift) | Generating content with output constraints (e.g. JSON mode) | +| [CountTokens.swift](./CountTokens.swift) | Counting input and output tokens | +| [FunctionCalling.swift](./FunctionCalling.swift) | Using function calling | +| [GenerationConfig.swift](./GenerationConfig.swift) | Setting model parameters | +| [SafetySettings.swift](./SafetySettings.swift) | Setting and using safety controls | +| [SystemInstructions.swift](./SystemInstructions.swift) | Setting system instructions | +| [TextGeneration.swift](./TextGeneration.swift) | Generating text | From c94127c3baad2b55a45793aa42ffefbdeee28ee1 Mon Sep 17 00:00:00 2001 From: Andrew Heard Date: Wed, 21 Aug 2024 13:11:28 -0400 Subject: [PATCH 15/16] Update `main` branch README for PaLM API decommissioning (#205) --- README.md | 41 +++++++++++++++++------------------------ 1 file changed, 17 insertions(+), 24 deletions(-) diff --git a/README.md b/README.md index 89565e6..169e582 100644 --- a/README.md +++ b/README.md @@ -98,31 +98,24 @@ See [Contributing](https://github.com/google/generative-ai-swift/blob/main/docs/CONTRIBUTING.md) for more information on contributing to the Google AI SDK for Swift. -## Developers who use the PaLM SDK for Swift (Deprecated) - -> [!IMPORTANT] The PaLM API is deprecated for use with Google AI services and -> tools (but *not* for Vertex AI). Learn more about this deprecation, its -> timeline, and how to migrate to use Gemini in the -> [PaLM API deprecation guide](http://ai.google.dev/palm_docs/deprecation). - -​​If you're using the PaLM SDK for Swift, review the information below to -continue using the **deprecated** PaLM SDK until you've migrated to the new -version that allows you to use Gemini. - -- To continue using PaLM models, make sure your app depends on version - [`0.3.0`](https://github.com/google/generative-ai-swift/releases/tag/0.3.0) - *up to* the next minor version - ([`0.4.0`](https://github.com/google/generative-ai-swift/releases/tag/0.4.0)) - of `generative-ai-swift`. - -- When you're ready to use Gemini models, migrate your code to the Gemini API - and update your app's `generative-ai-swift` dependency to version `0.4.0` or - higher. - -To see the PaLM documentation and code, go to the -[`palm` branch](https://github.com/google/generative-ai-swift/tree/palm). +## Developers who use the PaLM SDK for Swift (Decommissioned) + +> [!IMPORTANT] +> The PaLM API is now +> [decommissioned](https://ai.google.dev/palm_docs/deprecation). This means that +> users cannot use a PaLM model in a prompt, tune a new PaLM model, or run +> inference on PaLM-tuned models. +> +> Note: This is different from the +> [Vertex AI PaLM API](https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/text), +> which is scheduled to be decommissioned in October 2024. + +​​If you're using the PaLM SDK for Swift, migrate your code to the Gemini API +and update your app's `generative-ai-swift` dependency to version `0.4.0` or +higher. For more information on migrating from PaLM to Gemini, see the +[migration guide](https://ai.google.dev/docs/migration_guide). ## License The contents of this repository are licensed under the -[Apache License, version 2.0](http://www.apache.org/licenses/LICENSE-2.0). \ No newline at end of file +[Apache License, version 2.0](http://www.apache.org/licenses/LICENSE-2.0). From 44b8ce120425f9cf53ca756f3434ca2c2696f8bd Mon Sep 17 00:00:00 2001 From: Anup D'Souza <103429618+anupdsouza@users.noreply.github.com> Date: Thu, 22 Aug 2024 21:48:49 +0530 Subject: [PATCH 16/16] Update version to 0.5.6 (#206) --- Sources/GoogleAI/GenerativeAISwift.swift | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/Sources/GoogleAI/GenerativeAISwift.swift b/Sources/GoogleAI/GenerativeAISwift.swift index 0eb6253..910881e 100644 --- a/Sources/GoogleAI/GenerativeAISwift.swift +++ b/Sources/GoogleAI/GenerativeAISwift.swift @@ -21,7 +21,7 @@ import Foundation @available(iOS 15.0, macOS 11.0, macCatalyst 15.0, *) public enum GenerativeAISwift { /// String value of the SDK version - public static let version = "0.5.4" + public static let version = "0.5.6" /// The Google AI backend endpoint URL. static let baseURL = "https://generativelanguage.googleapis.com" }