Skip to content

Commit

Permalink
merged upstream changes
Browse files Browse the repository at this point in the history
  • Loading branch information
NatashaTheRobot committed Sep 28, 2024
2 parents c25a62a + 44b8ce1 commit ea6cffd
Show file tree
Hide file tree
Showing 26 changed files with 1,728 additions and 90 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/cli.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ jobs:
os: [macos-13]
include:
- os: macos-13
xcode: Xcode_15.0.1
xcode: Xcode_15.1
runs-on: ${{ matrix.os }}
steps:
- uses: actions/checkout@v4
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -157,7 +157,7 @@ class FunctionCallingViewModel: ObservableObject {
case let .functionCall(functionCall):
messages.insert(functionCall.chatMessage(), at: messages.count - 1)
functionCalls.append(functionCall)
case .data, .fileData, .functionResponse:
case .data, .fileData, .functionResponse, .executableCode, .codeExecutionResult:
fatalError("Unsupported response content.")
}
}
Expand Down
162 changes: 87 additions & 75 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,109 +1,121 @@
# Google AI SDK for Swift

[![](https://img.shields.io/endpoint?url=https%3A%2F%2Fswiftpackageindex.com%2Fapi%2Fpackages%2Fgoogle-gemini%2Fgenerative-ai-swift%2Fbadge%3Ftype%3Dswift-versions)](https://swiftpackageindex.com/google-gemini/generative-ai-swift)
[![](https://img.shields.io/endpoint?url=https%3A%2F%2Fswiftpackageindex.com%2Fapi%2Fpackages%2Fgoogle-gemini%2Fgenerative-ai-swift%2Fbadge%3Ftype%3Dplatforms)](https://swiftpackageindex.com/google-gemini/generative-ai-swift)

# Google AI SDK for Swift
The Google AI Swift SDK is the easiest way for Swift developers to build with
the Gemini API. The Gemini API gives you access to Gemini
[models](https://ai.google.dev/models/gemini) created by
[Google DeepMind](https://deepmind.google/technologies/gemini/#introduction).
Gemini models are built from the ground up to be multimodal, so you can reason
seamlessly across text, images, and code.

> [!CAUTION]
> **The Google AI SDK for Swift is recommended for prototyping only.** If you plan to enable
> billing, we strongly recommend that you use a backend SDK to access the Google AI Gemini API. You
> risk potentially exposing your API key to malicious actors if you embed your API key directly in
> your Swift app or fetch it remotely at runtime.
> **The Google AI SDK for Swift is recommended for prototyping only.** If you
> plan to enable billing, we strongly recommend that you use a backend SDK to
> access the Google AI Gemini API. You risk potentially exposing your API key to
> malicious actors if you embed your API key directly in your Swift app or fetch
> it remotely at runtime.
## Get started with the Gemini API

1. Go to [Google AI Studio](https://aistudio.google.com/).
2. Login with your Google account.
3. [Create an API key](https://aistudio.google.com/app/apikey). Note that in
Europe the free tier is not available.
4. Check out this repository. \
`git clone https://github.com/google/generative-ai-swift`
5. Open and build the sample app in the `Examples` folder of this repo.
6. Run the app once to ensure the build script generates an empty
`GenerativeAI-Info.plist` file
7. Paste your API key into the `API_KEY` property in the
`GenerativeAI-Info.plist` file.
8. Run the app
9. For detailed instructions, try the
[Swift SDK tutorial](https://ai.google.dev/tutorials/swift_quickstart) on
[ai.google.dev](https://ai.google.dev).

## Usage example

1. Add [`generative-ai-swift`](https://github.com/google/generative-ai-swift)
to your Xcode project using Swift Package Manager.

2. Import the `GoogleGenerativeAI` module

The Google AI SDK for Swift enables developers to use Google's state-of-the-art generative AI models
(like Gemini) to build AI-powered features and applications. This SDK supports use cases like:
- Generate text from text-only input
- Generate text from text-and-images input (multimodal)
- Build multi-turn conversations (chat)
```swift
import GoogleGenerativeAI
```

For example, with just a few lines of code, you can access Gemini's multimodal capabilities to
generate text from text-and-image input:
1. Initialize the model

```swift
let model = GenerativeModel(name: "gemini-1.5-flash-latest", apiKey: "YOUR_API_KEY")
```

1. Run a prompt

```swift
let cookieImage = UIImage(...)
let prompt = "Do these look store-bought or homemade?"

let response = try await model.generateContent(prompt, cookieImage)
```

## Try out the sample Swift app

This repository contains a sample app demonstrating how the SDK can access and utilize the Gemini
model for various use cases.

To try out the sample app, follow these steps:

1. Check out this repository.\
`git clone https://github.com/google/generative-ai-swift`

1. [Obtain an API key](https://makersuite.google.com/app/apikey) to use with the Google AI SDKs.

1. Open and build the sample app in the `Examples` folder of this repo.

1. Run the app once to ensure the build script generates an empty `GenerativeAI-Info.plist` file

1. Paste your API key into the `API_KEY` property in the `GenerativeAI-Info.plist` file.

1. Run the app.

## Use the SDK in your app

Add [`generative-ai-swift`](https://github.com/google/generative-ai-swift) to your Xcode project
using Swift Package Manager.

For detailed instructions, you can find a
[quickstart](https://ai.google.dev/tutorials/swift_quickstart) for the Google AI SDK for Swift in the
Google documentation.
[quickstart](https://ai.google.dev/tutorials/swift_quickstart) for the Google AI
SDK for Swift in the Google documentation.

This quickstart describes how to add your API key and the Swift package to your app, initialize the
model, and then call the API to access the model. It also describes some additional use cases and
features, like streaming, counting tokens, and controlling responses.
This quickstart describes how to add your API key and the Swift package to your
app, initialize the model, and then call the API to access the model. It also
describes some additional use cases and features, like streaming, counting
tokens, and controlling responses.

## Logging

To enable additional logging in the Xcode console, including a cURL command and raw stream
response for each model request, add `-GoogleGenerativeAIDebugLogEnabled` as
`Arguments Passed On Launch` in the Xcode scheme.
To enable additional logging in the Xcode console, including a cURL command and
raw stream response for each model request, add
`-GoogleGenerativeAIDebugLogEnabled` as `Arguments Passed On Launch` in the
Xcode scheme.

## Command Line Tool

A command line tool is available to experiment with Gemini model requests via Xcode or the command
line:
A command line tool is available to experiment with Gemini model requests via
Xcode or the command line:

1. `open Examples/GenerativeAICLI/Package.swift`
1. Run in Xcode and examine the console to see the options.
1. Edit the scheme's `Arguments Passed On Launch` with the desired options.
1. `open Examples/GenerativeAICLI/Package.swift`
1. Run in Xcode and examine the console to see the options.
1. Edit the scheme's `Arguments Passed On Launch` with the desired options.

## Documentation

Find complete documentation for the Google AI SDKs and the Gemini model in the Google
documentation: https://ai.google.dev/docs
See the
[Gemini API Cookbook](https://github.com/google-gemini/gemini-api-cookbook/) or
[ai.google.dev](https://ai.google.dev) for complete documentation.

## Contributing

See [Contributing](https://github.com/google/generative-ai-swift/blob/main/docs/CONTRIBUTING.md)
for more information on
contributing to the Google AI SDK for Swift.

See
[Contributing](https://github.com/google/generative-ai-swift/blob/main/docs/CONTRIBUTING.md)
for more information on contributing to the Google AI SDK for Swift.

## Developers who use the PaLM SDK for Swift (Deprecated)
## Developers who use the PaLM SDK for Swift (Decommissioned)

> [!IMPORTANT]
> The PaLM API is deprecated for use with Google AI services and tools (but _not_ for Vertex AI).
> Learn more about this deprecation, its timeline, and how to migrate to use Gemini in the
> [PaLM API deprecation guide](http://ai.google.dev/palm_docs/deprecation).
​​If you're using the PaLM SDK for Swift, review the information below to continue using the
**deprecated** PaLM SDK until you've migrated to the new version that allows you to use Gemini.

- To continue using PaLM models, make sure your app depends on version
[`0.3.0`](https://github.com/google/generative-ai-swift/releases/tag/0.3.0)
_up to_ the next minor version
([`0.4.0`](https://github.com/google/generative-ai-swift/releases/tag/0.4.0))
of `generative-ai-swift`.

- When you're ready to use Gemini models, migrate your code to the Gemini API and update your app's
`generative-ai-swift` dependency to version `0.4.0` or higher.

To see the PaLM documentation and code, go to the
[`palm` branch](https://github.com/google/generative-ai-swift/tree/palm).
> The PaLM API is now
> [decommissioned](https://ai.google.dev/palm_docs/deprecation). This means that
> users cannot use a PaLM model in a prompt, tune a new PaLM model, or run
> inference on PaLM-tuned models.
>
> Note: This is different from the
> [Vertex AI PaLM API](https://cloud.google.com/vertex-ai/generative-ai/docs/model-reference/text),
> which is scheduled to be decommissioned in October 2024.
​​If you're using the PaLM SDK for Swift, migrate your code to the Gemini API
and update your app's `generative-ai-swift` dependency to version `0.4.0` or
higher. For more information on migrating from PaLM to Gemini, see the
[migration guide](https://ai.google.dev/docs/migration_guide).

## License

The contents of this repository are licensed under the
[Apache License, version 2.0](http://www.apache.org/licenses/LICENSE-2.0).
3 changes: 2 additions & 1 deletion Sources/GoogleAI/Chat.swift
Original file line number Diff line number Diff line change
Expand Up @@ -160,7 +160,8 @@ public class Chat {
case let .text(str):
combinedText += str

case .data, .fileData, .functionCall, .functionResponse:
case .data, .fileData, .functionCall, .functionResponse, .executableCode,
.codeExecutionResult:
// Don't combine it, just add to the content. If there's any text pending, add that as
// a part.
if !combinedText.isEmpty {
Expand Down
85 changes: 84 additions & 1 deletion Sources/GoogleAI/FunctionCalling.swift
Original file line number Diff line number Diff line change
Expand Up @@ -161,6 +161,9 @@ public struct Tool {
/// A list of `FunctionDeclarations` available to the model.
let functionDeclarations: [FunctionDeclaration]?

/// Enables the model to execute code as part of generation.
let codeExecution: CodeExecution?

/// Constructs a new `Tool`.
///
/// - Parameters:
Expand All @@ -172,8 +175,11 @@ public struct Tool {
/// populating ``FunctionCall`` in the response. The next conversation turn may contain a
/// ``FunctionResponse`` in ``ModelContent/Part/functionResponse(_:)`` with the
/// ``ModelContent/role`` "function", providing generation context for the next model turn.
public init(functionDeclarations: [FunctionDeclaration]?) {
/// - codeExecution: Enables the model to execute code as part of generation, if provided.
public init(functionDeclarations: [FunctionDeclaration]? = nil,
codeExecution: CodeExecution? = nil) {
self.functionDeclarations = functionDeclarations
self.codeExecution = codeExecution
}
}

Expand Down Expand Up @@ -244,6 +250,55 @@ public struct FunctionResponse: Equatable {
}
}

/// Tool that executes code generated by the model, automatically returning the result to the model.
///
/// This type has no fields. See ``ExecutableCode`` and ``CodeExecutionResult``, which are only
/// generated when using this tool.
public struct CodeExecution {
/// Constructs a new `CodeExecution` tool.
public init() {}
}

/// Code generated by the model that is meant to be executed, and the result returned to the model.
///
/// Only generated when using the ``CodeExecution`` tool, in which case the code will automatically
/// be executed, and a corresponding ``CodeExecutionResult`` will also be generated.
public struct ExecutableCode: Equatable {
/// The programming language of the ``code``.
public let language: String

/// The code to be executed.
public let code: String
}

/// Result of executing the ``ExecutableCode``.
///
/// Only generated when using the ``CodeExecution`` tool, and always follows a part containing the
/// ``ExecutableCode``.
public struct CodeExecutionResult: Equatable {
/// Possible outcomes of the code execution.
public enum Outcome: String {
/// An unrecognized code execution outcome was provided.
case unknown = "OUTCOME_UNKNOWN"
/// Unspecified status; this value should not be used.
case unspecified = "OUTCOME_UNSPECIFIED"
/// Code execution completed successfully.
case ok = "OUTCOME_OK"
/// Code execution finished but with a failure; ``CodeExecutionResult/output`` should contain
/// the failure details from `stderr`.
case failed = "OUTCOME_FAILED"
/// Code execution ran for too long, and was cancelled. There may or may not be a partial
/// ``CodeExecutionResult/output`` present.
case deadlineExceeded = "OUTCOME_DEADLINE_EXCEEDED"
}

/// Outcome of the code execution.
public let outcome: Outcome

/// Contains `stdout` when code execution is successful, `stderr` or other description otherwise.
public let output: String
}

// MARK: - Codable Conformance

extension FunctionCall: Decodable {
Expand Down Expand Up @@ -293,3 +348,31 @@ extension FunctionCallingConfig.Mode: Encodable {}
extension ToolConfig: Encodable {}

extension FunctionResponse: Encodable {}

extension CodeExecution: Encodable {}

extension ExecutableCode: Codable {}

@available(iOS 15.0, macOS 11.0, macCatalyst 15.0, *)
extension CodeExecutionResult.Outcome: Codable {
public init(from decoder: any Decoder) throws {
let value = try decoder.singleValueContainer().decode(String.self)
guard let decodedOutcome = CodeExecutionResult.Outcome(rawValue: value) else {
Logging.default
.error("[GoogleGenerativeAI] Unrecognized Outcome with value \"\(value)\".")
self = .unknown
return
}

self = decodedOutcome
}
}

@available(iOS 15.0, macOS 11.0, macCatalyst 15.0, *)
extension CodeExecutionResult: Codable {
public init(from decoder: any Decoder) throws {
let container = try decoder.container(keyedBy: CodingKeys.self)
outcome = try container.decode(Outcome.self, forKey: .outcome)
output = try container.decodeIfPresent(String.self, forKey: .output) ?? ""
}
}
21 changes: 18 additions & 3 deletions Sources/GoogleAI/GenerateContentResponse.swift
Original file line number Diff line number Diff line change
Expand Up @@ -46,16 +46,31 @@ public struct GenerateContentResponse {
return nil
}
let textValues: [String] = candidate.content.parts.compactMap { part in
guard case let .text(text) = part else {
switch part {
case let .text(text):
return text
case let .executableCode(executableCode):
let codeBlockLanguage: String
if executableCode.language == "LANGUAGE_UNSPECIFIED" {
codeBlockLanguage = ""
} else {
codeBlockLanguage = executableCode.language.lowercased()
}
return "```\(codeBlockLanguage)\n\(executableCode.code)\n```"
case let .codeExecutionResult(codeExecutionResult):
if codeExecutionResult.output.isEmpty {
return nil
}
return "```\n\(codeExecutionResult.output)\n```"
case .data, .fileData, .functionCall, .functionResponse:
return nil
}
return text
}
guard textValues.count > 0 else {
Logging.default.error("Could not get a text part from the first candidate.")
return nil
}
return textValues.joined(separator: " ")
return textValues.joined(separator: "\n")
}

/// Returns function calls found in any `Part`s of the first candidate of the response, if any.
Expand Down
2 changes: 1 addition & 1 deletion Sources/GoogleAI/GenerativeAISwift.swift
Original file line number Diff line number Diff line change
Expand Up @@ -21,7 +21,7 @@ import Foundation
@available(iOS 15.0, macOS 11.0, macCatalyst 15.0, *)
public enum GenerativeAISwift {
/// String value of the SDK version
public static let version = "0.5.4"
public static let version = "0.5.6"
/// The Google AI backend endpoint URL.
static let baseURL = "https://generativelanguage.googleapis.com"
}
Loading

0 comments on commit ea6cffd

Please sign in to comment.