SwiftOpenAI

An open-source Swift package designed for effortless interaction with OpenAI's public API.

Description
Getting an API Key
Installation
Usage
Azure OpenAI
AIProxy
Ollama
Collaboration

Description

SwiftOpenAI is an open-source Swift package that streamlines interactions with all OpenAI's API endpoints, now with added support for Azure, AIProxy, and Assistant stream APIs.

BETA

Assistants
- Assistants File Object
Threads
Messages
- Message File Object
Runs
- Run Step object
- Run Step details
Assistants Streaming
- Message Delta Object
- Run Step Delta Object
Vector Stores
- Vector store File
- Vector store File Batch

Getting an API Key

⚠️ Important

To interact with OpenAI services, you'll need an API key. Follow these steps to obtain one:

Visit OpenAI.
Sign up for an account or log in if you already have one.
Navigate to the API key page and follow the instructions to generate a new API key.

For more information, consult OpenAI's official documentation.

⚠️ Please take precautions to keep your API key secure per OpenAI's guidance:

Remember that your API key is a secret! Do not share it with others or expose it in any client-side code (browsers, apps). Production requests must be routed through your backend server where your API key can be securely loaded from an environment variable or key management service.

SwiftOpenAI has built-in support for AIProxy, which is a backend for AI apps, to satisfy this requirement. To configure AIProxy, see the instructions here.

Installation

Swift Package Manager

Open your Swift project in Xcode.
Go to File -> Add Package Dependency.
In the search bar, enter this URL.
Choose the version you'd like to install (see the note below).
Click Add Package.

Note: Xcode has a quirk where it defaults an SPM package's upper limit to 2.0.0. This package is beyond that limit, so you should not accept the defaults that Xcode proposes. Instead, enter the lower bound of the release version that you'd like to support, and then tab out of the input box for Xcode to adjust the upper bound. Alternatively, you may select branch -> main to stay on the bleeding edge.

Usage

To use SwiftOpenAI in your project, first import the package:

import SwiftOpenAI

Then, initialize the service using your OpenAI API key:

let apiKey = "your_openai_api_key_here"
let service = OpenAIServiceFactory.service(apiKey: apiKey)

You can optionally specify an organization name if needed.

let apiKey = "your_openai_api_key_here"
let oganizationID = "your_organixation_id"
let service = OpenAIServiceFactory.service(apiKey: apiKey, organizationID: oganizationID)

That's all you need to begin accessing the full range of OpenAI endpoints.

How to get the status code of network errors

You may want to build UI around the type of error that the API returns. For example, a 429 means that your requests are being rate limited. The APIError type has a case responseUnsuccessful with two associated values: a description and statusCode. Here is a usage example using the chat completion API:

let service = OpenAIServiceFactory.service(apiKey: apiKey)
let parameters = ChatCompletionParameters(messages: [.init(role: .user, content: .text("hello world"))],
                                          model: .gpt4o)
do {
   let choices = try await service.startChat(parameters: parameters).choices
   // Work with choices
} catch APIError.responseUnsuccessful(let description, let statusCode) {
   print("Network error with status code: \(statusCode) and description: \(description)")
} catch {
   print(error.localizedDescription)
}

Audio

Audio Transcriptions

Parameters

public struct AudioTranscriptionParameters: Encodable {
   
   /// The name of the file asset is not documented in OpenAI's official documentation; however, it is essential for constructing the multipart request.
   let fileName: String
   /// The audio file object (not file name) translate, in one of these formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, or webm.
   let file: Data
   /// ID of the model to use. Only whisper-1 is currently available.
   let model: String
   /// The language of the input audio. Supplying the input language in [ISO-639-1](https://en.wikipedia.org/wiki/List_of_ISO_639-1_codes) format will improve accuracy and latency.
   let language: String?
   /// An optional text to guide the model's style or continue a previous audio segment. The [prompt](https://platform.openai.com/docs/guides/speech-to-text/prompting) should match the audio language.
   let prompt: String?
   /// The format of the transcript output, in one of these options: json, text, srt, verbose_json, or vtt. Defaults to json
   let responseFormat: String?
   /// The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. If set to 0, the model will use [log probability](https://en.wikipedia.org/wiki/Log_probability) to automatically increase the temperature until certain thresholds are hit. Defaults to 0
   let temperature: Double?
   
   public enum Model: String {
      case whisperOne = "whisper-1"
   }
   
   public init(
      fileName: String,
      file: Data,
      model: Model = .whisperOne,
      prompt: String? = nil,
      responseFormat: String? = nil,
      temperature: Double? = nil,
      language: String? = nil)
   {
      self.fileName = fileName
      self.file = file
      self.model = model.rawValue
      self.prompt = prompt
      self.responseFormat = responseFormat
      self.temperature = temperature
      self.language = language
   }
}

Response

public struct AudioObject: Decodable {
   
   /// The transcribed text if the request uses the `transcriptions` API, or the translated text if the request uses the `translations` endpoint.
   public let text: String
}

Usage

let fileName = "narcos.m4a"
let data = Data(contentsOfURL:_) // Data retrieved from the file named "narcos.m4a".
let parameters = AudioTranscriptionParameters(fileName: fileName, file: data) // **Important**: in the file name always provide the file extension.
let audioObject =  try await service.createTranscription(parameters: parameters)

Audio Translations

Parameters

public struct AudioTranslationParameters: Encodable {
   
   /// The name of the file asset is not documented in OpenAI's official documentation; however, it is essential for constructing the multipart request.
   let fileName: String
   /// The audio file object (not file name) translate, in one of these formats: flac, mp3, mp4, mpeg, mpga, m4a, ogg, wav, or webm.
   let file: Data
   /// ID of the model to use. Only whisper-1 is currently available.
   let model: String
   /// An optional text to guide the model's style or continue a previous audio segment. The [prompt](https://platform.openai.com/docs/guides/speech-to-text/prompting) should match the audio language.
   let prompt: String?
   /// The format of the transcript output, in one of these options: json, text, srt, verbose_json, or vtt. Defaults to json
   let responseFormat: String?
   /// The sampling temperature, between 0 and 1. Higher values like 0.8 will make the output more random, while lower values like 0.2 will make it more focused and deterministic. If set to 0, the model will use [log probability](https://en.wikipedia.org/wiki/Log_probability) to automatically increase the temperature until certain thresholds are hit. Defaults to 0
   let temperature: Double?
   
   public enum Model: String {
      case whisperOne = "whisper-1"
   }
   
   public init(
      fileName: String,
      file: Data,
      model: Model = .whisperOne,
      prompt: String? = nil,
      responseFormat: String? = nil,
      temperature: Double? = nil)
   {
      self.fileName = fileName
      self.file = file
      self.model = model.rawValue
      self.prompt = prompt
      self.responseFormat = responseFormat
      self.temperature = temperature
   }
}

Response

public struct AudioObject: Decodable {
   
   /// The transcribed text if the request uses the `transcriptions` API, or the translated text if the request uses the `translations` endpoint.
   public let text: String
}

Usage

let fileName = "german.m4a"
let data = Data(contentsOfURL:_) // Data retrieved from the file named "german.m4a".
let parameters = AudioTranslationParameters(fileName: fileName, file: data) // **Important**: in the file name always provide the file extension.
let audioObject = try await service.createTranslation(parameters: parameters)

Audio Speech

Parameters

/// [Generates audio from the input text.](https://platform.openai.com/docs/api-reference/audio/createSpeech)
public struct AudioSpeechParameters: Encodable {

   /// One of the available [TTS models](https://platform.openai.com/docs/models/tts): tts-1 or tts-1-hd
   let model: String
   /// The text to generate audio for. The maximum length is 4096 characters.
   let input: String
   /// The voice to use when generating the audio. Supported voices are alloy, echo, fable, onyx, nova, and shimmer. Previews of the voices are available in the [Text to speech guide.](https://platform.openai.com/docs/guides/text-to-speech/voice-options)
   let voice: String
   /// Defaults to mp3, The format to audio in. Supported formats are mp3, opus, aac, and flac.
   let responseFormat: String?
   /// Defaults to 1,  The speed of the generated audio. Select a value from 0.25 to 4.0. 1.0 is the default.
   let speed: Double?

   public enum TTSModel: String {
      case tts1 = "tts-1"
      case tts1HD = "tts-1-hd"
   }

   public enum Voice: String {
      case alloy
      case echo
      case fable
      case onyx
      case nova
      case shimmer
   }

   public enum ResponseFormat: String {
      case mp3
      case opus
      case aac
      case flac
   }
   
   public init(
      model: TTSModel,
      input: String,
      voice: Voice,
      responseFormat: ResponseFormat? = nil,
      speed: Double? = nil)
   {
       self.model = model.rawValue
       self.input = input
       self.voice = voice.rawValue
       self.responseFormat = responseFormat?.rawValue
       self.speed = speed
   }
}

Response