# Audio Generation API (BETA)

Generate sound effects, music, and text-to-speech audio for dynamic game content. Powered by ElevenLabs.

***

## Quick Start

```typescript
import RundotGameAPI from '@series-inc/rundot-game-sdk/api'

// Generate a sound effect
const sfx = await RundotGameAPI.audioGen.generate({
  type: 'sfx',
  description: 'Sword clashing against metal shield, heavy impact',
})

const audio = new Audio(sfx.audioUrl)
audio.play()
```

## Sound Effects (SFX)

Generate short sound effects from a text description.

```typescript
const result = await RundotGameAPI.audioGen.generate({
  type: 'sfx',
  description: 'Glass shattering on stone floor, sharp and echoey',
  durationSec: 3,       // 0.5–30 seconds (optional, model decides if omitted)
  clientRef: 'shatter-1', // Optional correlation ID for job recovery
})

console.log(result.audioUrl)     // URL to the generated audio
console.log(result.durationSec)  // Actual duration
```

### SFX Parameters

| Parameter     | Type     | Description                                           |
| ------------- | -------- | ----------------------------------------------------- |
| `type`        | `'sfx'`  | Required discriminator                                |
| `description` | `string` | Describe materials, intensity, context (required)     |
| `durationSec` | `number` | Duration 0.5–30s (optional, model decides if omitted) |
| `clientRef`   | `string` | Opaque correlation ID echoed back in job events       |

## Music

Generate music tracks with a prompt describing genre, tempo, instruments, and mood.

```typescript
const music = await RundotGameAPI.audioGen.generate({
  type: 'music',
  prompt: 'Orchestral victory fanfare, triumphant brass, 120 BPM',
  durationSec: 15,       // Required, 3–300 seconds
  clientRef: 'victory-music',
})

console.log(music.audioUrl)
console.log(music.durationSec)
```

### Music Parameters

| Parameter     | Type      | Description                                     |
| ------------- | --------- | ----------------------------------------------- |
| `type`        | `'music'` | Required discriminator                          |
| `prompt`      | `string`  | Genre, tempo, instruments, mood (required)      |
| `durationSec` | `number`  | Duration 3–300s (required)                      |
| `clientRef`   | `string`  | Opaque correlation ID echoed back in job events |

## Text-to-Speech (TTS)

Convert text to spoken audio with configurable voice and expression parameters.

```typescript
const speech = await RundotGameAPI.audioGen.generate({
  type: 'tts',
  text: '[whispers]The treasure lies beneath the old oak tree.[/whispers]',
  voiceId: 'pNInz6obpgDQGcFmaJgB',  // ElevenLabs voice ID
  model: 'eleven_v3',
  stability: 0.3,         // Lower = more expressive
  similarityBoost: 0.8,
  speed: 1.0,
})

console.log(speech.audioUrl)
```

### TTS Parameters

| Parameter         | Type                                      | Description                                                                                      |
| ----------------- | ----------------------------------------- | ------------------------------------------------------------------------------------------------ |
| `type`            | `'tts'`                                   | Required discriminator                                                                           |
| `text`            | `string`                                  | Text to speak; supports ElevenLabs v3 audio tags: `[whispers]`, `[shouts]`, `[pause]` (required) |
| `voiceId`         | `string`                                  | ElevenLabs voice ID (required)                                                                   |
| `model`           | `'eleven_v3' \| 'eleven_multilingual_v2'` | TTS model (default: `'eleven_v3'`)                                                               |
| `stability`       | `number`                                  | 0–1, lower = more expressive (default: 0.5)                                                      |
| `similarityBoost` | `number`                                  | 0–1, voice similarity boost (default: 0.8)                                                       |
| `speed`           | `number`                                  | Speech speed 0.5–2.0 (default: 1.0)                                                              |
| `clientRef`       | `string`                                  | Opaque correlation ID echoed back in job events                                                  |

## Async Job Recovery

Audio generation runs as an async job. If the client disconnects mid-generation, use `getCompletedJobs()` to drain results on reconnect. Pass a `clientRef` in the original request to correlate results with your game state.

```typescript
// On reconnect, check for any completed jobs
const completedJobs = await RundotGameAPI.audioGen.getCompletedJobs()

for (const job of completedJobs) {
  if (job.status === 'completed' && job.result) {
    console.log(`Job ${job.params.clientRef} ready:`, job.result.audioUrl)
  } else if (job.status === 'failed') {
    console.error(`Job ${job.params.clientRef} failed:`, job.error)
  }
}
```

## API Reference

| Method                        | Returns                       | Description                         |
| ----------------------------- | ----------------------------- | ----------------------------------- |
| `audioGen.generate(params)`   | `Promise<AudioGenResult>`     | Generate audio (SFX, music, or TTS) |
| `audioGen.getCompletedJobs()` | `Promise<AudioGenJobEvent[]>` | Drain completed async job results   |

### AudioGenResult

| Field          | Type                        | Description                       |
| -------------- | --------------------------- | --------------------------------- |
| `generationId` | `string`                    | Unique ID for the generated audio |
| `audioUrl`     | `string`                    | URL to the audio file             |
| `type`         | `'sfx' \| 'music' \| 'tts'` | Type of audio generated           |
| `durationSec`  | `number`                    | Actual duration of the audio      |
| `prompt`       | `string`                    | The prompt/description used       |

### AudioGenJobEvent

| Field    | Type                          | Description                 |
| -------- | ----------------------------- | --------------------------- |
| `jobId`  | `string`                      | Job identifier              |
| `status` | `'completed' \| 'failed'`     | Job outcome                 |
| `params` | `AudioGenParams`              | Original request parameters |
| `result` | `AudioGenResult \| undefined` | Result if completed         |
| `error`  | `string \| undefined`         | Error message if failed     |

## Best Practices

* Be specific in SFX descriptions: include materials, intensity, and spatial context (e.g., "metal on metal" not just "hit sound").
* For music, specify genre, tempo, instruments, and mood together for best results.
* Use `clientRef` for all generation calls to enable recovery after disconnects.
* Call `getCompletedJobs()` on reconnect to retrieve results from any in-flight generations.
* Keep TTS `stability` low (0.2–0.4) for expressive NPC dialogue, high (0.7–0.9) for UI narration.

## Limits

* Music `durationSec` minimum is **3 seconds**; SFX range is **0.5–30 seconds**.
* Subject to per-creator rate-limit tiers — see [Rate Limits](/rundot-docs/v5.16.0/readme/rate_limits.md).


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://series-1.gitbook.io/rundot-docs/v5.16.0/readme/audio_gen.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
