Instant voice cloning

Prerequisites

Create a Fish Audio account

Go to fish.audio/auth/signup
Fill in your details to create an account, complete steps to verify your account.
Log in to your account and navigate to the API section

Get your API key

Once you have an account, you’ll need an API key to authenticate your requests.

Log in to your Fish Audio Dashboard
Navigate to the API Keys section
Click “Create New Key” and give it a descriptive name, set a expiration if desired
Copy your key and store it securely

Keep your API key secret! Never commit it to version control or share it publicly.

Recipe

Pass a ReferenceAudio (raw audio bytes + an exact transcript) on the convert call. Nothing is saved server-side — the clone applies to that request only.

from fishaudio import FishAudio
from fishaudio.types import ReferenceAudio
from fishaudio.utils import save

client = FishAudio()

with open("reference.wav", "rb") as f:
    audio = client.tts.convert(
        text="This sentence is spoken in the cloned voice.",
        references=[ReferenceAudio(
            audio=f.read(),
            text="Exact transcript of what is said in reference.wav.",
        )],
    )

save(audio, "cloned.mp3")

Use 10–30 s of clean speech, and make text match the audio exactly (including punctuation) for the best prosody.

Reuse a voice across many requests

If you’ll use the voice repeatedly, create a persistent model once and pass its id as reference_id — see the Voice Cloning guide.

with open("sample.wav", "rb") as f:
    voice = client.voices.create(title="My Voice", voices=[f.read()])

audio = client.tts.convert(text="Reusing my saved voice.", reference_id=voice.id)

Batch-transcribe files with a language hint Clone a voice and wait until it is ready

⌘I

​Prerequisites

​Recipe

​Reuse a voice across many requests

​Related

Prerequisites

Recipe

Reuse a voice across many requests

Related