Batch-transcribe files with a language hint

Prerequisites

Create a Fish Audio account

Go to fish.audio/auth/signup
Fill in your details to create an account, complete steps to verify your account.
Log in to your account and navigate to the API section

Get your API key

Once you have an account, you’ll need an API key to authenticate your requests.

Log in to your Fish Audio Dashboard
Navigate to the API Keys section
Click “Create New Key” and give it a descriptive name, set a expiration if desired
Copy your key and store it securely

Keep your API key secret! Never commit it to version control or share it publicly.

Recipe

Read each file’s bytes from disk and pass them to asr.transcribe() with an explicit language. A language hint is more reliable than auto-detection when you already know the source language, especially for phonetically similar languages. Collect one result row per file as you go.

from fishaudio import FishAudio

client = FishAudio()

paths = ["speech.wav"]  # add more file paths here
language = "en"

results = []
for path in paths:
    with open(path, "rb") as f:
        audio = f.read()

    transcript = client.asr.transcribe(audio=audio, language=language)
    results.append({
        "file": path,
        "text": transcript.text,
        "duration": transcript.duration,  # seconds
    })

for row in results:
    print(f"{row['file']} ({row['duration']:.1f}s): {row['text']}")

Each call returns an ASRResponse with .text, a .duration in seconds, and per-phrase .segments. The loop keeps files independent, so one bad file does not block the rest of the batch.

Auto-detection (omit language) works well, but passing an explicit language improves accuracy for similar-sounding languages. Use one language per batch — split mixed-language files into separate lists.

Transcribe audio to SRT/VTT captions Instant voice cloning

⌘I

​Prerequisites

​Recipe

​Related

Prerequisites

Recipe

Related