Pronunciation API Documentation

Integrate pronunciation scoring into your application with our REST API

Authentication

Authenticate your requests using an API key in the Authorization header:

Authorization: Bearer lx_your_api_key_here

⚠️ Keep your API key secure: Never expose your API key in client-side code or public repositories.

Endpoint Details

🔬 Syllable-Level Precision

The V3 API delivers the deepest pronunciation analysis available: every word is broken down to individual syllables with IPA transcription, pitch class, timing, and per-phoneme accuracy. It supports the same 16 languages as V2.

Note: V3 is experimental. If you encounter an error, please tell us.

POST/api/pronunciation/v3/check

Syllable-level pronunciation analysis with 16-language support

✨ Operating Modes:

Known text mode: Provide sentence parameter → Scores pronunciation syllable-by-syllable against expected text
Just Talk mode: Omit sentence parameter → Whisper AI transcribes your speech first, then scores pronunciation

Request Parameters

Parameter	Type	Required	Description
speechdata	file	Yes	Audio file (WAV, MP3, M4A, OGG, FLAC, WEBM)
sentence	string	No	Expected text to compare against. If omitted, Whisper transcribes speech first ("Just Talk" mode).
language_code	string	No	Language code (default: ja). Same 16 languages as V2.

V3 Code Example

you should set the API key as an environment variable:

export LINGOLIX_API_KEY=your_api_key_here

# Known text mode
curl -X POST "https://api.lingolix.com/api/pronunciation/v3/check-pronunciation" \
  -H "Authorization: Bearer $LINGOLIX_API_KEY" \
  -F "speechdata=@audio.wav" \
  -F "sentence=Guten Morgen" \
  -F "language_code=de"

# Just Talk mode (sentence omitted)
curl -X POST "https://api.lingolix.com/api/pronunciation/v3/check-pronunciation" \
  -H "Authorization: Bearer $LINGOLIX_API_KEY" \
  -F "speechdata=@audio.wav" \
  -F "language_code=de"

Response Format

V3 returns a richer structure than V1/V2. Each word contains a syllables array with per-syllable IPA, pitch, timing, and completeness. speaking_rate is measured in subwords (syllables) per second.

{
  "text": "Guten Morgen",
  "speaking_rate": 12.5,
  "accuracy": 0.7777777910232544,
  "completeness": 1,
  "words": [
    {
      "text": "Guten",
      "syllables": [
        {
          "text": "Gu",
          "expected_ipa": "ɡuː",
          "detected_ipa": "ɡu",
          "accuracy": 1,
          "completeness": 1,
          "pitch": "low",
          "duration_ms": 60,
          "start_ms": 20,
          "end_ms": 40,
          "is_missing": false,
          "is_extra": false
        },
        {
          "text": "ten",
          "expected_ipa": "tn̩",
          "detected_ipa": "tn",
          "accuracy": 1,
          "completeness": 1,
          "pitch": "high",
          "duration_ms": 100,
          "start_ms": 70,
          "end_ms": 110,
          "is_missing": false,
          "is_extra": false
        }
      ],
      "accuracy": 1,
      "completeness": 1,
      "start_ms": 20,
      "end_ms": 110,
      "char_start": 0,
      "char_end": 5
    },
    {
      "text": "Morgen",
      "syllables": [
        {
          "text": "Mor",
          "expected_ipa": "mɔʁ",
          "detected_ipa": "ma",
          "accuracy": 0.3333333134651184,
          "completeness": 0.6666666666666667,
          "pitch": "high",
          "duration_ms": 60,
          "start_ms": 160,
          "end_ms": 180,
          "is_missing": false,
          "is_extra": false
        },
        {
          "text": "gen",
          "expected_ipa": "ɡn̩",
          "detected_ipa": "ɡn",
          "accuracy": 1,
          "completeness": 1,
          "pitch": "flat",
          "duration_ms": 100,
          "start_ms": 220,
          "end_ms": 260,
          "is_missing": false,
          "is_extra": false
        }
      ],
      "accuracy": 0.6000000238418579,
      "completeness": 0.5,
      "start_ms": 160,
      "end_ms": 260,
      "char_start": 6,
      "char_end": 12
    }
  ]
}

Word Fields

Field	Description
text	Word text
accuracy	Mean accuracy across the word's syllables (0.0 – 1.0)
completeness	Fraction of the word's syllables detected
start_ms	Start offset in milliseconds
end_ms	End offset in milliseconds
char_start	Character index of this word in the response's text field. -1 for Japanese and for extra words the user said that weren't in the expected text
char_end	Character index (exclusive) where this word ends in the response's text field. -1 under the same conditions as char_start

Syllable Fields

Field	Description
text	Syllable text (e.g. "ni")
expected_ipa	Expected IPA transcription
detected_ipa	Detected IPA from audio
accuracy	Phoneme-level accuracy (0.0 – 1.0)
completeness	Fraction of expected phonemes detected
pitch	Pitch class: high \| low \| flat \| unknown
duration_ms	Syllable duration in milliseconds
start_ms	Start offset in milliseconds
end_ms	End offset in milliseconds
is_missing	true if the user skipped this syllable
is_extra	true if the user said something not in target

Rate Limits & Quotas

Quota System

Your monthly quota is measured in audio minutes and is shared across all your API keys. Check your current usage in the Dashboard.

Free Tier

15 minutes per month
Hard limit - requests blocked when quota exhausted
Returns 429 error when quota exceeded

Paid Tiers

Higher monthly quotas (60min - 2000min depending on plan)
Overage allowed with per-minute billing
Overage rates: €0.02 - €0.05 per minute depending on plan

💡 Tip: Monitor your usage regularly to avoid unexpected charges. Upgrade your plan in the Subscription page.

Error Handling

Status Code	Error	Solution
401	Invalid or missing API key	Check your Authorization header
400	Invalid audio format	Use WAV, MP3, M4A, OGG, FLAC, or WEBM format
429	Quota exceeded	Upgrade plan or wait for next billing period
503	Service unavailable	Retry after a short delay

Need Help?

Have questions or need support integrating the API? We're here to help!

Contact Support