テキスト読み上げ

Fish Audio、MiniMax、Qwen などで自然な音声に変換

音声テキスト化

アップロードした音声を高精度で文字起こし

主要モデルでプロンプトから画像生成

テキストとスタイルから動画を作成

リップシンク・デジタルヒューマン

アバターやプレゼン向けに音声と映像を同期

音声ワークスペース

音声合成のワークスペースでプロジェクトを管理

ショート動画・吹き替え

SNS、広告、UGC 向けに素早くナレーション

オーディオブック・ポッドキャスト

自然なテンポの長尺ナレーション

教育・研修

講座や社内コミュニケーション向けの明瞭な読み上げ

モデルライブラリ

TTS プロバイダー・機能・仕様を一覧比較

音声クローンの手順

サンプル収集から学習・ベストプラクティスまで

API プレイグラウンド

API キーで REST をオンライン試行

トークンの作成と管理

アプリを開く

.

API documentation & playground

Choose an API below for endpoint details, parameters, and live testing with your API key.

Text to Speech (HTTP)
REST synthesis with your voice model ID and engine options.
Text to Speech (HTTP v2)
Synthesize speech with a voice ID and optional engine settings.
TTS WebSocket
Streaming speech over WebSocket for realtime use cases.
TTS WebSocket v2
Updated WebSocket protocol for TTS.
Speech to Text
Transcribe audio from a public URL.
Voice clone — create model
Upload reference audio to create a voice model.
Voice clone — delete model
Remove a voice model by ID.
Voice clone — list models
List public and personal voice models.
Lip sync — create task
Create a lip-sync video generation task.
Lip sync — query task
Poll task status and results by ID.
Lip sync — list tasks
List lip-sync tasks and statistics.
User profile (API)
Remaining API quota and basic account info.