AgentSkillsCN

azure-ai-transcription-py

Azure AI Transcription SDK for Python。适用于实时与批量语音转文本的转录任务,并支持时间戳与说话人分离功能。 触发词:“transcription”、“speech to text”、“Azure AI Transcription”、“TranscriptionClient”。

SKILL.md
--- frontmatter
name: azure-ai-transcription-py
description: |
  Azure AI Transcription SDK for Python. Use for real-time and batch speech-to-text transcription with timestamps and diarization.
  Triggers: "transcription", "speech to text", "Azure AI Transcription", "TranscriptionClient".
package: azure-ai-transcription

Azure AI Transcription SDK for Python

Client library for Azure AI Transcription (speech-to-text) with real-time and batch transcription.

Installation

bash
pip install azure-ai-transcription

Environment Variables

bash
TRANSCRIPTION_ENDPOINT=https://<resource>.cognitiveservices.azure.com
TRANSCRIPTION_KEY=<your-key>

Authentication

Use subscription key authentication (DefaultAzureCredential is not supported for this client):

python
import os
from azure.ai.transcription import TranscriptionClient

client = TranscriptionClient(
    endpoint=os.environ["TRANSCRIPTION_ENDPOINT"],
    credential=os.environ["TRANSCRIPTION_KEY"]
)

Transcription (Batch)

python
job = client.begin_transcription(
    name="meeting-transcription",
    locale="en-US",
    content_urls=["https://<storage>/audio.wav"],
    diarization_enabled=True
)
result = job.result()
print(result.status)

Transcription (Real-time)

python
stream = client.begin_stream_transcription(locale="en-US")
stream.send_audio_file("audio.wav")
for event in stream:
    print(event.text)

Best Practices

  1. Enable diarization when multiple speakers are present
  2. Use batch transcription for long files stored in blob storage
  3. Capture timestamps for subtitle generation
  4. Specify language to improve recognition accuracy
  5. Handle streaming backpressure for real-time transcription
  6. Close transcription sessions when complete