AgentSkillsCN

azure-ai-transcription-py

利用 Azure AI Transcription SDK for Python 实现带时间戳与说话人区分的实时与批量语音转文字功能。触发器包括:“transcription”、“speech to text”、“Azure AI Transcription”。

SKILL.md
--- frontmatter
name: azure-ai-transcription-py
description: Azure AI Transcription SDK for Python. Use for real-time and batch speech-to-text transcription with timestamps and diarization. Triggers: "transcription", "speech to text", "Azure AI Transcription", 
category: AI & Agents
source: antigravity
tags: [python, ai, azure, rag]
url: https://github.com/sickn33/antigravity-awesome-skills/tree/main/skills/azure-ai-transcription-py

Azure AI Transcription SDK for Python

Client library for Azure AI Transcription (speech-to-text) with real-time and batch transcription.

Installation

bash
pip install azure-ai-transcription

Environment Variables

bash
TRANSCRIPTION_ENDPOINT=https://<resource>.cognitiveservices.azure.com
TRANSCRIPTION_KEY=<your-key>

Authentication

Use subscription key authentication (DefaultAzureCredential is not supported for this client):

python
import os
from azure.ai.transcription import TranscriptionClient

client = TranscriptionClient(
    endpoint=os.environ["TRANSCRIPTION_ENDPOINT"],
    credential=os.environ["TRANSCRIPTION_KEY"]
)

Transcription (Batch)

python
job = client.begin_transcription(
    name="meeting-transcription",
    locale="en-US",
    content_urls=["https://<storage>/audio.wav"],
    diarization_enabled=True
)
result = job.result()
print(result.status)

Transcription (Real-time)

python
stream = client.begin_stream_transcription(locale="en-US")
stream.send_audio_file("audio.wav")
for event in stream:
    print(event.text)

Best Practices

  1. Enable diarization when multiple speakers are present
  2. Use batch transcription for long files stored in blob storage
  3. Capture timestamps for subtitle generation
  4. Specify language to improve recognition accuracy
  5. Handle streaming backpressure for real-time transcription
  6. Close transcription sessions when complete