Azure AI Content Understanding SDK for Python
Multimodal AI service that extracts semantic content from documents, video, audio, and image files for RAG and automated workflows.
Installation
bash
pip install azure-ai-contentunderstanding
Environment Variables
bash
CONTENTUNDERSTANDING_ENDPOINT=https://<resource>.cognitiveservices.azure.com/
Authentication
python
import os from azure.ai.contentunderstanding import ContentUnderstandingClient from azure.identity import DefaultAzureCredential endpoint = os.environ["CONTENTUNDERSTANDING_ENDPOINT"] credential = DefaultAzureCredential() client = ContentUnderstandingClient(endpoint=endpoint, credential=credential)
Core Workflow
Content Understanding operations are asynchronous long-running operations:
- •Begin Analysis — Start the analysis operation with
begin_analyze()(returns a poller) - •Poll for Results — Poll until analysis completes (SDK handles this with
.result()) - •Process Results — Extract structured results from
AnalyzeResult.contents
Prebuilt Analyzers
| Analyzer | Content Type | Purpose |
|---|---|---|
prebuilt-documentSearch | Documents | Extract markdown for RAG applications |
prebuilt-imageSearch | Images | Extract content from images |
prebuilt-audioSearch | Audio | Transcribe audio with timing |
prebuilt-videoSearch | Video | Extract frames, transcripts, summaries |
prebuilt-invoice | Documents | Extract invoice fields |
Analyze Document
python
import os
from azure.ai.contentunderstanding import ContentUnderstandingClient
from azure.ai.contentunderstanding.models import AnalyzeInput
from azure.identity import DefaultAzureCredential
endpoint = os.environ["CONTENTUNDERSTANDING_ENDPOINT"]
client = ContentUnderstandingClient(
endpoint=endpoint,
credential=DefaultAzureCredential()
)
# Analyze document from URL
poller = client.begin_analyze(
analyzer_id="prebuilt-documentSearch",
inputs=[AnalyzeInput(url="https://example.com/document.pdf")]
)
result = poller.result()
# Access markdown content (contents is a list)
content = result.contents[0]
print(content.markdown)
Access Document Content Details
python
from azure.ai.contentunderstanding.models import MediaContentKind, DocumentContent
content = result.contents[0]
if content.kind == MediaContentKind.DOCUMENT:
document_content: DocumentContent = content # type: ignore
print(document_content.start_page_number)
Analyze Image
python
from azure.ai.contentunderstanding.models import AnalyzeInput
poller = client.begin_analyze(
analyzer_id="prebuilt-imageSearch",
inputs=[AnalyzeInput(url="https://example.com/image.jpg")]
)
result = poller.result()
content = result.contents[0]
print(content.markdown)
Analyze Video
python
from azure.ai.contentunderstanding.models import AnalyzeInput
poller = client.begin_analyze(
analyzer_id="prebuilt-videoSearch",
inputs=[AnalyzeInput(url="https://example.com/video.mp4")]
)
result = poller.result()
# Access video content (AudioVisualContent)
content = result.contents[0]
# Get transcript phrases with timing
for phrase in content.transcript_phrases:
print(f"[{phrase.start_time} - {phrase.end_time}]: {phrase.text}")
# Get key frames (for video)
for frame in content.key_frames:
print(f"Frame at {frame.time}: {frame.description}")
Analyze Audio
python
from azure.ai.contentunderstanding.models import AnalyzeInput
poller = client.begin_analyze(
analyzer_id="prebuilt-audioSearch",
inputs=[AnalyzeInput(url="https://example.com/audio.mp3")]
)
result = poller.result()
# Access audio transcript
content = result.contents[0]
for phrase in content.transcript_phrases:
print(f"[{phrase.start_time}] {phrase.text}")
Custom Analyzers
Create custom analyzers with field schemas for specialized extraction:
python
# Create custom analyzer
analyzer = client.create_analyzer(
analyzer_id="my-invoice-analyzer",
analyzer={
"description": "Custom invoice analyzer",
"base_analyzer_id": "prebuilt-documentSearch",
"field_schema": {
"fields": {
"vendor_name": {"type": "string"},
"invoice_total": {"type": "number"},
"line_items": {
"type": "array",
"items": {
"type": "object",
"properties": {
"description": {"type": "string"},
"amount": {"type": "number"}
}
}
}
}
}
}
)
# Use custom analyzer
from azure.ai.contentunderstanding.models import AnalyzeInput
poller = client.begin_analyze(
analyzer_id="my-invoice-analyzer",
inputs=[AnalyzeInput(url="https://example.com/invoice.pdf")]
)
result = poller.result()
# Access extracted fields
print