AgentSkillsCN

gemini-file-api

借助 Gemini File API 实现文档缓存与高效的多查询工作流。当您讨论文件上传、缓存策略、Token 优化,或对比 File API 与内联文档处理方案时,可调用此技能。

SKILL.md
--- frontmatter
name: gemini-file-api
description: Work with Gemini File API for document caching and efficient multi-query workflows. Use when discussing file uploads, caching strategies, token optimization, or comparing File API vs inline document approaches.
allowed-tools:
  - Read
  - Grep
  - Bash

Gemini File API Integration

Purpose

Understand and work with Gemini's File API for efficient document processing.

Architecture Overview

code
┌─────────────────────────────────────────────┐
│  Document Upload                            │
│  UploadFile() -> Gemini File API           │
│  Returns: URI with 48hr TTL                │
└─────────────────┬───────────────────────────┘
                  │
                  ▼
┌─────────────────────────────────────────────┐
│  FileCache (in-memory)                      │
│  Prevents re-uploads within session        │
│  Key: file path, Value: FileReference      │
└─────────────────┬───────────────────────────┘
                  │
                  ▼
┌─────────────────────────────────────────────┐
│  GenerateContent()                          │
│  References files by URI (not bytes)       │
│  Reuse same URI for multiple queries       │
└─────────────────────────────────────────────┘

Client Usage

Creating a Client

go
// Reads GEMINI_API_KEY from environment
client, err := gemini.NewClient(ctx)
if err != nil {
    return err
}
defer client.Close()

Uploading Files

Single file:

go
ref, err := client.UploadFile(ctx, "/path/to/doc.pdf", "application/pdf")
// ref.URI is reusable for 48 hours

Multiple files (parallel upload):

go
paths := []string{"/path/to/w2.pdf", "/path/to/paystub.pdf"}
mimeTypes := map[string]string{
    "/path/to/w2.pdf": "application/pdf",
    "/path/to/paystub.pdf": "application/pdf",
}
refs, err := client.UploadFiles(ctx, paths, mimeTypes)
// refs is map[string]*FileReference

Generating Content

With File API (recommended for multiple queries):

go
resp, err := client.GenerateContent(ctx, &gemini.GenerateRequest{
    Model:        gemini.ModelFlash,
    Prompt:       "Analyze this W2 for annual income",
    SystemPrompt: "You are a mortgage underwriter...",
    FileURIs:     []string{ref.URI},
    FileMIMEs:    map[string]string{ref.URI: "application/pdf"},
})

With Inline bytes (simpler for single queries):

go
resp, err := client.GenerateContent(ctx, &gemini.GenerateRequest{
    Model:  gemini.ModelFlash,
    Prompt: "Analyze this document",
    InlineFiles: []gemini.InlineFile{{
        Data:     fileBytes,
        MimeType: "application/pdf",
    }},
})

Data Structures

FileReference

go
type FileReference struct {
    Name      string    // Gemini-assigned name
    URI       string    // Reusable URI for requests
    MimeType  string
    State     string    // Processing state
    CreatedAt time.Time
    ExpiresAt time.Time // 48 hours from creation
    SizeBytes int64
}

GenerateRequest

go
type GenerateRequest struct {
    Model        string              // ModelFlash or ModelPro
    Prompt       string              // User prompt
    SystemPrompt string              // System instructions
    FileURIs     []string            // Gemini file URIs
    FileMIMEs    map[string]string   // URI -> MIME type
    InlineFiles  []InlineFile        // Direct bytes (alternative)
    Temperature  *float32
    MaxTokens    *int32
}

GenerateResponse

go
type GenerateResponse struct {
    Text           string
    FinishReason   string
    TokensUsed     int64
    ModelUsed      string
    ProcessingTime time.Duration
}

Model Selection

ModelConstantUse Case
Gemini Flashgemini.ModelFlashFast, cheaper, most tasks
Gemini Progemini.ModelProComplex reasoning, escalation

File API vs Inline Comparison

AspectFile APIInline
UploadOnce (cached 48hr)Every request
Best forMultiple queries, same docsSingle query
Token costLower per queryHigher per query
LatencyHigher first queryConsistent
Break-even~7 queriesN/A

FileCache Details

Located in internal/gemini/cache.go:

  • Thread-safe: Uses RWMutex
  • Expiry-aware: Checks 48hr TTL before returning
  • Methods: Get, Set, Delete, Cleanup, GetAllURIs

Troubleshooting

"GEMINI_API_KEY not set"

bash
export GEMINI_API_KEY="your-key"

Upload failures

  • Check file exists and is readable
  • Verify MIME type matches content
  • Check file size limits

Rate limits

Add delays between requests:

go
time.Sleep(500 * time.Millisecond)

Related Files

  • internal/gemini/client.go - Main client implementation
  • internal/gemini/cache.go - FileCache implementation
  • cmd/benchmark/main.go - Performance comparison tool