MedGemma Query Skill

Send a medical or health-related question to a locally running MedGemma model via Ollama and return the response.

When to Use

•Medical knowledge questions (symptoms, conditions, treatments, drug info)
•Clinical reasoning and differential diagnosis brainstorming
•Medical literature interpretation
•Health data analysis assistance
•Any time medical domain expertise would be helpful

How to Execute

Use curl to query the Ollama API directly. Default to the 27B model for quality; use 4B only if the user requests speed over quality.

Models

•27B (recommended): hf.co/unsloth/medgemma-27b-it-GGUF:Q4_K_M
•4B (fast): MedAIBase/MedGemma1.5:4b

Basic Query Pattern

bash

curl -s http://localhost:11434/api/chat -d '{
  "model": "hf.co/unsloth/medgemma-27b-it-GGUF:Q4_K_M",
  "stream": false,
  "messages": [
    {"role": "system", "content": "You are a medical AI assistant. Provide accurate, evidence-based medical information. Always note when professional medical consultation is recommended."},
    {"role": "user", "content": "$ARGUMENTS"}
  ]
}' | python3 -c "import sys,json; r=json.load(sys.stdin); print(r['message']['content'])"

With Structured JSON Output

When structured output is needed (e.g., differential diagnosis list, drug info tables), add formatting instructions to the system prompt:

bash

curl -s http://localhost:11434/api/chat -d '{
  "model": "hf.co/unsloth/medgemma-27b-it-GGUF:Q4_K_M",
  "stream": false,
  "messages": [
    {"role": "system", "content": "You are a medical AI assistant. Respond with a JSON object containing your analysis. Include keys: summary, details, confidence, and references_needed."},
    {"role": "user", "content": "$ARGUMENTS"}
  ]
}' | python3 -c "import sys,json; r=json.load(sys.stdin); print(r['message']['content'])"

With Chain-of-Thought Reasoning

For complex clinical reasoning, enable thinking:

bash

curl -s http://localhost:11434/api/chat -d '{
  "model": "hf.co/unsloth/medgemma-27b-it-GGUF:Q4_K_M",
  "stream": false,
  "messages": [
    {"role": "system", "content": "You are a medical AI assistant. Think through your reasoning step by step using <think> tags before providing your answer."},
    {"role": "user", "content": "$ARGUMENTS"}
  ]
}' | python3 -c "import sys,json; r=json.load(sys.stdin); print(r['message']['content'])"

Important Notes

•MedGemma is NOT a substitute for professional medical advice
•The 27B model is slow but accurate; the 4B model is fast but less reliable
•Ollama must be running locally on port 11434
•Both models have 131,072 token context windows
•If Ollama is not running, start it with ollama serve
•Responses may contain <think>...</think> reasoning blocks — include these in output as they show the model's reasoning