MedGemma Query Skill
Send a medical or health-related question to a locally running MedGemma model via Ollama and return the response.
When to Use
- •Medical knowledge questions (symptoms, conditions, treatments, drug info)
- •Clinical reasoning and differential diagnosis brainstorming
- •Medical literature interpretation
- •Health data analysis assistance
- •Any time medical domain expertise would be helpful
How to Execute
Use curl to query the Ollama API directly. Default to the 27B model for quality; use 4B only if the user requests speed over quality.
Models
- •27B (recommended):
hf.co/unsloth/medgemma-27b-it-GGUF:Q4_K_M - •4B (fast):
MedAIBase/MedGemma1.5:4b
Basic Query Pattern
bash
curl -s http://localhost:11434/api/chat -d '{
"model": "hf.co/unsloth/medgemma-27b-it-GGUF:Q4_K_M",
"stream": false,
"messages": [
{"role": "system", "content": "You are a medical AI assistant. Provide accurate, evidence-based medical information. Always note when professional medical consultation is recommended."},
{"role": "user", "content": "$ARGUMENTS"}
]
}' | python3 -c "import sys,json; r=json.load(sys.stdin); print(r['message']['content'])"
With Structured JSON Output
When structured output is needed (e.g., differential diagnosis list, drug info tables), add formatting instructions to the system prompt:
bash
curl -s http://localhost:11434/api/chat -d '{
"model": "hf.co/unsloth/medgemma-27b-it-GGUF:Q4_K_M",
"stream": false,
"messages": [
{"role": "system", "content": "You are a medical AI assistant. Respond with a JSON object containing your analysis. Include keys: summary, details, confidence, and references_needed."},
{"role": "user", "content": "$ARGUMENTS"}
]
}' | python3 -c "import sys,json; r=json.load(sys.stdin); print(r['message']['content'])"
With Chain-of-Thought Reasoning
For complex clinical reasoning, enable thinking:
bash
curl -s http://localhost:11434/api/chat -d '{
"model": "hf.co/unsloth/medgemma-27b-it-GGUF:Q4_K_M",
"stream": false,
"messages": [
{"role": "system", "content": "You are a medical AI assistant. Think through your reasoning step by step using <think> tags before providing your answer."},
{"role": "user", "content": "$ARGUMENTS"}
]
}' | python3 -c "import sys,json; r=json.load(sys.stdin); print(r['message']['content'])"
Important Notes
- •MedGemma is NOT a substitute for professional medical advice
- •The 27B model is slow but accurate; the 4B model is fast but less reliable
- •Ollama must be running locally on port 11434
- •Both models have 131,072 token context windows
- •If Ollama is not running, start it with
ollama serve - •Responses may contain
<think>...</think>reasoning blocks — include these in output as they show the model's reasoning