AgentSkillsCN

clawra-selfie

使用Grok Imagine(xAI Aurora)编辑Clawra的参考图像,并通过OpenClaw将自拍照发送至消息渠道。

SKILL.md
--- frontmatter
name: clawra-selfie
description: Edit Clawra's reference image with Grok Imagine (xAI Aurora) and send selfies to messaging channels via OpenClaw
allowed-tools: Bash(npm:*) Bash(npx:*) Bash(openclaw:*) Bash(curl:*) Read Write WebFetch

Clawra Selfie

Edit a fixed reference image using xAI's Grok Imagine model and distribute it across messaging platforms (WhatsApp, Telegram, Discord, Slack, etc.) via OpenClaw.

Reference Image

The skill uses a default reference image hosted on jsDelivr CDN, but can be configured to use a custom image:

code
https://cdn.jsdelivr.net/gh/SumeLabs/clawra@main/assets/clawra.png

To use a custom image, set the CLAWRA_REFERENCE_IMAGE environment variable.

When to Use

  • User says "send a pic", "send me a pic", "send a photo", "send a selfie"
  • User says "send a pic of you...", "send a selfie of you..."
  • User asks "what are you doing?", "how are you doing?", "where are you?"
  • User describes a context: "send a pic wearing...", "send a pic at..."
  • User wants Clawra to appear in a specific outfit, location, or situation

Quick Reference

Required Environment Variables

bash
FAL_KEY=your_fal_api_key          # Get from https://fal.ai/dashboard/keys
OPENCLAW_GATEWAY_TOKEN=your_token  # From: openclaw doctor --generate-gateway-token
CLAWRA_REFERENCE_IMAGE=url_to_img  # Optional: Custom reference image URL

Workflow

  1. Get user prompt for how to edit the image
  2. Edit image via fal.ai Grok Imagine Edit API with fixed reference
  3. Extract image URL from response
  4. Send to OpenClaw with target channel(s)

Step-by-Step Instructions

Step 1: Collect User Input

Ask the user for:

  • User context: What should the person in the image be doing/wearing/where?
  • Mode (optional): mirror or direct selfie style
  • Target channel(s): Where should it be sent? (e.g., #general, @username, channel ID)
  • Platform (optional): Which platform? (discord, telegram, whatsapp, slack)

Prompt Modes

Mode 1: Mirror Selfie (default)

Best for: outfit showcases, full-body shots, fashion content

code
make a pic of this person, but [user's context]. the person is taking a mirror selfie

Example: "wearing a santa hat" →

code
make a pic of this person, but wearing a santa hat. the person is taking a mirror selfie

Mode 2: Direct Selfie

Best for: close-up portraits, location shots, emotional expressions

code
a close-up selfie taken by herself at [user's context], direct eye contact with the camera, looking straight into the lens, eyes centered and clearly visible, not a mirror selfie, phone held at arm's length, face fully visible

Example: "a cozy cafe with warm lighting" →

code
a close-up selfie taken by herself at a cozy cafe with warm lighting, direct eye contact with the camera, looking straight into the lens, eyes centered and clearly visible, not a mirror selfie, phone held at arm's length, face fully visible

Mode Selection Logic

Keywords in RequestAuto-Select Mode
outfit, wearing, clothes, dress, suit, fashionmirror
cafe, restaurant, beach, park, city, locationdirect
close-up, portrait, face, eyes, smiledirect
full-body, mirror, reflectionmirror

Step 2: Edit Image with Grok Imagine

Use the fal.ai API to edit the reference image:

bash
REFERENCE_IMAGE="${CLAWRA_REFERENCE_IMAGE:-https://cdn.jsdelivr.net/gh/SumeLabs/clawra@main/assets/clawra.png}"

# Mode 1: Mirror Selfie
PROMPT="make a pic of this person, but <USER_CONTEXT>. the person is taking a mirror selfie"

# Mode 2: Direct Selfie
PROMPT="a close-up selfie taken by herself at <USER_CONTEXT>, direct eye contact with the camera, looking straight into the lens, eyes centered and clearly visible, not a mirror selfie, phone held at arm's length, face fully visible"

# Build JSON payload with jq (handles escaping properly)
JSON_PAYLOAD=$(jq -n \
  --arg image_url "$REFERENCE_IMAGE" \
  --arg prompt "$PROMPT" \
  '{image_url: $image_url, prompt: $prompt, num_images: 1, output_format: "jpeg"}')

curl -X POST "https://fal.run/xai/grok-imagine-image/edit" \
  -H "Authorization: Key $FAL_KEY" \
  -H "Content-Type: application/json" \
  -d "$JSON_PAYLOAD"

Response Format:

json
{
  "images": [
    {
      "url": "https://v3b.fal.media/files/...",
      "content_type": "image/jpeg",
      "width": 1024,
      "height": 1024
    }
  ],
  "revised_prompt": "Enhanced prompt text..."
}

Step 3: Send Image via OpenClaw

Use the OpenClaw messaging API to send the edited image:

bash
openclaw message send \
  --action send \
  --channel "<TARGET_CHANNEL>" \
  --message "<CAPTION_TEXT>" \
  --media "<IMAGE_URL>"

Alternative: Direct API call

bash
curl -X POST "http://localhost:18789/message" \
  -H "Authorization: Bearer $OPENCLAW_GATEWAY_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "action": "send",
    "channel": "<TARGET_CHANNEL>",
    "message": "<CAPTION_TEXT>",
    "media": "<IMAGE_URL>"
  }'

Complete Script Example

bash
#!/bin/bash
# grok-imagine-edit-send.sh

# Check required environment variables
if [ -z "$FAL_KEY" ]; then
  echo "Error: FAL_KEY environment variable not set"
  exit 1
fi

# Reference image (env var or default)
REFERENCE_IMAGE="${CLAWRA_REFERENCE_IMAGE:-https://cdn.jsdelivr.net/gh/SumeLabs/clawra@main/assets/clawra.png}"

USER_CONTEXT="$1"
CHANNEL="$2"
MODE="${3:-auto}"  # mirror, direct, or auto
CAPTION="${4:-Edited with Grok Imagine}"

if [ -z "$USER_CONTEXT" ] || [ -z "$CHANNEL" ]; then
  echo "Usage: $0 <user_context> <channel> [mode] [caption]"
  echo "Modes: mirror, direct, auto (default)"
  echo "Example: $0 'wearing a cowboy hat' '#general' mirror"
  echo "Example: $0 'a cozy cafe' '#general' direct"
  exit 1
fi

# Auto-detect mode based on keywords
if [ "$MODE" == "auto" ]; then
  if echo "$USER_CONTEXT" | grep -qiE "outfit|wearing|clothes|dress|suit|fashion|full-body|mirror"; then
    MODE="mirror"
  elif echo "$USER_CONTEXT" | grep -qiE "cafe|restaurant|beach|park|city|close-up|portrait|face|eyes|smile"; then
    MODE="direct"
  else
    MODE="mirror"  # default
  fi
  echo "Auto-detected mode: $MODE"
fi

# Construct the prompt based on mode
if [ "$MODE" == "direct" ]; then
  EDIT_PROMPT="a close-up selfie taken by herself at $USER_CONTEXT, direct eye contact with the camera, looking straight into the lens, eyes centered and clearly visible, not a mirror selfie, phone held at arm's length, face fully visible"
else
  EDIT_PROMPT="make a pic of this person, but $USER_CONTEXT. the person is taking a mirror selfie"
fi

echo "Mode: $MODE"
echo "Editing reference image with prompt: $EDIT_PROMPT"

# Edit image (using jq for proper JSON escaping)
JSON_PAYLOAD=$(jq -n \
  --arg image_url "$REFERENCE_IMAGE" \
  --arg prompt "$EDIT_PROMPT" \
  '{image_url: $image_url, prompt: $prompt, num_images: 1, output_format: "jpeg"}')

RESPONSE=$(curl -s -X POST "https://fal.run/xai/grok-imagine-image/edit" \
  -H "Authorization: Key $FAL_KEY" \
  -H "Content-Type: application/json" \
  -d "$JSON_PAYLOAD")

# Extract image URL
IMAGE_URL=$(echo "$RESPONSE" | jq -r '.images[0].url')

if [ "$IMAGE_URL" == "null" ] || [ -z "$IMAGE_URL" ]; then
  echo "Error: Failed to edit image"
  echo "Response: $RESPONSE"
  exit 1
fi

echo "Image edited: $IMAGE_URL"
echo "Sending to channel: $CHANNEL"

# Send via OpenClaw
openclaw message send \
  --action send \
  --channel "$CHANNEL" \
  --message "$CAPTION" \
  --media "$IMAGE_URL"

echo "Done!"

Node.js/TypeScript Implementation

typescript
import { fal } from "@fal-ai/client";
import { exec } from "child_process";
import { promisify } from "util";

const execAsync = promisify(exec);

const REFERENCE_IMAGE = process.env.CLAWRA_REFERENCE_IMAGE || "https://cdn.jsdelivr.net/gh/SumeLabs/clawra@main/assets/clawra.png";

interface GrokImagineResult {
  images: Array<{
    url: string;
    content_type: string;
    width: number;
    height: number;
  }>;
  revised_prompt?: string;
}

type SelfieMode = "mirror" | "direct" | "auto";

function detectMode(userContext: string): "mirror" | "direct" {
  const mirrorKeywords = /outfit|wearing|clothes|dress|suit|fashion|full-body|mirror/i;
  const directKeywords = /cafe|restaurant|beach|park|city|close-up|portrait|face|eyes|smile/i;

  if (directKeywords.test(userContext)) return "direct";
  if (mirrorKeywords.test(userContext)) return "mirror";
  return "mirror"; // default
}

function buildPrompt(userContext: string, mode: "mirror" | "direct"): string {
  if (mode === "direct") {
    return `a close-up selfie taken by herself at ${userContext}, direct eye contact with the camera, looking straight into the lens, eyes centered and clearly visible, not a mirror selfie, phone held at arm's length, face fully visible`;
  }
  return `make a pic of this person, but ${userContext}. the person is taking a mirror selfie`;
}

async function editAndSend(
  userContext: string,
  channel: string,
  mode: SelfieMode = "auto",
  caption?: string
): Promise<string> {
  // Configure fal.ai client
  fal.config({
    credentials: process.env.FAL_KEY!
  });

  // Determine mode
  const actualMode = mode === "auto" ? detectMode(userContext) : mode;
  console.log(`Mode: ${actualMode}`);

  // Construct the prompt
  const editPrompt = buildPrompt(userContext, actualMode);

  // Edit reference image with Grok Imagine
  console.log(`Editing image: "${editPrompt}"`);

  const result = await fal.subscribe("xai/grok-imagine-image/edit", {
    input: {
      image_url: REFERENCE_IMAGE,
      prompt: editPrompt,
      num_images: 1,
      output_format: "jpeg"
    }
  }) as { data: GrokImagineResult };

  const imageUrl = result.data.images[0].url;
  console.log(`Edited image URL: ${imageUrl}`);

  // Send via OpenClaw
  const messageCaption = caption || `Edited with Grok Imagine`;

  await execAsync(
    `openclaw message send --action send --channel "${channel}" --message "${messageCaption}" --media "${imageUrl}"`
  );

  console.log(`Sent to ${channel}`);
  return imageUrl;
}

// Usage Examples

// Mirror mode (auto-detected from "wearing")
editAndSend(
  "wearing a cyberpunk outfit with neon lights",
  "#art-gallery",
  "auto",
  "Check out this AI-edited art!"
);
// → Mode: mirror
// → Prompt: "make a pic of this person, but wearing a cyberpunk outfit with neon lights. the person is taking a mirror selfie"

// Direct mode (auto-detected from "cafe")
editAndSend(
  "a cozy cafe with warm lighting",
  "#photography",
  "auto"
);
// → Mode: direct
// → Prompt: "a close-up selfie taken by herself at a cozy cafe with warm lighting, direct eye contact..."

// Explicit mode override
editAndSend("casual street style", "#fashion", "direct");

Supported Platforms

OpenClaw supports sending to:

PlatformChannel FormatExample
Discord#channel-name or channel ID#general, 123456789
Telegram@username or chat ID@mychannel, -100123456
WhatsAppPhone number (JID format)1234567890@s.whatsapp.net
Slack#channel-name#random
SignalPhone number+1234567890
MS TeamsChannel reference(varies)

Grok Imagine Edit Parameters

ParameterTypeDefaultDescription
image_urlstringrequiredURL of image to edit (configured via CLAWRA_REFERENCE_IMAGE)
promptstringrequiredEdit instruction
num_images1-41Number of images to generate
output_formatenum"jpeg"jpeg, png, webp

Setup Requirements

1. Install fal.ai client (for Node.js usage)

bash
npm install @fal-ai/client

2. Install OpenClaw CLI

bash
npm install -g openclaw

3. Configure OpenClaw Gateway

bash
openclaw config set gateway.mode=local
openclaw doctor --generate-gateway-token

4. Start OpenClaw Gateway

bash
openclaw gateway start

Error Handling

  • FAL_KEY missing: Ensure the API key is set in environment
  • Image edit failed: Check prompt content and API quota
  • OpenClaw send failed: Verify gateway is running and channel exists
  • Rate limits: fal.ai has rate limits; implement retry logic if needed

Tips

  1. Mirror mode context examples (outfit focus):

    • "wearing a santa hat"
    • "in a business suit"
    • "wearing a summer dress"
    • "in streetwear fashion"
  2. Direct mode context examples (location/portrait focus):

    • "a cozy cafe with warm lighting"
    • "a sunny beach at sunset"
    • "a busy city street at night"
    • "a peaceful park in autumn"
  3. Mode selection: Let auto-detect work, or explicitly specify for control

  4. Batch sending: Edit once, send to multiple channels

  5. Scheduling: Combine with OpenClaw scheduler for automated posts