Clawra Selfie

Edit a fixed reference image using xAI's Grok Imagine model and distribute it across messaging platforms (WhatsApp, Telegram, Discord, Slack, etc.) via OpenClaw.

Reference Image

The skill uses a default reference image hosted on jsDelivr CDN, but can be configured to use a custom image:

code

https://cdn.jsdelivr.net/gh/SumeLabs/clawra@main/assets/clawra.png

To use a custom image, set the CLAWRA_REFERENCE_IMAGE environment variable.

When to Use

•User says "send a pic", "send me a pic", "send a photo", "send a selfie"
•User says "send a pic of you...", "send a selfie of you..."
•User asks "what are you doing?", "how are you doing?", "where are you?"
•User describes a context: "send a pic wearing...", "send a pic at..."
•User wants Clawra to appear in a specific outfit, location, or situation

Quick Reference

Required Environment Variables

bash

FAL_KEY=your_fal_api_key          # Get from https://fal.ai/dashboard/keys
OPENCLAW_GATEWAY_TOKEN=your_token  # From: openclaw doctor --generate-gateway-token
CLAWRA_REFERENCE_IMAGE=url_to_img  # Optional: Custom reference image URL

Workflow

•Get user prompt for how to edit the image
•Edit image via fal.ai Grok Imagine Edit API with fixed reference
•Extract image URL from response
•Send to OpenClaw with target channel(s)

Step-by-Step Instructions

Step 1: Collect User Input

Ask the user for:

•User context: What should the person in the image be doing/wearing/where?
•Mode (optional): mirror or direct selfie style
•Target channel(s): Where should it be sent? (e.g., #general, @username, channel ID)
•Platform (optional): Which platform? (discord, telegram, whatsapp, slack)

Prompt Modes

Mode 1: Mirror Selfie (default)

Best for: outfit showcases, full-body shots, fashion content

code

make a pic of this person, but [user's context]. the person is taking a mirror selfie

Example: "wearing a santa hat" →

code

make a pic of this person, but wearing a santa hat. the person is taking a mirror selfie

Mode 2: Direct Selfie

Best for: close-up portraits, location shots, emotional expressions

code

a close-up selfie taken by herself at [user's context], direct eye contact with the camera, looking straight into the lens, eyes centered and clearly visible, not a mirror selfie, phone held at arm's length, face fully visible

Example: "a cozy cafe with warm lighting" →

code

a close-up selfie taken by herself at a cozy cafe with warm lighting, direct eye contact with the camera, looking straight into the lens, eyes centered and clearly visible, not a mirror selfie, phone held at arm's length, face fully visible

Mode Selection Logic

Keywords in Request	Auto-Select Mode
outfit, wearing, clothes, dress, suit, fashion	`mirror`
cafe, restaurant, beach, park, city, location	`direct`
close-up, portrait, face, eyes, smile	`direct`
full-body, mirror, reflection	`mirror`

Step 2: Edit Image with Grok Imagine

Use the fal.ai API to edit the reference image:

bash

REFERENCE_IMAGE="${CLAWRA_REFERENCE_IMAGE:-https://cdn.jsdelivr.net/gh/SumeLabs/clawra@main/assets/clawra.png}"

# Mode 1: Mirror Selfie
PROMPT="make a pic of this person, but <USER_CONTEXT>. the person is taking a mirror selfie"

# Mode 2: Direct Selfie
PROMPT="a close-up selfie taken by herself at <USER_CONTEXT>, direct eye contact with the camera, looking straight into the lens, eyes centered and clearly visible, not a mirror selfie, phone held at arm's length, face fully visible"

# Build JSON payload with jq (handles escaping properly)
JSON_PAYLOAD=$(jq -n \
  --arg image_url "$REFERENCE_IMAGE" \
  --arg prompt "$PROMPT" \
  '{image_url: $image_url, prompt: $prompt, num_images: 1, output_format: "jpeg"}')

curl -X POST "https://fal.run/xai/grok-imagine-image/edit" \
  -H "Authorization: Key $FAL_KEY" \
  -H "Content-Type: application/json" \
  -d "$JSON_PAYLOAD"

Response Format:

json

{
  "images": [
    {
      "url": "https://v3b.fal.media/files/...",
      "content_type": "image/jpeg",
      "width": 1024,
      "height": 1024
    }
  ],
  "revised_prompt": "Enhanced prompt text..."
}

Step 3: Send Image via OpenClaw

Use the OpenClaw messaging API to send the edited image:

bash

openclaw message send \
  --action send \
  --channel "<TARGET_CHANNEL>" \
  --message "<CAPTION_TEXT>" \
  --media "<IMAGE_URL>"

Alternative: Direct API call

bash

curl -X POST "http://localhost:18789/message" \
  -H "Authorization: Bearer $OPENCLAW_GATEWAY_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "action": "send",
    "channel": "<TARGET_CHANNEL>",
    "message": "<CAPTION_TEXT>",
    "media": "<IMAGE_URL>"
  }'

Complete Script Example

bash

#!/bin/bash
# grok-imagine-edit-send.sh

# Check required environment variables
if [ -z "$FAL_KEY" ]; then
  echo "Error: FAL_KEY environment variable not set"
  exit 1
fi

# Reference image (env var or default)
REFERENCE_IMAGE="${CLAWRA_REFERENCE_IMAGE:-https://cdn.jsdelivr.net/gh/SumeLabs/clawra@main/assets/clawra.png}"

USER_CONTEXT="$1"
CHANNEL="$2"
MODE="${3:-auto}"  # mirror, direct, or auto
CAPTION="${4:-Edited with Grok Imagine}"

if [ -z "$USER_CONTEXT" ] || [ -z "$CHANNEL" ]; then
  echo "Usage: $0 <user_context> <channel> [mode] [caption]"
  echo "Modes: mirror, direct, auto (default)"
  echo "Example: $0 'wearing a cowboy hat' '#general' mirror"
  echo "Example: $0 'a cozy cafe' '#general' direct"
  exit 1
fi

# Auto-detect mode based on keywords
if [ "$MODE" == "auto" ]; then
  if echo "$USER_CONTEXT" | grep -qiE "outfit|wearing|clothes|dress|suit|fashion|full-body|mirror"; then
    MODE="mirror"
  elif echo "$USER_CONTEXT" | grep -qiE "cafe|restaurant|beach|park|city|close-up|portrait|face|eyes|smile"; then
    MODE="direct"
  else
    MODE="mirror"  # default
  fi
  echo "Auto-detected mode: $MODE"
fi

# Construct the prompt based on mode
if [ "$MODE" == "direct" ]; then
  EDIT_PROMPT="a close-up selfie taken by herself at $USER_CONTEXT, direct eye contact with the camera, looking straight into the lens, eyes centered and clearly visible, not a mirror selfie, phone held at arm's length, face fully visible"
else
  EDIT_PROMPT="make a pic of this person, but $USER_CONTEXT. the person is taking a mirror selfie"
fi

echo "Mode: $MODE"
echo "Editing reference image with prompt: $EDIT_PROMPT"

# Edit image (using jq for proper JSON escaping)
JSON_PAYLOAD=$(jq -n \
  --arg image_url "$REFERENCE_IMAGE" \
  --arg prompt "$EDIT_PROMPT" \
  '{image_url: $image_url, prompt: $prompt, num_images: 1, output_format: "jpeg"}')

RESPONSE=$(curl -s -X POST "https://fal.run/xai/grok-imagine-image/edit" \
  -H "Authorization: Key $FAL_KEY" \
  -H "Content-Type: application/json" \
  -d "$JSON_PAYLOAD")

# Extract image URL
IMAGE_URL=$(echo "$RESPONSE" | jq -r '.images[0].url')

if [ "$IMAGE_URL" == "null" ] || [ -z "$IMAGE_URL" ]; then
  echo "Error: Failed to edit image"
  echo "Response: $RESPONSE"
  exit 1
fi

echo "Image edited: $IMAGE_URL"
echo "Sending to channel: $CHANNEL"

# Send via OpenClaw
openclaw message send \
  --action send \
  --channel "$CHANNEL" \
  --message "$CAPTION" \
  --media "$IMAGE_URL"

echo "Done!"

Node.js/TypeScript Implementation

typescript

import { fal } from "@fal-ai/client";
import { exec } from "child_process";
import { promisify } from "util";

const execAsync = promisify(exec);

const REFERENCE_IMAGE = process.env.CLAWRA_REFERENCE_IMAGE || "https://cdn.jsdelivr.net/gh/SumeLabs/clawra@main/assets/clawra.png";

interface GrokImagineResult {
  images: Array<{
    url: string;
    content_type: string;
    width: number;
    height: number;
  }>;
  revised_prompt?: string;
}

type SelfieMode = "mirror" | "direct" | "auto";

function detectMode(userContext: string): "mirror" | "direct" {
  const mirrorKeywords = /outfit|wearing|clothes|dress|suit|fashion|full-body|mirror/i;
  const directKeywords = /cafe|restaurant|beach|park|city|close-up|portrait|face|eyes|smile/i;

  if (directKeywords.test(userContext)) return "direct";
  if (mirrorKeywords.test(userContext)) return "mirror";
  return "mirror"; // default
}

function buildPrompt(userContext: string, mode: "mirror" | "direct"): string {
  if (mode === "direct") {
    return `a close-up selfie taken by herself at ${userContext}, direct eye contact with the camera, looking straight into the lens, eyes centered and clearly visible, not a mirror selfie, phone held at arm's length, face fully visible`;
  }
  return `make a pic of this person, but ${userContext}. the person is taking a mirror selfie`;
}

async function editAndSend(
  userContext: string,
  channel: string,
  mode: SelfieMode = "auto",
  caption?: string
): Promise<string> {
  // Configure fal.ai client
  fal.config({
    credentials: process.env.FAL_KEY!
  });

  // Determine mode
  const actualMode = mode === "auto" ? detectMode(userContext) : mode;
  console.log(`Mode: ${actualMode}`);

  // Construct the prompt
  const editPrompt = buildPrompt(userContext, actualMode);

  // Edit reference image with Grok Imagine
  console.log(`Editing image: "${editPrompt}"`);

  const result = await fal.subscribe("xai/grok-imagine-image/edit", {
    input: {
      image_url: REFERENCE_IMAGE,
      prompt: editPrompt,
      num_images: 1,
      output_format: "jpeg"
    }
  }) as { data: GrokImagineResult };

  const imageUrl = result.data.images[0].url;
  console.log(`Edited image URL: ${imageUrl}`);

  // Send via OpenClaw
  const messageCaption = caption || `Edited with Grok Imagine`;

  await execAsync(
    `openclaw message send --action send --channel "${channel}" --message "${messageCaption}" --media "${imageUrl}"`
  );

  console.log(`Sent to ${channel}`);
  return imageUrl;
}

// Usage Examples

// Mirror mode (auto-detected from "wearing")
editAndSend(
  "wearing a cyberpunk outfit with neon lights",
  "#art-gallery",
  "auto",
  "Check out this AI-edited art!"
);
// → Mode: mirror
// → Prompt: "make a pic of this person, but wearing a cyberpunk outfit with neon lights. the person is taking a mirror selfie"

// Direct mode (auto-detected from "cafe")
editAndSend(
  "a cozy cafe with warm lighting",
  "#photography",
  "auto"
);
// → Mode: direct
// → Prompt: "a close-up selfie taken by herself at a cozy cafe with warm lighting, direct eye contact..."

// Explicit mode override
editAndSend("casual street style", "#fashion", "direct");

Supported Platforms

OpenClaw supports sending to:

Platform	Channel Format	Example
Discord	`#channel-name` or channel ID	`#general`, `123456789`
Telegram	`@username` or chat ID	`@mychannel`, `-100123456`
WhatsApp	Phone number (JID format)	`1234567890@s.whatsapp.net`
Slack	`#channel-name`	`#random`
Signal	Phone number	`+1234567890`
MS Teams	Channel reference	(varies)

Grok Imagine Edit Parameters

Parameter	Type	Default	Description
`image_url`	string	required	URL of image to edit (configured via `CLAWRA_REFERENCE_IMAGE`)
`prompt`	string	required	Edit instruction
`num_images`	1-4	1	Number of images to generate
`output_format`	enum	"jpeg"	jpeg, png, webp

Setup Requirements

1. Install fal.ai client (for Node.js usage)

bash

npm install @fal-ai/client

2. Install OpenClaw CLI

bash

npm install -g openclaw

3. Configure OpenClaw Gateway

bash

openclaw config set gateway.mode=local
openclaw doctor --generate-gateway-token

4. Start OpenClaw Gateway

bash

openclaw gateway start

Error Handling

•FAL_KEY missing: Ensure the API key is set in environment
•Image edit failed: Check prompt content and API quota
•OpenClaw send failed: Verify gateway is running and channel exists
•Rate limits: fal.ai has rate limits; implement retry logic if needed

Tips

•
Mirror mode context examples (outfit focus):
- •"wearing a santa hat"
- •"in a business suit"
- •"wearing a summer dress"
- •"in streetwear fashion"
•
Direct mode context examples (location/portrait focus):
- •"a cozy cafe with warm lighting"
- •"a sunny beach at sunset"
- •"a busy city street at night"
- •"a peaceful park in autumn"
•
Mode selection: Let auto-detect work, or explicitly specify for control
•
Batch sending: Edit once, send to multiple channels
•
Scheduling: Combine with OpenClaw scheduler for automated posts