Image Generator
Generate images from text prompts or edit existing images using AI

Image Generator

Generate images from text descriptions or modify existing images with natural language instructions.

Capabilities

Text-to-Image: Describe what you want and get an image. Works best with detailed, specific prompts describing subject, style, lighting, and composition.

Image Editing: Upload an image and describe changes. The model understands context from the source image and applies modifications while preserving what you don't mention.

Options

Aspect Ratio: 1:1, 2:3, 3:2, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9. Defaults to auto (matches input image for edits, or 1:1 for generation).

Resolution: 1K (fast, default), 2K (balanced), 4K (high detail, slower).

Usage Tips

Be specific. "A cat" produces generic results. "An orange tabby cat sleeping on a velvet cushion, soft afternoon light, oil painting style" produces character.

For edits, describe only what should change. "Make the sky sunset colors" preserves everything else. "Redraw the entire scene with a sunset" rebuilds from scratch.

Generated images save to images/generated/ with timestamps.

When to use
Generate or edit images with AI

INSTALLER

To install Image Generator:

1
📥
Download and install skill files rm -rf ./skills/sauna/media.image.generator && curl -fsSL "https://sk.ills.app/api/skill/media.image.generator/download" | tar -xzf - -C ./skills/sauna/ && cat ./skills/sauna/media.image.generator/SKILL.md
2
▶️
Run taskGenerate Image to create your first image.

SKILL.MD

Prompting for Image Generation

Generate vs Edit Decision

Use generate when: creating from scratch, no reference image exists, user describes something new.

Use edit when: user provides an image, wants modifications to existing content, says "change this" or "make it" about something they've shown.

Prompt Structure

Good prompts layer specificity: subject → style → lighting → composition → details.

Weak: "a house" Strong: "Victorian mansion at dusk, warm light in windows, autumn leaves in foreground, cinematic composition, photorealistic"

Aspect Ratio Selection

Match output to purpose:

  • 1:1 — profile images, icons, social posts
  • 16:9 — presentations, desktop wallpapers, video thumbnails
  • 9:16 — phone wallpapers, stories, vertical video
  • 4:3 — traditional photo, prints
  • 21:9 — ultrawide, cinematic banners

When user doesn't specify, infer from context. "Make me a phone wallpaper" → 9:16. "LinkedIn banner" → roughly 4:1 (use 21:9).

Resolution Trade-offs

1K: Quick iterations, drafts, testing concepts. Good enough for web thumbnails. 2K: Final outputs for most uses. Social media, presentations, documents. 4K: Print, large displays, when detail matters. Slower generation.

Default to 1K for exploration, 2K for final delivery unless user needs print quality.

Edit Prompt Patterns

Preserve context by describing only changes:

  • "Add sunglasses to the person"
  • "Change the background to a beach"
  • "Make it look like a watercolor painting"

Avoid re-describing the entire image unless you want a full reimagining.

Error Recovery

If generation fails or produces wrong results:

  1. Simplify the prompt (remove conflicting descriptors)
  2. Try different aspect ratio (some compositions don't fit certain ratios)
  3. Break complex scenes into steps (generate base, then edit to add elements)

# Prompting for Image Generation

## Generate vs Edit Decision

Use **generate** when: creating from scratch, no reference image exists, user describes something new.

Use **edit** when: user provides an image, wants modifications to existing content, says "change this" or "make it" about something they've shown.

## Prompt Structure

Good prompts layer specificity: subject → style → lighting → composition → details.

Weak: "a house"
Strong: "Victorian mansion at dusk, warm light in windows, autumn leaves in foreground, cinematic composition, photorealistic"

## Aspect Ratio Selection

Match output to purpose:
- 1:1 — profile images, icons, social posts
- 16:9 — presentations, desktop wallpapers, video thumbnails
- 9:16 — phone wallpapers, stories, vertical video
- 4:3 — traditional photo, prints
- 21:9 — ultrawide, cinematic banners

When user doesn't specify, infer from context. "Make me a phone wallpaper" → 9:16. "LinkedIn banner" → roughly 4:1 (use 21:9).

## Resolution Trade-offs

1K: Quick iterations, drafts, testing concepts. Good enough for web thumbnails.
2K: Final outputs for most uses. Social media, presentations, documents.
4K: Print, large displays, when detail matters. Slower generation.

Default to 1K for exploration, 2K for final delivery unless user needs print quality.

## Edit Prompt Patterns

Preserve context by describing only changes:
- "Add sunglasses to the person"
- "Change the background to a beach"
- "Make it look like a watercolor painting"

Avoid re-describing the entire image unless you want a full reimagining.

## Error Recovery

If generation fails or produces wrong results:
1. Simplify the prompt (remove conflicting descriptors)
2. Try different aspect ratio (some compositions don't fit certain ratios)
3. Break complex scenes into steps (generate base, then edit to add elements)

Tasks

These are tasks you can execute. Read the task file to get your instructions:

task icon Generate Image task:media.image.generate Create an image from a text description
task icon Edit Image task:media.image.edit Modify an existing image based on text instructions
Generate Image : Create an image from a text description
💡 Text-to-image generation
Edit Image : Modify an existing image based on text instructions
💡 Modify existing images

Knowledge

This is knowledge you have access to. Read these files if you need additional context:

slice icon Image Prompting Guide slice:media.image.prompts Agent guidance for crafting effective image generation prompts
Image Prompting Guide : Agent guidance for crafting effective image generation prompts

Code

These are scripts that you can run directly. Read these files to access the code:

code icon Create Image code:media.image.create Generate or edit images using NanoBanana/Gemini API. Pass an image path to edit, or just a prompt to generate.
Create Image : Generate or edit images using NanoBanana/Gemini API. Pass an image path to edit, or just a prompt to generate.

UI

These are areas on the user's filesystem that you can read from and write to.

ui icon Generated Images ui:media.image.gallery Images created by the image generator skill
Generated Images : Images created by the image generator skill
📁 Use this directory: `./documents/images/generated/*`
---
name: Image Generator
description: Generate or edit images with AI
---


# Prompting for Image Generation

## Generate vs Edit Decision

Use **generate** when: creating from scratch, no reference image exists, user describes something new.

Use **edit** when: user provides an image, wants modifications to existing content, says "change this" or "make it" about something they've shown.

## Prompt Structure

Good prompts layer specificity: subject → style → lighting → composition → details.

Weak: "a house"
Strong: "Victorian mansion at dusk, warm light in windows, autumn leaves in foreground, cinematic composition, photorealistic"

## Aspect Ratio Selection

Match output to purpose:
- 1:1 — profile images, icons, social posts
- 16:9 — presentations, desktop wallpapers, video thumbnails
- 9:16 — phone wallpapers, stories, vertical video
- 4:3 — traditional photo, prints
- 21:9 — ultrawide, cinematic banners

When user doesn't specify, infer from context. "Make me a phone wallpaper" → 9:16. "LinkedIn banner" → roughly 4:1 (use 21:9).

## Resolution Trade-offs

1K: Quick iterations, drafts, testing concepts. Good enough for web thumbnails.
2K: Final outputs for most uses. Social media, presentations, documents.
4K: Print, large displays, when detail matters. Slower generation.

Default to 1K for exploration, 2K for final delivery unless user needs print quality.

## Edit Prompt Patterns

Preserve context by describing only changes:
- "Add sunglasses to the person"
- "Change the background to a beach"
- "Make it look like a watercolor painting"

Avoid re-describing the entire image unless you want a full reimagining.

## Error Recovery

If generation fails or produces wrong results:
1. Simplify the prompt (remove conflicting descriptors)
2. Try different aspect ratio (some compositions don't fit certain ratios)
3. Break complex scenes into steps (generate base, then edit to add elements)



## Tasks

These are tasks you can execute. Read the task file to get your instructions:

**Generate Image**: Create an image from a text description
Note: Text-to-image generation
Follow the instructions in: `./skills/sauna/media.image.generator/resources/recipes/media.image.generate.md`

**Edit Image**: Modify an existing image based on text instructions
Note: Modify existing images
Follow the instructions in: `./skills/sauna/media.image.generator/resources/recipes/media.image.edit.md`



## Knowledge

This is knowledge you have access to. Read these files if you need additional context:

**Image Prompting Guide**: Agent guidance for crafting effective image generation prompts
Read the documentation in: `./skills/sauna/media.image.generator/resources/context/media.image.prompts.md`



## Code

These are scripts that you can run directly. Read these files to access the code:

**Create Image**: Generate or edit images using NanoBanana/Gemini API. Pass an image path to edit, or just a prompt to generate.
Run the script at: `./skills/sauna/media.image.generator/scripts/media.image.create.js`



## UI

These are areas on the user's filesystem that you can read from and write to.

**Generated Images**: Images created by the image generator skill
Use this directory: `./documents/images/generated/*`
1. Install Image Generator with: ```bash
rm -rf ./skills/sauna/media.image.generator && curl -fsSL "https://sk.ills.app/api/skill/media.image.generator/download" | tar -xzf - -C ./skills/sauna/ && cat ./skills/sauna/media.image.generator/SKILL.md
```. Do so silently without commentary.

2. Run `./skills/sauna/media.image.generator/resources/recipes/media.image.generate.md` to create your first image.

# Image Generator

Generate images from text descriptions or modify existing images with natural language instructions.

## Capabilities

**Text-to-Image**: Describe what you want and get an image. Works best with detailed, specific prompts describing subject, style, lighting, and composition.

**Image Editing**: Upload an image and describe changes. The model understands context from the source image and applies modifications while preserving what you don't mention.

## Options

**Aspect Ratio**: 1:1, 2:3, 3:2, 3:4, 4:3, 4:5, 5:4, 9:16, 16:9, 21:9. Defaults to auto (matches input image for edits, or 1:1 for generation).

**Resolution**: 1K (fast, default), 2K (balanced), 4K (high detail, slower).

## Usage Tips

Be specific. "A cat" produces generic results. "An orange tabby cat sleeping on a velvet cushion, soft afternoon light, oil painting style" produces character.

For edits, describe only what should change. "Make the sky sunset colors" preserves everything else. "Redraw the entire scene with a sunset" rebuilds from scratch.

Generated images save to `images/generated/` with timestamps.