Crafting effective prompts is essential for generating high-quality images in AI systems like Stable Diffusion, which rely on models such as CLIP (Contrastive Language–Image Pretraining). However, understanding the limitations of CLIP and avoiding common mistakes can significantly improve your results. This guide outlines key pitfalls to avoid and provides actionable strategies for writing better prompts.
Table of Contents
Understanding CLIP’s Limitations
CLIP is a powerful tool for connecting text and images, but it has certain constraints that can affect how prompts are interpreted:
- Focus on Common Language: CLIP was trained on generic photos and captions. It understands commonly used words and concepts but struggles with technical, scientific, or niche terminology.
- No Understanding of Punctuation: Punctuation marks like commas or periods are ignored in the model’s processing. While they can improve human readability, they don’t influence the generated image.
- Simplistic Language Processing: CLIP processes nouns, adjectives, and verbs effectively but often disregards articles (e.g., “the”), prepositions (e.g., “of”), and conjunctions (e.g., “and”). Extraneous words can dilute the prompt’s effectiveness.
- Not a Conversational AI: Unlike large language models like ChatGPT, CLIP is more akin to a search engine. It doesn’t understand polite instructions or verbose phrasing.
Common Prompting Pitfalls and How to Avoid Them
1. Overloading Prompts with Unnecessary Words
- Pitfall: Adding articles, conjunctions, and phrases like “Create an image of” or “Please generate” wastes tokens and has no impact on the result.
- Solution: Keep prompts concise and direct. Use keywords that describe the subject, style, and mood of the image.
- Example:
- Poor: “Please create an image of a beautiful landscape with mountains and a sunrise.”

- Better: “Beautiful mountain landscape, sunrise, vibrant colors.”

2. Using Punctuation for Effect
- Pitfall: Adding commas or other separators in hopes of influencing the output. These characters are ignored by CLIP and only add randomness.
- Solution: Use punctuation solely for readability when sharing prompts with others, but don’t expect it to change the results.
- Example:
- “Beautiful mountain landscape, sunrise, vibrant colors” vs. “Beautiful mountain landscape sunrise vibrant colors” will yield similar results. See example output above.
3. Expecting CLIP to Understand Technical Terms
- Pitfall: Using scientific names, jargon, or niche terminology that CLIP wasn’t trained to recognize.
- Solution: Stick to common names and descriptions that are widely understood, especially those commonly found in public datasets.
- Example:
- Poor: “Archilochus colubris perched on a branch.”

- Better: “Ruby-throated hummingbird perched on a branch.”

4. Writing Prompts Like a Conversation
- Pitfall: Treating CLIP as if it were a conversational AI by including polite instructions or overly verbose phrasing.
- Solution: Write prompts as you would a search query—brief and to the point.
- Example:
- Poor: “Can you please generate an image of a sunset over a calm ocean?”

- Better: “Sunset over calm ocean, serene, warm colors.”

5. Ignoring CLIP’s Training Context
- Pitfall: Expecting CLIP to understand abstract or culturally specific references that aren’t commonly captured in public datasets.
- Solution: Use language that aligns with the general population’s understanding, such as descriptions familiar to an American teenager.
- Example:
- Poor: “Baroque architecture with intricate rococo detailing.”

- Better: “Elegant European palace, ornate design, gold accents.”

Practical Examples and Case Studies
Example 1: Simplifying Verbose Prompts
- Verbose Prompt: “Please create a detailed image of a forest in autumn with colorful leaves falling from the trees.”

- Optimized Prompt: “Autumn forest, colorful leaves, falling, serene atmosphere.”

Example 2: Avoiding Scientific Terminology
- Scientific Prompt: “Archilochus colubris perched on a blooming flower.”

- Optimized Prompt: “Ruby-throated hummingbird on blooming flower, vivid colors.”

Example 3: Using Common Descriptions
- Abstract Prompt: “A surreal depiction of existential dread.”

- Optimized Prompt: “Dark surreal art, eerie atmosphere, abstract shapes.”

Key Takeaways for Writing Effective Prompts
- Keep It Simple: Focus on nouns, adjectives, and verbs that clearly describe the subject and desired style.
- Avoid Extraneous Words: Skip articles, prepositions, and polite instructions—they don’t contribute to the output.
- Use Common Language: Stick to widely recognized terms and avoid overly technical or obscure references.
- Don’t Overthink Punctuation: Use punctuation for readability, but don’t expect it to influence the image generation.
- Align with CLIP’s Training Data: Think of prompts as captions for generic photos—language that resonates with the general public.