AI Models Overview

ShoeCatch provides the latest AI models tailored for footwear designers.
From leading research labs such as OpenAI, Google, and Black Forest Labs, to our own models specialized in footwear design, each model offers unique strengths.
Experiment with different models to find the one that best fits your project.

Image Generation & Editing Models

Image Nodes connect text and image inputs, allowing you to iterate on design ideas, experiment with styles and colors, and combine visuals into cohesive stories.

Model	Input	Description
GPT Image	Text, Image + Text	OpenAI’s standard model for image generation and editing.
GPT Image 1.5	Text, Image + Text	OpenAI’s Pro-quality model for image generation and editing.
Flux dev	Text	Flexible image generation model for experimental use.
Flux Pro Kontext Max	Text, Image + Text	Intelligent generation model with advanced contextual understanding.
Flux Pro 1.1	Text, Image + Text	Professional-grade image generation model.
Imagen4 Ultra	Text, Image + Text	Next-generation image synthesis preview model by Google.
Nano Banana	Text, Image + Text	Basic image editing model optimized by Google.
Nano Banana2	Text, Image + Text	Pro-quality images at Flash speed. Generate and edit with Google’s latest model.
Nano Banana Pro	Text, Image + Text	High-performance image editing model optimized by Google.
ShoeCatch 1.0	Text, Image + Text	ShoeCatch standard image model, optimized for speed and reliability.
Qwen	Image + Text	Specialized for selective edits such as logos, text, and partial adjustments.
Reve Remix	Image + Text	Reimagines style while preserving the essence of the original image.
Reve Edit	Image + Text	Precise model for refining text, objects, and fine details.
Seedream 4.0	Image + Text	Unified model for contextual understanding, generation, and editing.
Remove Background	Image	One-click automatic background removal.
Topaz (Upscaler)	Image	Fast, high-quality image upscaling tool.

Video Models

Video Nodes turn static images into motion, enabling dynamic storytelling and immersive experiences.

Model	Input	Description
Wan 2.2 / Wan 2.5	Text, Image + Text	Accurate generation with strong text comprehension.
Veo 2	Text, Image + Text	Converts still images into smooth, natural motion.
Veo 3	Text, Image + Text	Advanced AI model for cinematic video creation.
Kling Pro 1.6	Text, Image + Text	Generates lifelike, realistic motion videos.
Kling 2.5 Turbo Pro	Image + Text	Produces natural motion with cinematic realism.
Kling 2.5 Turbo Standard	Image + Text	Stable, balanced motion generation for general use.
Kling 3.0 Standard	Text, Image + Text	Produces fast, cost-effective cinematic clips for iteration and prototyping.
Kling 3.0 Pro	Text, Image + Text	Produces higher-fidelity cinematic video with stronger temporal coherence for final-quality shots.
Gen4-Turbo	Image + Text	Produces realistic motion with strong subject/style coherence for rapid iteration.

Text Models

Text models help you organize text, write prompts, and generate image descriptions from text or image inputs.
The generated text can be used again as input for image generation or video generation.

Model	Input	Description
GPT-5.5	Text, Image	A high-performance OpenAI text model for complex planning, research, document organization, code, and data analysis. Best for long context and multi-step instructions.
GPT-5.4 mini	Text, Image	A fast and efficient lightweight model in the GPT-5.4 family. Best for repeated prompt writing, image descriptions, and short text refinement.
Claude Opus 4.6	Text, Image	A high-performance Anthropic model for complex reasoning, long document analysis, advanced writing, and structured thinking.
Claude Sonnet 4.6	Text, Image	A balanced Claude model for quality and speed. Reliable for prompt writing, document summaries, and design descriptions.
Gemini 3.1 Pro	Text, Image	A high-performance Google model for complex multimodal reasoning, long context, and image- or document-based analysis.
Gemini 3.1 Flash	Text, Image	A fast Google model for quick responses and repeated tasks. Useful for simple prompt refinement, image descriptions, and high-volume text generation.

Use text models to organize design concepts, analyze reference images, refine prompts, and write collection descriptions.

Manufacturing Integration Models

ShoeCatch’s proprietary models analyze design images to predict manufacturability, cost, and production feasibility.

Key Capabilities
- Estimate cost, lead time, and potential production risks during the design stage.
- Bridge the gap between design and manufacturing through digital sampling and AI-driven analysis.
Analysis Items
- Estimated manufacturing cost range
- Minimum order quantity (MOQ)
- Expected production lead time
- Potential manufacturing risks

Several additional models are currently in closed beta.
Stay tuned as we continue to expand the ShoeCatch AI ecosystem. 👟