AI Models Overview

ShoeCatch provides the latest AI models tailored for footwear designers.
From leading research labs such as OpenAI, Google, and Black Forest Labs, to our own models specialized in footwear design, each model offers unique strengths.
Experiment with different models to find the one that best fits your project.

Image Generation & Editing Models

Image Nodes connect text and image inputs, allowing you to iterate on design ideas, experiment with styles and colors, and combine visuals into cohesive stories.

Model

Input

Description

GPT Image

Text, Image + Text

OpenAI’s standard model for image generation and editing.

GPT Image 1.5

Text, Image + Text

OpenAI’s Pro-quality model for image generation and editing.

Flux dev

Text

Flexible image generation model for experimental use.

Flux Pro Kontext Max

Text, Image + Text

Intelligent generation model with advanced contextual understanding.

Flux Pro 1.1

Text, Image + Text

Professional-grade image generation model.

Imagen4 Ultra

Text, Image + Text

Next-generation image synthesis preview model by Google.

Nano Banana

Text, Image + Text

Basic image editing model optimized by Google.

Nano Banana2

Text, Image + Text

Pro-quality images at Flash speed. Generate and edit with Google’s latest model.

Nano Banana Pro

Text, Image + Text

High-performance image editing model optimized by Google.

ShoeCatch 1.0

Text, Image + Text

ShoeCatch standard image model, optimized for speed and reliability.

Qwen

Image + Text

Specialized for selective edits such as logos, text, and partial adjustments.

Reve Remix

Image + Text

Reimagines style while preserving the essence of the original image.

Reve Edit

Image + Text

Precise model for refining text, objects, and fine details.

Seedream 4.0

Image + Text

Unified model for contextual understanding, generation, and editing.

Remove Background

Image

One-click automatic background removal.

Topaz (Upscaler)

Image

Fast, high-quality image upscaling tool.


Video Models

Video Nodes turn static images into motion, enabling dynamic storytelling and immersive experiences.

Model

Input

Description

Wan 2.2 / Wan 2.5

Text, Image + Text

Accurate generation with strong text comprehension.

Veo 2

Text, Image + Text

Converts still images into smooth, natural motion.

Veo 3

Text, Image + Text

Advanced AI model for cinematic video creation.

Kling Pro 1.6

Text, Image + Text

Generates lifelike, realistic motion videos.

Kling 2.5 Turbo Pro

Image + Text

Produces natural motion with cinematic realism.

Kling 2.5 Turbo Standard

Image + Text

Stable, balanced motion generation for general use.

Kling 3.0 Standard

Text, Image + Text

Produces fast, cost-effective cinematic clips for iteration and prototyping.

Kling 3.0 Pro

Text, Image + Text

Produces higher-fidelity cinematic video with stronger temporal coherence for final-quality shots.

Gen4-Turbo

Image + Text

Produces realistic motion with strong subject/style coherence for rapid iteration.


Text Models

Text models help you organize text, write prompts, and generate image descriptions from text or image inputs.
The generated text can be used again as input for image generation or video generation.

Model

Input

Description

GPT-5.5

Text, Image

A high-performance OpenAI text model for complex planning, research, document organization, code, and data analysis. Best for long context and multi-step instructions.

GPT-5.4 mini

Text, Image

A fast and efficient lightweight model in the GPT-5.4 family. Best for repeated prompt writing, image descriptions, and short text refinement.

Claude Opus 4.6

Text, Image

A high-performance Anthropic model for complex reasoning, long document analysis, advanced writing, and structured thinking.

Claude Sonnet 4.6

Text, Image

A balanced Claude model for quality and speed. Reliable for prompt writing, document summaries, and design descriptions.

Gemini 3.1 Pro

Text, Image

A high-performance Google model for complex multimodal reasoning, long context, and image- or document-based analysis.

Gemini 3.1 Flash

Text, Image

A fast Google model for quick responses and repeated tasks. Useful for simple prompt refinement, image descriptions, and high-volume text generation.

Use text models to organize design concepts, analyze reference images, refine prompts, and write collection descriptions.


Manufacturing Integration Models

ShoeCatch’s proprietary models analyze design images to predict manufacturability, cost, and production feasibility.

  • Key Capabilities

    • Estimate cost, lead time, and potential production risks during the design stage.

    • Bridge the gap between design and manufacturing through digital sampling and AI-driven analysis.

  • Analysis Items

    • Estimated manufacturing cost range

    • Minimum order quantity (MOQ)

    • Expected production lead time

    • Potential manufacturing risks

Several additional models are currently in closed beta.
Stay tuned as we continue to expand the ShoeCatch AI ecosystem. 👟