Intro to Generative AI: The Big Picture of Image, Video, Text, Audio

AI Navigate Original / 5/16/2026

共有:

Key Points

  • Generative AI spans text, image, video, audio, 3D—know each
  • Text free; image/video/audio mostly paid; 3D still experimental
  • Often combine multiple AIs for one work; keep your own hand
  • Start by genre: text→ChatGPT, image→Midjourney, video→Runway

The Big Picture of Generative AI

"Generative AI" spans many areas—image, video, text, audio, 3D. Before starting as a creator, organize what kinds exist and what each is good at.

5 Main Categories

1. Text Generation

  • Main: ChatGPT, Claude, Gemini
  • Uses: articles, novels, scenarios, social, copy
  • Can start sufficiently for free

2. Image Generation

  • Main: Midjourney, DALL-E, Stable Diffusion, FLUX
  • Uses: illustration, photo-style, concept art, manga
  • Mostly paid (pricing/plan names change often—check each official site)

3. Video Generation

  • Main: Google Veo, Runway, Kling, etc. (OpenAI Sora's offering changed—check the latest)
  • Uses: ads, social video, prototypes, visual expression
  • Pricier, longer generation time

Sign up to read the full article

Create a free account to access the full content of our original articles.