CanvasX
CompletedNext.jsTypeScriptReact+14 more

CanvasX

AI-powered mobile UI/UX design tool that generates beautiful, production-ready mobile app screens from text prompts in minutes — powered by Google Gemini AI and real-time streaming.

Technology Stack

Next.js
TypeScript
React
Tailwind CSS
NeonDB (PostgreSQL)
Prisma ORM
Shadcn UI
Zod
Gemini AI
Vercel AI SDK
Inngest
NextAuth
Puppeteer
Unsplash API
Polar.sh
Upstash Redis
Motion

Key Challenges

  • Two-Phase AI Generation Pipeline
  • Real-time Streaming with Inngest
  • 22-Theme CSS Variable System
  • AI Prompt Engineering for Pixel-Perfect HTML
  • Interactive Canvas with Zoom/Pan/Pinch
  • Server-Side Screenshot Export via Puppeteer
  • Subscription-Gated Feature Access
  • Rate Limiting & Redis Caching
  • Iterative Screen Generation with Context

Key Learnings

  • Google Gemini AI Integration (Structured Output + Tool Use)
  • Inngest Background Jobs & Realtime Streaming
  • Advanced AI Prompt Engineering for UI Generation
  • CSS Variable-Based Theming Architecture
  • Prisma ORM with NeonDB
  • Polar.sh Payment & Subscription Integration
  • NextAuth v5 (Google OAuth + Credentials)
  • Puppeteer Server-Side Rendering
  • Upstash Redis Caching & Rate Limiting
  • Vercel AI SDK with Tool Calling

CanvasX: AI-Powered Mobile UI/UX Design Tool

Overview

CanvasX is a full-stack AI application that transforms text prompts into Dribbble-quality, production-ready mobile app screens in under 90 seconds. Built with Next.js 16 and powered by Google Gemini 2.5 Flash, it uses a two-phase AI generation pipeline — first planning the screen architecture, then generating pixel-perfect HTML/CSS — all streamed to the user in real-time via Inngest. Users describe the app they want, pick from 22 curated design themes, and watch their screens materialize live inside interactive iPhone device frames on a zoomable canvas.

What Users Can Do

  • Text-to-UI Generation: Describe any mobile app idea in natural language and get 2–3 production-ready screens generated automatically
  • 22 Built-in Themes: Choose from curated themes like Ocean Breeze, Netflix Dark, Acid Lime, Neo-Brutalism, Glassmorphism, Cyber, Midnight, and more
  • Interactive Canvas: Zoom, pan, and pinch to navigate generated screens rendered inside realistic iPhone device frames
  • Real-time Streaming: Watch screens generate live with progress stages — analyzing, planning, generating, completing
  • Iterative Design: Add more screens to existing projects with context-aware generation that maintains design consistency
  • Theme Switching: Instantly switch between all 22 themes on the canvas to preview different visual styles
  • PNG Export: Download high-quality screenshots of individual screens via server-side Puppeteer rendering
  • HTML Code View: Inspect and copy the raw HTML/CSS code behind any generated screen
  • Project Management: Create, browse, and manage multiple design projects from a unified dashboard
  • Subscription Plans: Free (2 projects, 10 generations/month), Pro ($6/month), and Unlimited ($20/month) tiers

Why I Built This

I built CanvasX to solve real pain points in the early-stage mobile app design process:

  • Design Bottleneck: Turning an app idea into visual mockups traditionally requires hours of manual design work or expensive tools
  • Tool Complexity: Tools like Figma have a steep learning curve for developers who just want to visualize their ideas quickly
  • Inconsistency: Manually designing multiple screens often leads to inconsistent styling, spacing, and component patterns
  • Theme Exploration: Trying different visual styles on a design is time-consuming — you have to manually restyle everything
  • No AI-Native Design Tool: Existing AI tools generate single static images, not structured, themeable, production-ready HTML/CSS screens

Tech Stack

Frontend

  • Next.js 16 (App Router + Turbopack): React framework for both UI and API routes
  • TypeScript: End-to-end type safety with Prisma-generated types
  • Tailwind CSS 4: Utility-first styling for the application UI
  • Shadcn UI: Accessible component primitives (Radix UI)
  • Motion (Framer Motion): Page transitions and micro-animations
  • React Zoom Pan Pinch: Interactive canvas with zoom, pan, and pinch gestures
  • React Resizable Panels: Adjustable layout panels in the editor
  • TanStack Query: Server state management with automatic cache invalidation
  • React Context API: Client-side state for canvas, frames, themes, and generation status

Backend & Services

  • Google Gemini 2.5 Flash: AI model for screen analysis/planning and HTML generation
  • Vercel AI SDK: Unified interface for AI model calls with structured output and tool use
  • Unsplash API: AI tool integration — Gemini calls searchUnsplash to find real images during generation
  • NeonDB (Serverless PostgreSQL): Primary database for users, projects, frames, and subscriptions
  • Prisma ORM: Type-safe database client with migrations
  • Inngest: Durable background job execution with realtime event streaming
  • NextAuth v5: Authentication with Google OAuth and credentials provider
  • Puppeteer / Puppeteer-Core: Server-side headless browser for PNG screenshot export
  • Upstash Redis: Response caching and API rate limiting
  • Polar.sh: Payment processing and subscription lifecycle management
  • DOMPurify + JSDOM: Server-side HTML sanitization for AI-generated content

Key Features

Two-Phase AI Generation Pipeline

The core innovation of CanvasX is splitting screen generation into two distinct AI phases, each optimized for its role:

  • Phase 1 — Analysis & Planning: The user's prompt is sent to Gemini with the ANALYSIS_PROMPT system instruction. Gemini returns a structured JSON object (validated via Zod) containing:
    • theme: Best-matching theme ID from 22 options
    • screens[]: Array of 2–3 screen specs, each with id, name, purpose, and a highly detailed visualDescription covering exact layout, chart types, icon names, data values, and bottom navigation configuration
  • Phase 2 — HTML Generation: For each planned screen, a second Gemini call generates self-contained HTML/CSS using:
    • Tailwind v3 utility classes for layout and styling
    • CSS custom properties (var(--primary), var(--background), etc.) for theme colors
    • SVG-only charts (area, line, circular progress, donut — never canvas or divs)
    • Iconify icons (lucide:* set) for all iconography
    • Real images via searchUnsplash tool calling
    • Realistic placeholder data ("8,432 steps", "$12.99", "7h 20m" — not generic text)

22-Theme CSS Variable Architecture

  • Each theme defines ~20 CSS custom properties: --background, --foreground, --card, --primary, --accent, --muted, --border, --chart-1 through --chart-5, and more
  • Themes range from light (Ocean Breeze, Swiss Style, Peach) to dark (Netflix, Acid Lime, Cyber, Midnight, Neon) to special effects (Glassmorphism, Neo-Brutalism)
  • Base variables provide shared typography (--font-sans, --font-heading, --font-serif, --font-mono) and shadow scales
  • Theme CSS is injected into each frame's wrapper HTML, making generated screens instantly re-themeable without regeneration
  • The AI is instructed to use CSS variables for all foundational colors, ensuring perfect theme adherence

Real-time Generation Streaming

  • Inngest's @inngest/realtime middleware publishes live progress events on a per-user channel (user:{userId})
  • Six event stages flow to the frontend: generation.startanalysis.startanalysis.complete (skeleton frames appear with loading states) → frame.created (each screen renders as it completes) → generation.complete
  • The RealtimeProvider context subscribes to events, updates frame state, and manages loading/error UI
  • Fallback timeout (60s) catches cases where the backend never responds; generation timeout (5 min) catches stuck jobs
  • Toast notifications provide real-time feedback for completion and error states

Context-Aware Iterative Generation

  • When adding screens to an existing project, the system sends the last 4 frames' HTML as CONTEXT HTML to the analysis prompt
  • This ensures new screens maintain visual consistency: matching bottom navigation, consistent component styles, coherent color usage, and aligned spacing
  • The existing theme is preserved — the AI doesn't re-select a theme for iterative generations

Interactive Canvas & Device Frames

  • Generated HTML is rendered inside realistic iPhone device frames using iframe sandboxing
  • react-zoom-pan-pinch provides smooth canvas navigation with zoom controls
  • Floating toolbar offers actions: theme switching, HTML code view, screenshot export
  • Each frame has its own toolbar for individual actions (download, view code, delete)
  • Canvas supports multiple frames laid out in a responsive grid

Server-Side Screenshot Export

  • PNG export uses Puppeteer (full) in development and @sparticuz/chromium-min + Puppeteer-Core in production (for serverless environments)
  • The frame's HTML is wrapped in a complete HTML document with theme CSS variables, viewport meta tags, and Google Fonts
  • Screenshots are captured at iPhone dimensions with proper device pixel ratio for high-quality output

AI-Powered Project Naming

  • When a user submits a prompt, a separate Gemini call generates a concise project name (under 5 words) based on the prompt content
  • This runs as a server action before project creation, giving every project a meaningful name automatically

Technical Implementation

Generation Flow

  1. User types a prompt on the home page (e.g., "Fitness tracker app with dark theme")
  2. Server action calls Gemini to generate a short project name
  3. Project record is created in NeonDB via Prisma
  4. Inngest event ui/generate.screen is triggered with prompt, projectId, userId, and existing frames (if any)
  5. User is navigated to /project/[id] where the canvas UI subscribes to realtime events
  6. Inngest Step 1 — analyze-and-plan-screen: Gemini analyzes prompt → returns JSON plan with theme + screen specs
  7. Project's theme is updated in DB; frontend receives skeleton frames
  8. Inngest Step 2 — generated-screen-{i} (per screen): Gemini generates full HTML/CSS with Unsplash tool calling
  9. Each generated frame is saved to DB, published to frontend, and rendered in a device frame
  10. Redis cache is invalidated for the project; generation marked complete

Database Schema

  • User: Authentication user with email, password (hashed via bcryptjs), OAuth accounts, and subscription reference
  • Account: OAuth provider accounts (Google) linked to users
  • Project: Design project with name, selected theme ID, thumbnail URL, and associated user
  • Frame: Individual screen within a project — stores the title and raw HTML content
  • Subscription: Polar.sh subscription tracking with plan ID, status, billing period, and cancellation state

Technical Challenges & Solutions

Challenge 1: Consistent AI-Generated UI Quality

  • Problem: LLMs tend to generate generic, Bootstrap-like HTML that doesn't look premium or design-forward
  • Solution: Crafted an extensive system prompt (GENERATION_SYSTEM_PROMPT) that enforces Dribbble-quality standards — glassmorphism, soft glows, generous rounding, layered cards, floating navigation, gradient accents, and z-index layering. Included concrete HTML examples for SVG charts (area, circular progress, donut) so the AI produces consistent, visually stunning output

Challenge 2: Theme Consistency Across Screens

  • Problem: Generating multiple screens independently often leads to inconsistent styling — different fonts, spacing, nav patterns, and color usage
  • Solution: The two-phase pipeline solves this — Phase 1 creates a unified plan with explicit bottom navigation specs (icon names, active states, styling) that every screen must follow. The visualDescription field is hyper-specific, including exact Tailwind classes, icon names, and layout rules. For iterative generation, existing HTML is injected as context

Challenge 3: Real Images Without Hallucination

  • Problem: AI models hallucinate image URLs that return 404s, breaking the visual output
  • Solution: Integrated Unsplash as a Vercel AI SDK tool (searchUnsplash). During generation, Gemini calls the tool with a search query, and the tool returns real Unsplash image URLs. Avatars use pravatar.cc with deterministic user IDs. The system prompt explicitly prohibits hallucinated image URLs

Challenge 4: Real-time Progress Without WebSockets

  • Problem: Screen generation takes 30–90 seconds; users need live feedback, but setting up WebSocket infrastructure is complex
  • Solution: Leveraged Inngest's built-in realtime pub/sub. The background function publish()es events at each stage; the frontend subscribes via @inngest/realtime/hooks. No WebSocket server, no polling — just event-driven streaming with automatic reconnection

Challenge 5: Screenshot Export in Serverless

  • Problem: Puppeteer requires a full Chromium binary (~280MB) which exceeds serverless function size limits
  • Solution: Used @sparticuz/chromium-min for production (a stripped-down Chromium binary optimized for Lambda/serverless) paired with puppeteer-core, while keeping full puppeteer for local development. The HTML is wrapped with all required CSS variables, fonts, and viewport settings before capture

Challenge 6: Subscription-Gated Feature Enforcement

  • Problem: Need to enforce different limits (projects, screens per project, generations per month, available themes) across three plan tiers without duplicating logic
  • Solution: Centralized plan definitions in constant/plans.ts with a getUserPlan() function that resolves the active plan from the Subscription table — handling active, canceled-but-within-period, and expired states. Helper functions canCreateProject() and canGenerateScreen() are called before any create/generate action

After Launch & Impact

  • Built a complete AI-powered SaaS with a novel two-phase generation pipeline that produces consistently high-quality mobile UI
  • Designed and implemented 22 production-ready CSS variable themes covering light, dark, neon, glassmorphism, and brutalist aesthetics
  • Integrated seven third-party services (NextAuth, Gemini, Inngest, Prisma/Neon, Polar.sh, Upstash Redis, Unsplash) into a cohesive architecture
  • Implemented real-time event streaming for live generation progress without WebSocket infrastructure
  • Built an interactive canvas with zoom/pan/pinch, device frame rendering, and floating toolbars
  • Achieved server-side PNG export using Puppeteer with serverless-optimized Chromium

Future Plans

  • Add collaborative editing with shared project links
  • Support additional device frames (Android, iPad, Apple Watch)
  • Implement screen-to-code export (React Native, Flutter, SwiftUI)
  • Add AI-powered design iteration ("make the header more minimal", "change the chart to a bar chart")
  • Build a public template gallery of community-generated designs
  • Add version history and undo/redo for generated screens
  • Implement A/B testing view to compare theme variations side-by-side
  • Add Figma plugin for exporting designs directly to Figma
  • Support custom theme creation with a visual theme editor

No effort on the path of self-improvement is ever lost.

Shree Krishna, Bhagavad Gita