
CanvasX
An AI-powered web application that generates intuitive mobile app mockups from natural language prompts. Features real-time AI generation, multiple theme support, interactive canvas editing, and high-quality screenshot exports.
Technology Stack
Key Challenges
- Real-Time AI Generation
- Multi-Step AI Pipeline
- Screenshot Generation
- Dynamic Theme System
- Interactive Canvas Controls
- HTML Rendering in IFrames
Key Learnings
- Next.js 16 App Router
- Vercel AI SDK Integration
- Inngest Real-Time Streaming
- Puppeteer Server-Side Rendering
- Advanced Prompt Engineering
- CSS Variable Theming
- MongoDB with Prisma
CanvasX: AI-Powered Mobile UI Designer
Overview
CanvasX is an AI-powered web application that transforms natural language descriptions into intuitive, production-ready mobile app mockups in minutes. Built with Next.js and powered by Google Gemini 3 Pro, the platform features a sophisticated two-stage AI pipeline that analyzes user intent, selects appropriate themes, and generates pixel-perfect HTML/CSS screens with real-time updates and an interactive canvas workspace.
What Users Can Do
- AI-Powered Generation: Describe any mobile app screen in natural language and get instant, production-ready mockups
- Smart Suggestions: Choose from pre-built prompts (Finance Tracker, Fitness Activity, Food Delivery, Travel Booking, E-Commerce, Meditation)
- Real-Time Updates: Watch screens generate live with status indicators (Analyzing → Generating → Completed)
- Multiple Themes: Select from 10+ professionally designed themes (Ocean Breeze, Netflix, Midnight, Neo-Brutalism, Acid Lime, etc.)
- Frame Management: Download PNG screenshots, view HTML source code, delete individual frames
- Project Persistence: All projects automatically saved to Database
- Iterative Design: Generate additional screens for existing projects while maintaining theme consistency
Why I Built This
I built CanvasX to solve several key problems in the app design workflow:
- Speed Up Design Process: Designers spend hours creating mockups - AI can generate them in minutes
- Bridge Ideas to Visuals: Non-designers struggle to visualize app ideas without design skills
- Maintain Consistency: Keeping consistent themes and styles across multiple screens is tedious
- Realistic Mockups: Mock data and placeholders look unprofessional - integrate real images via Unsplash
- Code-Ready Output: Generated screens use production-ready HTML/Tailwind code
- Real-Time Feedback: Users see generation progress instead of waiting blindly
Tech Stack
Frontend
- Next.js 16: React framework with App Router and server actions
- TypeScript: Type-safe development
- Tailwind CSS: Utility-first styling with CSS variables
- Radix UI: Accessible component primitives
- React Zoom Pan Pinch: Interactive canvas controls
- React RND: Draggable and resizable frames
- Tanstack Query: Server state management
Backend & AI
- Vercel AI SDK: AI integration framework
- Google Gemini: Advanced AI model for UI generation
- Inngest: Event-driven workflows and real-time streaming
- Zod: Schema validation for AI outputs
- Unsplash API: Real image integration
Database & Services
- Prisma + PostgreSQL: Type-safe ORM with SQL database
- Next Auth: Authentication and user management
- Puppeteer: Server-side screenshot generation
- @sparticuz/chromium-min: Optimized Chromium for Vercel
Key Features
Two-Stage AI Generation Pipeline
Analysis Phase: Gemini 3 Pro analyzes prompts to plan 1-4 screens, automatically selects themes, and ensures consistency across screens.
Generation Phase: Creates structured HTML/CSS for each screen with theme variables, fetches real images from Unsplash, and produces production-ready code.
Theme System
- 10+ premium themes with CSS variable architecture
- Instant theme switching without regenerating screens
- Visual color palette previews
- Dark/light mode support
- Theme persistence across sessions
Real-Time Updates
- WebSocket-based event streaming via Inngest Realtime
- Status tracking (Running → Analyzing → Generating → Completed)
- Frames appear on canvas immediately as they're generated
- Dedicated user channels for isolated updates
Interactive Canvas Workspace
- Infinite canvas with zoom (10%-300%) and pan controls
- Realistic mobile device frames (420px width)
- Draggable frame positioning
- Select and Hand tool modes
- Auto-layout with smart spacing
- High-quality screenshot exports (individual frames or entire canvas)
Project Management
- AI-generated project names based on prompts
- Dashboard with thumbnail previews
- Full CRUD operations with user authorization
- Cascade delete for frames
- Projects sorted by most recent
Technical Implementation
AI Generation Flow
- User submits prompt → API creates project in Database
- Inngest triggers two-stage AI pipeline
- Analysis: Gemini analyzes prompt using Zod schema to plan screens and select theme
- Generation: For each screen, Gemini generates HTML with theme variables
- Unsplash tool fetches real images during generation
- Frames stored in Database and published via real-time events
- React Context updates canvas state instantly
Screenshot Architecture
- Puppeteer integration with environment detection (local/production)
- Extracts all CSS using
document.styleSheets - Creates self-contained HTML documents
- 2x scale factor for retina-quality images
- Auto-generated project thumbnails
Technical Challenges & Solutions
Challenge 1: Real-Time AI Generation
- Problem: Long-running AI tasks exceeding serverless timeouts
- Solution: Implemented Inngest event-driven architecture with channel-based messaging (
user:{userId}) and topic subscriptions for live status updates
Challenge 2: Multi-Step AI Pipeline
- Problem: Inconsistent AI outputs without proper planning
- Solution: Designed two-stage pipeline where Gemini first analyzes prompts (using Zod for structured output), then generates screens with context-aware prompts
Challenge 3: Server-Side Screenshot Generation
- Problem: Rendering HTML with proper styling in serverless environment
- Solution: Integrated Puppeteer with CSS extraction from all stylesheets and created self-contained HTML. Optimized for Vercel with cached Chromium
Challenge 4: Dynamic Theme System
- Problem: Changing themes required regenerating all screens
- Solution: Created 10+ themes with 20+ CSS variables each. Implemented instant switching by injecting variables into iframe contexts
Challenge 5: Interactive Canvas Controls
- Problem: Managing zoom, pan, and draggable frames together
- Solution: Integrated React Zoom Pan Pinch with mode switching (Select/Hand) and React RND for scale-aware positioning
Challenge 6: HTML Rendering Security
- Problem: Safely rendering user-generated HTML
- Solution: Rendered in sandboxed iframes with theme CSS injection. Implemented postMessage for dynamic height calculation
After Launch & Impact
- Built complete AI-powered design tool with cutting-edge Next.js and React
- Successfully implemented multi-step AI pipeline with real-time streaming
- Integrated Google Gemini 3 Pro for high-quality UI generation
- Created sophisticated theme system with 10+ professional themes
- Implemented server-side screenshot generation for both local and production
- Learned advanced prompt engineering for consistent code generation
- Deployed full-stack serverless application on Vercel
Future Plans
- Export to React/Vue/Flutter components
- Multi-user collaboration with real-time co-editing
- Version history with undo/redo
- Custom theme creator
- Component library extraction
- Figma export integration
- Animation support
- Responsive variants (tablet, desktop)
- Chat-based iterative editing