
Summarist.ai
AI-powered SaaS that transforms PDFs into clear, structured summaries and lets you chat with your documents to get instant answers.
Technology Stack
Key Challenges
- PDF Text Extraction
- AI Summary Generation
- Structured Summary Formatting
- RAG Pipeline with pgvector
- Real-time Background Processing
- Conversational AI with Context
- File Upload Handling
- Payment Integration
- Subscription Management
Key Learnings
- Google Gemini AI Integration
- RAG (Retrieval-Augmented Generation)
- pgvector Similarity Search
- LangChain PDF Processing
- Inngest Background Jobs & Realtime
- Drizzle ORM with NeonDB
- Stripe Payment Integration
- UploadThing File Management
- Clerk Authentication
- Structured AI Prompting
Summarist.ai: AI-Powered PDF Summarization Platform
Overview
Summarist.ai is a full-stack AI application that transforms PDF documents into two powerful experiences: structured visual summaries and intelligent multi-turn conversations. Built with Next.js and powered by Google Gemini AI, it lets users upload PDFs, receive beautifully formatted AI summaries, and then have context-aware conversations with their documents — all backed by a vector search RAG pipeline and real-time background job processing.
What Users Can Do
- Upload PDFs: Drag and drop or click to upload PDF documents (up to 20MB)
- AI Summarization: Get instant AI-generated summaries using Google Gemini 2.5 Flash with structured JSON output (title, overview, key points, sections, action items)
- Chat with PDF: Ask questions about any uploaded PDF and get accurate, source-grounded answers powered by RAG (Retrieval-Augmented Generation)
- Real-time Processing: Watch live progress updates as PDFs are parsed, chunked, embedded and indexed in the background
- Vault: Unified library to browse all your summaries and PDF chats in one place
- Interactive Summary Viewer: Navigate summaries with section-based structure and reading progress tracking
- Dashboard Management: Access and manage all your AI-generated content in one place
- Download Summaries: Export summaries as markdown files for offline use
- Subscription Plans: Free, Pro ($5/month), and Unlimited ($20/month) tiers with per-plan usage limits
Why I Built This
I built Summarist.ai to tackle the core friction points of working with lengthy PDF documents:
- Time Consumption: Reading long research papers and reports takes too much time
- No Quick Answers: You can read a summary but still need to dig back into the document for specific questions
- Information Overload: Extracting key insights from dense documents is mentally taxing
- Lack of Interactivity: Static summaries don't let you explore the content on your own terms
- Export Limitations: Needing a portable, shareable format for notes taken from PDFs
Tech Stack
Frontend
- Next.js 16 (App Router + Turbopack): React framework for both UI and API routes
- TypeScript: End-to-end type safety, including inferred Drizzle ORM types
- Tailwind CSS 4: Utility-first styling
- Shadcn UI: Accessible component primitives (Radix UI)
- Framer Motion / Motion: Page and micro-animations
- Lenis: Smooth scroll provider
Backend & Services
- Google Gemini 2.5 Flash: AI model for summary generation via structured prompts
- Google Gemini Embedding (gemini-embedding-001): 3072-dimensional text embeddings for semantic search
- LangChain (PDFLoader + CharacterTextSplitter): PDF parsing and document chunking
- NeonDB (Serverless PostgreSQL + pgvector): Primary database with vector similarity search
- Drizzle ORM: Type-safe query builder and schema management
- Inngest: Durable background job execution with realtime event streaming
- Clerk: Authentication, user management, and webhook-based user sync
- UploadThing: Secure cloud file upload and storage
- Stripe: Payment processing and subscription lifecycle management
Key Features
AI-Powered PDF Summarization
- Uploads via UploadThing → text extraction via LangChain PDFLoader → Gemini 2.5 Flash generates a structured JSON summary
- Prompt engineering produces consistent output:
title,readTime,overview,keyPoints[],sections[],actionItems[] - Markdown code fences are stripped from the AI response before saving
- Graceful fallback: if JSON parsing fails, the raw text is still displayed without crashing the UI
Chat with PDF — Full RAG Pipeline
- Upload: PDF is uploaded and a
chatPdfsDB record is created with statusprocessing - Inngest triggers a durable background function (
process-pdf-for-chat) - Parse (LangChain PDFLoader): PDF loaded from URL into page-level documents
- Chunk (CharacterTextSplitter): Split into 1,000-character chunks with 200-character overlap
- Embed (gemini-embedding-001): Each chunk embedded into a 3,072-dimensional vector
- Store: Chunk content + vector inserted into
pdf_chunkstable asvector(3072)via raw SQL - Ready: Status updated to
ready; user can now chat - Chat query: User message is embedded → top-5 similar chunks retrieved via pgvector
<=>cosine distance → chunks injected into Gemini multi-turn chat as context - History: All messages persisted in
chat_messages, rebuilt as strictly-alternatinguser → modelpairs for Gemini's history API
Real-time Processing Progress
- Inngest's
@inngest/realtimemiddleware publishes live progress events on a per-PDF channel (pdf-processing:<chatPdfId>) - Each stage (parsing → chunking → embedding → ready / error) broadcasts
{ status, message, progress: 0-100, metadata } - The frontend subscribes and shows a live progress bar — no polling required
Unified Vault
- Combines
pdfSummariesandchatPdfsinto a single sorted feed (VaultItemunion type) - Sorted by most recently updated (chats use
updatedAt; summaries usecreatedAt) - Each item links to its dedicated viewer / chat page
Technical Implementation
Summary Generation Flow
- User uploads PDF via UploadThing
- Server action extracts text using LangChain PDFLoader
- Text is sent to Google Gemini AI with structured prompt
- AI generates formatted summary with emojis and markdown
- Summary saved to NeonDB with user association
- User redirected to summary viewer page
Database Schema
- pdf_summaries: Stores summaries with user_id, file_url, summary_text, title, file_name
- users: Manages user subscriptions with customer_id, price_id, status
- payments: Tracks payment transactions
Technical Challenges & Solutions
Challenge 1: PDF Text Extraction at Scale
- Problem: Handling various PDF structures, scanned pages, and multi-page documents
- Solution: LangChain's
PDFLoaderwith blob-based loading inside Inngest steps; each page is a separateDocumentwith metadata for page number tracking
Challenge 2: Scalable Vector Storage
- Problem: Drizzle ORM has no native pgvector column type
- Solution: Defined a custom
customTypeforvector(N)withtoDriver/fromDriverserializers; used rawdb.execute(sql\...`)` for inserts and similarity queries to avoid ORM limitations
Challenge 3: Real-time Progress Without Polling
- Problem: Background embedding jobs take 30–120 seconds; users need live feedback
- Solution: Inngest's realtime middleware publishes progress events per-PDF; the client subscribes via
@inngest/realtimechannels, avoiding long-polling or WebSocket infrastructure
Challenge 4: Gemini Conversation History Format
- Problem: Gemini's
startChatAPI requires strictly alternatinguser → modelturns; mismatched history throws an error - Solution: Chat messages are loaded, the latest (unsaved) user turn is dropped, and only complete user+model pairs are included — partial turns are silently skipped
Challenge 5: User Sync Reliability
- Problem: Clerk webhooks occasionally miss
user.createdevents, leaving actions with no DB user row - Solution: Every server action runs a fallback check — if the user row is missing,
ensureFreeUserExistscreates it before proceeding; this pattern is consistent across all three entry points (upload, chat, credits)
Challenge 6: Structured AI Output Consistency
- Problem: Gemini sometimes wraps JSON in markdown code fences or adds commentary
- Solution: Post-process the raw response with regex to strip
```jsonfences before parsing; fallbackparseSummaryTexthandles unparseable text gracefully without crashing the viewer
After Launch & Impact
- Built a complete full-stack AI application with two distinct AI-powered workflows
- Designed and implemented a production RAG pipeline from scratch: chunk → embed → store → retrieve → generate
- Integrated six third-party services (Clerk, Stripe, UploadThing, Gemini, Inngest, NeonDB) in a single coherent architecture
- Implemented durable background job processing with real-time client-side progress updates
- Gained deep experience with pgvector, Drizzle ORM custom types, and serverless database constraints
- Built a subscription-gated SaaS with multiple plan tiers and Stripe lifecycle management
Future Plans
- Enforce chat upload limits (currently only summary limits are gated)
- Add summary sharing via public shareable links
- Support additional file formats (DOCX, TXT, EPUB)
- Implement batch PDF processing
- Add summary search and filtering in the Vault
- Export summaries to PDF format
- Add summary editing capabilities
- Create a public API for third-party integrations