Summarist.ai
CompletedNext.jsTypeScriptReact+13 more

Summarist.ai

AI-powered SaaS that transforms PDFs into clear, structured summaries and lets you chat with your documents to get instant answers.

Technology Stack

Next.js
TypeScript
React
Tailwind CSS
NeonDB (PostgreSQL + pgvector)
Drizzle ORM
Shadcn UI
Zod
Gemini AI
LangChain
RAG
Inngest
Clerk
UploadThing
Stripe
Framer Motion

Key Challenges

  • PDF Text Extraction
  • AI Summary Generation
  • Structured Summary Formatting
  • RAG Pipeline with pgvector
  • Real-time Background Processing
  • Conversational AI with Context
  • File Upload Handling
  • Payment Integration
  • Subscription Management

Key Learnings

  • Google Gemini AI Integration
  • RAG (Retrieval-Augmented Generation)
  • pgvector Similarity Search
  • LangChain PDF Processing
  • Inngest Background Jobs & Realtime
  • Drizzle ORM with NeonDB
  • Stripe Payment Integration
  • UploadThing File Management
  • Clerk Authentication
  • Structured AI Prompting

Summarist.ai: AI-Powered PDF Summarization Platform

Overview

Summarist.ai is a full-stack AI application that transforms PDF documents into two powerful experiences: structured visual summaries and intelligent multi-turn conversations. Built with Next.js and powered by Google Gemini AI, it lets users upload PDFs, receive beautifully formatted AI summaries, and then have context-aware conversations with their documents — all backed by a vector search RAG pipeline and real-time background job processing.

What Users Can Do

  • Upload PDFs: Drag and drop or click to upload PDF documents (up to 20MB)
  • AI Summarization: Get instant AI-generated summaries using Google Gemini 2.5 Flash with structured JSON output (title, overview, key points, sections, action items)
  • Chat with PDF: Ask questions about any uploaded PDF and get accurate, source-grounded answers powered by RAG (Retrieval-Augmented Generation)
  • Real-time Processing: Watch live progress updates as PDFs are parsed, chunked, embedded and indexed in the background
  • Vault: Unified library to browse all your summaries and PDF chats in one place
  • Interactive Summary Viewer: Navigate summaries with section-based structure and reading progress tracking
  • Dashboard Management: Access and manage all your AI-generated content in one place
  • Download Summaries: Export summaries as markdown files for offline use
  • Subscription Plans: Free, Pro ($5/month), and Unlimited ($20/month) tiers with per-plan usage limits

Why I Built This

I built Summarist.ai to tackle the core friction points of working with lengthy PDF documents:

  • Time Consumption: Reading long research papers and reports takes too much time
  • No Quick Answers: You can read a summary but still need to dig back into the document for specific questions
  • Information Overload: Extracting key insights from dense documents is mentally taxing
  • Lack of Interactivity: Static summaries don't let you explore the content on your own terms
  • Export Limitations: Needing a portable, shareable format for notes taken from PDFs

Tech Stack

Frontend

  • Next.js 16 (App Router + Turbopack): React framework for both UI and API routes
  • TypeScript: End-to-end type safety, including inferred Drizzle ORM types
  • Tailwind CSS 4: Utility-first styling
  • Shadcn UI: Accessible component primitives (Radix UI)
  • Framer Motion / Motion: Page and micro-animations
  • Lenis: Smooth scroll provider

Backend & Services

  • Google Gemini 2.5 Flash: AI model for summary generation via structured prompts
  • Google Gemini Embedding (gemini-embedding-001): 3072-dimensional text embeddings for semantic search
  • LangChain (PDFLoader + CharacterTextSplitter): PDF parsing and document chunking
  • NeonDB (Serverless PostgreSQL + pgvector): Primary database with vector similarity search
  • Drizzle ORM: Type-safe query builder and schema management
  • Inngest: Durable background job execution with realtime event streaming
  • Clerk: Authentication, user management, and webhook-based user sync
  • UploadThing: Secure cloud file upload and storage
  • Stripe: Payment processing and subscription lifecycle management

Key Features

AI-Powered PDF Summarization

  • Uploads via UploadThing → text extraction via LangChain PDFLoader → Gemini 2.5 Flash generates a structured JSON summary
  • Prompt engineering produces consistent output: title, readTime, overview, keyPoints[], sections[], actionItems[]
  • Markdown code fences are stripped from the AI response before saving
  • Graceful fallback: if JSON parsing fails, the raw text is still displayed without crashing the UI

Chat with PDF — Full RAG Pipeline

  1. Upload: PDF is uploaded and a chatPdfs DB record is created with status processing
  2. Inngest triggers a durable background function (process-pdf-for-chat)
  3. Parse (LangChain PDFLoader): PDF loaded from URL into page-level documents
  4. Chunk (CharacterTextSplitter): Split into 1,000-character chunks with 200-character overlap
  5. Embed (gemini-embedding-001): Each chunk embedded into a 3,072-dimensional vector
  6. Store: Chunk content + vector inserted into pdf_chunks table as vector(3072) via raw SQL
  7. Ready: Status updated to ready; user can now chat
  8. Chat query: User message is embedded → top-5 similar chunks retrieved via pgvector <=> cosine distance → chunks injected into Gemini multi-turn chat as context
  9. History: All messages persisted in chat_messages, rebuilt as strictly-alternating user → model pairs for Gemini's history API

Real-time Processing Progress

  • Inngest's @inngest/realtime middleware publishes live progress events on a per-PDF channel (pdf-processing:<chatPdfId>)
  • Each stage (parsing → chunking → embedding → ready / error) broadcasts { status, message, progress: 0-100, metadata }
  • The frontend subscribes and shows a live progress bar — no polling required

Unified Vault

  • Combines pdfSummaries and chatPdfs into a single sorted feed (VaultItem union type)
  • Sorted by most recently updated (chats use updatedAt; summaries use createdAt)
  • Each item links to its dedicated viewer / chat page

Technical Implementation

Summary Generation Flow

  1. User uploads PDF via UploadThing
  2. Server action extracts text using LangChain PDFLoader
  3. Text is sent to Google Gemini AI with structured prompt
  4. AI generates formatted summary with emojis and markdown
  5. Summary saved to NeonDB with user association
  6. User redirected to summary viewer page

Database Schema

  • pdf_summaries: Stores summaries with user_id, file_url, summary_text, title, file_name
  • users: Manages user subscriptions with customer_id, price_id, status
  • payments: Tracks payment transactions

Technical Challenges & Solutions

Challenge 1: PDF Text Extraction at Scale

  • Problem: Handling various PDF structures, scanned pages, and multi-page documents
  • Solution: LangChain's PDFLoader with blob-based loading inside Inngest steps; each page is a separate Document with metadata for page number tracking

Challenge 2: Scalable Vector Storage

  • Problem: Drizzle ORM has no native pgvector column type
  • Solution: Defined a custom customType for vector(N) with toDriver / fromDriver serializers; used raw db.execute(sql\...`)` for inserts and similarity queries to avoid ORM limitations

Challenge 3: Real-time Progress Without Polling

  • Problem: Background embedding jobs take 30–120 seconds; users need live feedback
  • Solution: Inngest's realtime middleware publishes progress events per-PDF; the client subscribes via @inngest/realtime channels, avoiding long-polling or WebSocket infrastructure

Challenge 4: Gemini Conversation History Format

  • Problem: Gemini's startChat API requires strictly alternating user → model turns; mismatched history throws an error
  • Solution: Chat messages are loaded, the latest (unsaved) user turn is dropped, and only complete user+model pairs are included — partial turns are silently skipped

Challenge 5: User Sync Reliability

  • Problem: Clerk webhooks occasionally miss user.created events, leaving actions with no DB user row
  • Solution: Every server action runs a fallback check — if the user row is missing, ensureFreeUserExists creates it before proceeding; this pattern is consistent across all three entry points (upload, chat, credits)

Challenge 6: Structured AI Output Consistency

  • Problem: Gemini sometimes wraps JSON in markdown code fences or adds commentary
  • Solution: Post-process the raw response with regex to strip ```json fences before parsing; fallback parseSummaryText handles unparseable text gracefully without crashing the viewer

After Launch & Impact

  • Built a complete full-stack AI application with two distinct AI-powered workflows
  • Designed and implemented a production RAG pipeline from scratch: chunk → embed → store → retrieve → generate
  • Integrated six third-party services (Clerk, Stripe, UploadThing, Gemini, Inngest, NeonDB) in a single coherent architecture
  • Implemented durable background job processing with real-time client-side progress updates
  • Gained deep experience with pgvector, Drizzle ORM custom types, and serverless database constraints
  • Built a subscription-gated SaaS with multiple plan tiers and Stripe lifecycle management

Future Plans

  • Enforce chat upload limits (currently only summary limits are gated)
  • Add summary sharing via public shareable links
  • Support additional file formats (DOCX, TXT, EPUB)
  • Implement batch PDF processing
  • Add summary search and filtering in the Vault
  • Export summaries to PDF format
  • Add summary editing capabilities
  • Create a public API for third-party integrations

Set your heart upon your work, but never on its reward.

Shree Krishna, Bhagavad Gita