Multi-tenant chatbots
Each user signs in and creates their own named bots, each with an isolated knowledge base. Retrieval is filtered by user and bot, so one account's documents and answers never leak into another's.
A live, multi-tenant platform where anyone can spin up a chatbot grounded in their own documents — a Next.js app on Vercel talks to a Cloudflare Worker that retrieves from a vector index and answers with gpt-5.1, streamed token by token.
AskMyBot lets anyone build a chatbot that answers from their own material instead of the open internet. You sign in, create a named bot, upload your documents — a policy handbook, product docs, lecture notes — and from then on the bot answers questions using what's in those files, and points back to where it found the answer. Every account is isolated: your bots and your knowledge are yours alone.
Answering is a retrieval-augmented-generation (RAG) loop. When you ask something, the question is turned into a vector and matched against your documents in a vector index; the closest passages are pulled back as context, combined with the recent conversation, and handed to the model, which writes the reply. The answer streams back word by word, so you watch it form in real time rather than waiting for the whole thing.
Under the hood it's a deliberately serverless, two-part system. A Next.js app on Vercel handles the interface, login and uploads; a separate Cloudflare Worker does the heavy retrieval and generation close to the data. Documents live in Cloudflare R2, their text and your bots' prompts in Cloudflare D1, the searchable vectors in Cloudflare Vectorize, and your account in Neon Postgres — so each concern scales on its own and the whole thing runs without a server to babysit. It is deployed and live at askmybot.me.
Each user signs in and creates their own named bots, each with an isolated knowledge base. Retrieval is filtered by user and bot, so one account's documents and answers never leak into another's.
Files upload straight to Cloudflare R2 through presigned URLs, then get parsed and chunked by Unstructured.io (by-title, with overlap), embedded with OpenAI, and written into the vector index and D1 — turning raw PDFs and Word docs into searchable knowledge.
A question is embedded and matched against Cloudflare Vectorize; the top-k vector hits are resolved to their full passages in D1 by a custom LangChain retriever, then passed to the model as grounded context — with the question first reformulated against the chat history.
A history-aware retriever and QA chain generate the reply, streamed to the browser over Server-Sent Events. The UI shows live status — connecting, retrieving, generating — so the wait is legible, and answers render as formatted markdown.
A Next.js frontend on Vercel is cleanly separated from a Cloudflare Worker (Hono) that owns retrieval and generation. The two talk over a bearer-token API, letting the RAG engine sit next to its data and scale independently of the web tier.
NextAuth handles email/password sign-up with verification plus Google and Facebook OAuth; accounts and sessions live in Neon Postgres via Drizzle ORM, and transactional email (verification, password reset) goes out through Resend.
Each bot carries its own system prompt and a default-bot setting, and knowledge can be added or removed by named domain — so owners tune voice and scope, and curate exactly what their bot knows.
Copy and regenerate any reply, listen to answers via browser text-to-speech, and manage files in a drag-and-drop browser built on Atlaskit pragmatic drag-and-drop — consumer-grade polish on top of the AI plumbing.