Back to projects
AI / RAG SystemIn Progress

Knowledge Vault — Multi-Tenant RAG Platform

A domain-agnostic, multi-tenant RAG knowledge platform. The same pipeline — PDF ingestion, chunking, embeddings, pgvector similarity search, and citation-grounded answers — serves legal documents, company policies, research papers, compliance manuals, and internal knowledge bases. The architecture stays identical across industries; only the content changes.

Highlights

  • Document ingestion and chunking pipeline
  • Embeddings stored in PostgreSQL via pgvector
  • Citation-grounded answers (no ungrounded hallucinations)
  • Per-tenant isolation of documents and vectors
  • Reusable across legal, policy, research, and compliance content

Architecture Decision Records

pgvector instead of a separate vector database

Context: We needed vector search that respects tenant boundaries without operating an extra service.

Decision: Store embeddings in PostgreSQL with pgvector alongside relational data, keeping the stack simple, transactional, and tenant-aware with trivial joins.

Domain-agnostic pipeline over a niche legal app

Context: The original concept was law-firm specific, but the underlying pipeline is identical for any document corpus.

Decision: Present the system as a reusable multi-tenant RAG platform, demonstrating versatility across industries while keeping a single technical architecture.