knowledge-retrieval-agent-fabric
A system for ingesting web content via crawling to establish AI-driven search utilities that amplify content accessibility and facilitate advanced interactions with large language models.
Author

madarco
Quick Info
Actions
Tags
Knowledge Retrieval Agent Fabric (RagRabbit)
Self-Managed Web Content Intelligence Engine, LLMs.txt Generator, and MCP Gateway for seamless integration with generative AI tools. Features one-click deployment on Vercel.
Operational Mechanics
RagRabbit is structured as a Next.js application managed by Turborepo, leveraging Llamaindex for data processing and utilizing pgVector for vector storage within PostgreSQL.
Key Capabilities
- 💬 Conversational Interface: Deployable AI agent and immediate informational retrieval module.
- 🕸️ Web Content Indexer: Systematically scrapes and builds indices of web pages, storing embeddings in pgVector via PostgreSQL.
- 📄 LLMs.txt Artifact Generation: Fully adaptable text output creation with granular control over the Table of Contents (ToC) sequencing.
- 🔌 MCP Gateway: Execute
npx @ragrabbit/mcpto grant supported AI clients (like Claude Desktop and Cursor IDE) access to your indexed documentation. - 🛠️ Adaptability: Supports configurable authentication protocols, is open source, and manages API key access.
- 🚀 Simplified Provisioning: One-step setup process available on Vercel.
Interoperability:
Live Demonstration
Explore the RagRabbit Demonstration Portal
Deployment Procedure
To initiate the service on Vercel:
Prerequisites for Operation:
- Node.js runtime version 20.x or newer
- A PostgreSQL instance equipped with the pgVector extension
- A valid OpenAI API credential key
- (Optional) An API credential key for Trigger.dev
Parameter Configuration
Set the following environmental parameters:
- OPENAI_API_KEY
For credentials-based user access:
- ADMIN_USER
- ADMIN_PASSWORD
For email-based access:
- RESEND_AUTH=true
- To limit email recipients: RESEND_ALLOWED_EMAILS="user1@domain.com,user2@domain.com"
- To log login links instead of sending emails (visible in Vercel logs): SIMULATE_EMAILS=true
Refer to .env.example for the comprehensive variable manifest.
Usage Instructions
Navigate to the Indexing control panel to introduce new targets for data ingestion, supporting either a singular URL or a complete website for recursive web-crawling:
Subsequently, activate the Job Runner service (maintain the browser tab open until the process finalizes):
Within the LLM.txt interface, you can review the resulting LLM.txt document before finalization:
You can then embed the interactive widget onto your webpage via the subsequent snippet:
Chat Activation Button
Place a button element at the bottom periphery of your page:
Search Interface Widget
Integrate a dedicated search input field at any desired page location:
React.js Integration Context
typescript "use client";
import Script from "next/script";
export function RagRabbitSearch() { return ( <>
); }
MPC Interface Server
This dedicated server component enables any compatible AI application to semantically query and retrieve content from your documentation repository.
Claude Desktop Client
Configure a custom MCP server within Claude Desktop, naming it after your documentation product, allowing Claude AI to reference it for information retrieval.
in claude_desktop_config.json (Access via Claude -> Settings -> Developer -> Edit Config)
{
"mcpServers": {
"
Cursor IDE Integration
Access Cursor -> Settings -> Cursor Settings -> MCP Menu.
Add a new MCP entry designated as type command with the following directive:
npx @ragrabbit/mcp", "http://
Configuration Fields:
ragrabbit-url: (Mandatory) The base Uniform Resource Locator (URL) of your RagRabbit deployment, e.g., https://my-ragrabbit-instance.vercel.com/name: (Mandatory) A distinct identifier for the documentation retrieval service (defaults to "RagRabbit") utilized by the AI for context sourcing.
Configurability Parameters
Chat Button Customization
You can tailor the appearance of the chat activation button by supplying specific query parameters to the widget.js script tag:
buttonText Parameter
Search Widget Customization
Parameters can be supplied via the initialization call to mountSearch after defining the target container element:
searchPlaceholder Parameter
Interoperability Protocols
Fumadocs Adapter
Implement a component to supplant the default Search Dialog interface:
bash pnpm add @ragrabbit/search-react
typescript "use client"; import type { SharedProps } from "fumadocs-ui/components/dialog/search"; import { RagRabbitModal } from "@ragrabbit/search-react";
export default function SearchDialog({ open, onOpenChange }: SharedProps) {
return
Then, declare this custom component within the layout.tsx configuration:
tsx <RootProvider search={{ SearchDialog, }}
...
Optionally, introduce the floating Chat interface trigger:
typescript "use client"; import { RagRabbitChatButton } from "@ragrabbit/search-react";
export default function ChatButton() {
return
And integrate this component into your layout.tsx structure:
tsx
