๐Ÿ› ๏ธ
Tool

rag

by RoodyCode gh-tool--roodycode--rag
Nexus Index
40.9 Top 100%
S: Semantic 50
A: Authority 0
P: Popularity 26
R: Recency 96
Q: Quality 50
Tech Context
Vital Performance
0 DL / 30D
0.0%
Python Lang
Open Source 2 Stars
1.0.0 Version
Alpha Reliability
Tool Information Summary
Entity Passport
Registry ID gh-tool--roodycode--rag
Provider github
๐Ÿ“œ

Cite this tool

Academic & Research Attribution

BibTeX
@misc{gh_tool__roodycode__rag,
  author = {RoodyCode},
  title = {rag Tool},
  year = {2026},
  howpublished = {\url{https://free2aitools.com/tool/gh-tool--roodycode--rag}},
  note = {Accessed via Free2AITools Knowledge Fortress}
}
APA Style
RoodyCode. (2026). rag [Tool]. Free2AITools. https://free2aitools.com/tool/gh-tool--roodycode--rag

๐Ÿ”ฌTechnical Deep Dive

Full Specifications [+]

Quick Commands

๐Ÿ PIP Install
pip install rag

โš–๏ธ Nexus Index V2.0

40.9
TOP 100% SYSTEM IMPACT
Semantic (S) 50
Authority (A) 0
Popularity (P) 26
Recency (R) 96
Quality (Q) 50

๐Ÿ’ฌ Index Insight

FNI V2.0 for rag: Semantic (S:50), Authority (A:0), Popularity (P:26), Recency (R:96), Quality (Q:50).

Free2AITools Nexus Index

Verification Authority

Unbiased Data Node Refresh: VFS Live

๐Ÿ“‹ Specs

Language
Python
License
Open Source
Version
1.0.0
๐Ÿ“ฆ

Usage documentation not yet indexed for this tool.

Technical Documentation

RAG Knowledge Base

A private document RAG (Retrieval-Augmented Generation) system that ingests PDFs and exposes a search tool via an MCP server. Retrieval combines vector search (pgvector) and BM25 keyword search with cross-encoder reranking, and answers are generated by an AWS Bedrock LLM.

Architecture

mermaid
flowchart LR
    subgraph Ingestion
        direction TB
        A[๐Ÿ“„ data/ PDFs] --> B[Docling
PDF Parser] B --> C[HybridChunker
BAAI/bge-m3 tokenizer] C --> D[HuggingFace Embeddings
BAAI/bge-m3 ยท 1024-dim] D --> E[(pgvector
PostgreSQL)] C --> F[(Redis
BM25 Docstore)] end subgraph Query["Query โ€” mcp_server.py"] direction TB G[search_knowledge
tool call] --> H[Vector Retriever
pgvector] G --> I[BM25 Retriever
Redis] H & I --> J[QueryFusionRetriever
relative_score fusion] J --> K[Cross-encoder Reranker
BAAI/bge-reranker-large] K --> L[BedrockConverse LLM] L --> M[Answer + Sources] end E --> H F --> I

Prerequisites

  • Python 3.11+
  • uv
  • Docker & Docker Compose
  • AWS credentials with Bedrock access

Build and Run

Configuration

ingestion/config.py is the source of truth for supported environment variables and defaults. Copy .env.example to .env and update credentials/endpoints for your environment.

env
DATABASE_URL=postgresql://chat-app:admin@localhost:5432/chat_app
BEDROCK_API_KEY=
AWS_REGION=eu-central-1
DATA_DIR=./data

Local Build and Run

  1. Install Python dependencies:
bash
uv sync
  1. Start backing services:
bash
docker compose up pgvector redis -d
  1. Enqueue ingestion jobs:
bash
uv run python ingest.py
  1. Start one or more workers:
bash
uv run python worker.py
  1. Query locally (optional):
bash
uv run python ask.py "your question"
  1. Run MCP server:
bash
uv run python mcp_server.py

The server starts on http://localhost:8000 and exposes:

Tool Description
search_knowledge Searches the knowledge base and returns an answer with source file citations

Docker Build and Run

Build images:

bash
docker compose build

Run full stack:

bash
docker compose up -d

Run only ingestion infrastructure + workers:

bash
docker compose up -d pgvector redis worker

Scale worker count:

bash
docker compose up -d --scale worker=3 worker

Run ingestion from inside the worker container (optional):

bash
docker compose exec worker bash
uv run python ingest.py

Maintenance

  • Add dependencies with uv add <package> and commit both pyproject.toml and uv.lock.
  • Rebuild images after dependency changes with docker compose build.
  • When adding or renaming settings, update both ingestion/config.py and .env.example.

Project Structure

text
.
โ”œโ”€โ”€ data/                  # PDF documents to ingest
โ”œโ”€โ”€ ingestion/
โ”‚   โ”œโ”€โ”€ config.py          # Pydantic settings (loaded from .env)
โ”‚   โ”œโ”€โ”€ pipeline.py        # Docling parsing, embedding, pgvector + Redis ingestion
โ”‚   โ”œโ”€โ”€ queue.py           # Redis/RQ enqueueing for ingestion jobs
โ”‚   โ””โ”€โ”€ tasks.py           # Worker task wrappers around ingestion functions
โ”œโ”€โ”€ query/
โ”‚   โ””โ”€โ”€ engine.py          # Hybrid retriever + reranker + Bedrock LLM query engine
โ”œโ”€โ”€ ingest.py              # Ingestion queue producer entry point
โ”œโ”€โ”€ worker.py              # RQ worker entry point
โ”œโ”€โ”€ ask.py                 # Local interactive query CLI
โ”œโ”€โ”€ mcp_server.py          # FastMCP server exposing search_knowledge tool
โ”œโ”€โ”€ Dockerfile
โ””โ”€โ”€ docker-compose.yml

Social Proof

GitHub Repository
2Stars
๐Ÿ”„ Daily sync (03:00 UTC)

AI Summary: Based on GitHub metadata. Not a recommendation.

๐Ÿ“Š FNI Methodology ๐Ÿ“š Knowledge Baseโ„น๏ธ Verify with original source

๐Ÿ›ก๏ธ Tool Transparency Report

Technical metadata sourced from upstream repositories.

Open Metadata

๐Ÿ†” Identity & Source

id
gh-tool--roodycode--rag
slug
roodycode--rag
source
github
author
RoodyCode
license
tags
ai, document-ingestion, embeddings, llamaindex, pdf-processing, personal-knowledge-base, postgres, rag, redis, retrieval-augmented-generation, self-hosted, semantic-search, vector-database, python

โš™๏ธ Technical Specs

architecture
null
params billions
null
context length
null
pipeline tag
feature-extraction

๐Ÿ“Š Engagement & Metrics

downloads
0
stars
2
forks
0
github stars
2

Data indexed from public sources. Updated daily.