Give your LLM a memory it never forgets
Drop-in REST API that adds persistent, cross-session memory to any LLM application. 2ms retrieval. Zero infrastructure.
Memory Pipeline
Proven results — not just promises
Benchmarked against baseline LLM with no memory layer
Benchmark results (Exp 44)
| Mode | Factual score | Keyword hit |
|---|---|---|
| ★ NGT Memory | 2.44 / 3 | 57% |
| No memory | 1.33 / 3 | 27% |
LLMs without memory are broken by design
Every session starts fresh. Your users have to repeat themselves. Your AI gives dangerous generic advice. NGT Memory fixes this.
Real example — Restaurant recommendation in Kyoto
“Ippudo is great for ramen lovers” — recommends meat to a vegetarian
“Shigetsu at Tenryu-ji serves shojin ryori (Buddhist vegan cuisine)” — personalized because it remembers you’re vegetarian
How NGT Memory works
A simple pipeline that injects relevant memories into every LLM prompt
Request Pipeline
Cosine Similarity
Semantically close facts retrieved via vector similarity search
Hebbian Graph
Associative links between concepts, like the human brain
Hierarchical Consolidation
Important facts promoted to long-term memory automatically
Up and running in 5 minutes
Drop-in REST API — no new infrastructure, no vector database, no vendor lock-in
# 1. Clone the repository
git clone https://github.com/ngt-memory/ngt-memory.git
cd ngt-memory
# 2. Configure environment
cp .env.example .env
# Set OPENAI_API_KEY in .env
# 3. Start the service
docker-compose up -d
# ✓ NGT Memory is running at http://localhost:8000
Everything you need
Production-ready memory layer with all the features your LLM app needs
Persistent Memory
Stores facts between sessions — users never repeat themselves
2ms Retrieval
Graph + cosine search with no external database required
Drop-in REST API
Integrates into any LLM app in under 5 minutes
Multi-session
Isolated memory per user — scales to thousands of sessions
Docker Ready
One command full deployment — docker-compose up -d
Local-first
Runs entirely on your infrastructure — no cloud dependency
Hebbian Graph
Associative links between concepts, like the human brain
Built-in Analytics
Memory metrics, session stats, retrieval performance
API Key Auth
Optional endpoint protection with configurable API keys
How we compare
NGT Memory is the only solution that requires no external vector database and delivers sub-2ms retrieval
| Feature | ★ BestNGT Memory | Mem0 | Zep | LangChain Memory |
|---|---|---|---|---|
| Self-hosted | ||||
| No vector DB required | ||||
| Hebbian graph | ||||
| Retrieval latency | 2ms | ~50ms | ~100ms | ~30ms |
| Open source | ||||
| REST API |
Built for real-world AI applications
From healthcare to consumer apps — NGT Memory makes every LLM application smarter with context
Medical AI Assistant
Remembers allergies, medications, and patient history across sessions. Never gives advice that conflicts with known conditions.
💡 Patient mentioned penicillin allergy 3 sessions ago → avoided in all subsequent recommendations
Personal AI Companion
Remembers preferences, plans, and important life events. Grows smarter and more personal with every conversation.
💡 Knows you're vegetarian, live in Berlin, and training for a marathon
Customer Support Bot
Remembers support history, preferences, and past resolutions. No more asking customers to repeat themselves.
💡 Customer contacted support 3 times about billing → context injected automatically