The Technology Behind Instant Documentation Search
Ever wonder how ULPI makes your AI instantly know your entire codebase?It’s not magic—it’s a carefully engineered pipeline that transforms scattered documentation into semantic search that actually works.This guide explains:
🔍 How semantic search understands meaning, not just keywords
⚡ Why ULPI is 25x more efficient than loading full docs
🔄 How updates sync in under 60 seconds
🔒 How your private docs stay secure
Not interested in technical details? Skip to Search Features for practical usage tips.Want to integrate via API? Jump to API Integration.
One-click OAuth connection to GitHub, GitLab, Bitbucket, or GiteaULPI automatically discovers all your documentation:
README files
docs/ directories
Wikis
Markdown files everywhere
2
AI Indexes Your Docs (2-5 minutes)
Automatic processing in the background:
Parse every documentation file
Break into logical sections (chunking)
Convert to vector embeddings (AI representation)
Store in lightning-fast search engine (Typesense)
You don’t do anything. It just works.
3
Your AI Asks Questions (instant)
When your AI assistant needs docs:
Copy
AI: "I need docs about authentication"↓ULPI: Searches all repos semantically (40ms)↓ULPI: Returns relevant sections (2,000 tokens)↓AI: "Based on your docs/auth.md, here's how..."
Total time: Under 50 milliseconds
4
Auto-Sync on Every Push (60 seconds)
You push to main:
Copy
git push origin main
ULPI automatically:
Receives webhook notification (1 second)
Re-indexes changed files (30-60 seconds)
AI now sees updated documentation
No manual sync button. Always up-to-date.
The magic: AI understands meaning, not just keywords. That’s why it finds the right docs even when you use different terminology.
Smart chunking strategy (not just splitting every N characters):
Respect document structure:
Preserve headings and sections
Keep code blocks together
Don’t split tables or lists
Optimal chunk size: ~512 tokens
Large enough for context
Small enough for precision
Equivalent to 2-3 paragraphs
Add overlap:
50-token overlap between chunks
Prevents losing context at chunk boundaries
Example:
Copy
## AuthenticationOur API uses OAuth 2.0 for authentication...[500 tokens about OAuth]─────── CHUNK BOUNDARY with 50-token overlap ───────## AuthorizationAfter authentication, users are assigned roles...[500 tokens about authorization]
Result: When searching for “authentication”, you get the OAuth section—not the entire auth guide.
✅ “redis.conf” (keyword match + somewhat semantic)
✅ “Caching setup with Redis” (both)
✅ “Session storage using cache” (semantic)
Result: Gets both exact matches AND related concepts
4
Ranking & Filtering
Results are ranked by:
Relevance score (hybrid score)
Document recency (newer docs ranked higher)
Repository priority (if you specified repos)
File type:
README.md (most important)
docs/*.md (documentation)
Other files (lower priority)
Filters applied:
Repository scope (if API key is scoped)
Branch (default: main)
File type (if specified)
Date range (if specified)
5
Context Assembly
Return top results with metadata:
Copy
{ "results": [ { "content": "Deploy to production using GitHub Actions...", "file": "infrastructure/deployment.md", "repository": "backend-api", "branch": "main", "url": "https://github.com/org/backend-api/blob/main/infrastructure/deployment.md#L42-L67", "score": 0.94, "last_modified": "2025-11-10T14:23:00Z" } ], "total": 12, "tokens_used": 2340}
AI receives this and synthesizes answer:
Copy
"Based on your deployment guide in infrastructure/deployment.md:To deploy to production:1. Run `npm run build`2. Push to deploy branch: `git push origin main:deploy`3. GitHub Actions automatically deploys to AWS[Link to deployment.md:42-67]"
MCP (Model Context Protocol) is a standard for connecting AI tools to external data sources.ULPI provides an MCP server that bridges AI assistants to the ULPI API:
Copy
┌─────────────────┐│ Your AI │ (Claude Desktop, Cursor, etc.)│ Assistant │└────────┬────────┘ │ │ MCP Protocol │┌────────▼────────┐│ ULPI MCP │ (@ulpi/mcp-server-documentation)│ Server │ - Runs locally in your IDE│ │ - Authenticates with API key└────────┬────────┘ │ │ HTTPS │┌────────▼────────┐│ ULPI API │ (api.ulpi.io)│ - Search │ - Hosted in AWS│ - Indexing │ - Handles all teams└─────────────────┘
User: "How do we handle authentication in the API?"Claude (thinking): I need to search the user's documentation for authentication info. I'll use the search_documentation tool.Claude → MCP Server: search_documentation({ query: "API authentication implementation", limit: 5 })MCP Server → ULPI API: POST /api/v1/documentation/search { "query": "API authentication implementation", "limit": 5 }ULPI API → MCP Server: { "results": [ { "content": "Our API uses Laravel Sanctum for authentication...", "file": "backend-api/docs/auth.md", "url": "github.com/org/backend/blob/main/docs/auth.md#L15" } ] }MCP Server → Claude: [Returns search results]Claude → User: "Based on your documentation in backend-api/docs/auth.md, your API uses Laravel Sanctum for authentication. Here's how it works: 1. Client sends email + password to /api/login 2. Server validates credentials 3. Sanctum generates API token 4. Client includes token in Authorization header [Link to auth.md:15-42]"
Seamless. User doesn’t see any of this—just accurate answers from their docs.
Location: User’s browser or IDEDuration: 5 minutesPurpose: Instant results for repeated queries in same sessionHow it works:
Copy
// MCP server caches responses locallyconst cache = new Map();async function search(query) { const cacheKey = `search:${query}`; // Check cache first if (cache.has(cacheKey)) { const cached = cache.get(cacheKey); if (Date.now() - cached.timestamp < 5 * 60 * 1000) { return cached.results; // 5ms response } } // Cache miss - call API const results = await callULPIAPI(query); cache.set(cacheKey, { results, timestamp: Date.now() }); return results; // 200ms response}
Benefit:
First query: 200ms (API call)
Repeat query: 5ms (local cache)
Invalidation: Automatic after 5 minutes
Location: Node.js process running MCP serverDuration: 1 hourPurpose: Persist cache across AI chat restartsStorage: In-memory (cleared when IDE restarts)Benefit:
Reduces API calls
Faster responses throughout work session
Survives Claude Desktop window close/reopen (process keeps running)
Invalidation:
After 1 hour
When you restart IDE completely
Location: ULPI API servers (Redis)Duration: 5 minutesPurpose: Fast responses for popular queries across all usersHow it works:
Copy
// Laravel API endpointpublic function search(Request $request) { $cacheKey = "search:{$request->query}:{$request->repository}"; return Cache::remember($cacheKey, 300, function () use ($request) { // Perform vector search in Typesense return $this->typesenseSearch($request->query); });}