Skip to main content

Master Documentation Search: Beyond Keywords

Tired of searching “authentication” and getting 47 irrelevant results? ULPI’s semantic search understands what you mean, not just what you typed. This guide teaches you:
  • 🎯 How to write queries that return exactly what you need
  • 🔍 Advanced filters (repository, branch, file type, date)
  • ⚡ Pro tips to search 10x faster
  • 🚀 Real-world search scenarios with examples
New to ULPI? Start with Getting Started to set up search first.Want technical details? See How It Works for the architecture.

The Problem: Why Most Searches Fail

You’ve experienced this frustration:
  • Generic Keywords (Fails)
  • Natural Language (Works)
You search: “database”You get:
  • 127 results across 23 files
  • Database setup guide
  • Database migration docs
  • Database troubleshooting
  • Database performance tuning
  • 123 other “database” mentions
Problem: Too broad. Which one do you actually need?Time wasted: 10 minutes scanning results
The difference? Specificity + natural language.

Natural Language Queries: Ask Like a Human

ULPI understands questions, not just keywords.

The Formula: Be Specific + Ask Naturally

How-To Questions

Ask for step-by-step guidance:
"How do I deploy to production?"
"How do we handle rate limiting?"
"How can I optimize slow queries?"
"How to rollback a deployment?"
Returns: Guides, tutorials, step-by-step docs

What-Is Questions

Ask for definitions or explanations:
"What is our API rate limit?"
"What environment variables are required?"
"What testing framework do we use?"
"What's the deployment approval process?"
Returns: Configuration, policies, explanations

Show-Me Requests

Ask for examples or code:
"Show me unit test examples"
"Show me Docker configuration"
"Show me the deployment script"
"Show me error handling patterns"
Returns: Code examples, templates, patterns

Where-Is/Find Requests

Ask for locations or specific docs:
"Where are logs stored?"
"Find the setup guide"
"Find webhook documentation"
"Where is the Redis config?"
Returns: Specific files, locations, links
Why it works: Semantic search analyzes the entire question to understand your intent.

Real-World Search Scenarios

How to search in common situations:
Situation: Sarah just joined the team. She needs to set up her local environment.
  • Wrong Approach
  • Right Approach
Searches: “setup”Gets: 89 results
  • Docker setup
  • AWS setup
  • Environment setup
  • CI/CD setup
  • Database setup
  • IDE setup
Time: 20 minutes reading 6 different docs
Key insight: Include “local” + “first time” + “development environment” = precise results
Situation: 3 AM. Production throwing 500 errors. Need troubleshooting steps FAST.
  • Panic Search (Fails)
  • Focused Search (Works)
Searches: “500 error”Gets: 67 results about error handling in generalTime wasted: 5 minutes during critical outage
Pro tip: Bookmark common runbooks, but ULPI search is faster than scrolling bookmarks.
Situation: Need to add OAuth authentication. Want to follow existing patterns.Pattern: Start specific → Get exact docs → AI codes using your standards
Situation: Inherited old codebase. Need to understand why things were built this way.
  • Code Search (Wrong Tool)
  • Documentation Search (Right Tool)
Searches code: “Redis session storage”Finds: Implementation codeMissing: WHY we use Redis, not just HOW
Search tip: Look for Architecture Decision Records (ADRs) with “why” questions
Situation: Third-party integrating with your API. Needs clear examples.Search progression:
1

Start Broad

Search: “API authentication documentation”Gets: Overview of auth methods
2

Get Specific

Search: “How to authenticate API requests with API key?”Gets: Step-by-step auth guide
3

Find Examples

Search: “Show me cURL examples for authenticated API requests”Gets: Working code examples
4

Handle Errors

Search: “What error codes does the API return for failed authentication?”Gets: Error reference table
Pattern: Broad → Specific → Examples → Edge cases
Situation: App is slow. Need to find performance best practices.Wrong searches:
  • ❌ “performance” (too broad)
  • ❌ “slow” (no context)
  • ❌ “optimize” (optimize what?)
Right searches:
  • ✅ “How to optimize slow database queries?”
  • ✅ “What caching strategies do we use?”
  • ✅ “Show me Redis caching configuration for sessions”
  • ✅ “What are our API response time targets?”
Result: Specific, actionable documentation

Advanced Filters

Narrow your search when you know where to look.

Repository Filter

Search specific repos when you know the context:
  • Single Repository
  • Multiple Repositories
  • All Repositories (Default)
Syntax: Use repository: filter
"authentication" repository:backend-api
Use case: Only need backend auth docs, not frontendResult: Only searches backend-api repo
Use repository filter when:
  • ✅ You know exactly which repo has the docs
  • ✅ You want to exclude noise from other repos
  • ✅ Searching microservices (each service = separate repo)
Skip repository filter when:
  • ❌ You’re not sure where docs live
  • ❌ Documentation might span multiple repos
  • ❌ You want comprehensive results

Branch Filter

Search specific branches for work-in-progress docs:
  • Main Branch (Default)
  • Development Branch
  • Feature Branch
No filter needed:
"setup guide"
Searches: main or master branch (production docs)Use 99% of the time
Pro tip: Use branch filters when reviewing PRs to see documentation changes before merging.

File Type Filter

Limit to specific file types:
  • README Files Only
  • Docs Directory
  • Markdown Files
Syntax: filetype:README
"installation" filetype:README
Use case: Quick setup instructions (READMEs are usually higher quality)Result: Only searches README.md files

Date Filter

Find recent or historical documentation:
  • Recently Updated
  • Older Docs
  • Specific Date Range
Syntax: updated:7d
"deployment process" updated:7d
Use case: Find docs changed in last weekResult: Only docs modified in last 7 days

Understanding the difference improves your searches.

How Semantic Search Works

  • Example 1: Synonyms
  • Example 3: Technical Jargon
You search: “How do I send emails?”Semantic search understands these mean the same:
  • “send emails”
  • “email delivery”
  • “SMTP configuration”
  • “mail service”
  • “notification emails”
Finds docs about:
  • Email configuration
  • SMTP setup
  • Mail service integration
  • Notification delivery
Even though your exact phrase isn’t in the docs!

When to Use Exact Match

Force exact phrase matching with quotes:
  • Exact Phrase
  • Semantic (Default)
With quotes:
"database migration"
Finds: Only docs with exact phrase “database migration”Use when: You know the exact technical term
Rule of thumb: Start without quotes (semantic). Add quotes if too many results.

Search Optimization Tips

Get better results faster:
Bad (generic):
  • ❌ “database”
  • ❌ “error”
  • ❌ “API”
  • ❌ “setup”
Good (specific):
  • ✅ “How to optimize slow PostgreSQL queries?”
  • ✅ “How to handle 500 errors in production?”
  • ✅ “What rate limits does our REST API have?”
  • ✅ “How to set up local development environment on macOS?”
Why it works: Specific queries match specific docs
Why questions work better:Questions include context and intent:
  • “How do I…” → Looking for step-by-step guide
  • “What is…” → Looking for definition or explanation
  • “Where is…” → Looking for location or reference
  • “Why do we…” → Looking for rationale or decision
Examples:
  • ✅ “How do I deploy to staging environment?”
  • ✅ “What is our database backup schedule?”
  • ✅ “Where are application logs stored in production?”
  • ✅ “Why do we use Redis instead of Memcached?”
vs. Keywords:
  • ❌ “deploy staging”
  • ❌ “backup schedule”
  • ❌ “logs”
  • ❌ “Redis”
Improvement: 3-5x more relevant results
Add environment, technology, or use case:Without context:
  • “authentication” → 100 results
  • “cache” → 50 results
  • “deployment” → 75 results
With context:
  • “API authentication with JWT tokens” → 5 results
  • “Redis cache configuration for sessions” → 3 results
  • “Deployment to AWS production environment” → 4 results
Context types:
  • Environment: local, staging, production
  • Technology: Redis, PostgreSQL, Docker, Kubernetes
  • Use case: authentication, caching, logging, monitoring
  • Platform: AWS, GCP, Azure, Heroku
If first search doesn’t work, rephrase:Deployment example:
  1. “How do I deploy to production?”
  2. “What’s the production deployment process?”
  3. “How to push code to prod?”
  4. “Deployment steps for live environment”
Authentication example:
  1. “How do we authenticate users?”
  2. “What’s our login process?”
  3. “User verification flow”
  4. “How to sign in users?”
Usually find docs by attempt #2-3
Search progression strategy:
1

Start Broad

Search: “API documentation”Get: Overview of all API docs
2

Add Specificity

Search: “API authentication documentation”Get: Auth-specific API docs
3

Get Very Specific

Search: “How to authenticate API requests with Bearer tokens?”Get: Exact guide you need
Why this works: First search orients you, subsequent searches zoom in
Technology changes fast. Prioritize recent docs:Examples:
  • “Kubernetes deployment” updated:30d → Latest K8s practices
  • “React patterns” updated:90d → Modern React (not class components)
  • “API changes” updated:7d → Recent breaking changes
Use case: Avoid outdated practices

Understanding Search Results

What you get when ULPI returns results:

Result Format

  • Content Preview
  • Metadata
  • Relevance Score
Relevant excerpt from documentation:
## Production Deployment

To deploy to production:

1. Run tests: `npm test`
2. Build assets: `npm run build`
3. Push to deploy branch: `git push origin main:deploy`
4. Monitor deployment: https://dashboard.example.com

Deployment takes ~5 minutes. Check #deployments Slack channel for status.
Highlights:
  • Most relevant section (not entire file)
  • Preserved formatting
  • Code examples intact
  • Links included

Result Ranking

How results are ordered:
  1. Relevance score (semantic similarity)
  2. Keyword matches (exact words in query)
  3. Document recency (newer docs ranked higher)
  4. File importance (README > docs/ > other)
  5. Repository priority (if you specified repos)
Pro tip: First result is usually what you need. If not, check result #2-3 before re-searching.

Common Search Patterns

Copy these proven query patterns:

Development Workflows

"How do I set up my local development environment?"
"What dependencies are required for local dev?"
"How to run the application locally?"

API & Integration

"How to authenticate API requests?"
"What OAuth providers do we support?"
"How to refresh expired tokens?"

Infrastructure & Deployment

"How to deploy to production?"
"What's the rollback procedure?"
"How to deploy to staging first?"

Troubleshooting Search Issues

Possible causes:
  1. Documentation doesn’t exist
    • Verify docs exist in your repositories
    • Check if topic is actually documented
  2. Too specific query
    • Try removing details: “OAuth JWT authentication” → “authentication”
    • Use broader terms first
  3. Wrong terminology
    • Your docs say “login”, you search “authentication”
    • Try synonyms: “sign in”, “user verification”
  4. Repository scope too narrow
    • Remove repository: filter
    • Search all repos, not just one
Fixes:
  • Simplify query
  • Try synonyms
  • Remove filters
  • Ask teammate “what do we call this?”
Solutions:
  1. Be more specific:
    • ❌ “database” → ✅ “PostgreSQL query optimization”
  2. Use filters:
    • Add repository:backend-api
    • Add path:docs/
    • Add updated:30d
  3. Use exact phrases:
    • Add quotes: "database migration"
  4. Ask a question:
    • ❌ “deployment” → ✅ “How do I deploy to AWS production?”
Pattern: Generic query → Too many results → Add specificity → Perfect results
Why this happens:
  1. Ambiguous terminology
    • “Cache” could mean Redis, browser cache, CDN cache
  2. Multiple contexts
    • “Deploy” could mean backend, frontend, infrastructure
Solutions:
  1. Add context:
    • ❌ “cache setup” → ✅ “Redis cache setup for API sessions”
  2. Specify environment:
    • ❌ “deployment” → ✅ “deployment to AWS production”
  3. Include technology:
    • ❌ “authentication” → ✅ “JWT authentication for REST API”
Solutions:
  1. Filter by recency:
    "deployment guide" updated:30d
    
  2. Check result metadata:
    • Look at “Last Updated” date
    • If >6 months old, might be outdated
  3. Search specific branch:
    "API docs" branch:main
    
    (Ensures you get production docs, not old feature branches)
Prevention: Keep docs updated, archive old files
Check:
  1. Repository connected?
    • Dashboard → Repositories → Verify repo is listed
  2. API key has access?
    • Dashboard → API Keys → Check scope includes that repo
  3. Indexing complete?
    • Dashboard → Repositories → Status should be “Indexed”
  4. OAuth permissions?
    • GitHub → Settings → Applications → ULPI → Verify repo access
Fix: Re-connect repository or regenerate API key with correct scope

Search Performance Metrics

What to expect:

Average Response Time

45msSimple queries return in under 50 millisecondsCompare: GitHub search takes 200-500ms

Result Accuracy

92%Top 3 results match user intentMeasured: User clicks first result 92% of the time

Index Coverage

100%All documentation files searchable within 60 seconds of pushReal-time: Webhook-based auto-sync

Performance by Query Type

Query TypeAvg LatencyAccuracyToken Usage
Simple (1-2 words)30ms85%500 tokens
Question (5-10 words)45ms95%1,200 tokens
Complex (15+ words)120ms92%2,500 tokens
With filters35ms97%800 tokens
Insight: Questions and filters improve both speed and accuracy.

Limitations & Constraints

Current limitations:
  1. Query length: Max 500 characters
    • Most queries are under 100 characters
    • If hitting limit, query is probably too complex
  2. Results per query: Max 20 results
    • Configurable in API
    • UI shows top 10 by default
    • If you need >20 results, query is too broad
  3. Language support: Optimized for English
    • Other languages work but ~10-15% lower accuracy
    • Non-Latin scripts (Chinese, Arabic) supported but experimental
  4. Code search: Documentation only, not code
    • Code comments can be indexed (opt-in)
    • Use GitHub/grep for code search
  5. Binary files: Not indexed
    • PDFs, Word docs, images not searchable
    • Convert to Markdown for indexing
  6. Real-time sync: 30-60 second delay
    • Webhook processing takes 30-60 seconds
    • Not instant (like grep on local files)

Next Steps


Quick Reference: Search Cheat Sheet

Bookmark this for quick access:
# Natural language (recommended)
"How do I deploy to production?"
"What environment variables are required?"
"Show me unit test examples"

# With filters
"authentication" repository:backend-api
"setup guide" branch:develop
"API docs" updated:30d

# Exact phrases
"database migration"  # Exact match
database migration    # Semantic search

# Combined
"How to configure Redis cache?" repository:backend-api updated:30d
# Repository
repository:backend-api
repository:api,frontend,infrastructure

# Branch
branch:main
branch:develop
branch:feature/new-auth

# Path
path:docs/
path:README.md

# File type
filetype:md
filetype:README

# Date
updated:7d         # Last 7 days
updated:30d        # Last 30 days
updated:<90d       # Older than 90 days
updated:2025-01-01..2025-01-31  # Date range
# Setup & Installation
"How to set up local development?"
"What dependencies are required?"
"Installation guide for {technology}"

# API Documentation
"API authentication methods"
"List all API endpoints"
"API error codes and formats"

# Deployment
"How to deploy to {environment}?"
"Deployment rollback procedure"
"Environment variables for production"

# Troubleshooting
"How to debug {issue}?"
"Where are logs stored?"
"Common errors and solutions"

# Configuration
"How to configure {technology}?"
"Configuration file location"
"Default settings and overrides"

Need help with search?Average response time: Under 2 hours during business hoursPro tip: Share successful search patterns with your team in your internal wiki!