dev-tasks

Common Development Tasks

Adding a New Facet Field

•Update DocumentIndexer.java - add the field to FacetsConfig
•Add the field to the Lucene document in the indexing method
•Update README.md field schema table
•Note: This is a breaking change -- requires full reindex of all documents

Adding File Format Support

•Add the file extension pattern to application.yaml under include-patterns
•Apache Tika handles most formats automatically via auto-detection
•For custom parsing beyond Tika: extend FileContentExtractor.java
•Test with real documents of the new format
•Update README.md supported formats section

Debugging STDIO Issues

The MCP server uses STDIO transport. Any output to stdout breaks the JSON-RPC protocol.

Checklist:

•Must use deployed profile: java -Dspring.profiles.active=deployed -jar target/luceneserver-*.jar
•The deployed profile disables all console logging appenders
•Search for System.out.println in codebase -- there must be NONE
•Search for System.err.println -- also problematic
•Logger output goes to file only in deployed mode
•Test command: java -Dspring.profiles.active=deployed -jar target/luceneserver-*.jar

Performance Tuning

Indexing is slow

•Increase thread-pool-size in config (more parallel file walkers)
•Increase batch-size (fewer Lucene commits)
•Increase batch-timeout-ms (larger batches)
•Check if max-content-length is causing excessive content truncation

Search is slow

•Check for leading wildcard queries (*term) -- these are expensive even with content_reversed optimization
•Reduce pageSize in search requests
•Check total index size -- very large indices may benefit from optimizeIndex()

Out of Memory (OOM)

•Set max-content-length to limit per-document content size
•Increase JVM heap: -Xmx2g or higher
•Reduce thread-pool-size (each thread holds document content in memory)
•Check for very large files being indexed (e.g., multi-GB archives)

Adding a New Indexed Field

WARNING: Adding a new indexed field is a breaking change. Existing documents in the index will not have the new field.

Steps:

•Add field in DocumentIndexer.java document creation
•If field needs special analysis: update PerFieldAnalyzerWrapper in LuceneIndexService
•If field should be searchable: update query parsing in LuceneIndexService
•If field should be highlighted: ensure it has Store.YES and term vectors
•Update README.md field schema table
•Document in README.md that reindex is required
•Test with SearchHighlightingIntegrationTest patterns

Testing Patterns

•Integration tests use @TempDir for isolated Lucene indices
•Mock ApplicationConfig for configuration
•Helper indexDocument(Path) extracts, creates doc, indexes, commits, refreshes
•Run all tests: mvn test (~88 tests, ~10 seconds)
•Run specific test: mvn test -Dtest=SearchHighlightingIntegrationTest