Skip to main content
Back to Database Compass

Search Engine

AP 53/80 points

A database engine built around inverted indexes for full-text search, relevance scoring, and faceted navigation. Optimized for finding needles in haystacks across unstructured or semi-structured text data.

Scale 8
Perf 8
Rely 5
Ops 5
Query 7
Schema 7
Eco 8
Learn 5
Σ Total 53/80

Character

The librarian with a photographic memory. It has indexed every word in every document and can find exactly what you're looking for before you finish typing. Brilliant at search and discovery, but don't trust it as your only copy of the data. It's a finding aid, not a vault.

When to Use

  • Full-text search across product catalogs or content libraries
  • Log aggregation and observability (ELK/OpenSearch stack)
  • E-commerce faceted navigation and typeahead suggestions
  • Security analytics and threat detection (SIEM)

Avoid When

  • Used as the sole primary data store without a separate source of truth
  • Strict transactional consistency is required for writes
  • Simple key-value lookups don't justify the cluster overhead
  • Budget constraints cannot support resource-intensive clusters

Dimension Analysis

Scalability 8/10

Elasticsearch and Solr distribute indexes across shards and nodes with near-linear search throughput scaling. Adding nodes improves both query capacity and index size limits.

Performance 8/10

Inverted indexes enable sub-second full-text search across billions of documents. Relevance scoring (BM25, TF-IDF) and aggregations execute efficiently thanks to columnar segment storage.

Reliability 5/10

Search engines are typically used as secondary indexes, not primary data stores. Replica shards provide redundancy, but reindexing from a primary source is the standard recovery pattern.

Operational Simplicity 5/10

Cluster management (shard allocation, index lifecycle, JVM tuning) requires expertise. Elasticsearch clusters are notoriously resource-hungry and need careful capacity planning to avoid split-brain scenarios.

Query Flexibility 7/10

The query DSL supports full-text search, fuzzy matching, geospatial queries, aggregations, and complex boolean logic. However, it lacks relational joins and strict transactional semantics.

Schema Flexibility 7/10

Dynamic mapping automatically detects and indexes new fields. Schema evolution is straightforward for adding fields, but changing field types requires reindexing the entire dataset.

Ecosystem Maturity 8/10

Elasticsearch powers search for Wikipedia, GitHub, and thousands of enterprises. The ELK stack (Elasticsearch, Logstash, Kibana) is the de facto standard for log analytics and observability.

Learning Curve 5/10

The query DSL, mapping configuration, analyzer chains, and cluster topology concepts create a significant learning curve. Effective relevance tuning requires understanding information retrieval theory.

CAP Theorem

AP Availability + Partition Tolerance

Search engines prioritize availability and partition tolerance. Elasticsearch uses eventual consistency with configurable refresh intervals. Writes become searchable after a refresh period (default 1 second), not immediately.

Top Databases

Elasticsearch Elastic License 2.0 / SSPL (source-available since 2021)

The most popular search and analytics engine, built on Apache Lucene. Powers search, logging, and security analytics for thousands of organizations with a rich query DSL and visualization stack.

Apache Solr Apache 2.0

Enterprise search platform built on Apache Lucene with advanced features like faceted search, distributed indexing, and rich document handling. Preferred in traditional enterprise environments.

Lightning-fast, typo-tolerant search engine designed for end-user-facing search. Prioritizes developer experience with simple APIs and instant out-of-the-box relevance.

Open-source search engine focused on developer productivity and ease of use. Features typo tolerance, faceting, and geo-search with a simple RESTful API and low operational overhead.