
The Problem
Imagine you have millions of events in your database and you want to find every single one that contains the word "soccer."
A traditional database query would have to scan every single event row-by-row to check for that word.
That is slow, expensive, and unscalable.
The Solution
A data structure called an Inverted Index.
It flips the relationship entirely.
❌ Instead of storing: Event → List of words it contains
✅ It stores: Word → List of Events where it appears
How it works
If you have three events:
- Event1: "soccer championship finals"
- Event2: "youth soccer meetup"
- Event3: "music festival"
The inverted index looks like:
- soccer → [Event1, Event2]
- championship → [Event1]
- music → [Event3]
Now, when a user searches for "soccer", the system doesn't scan the database. It just looks up the word "soccer" in the index and instantly retrieves [Event 1, Event 2].
It's basically the index page at the back of a textbook, but applied to millions (or billions) of documents.
This simple concept is the engine behind Elasticsearch, Solr, Lucene, and almost every major search bar you use today.
Next: Once you understand how inverted indexes work, the next question is: where should this index live? Learn why search needs its own service and infrastructure.