Search and Matching Algorithms

Behind every successful marketplace transaction is an algorithm that connected the right buyer with the right seller — search and matching is the invisible engine that turns a catalog of supply into a personalized, high-converting experience.

Why This Matters

🏢 Owner: Search and matching quality is the single biggest lever on conversion rate. A marketplace with great supply but poor matching is like a library with no catalog — the value is there, but nobody can find it.
💻 Dev: You'll design and build the ranking models, scoring functions, and matching algorithms that power the core marketplace experience. This is where engineering effort has the highest ROI.
📋 PM: Every ranking decision involves trade-offs — relevance vs revenue (promoted listings), fairness vs optimization (new seller visibility), personalization vs privacy. You'll navigate these daily.
🎨 Designer: The results your algorithms produce are only as good as how they're presented. You design the surfaces that make ranking decisions visible and useful to buyers.

The Concept (Simple)

Think of a marketplace matching algorithm like a dating app.

A bad dating app shows you everyone in the city. You scroll endlessly, see mostly irrelevant profiles, and give up. A great dating app learns your preferences, filters out bad matches, and shows you a curated set of people you're most likely to connect with — and it gets better the more you use it.

Marketplace search works identically. The algorithm's job is to take a massive catalog of supply and surface the few options most likely to result in a successful transaction for this specific buyer at this specific moment. It learns from every search, click, booking, review, and abandonment.

How It Works (Detailed)

Search vs Matching: Two Paradigms

┌─────────────────────────────────────────────────────────────────┐
│            SEARCH vs MATCHING PARADIGMS                           │
├──────────────────────────────┬──────────────────────────────────┤
│         SEARCH               │          MATCHING                │
├──────────────────────────────┼──────────────────────────────────┤
│ Buyer initiates              │ Platform initiates               │
│ "Show me options"            │ "Here's your match"              │
│ Buyer chooses from results   │ Algorithm assigns/recommends     │
│ Many results returned        │ One or few results returned      │
│                              │                                  │
│ Examples:                    │ Examples:                        │
│ • Airbnb search results     │ • Uber driver assignment         │
│ • Etsy product search       │ • DoorDash dasher assignment     │
│ • Upwork freelancer browse  │ • Thumbtack pro quotes           │
│                              │ • Tinder profile suggestions     │
├──────────────────────────────┼──────────────────────────────────┤
│ Best for: High-consideration │ Best for: Commodity/urgent       │
│ purchases where buyer wants  │ services where speed matters     │
│ to compare and choose        │ more than selection              │
└──────────────────────────────┴──────────────────────────────────┘

Building a Ranking Model

A ranking model scores every candidate listing against a query and returns results in order of predicted transaction probability:

┌─────────────────────────────────────────────────────────────────┐
│                    RANKING MODEL LAYERS                           │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  Layer 1: RETRIEVAL (narrow the universe)                       │
│  ┌─────────────────────────────────────────────────────┐       │
│  │  Input: 500,000 active listings                     │       │
│  │  Filter: location, category, availability, price    │       │
│  │  Output: 2,000 candidate listings                   │       │
│  │  Tech: Inverted index (Elasticsearch), geo-hash     │       │
│  │  Speed: < 50ms                                      │       │
│  └─────────────────────────────────────────────────────┘       │
│           │                                                     │
│           ▼                                                     │
│  Layer 2: SCORING (rank the candidates)                         │
│  ┌─────────────────────────────────────────────────────┐       │
│  │  Input: 2,000 candidates                            │       │
│  │  Score each on: relevance, quality, personalization │       │
│  │  Output: Scored and sorted list                     │       │
│  │  Tech: ML model (gradient boosted trees, neural)    │       │
│  │  Speed: < 100ms                                     │       │
│  └─────────────────────────────────────────────────────┘       │
│           │                                                     │
│           ▼                                                     │
│  Layer 3: RE-RANKING (apply business logic)                     │
│  ┌─────────────────────────────────────────────────────┐       │
│  │  Input: Scored list                                 │       │
│  │  Apply: diversity rules, new seller boost,          │       │
│  │         promoted listing insertion, deduplication    │       │
│  │  Output: Final result set (20-50 per page)          │       │
│  │  Tech: Rule engine + A/B testing framework          │       │
│  │  Speed: < 20ms                                      │       │
│  └─────────────────────────────────────────────────────┘       │
│                                                                 │
│  Total pipeline latency target: < 200ms                         │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Scoring Features

The scoring layer combines dozens of features into a single relevance score:

Feature Category	Features	Weight (typical)
Text relevance	Title match, description match, category match, synonym expansion	20%
Quality signals	Average rating, review count, completion rate, response time	25%
Geo-relevance	Distance from search location, neighborhood match, service area	20% (location marketplaces)
Personalization	Past searches, past purchases, saved items, price sensitivity	15%
Freshness	Last updated, recently active, calendar current	10%
Conversion history	Click-through rate, booking rate, purchase rate for this listing	10%

Scoring Formula Example

A simplified scoring function for a service marketplace:

┌─────────────────────────────────────────────────────────┐
│           SCORING FUNCTION (Simplified)                   │
├─────────────────────────────────────────────────────────┤
│                                                         │
│  score = (0.20 × text_relevance)                        │
│        + (0.25 × quality_score)                         │
│        + (0.20 × geo_score)                             │
│        + (0.15 × personalization_score)                 │
│        + (0.10 × freshness_score)                       │
│        + (0.10 × conversion_score)                      │
│                                                         │
│  Where:                                                 │
│  text_relevance  = BM25(query, listing_text)            │
│  quality_score   = normalize(rating × log(reviews+1))   │
│  geo_score       = 1 / (1 + distance_km)                │
│  personal_score  = cosine_sim(user_vector, listing_vec) │
│  freshness_score = decay(days_since_update, half_life)  │
│  conversion_score= historical_CTR × booking_rate        │
│                                                         │
│  In practice: replace with ML model trained on          │
│  click/booking data once you have enough volume         │
│                                                         │
└─────────────────────────────────────────────────────────┘

Matching Algorithms for Service Marketplaces

Service marketplaces often use algorithmic matching instead of search:

Nearest-Neighbor Matching (Uber Model)

┌─────────────────────────────────────────────────────────┐
│           NEAREST-NEIGHBOR MATCHING                      │
├─────────────────────────────────────────────────────────┤
│                                                         │
│  Rider requests pickup at location (x, y)               │
│       │                                                 │
│       ▼                                                 │
│  Query geo-index for drivers within 5km radius          │
│       │                                                 │
│       ▼                                                 │
│  Filter: available, correct vehicle type, rating > 4.5  │
│       │                                                 │
│       ▼                                                 │
│  Score remaining drivers:                               │
│  • ETA to pickup (50% weight)                           │
│  • Driver rating (20% weight)                           │
│  • Trip acceptance rate (20% weight)                    │
│  • Driver earnings balance (10% weight — fairness)      │
│       │                                                 │
│       ▼                                                 │
│  Assign top-scored driver, notify, start countdown      │
│  If declined → assign next driver within 10 seconds     │
│                                                         │
│  Tech: Geospatial index (H3, S2), real-time updates    │
│  Latency: < 3 seconds end-to-end                       │
│                                                         │
└─────────────────────────────────────────────────────────┘

Auction-Based Matching (Thumbtack Model)

Step	What Happens
1	Buyer submits a job request with details
2	Platform identifies qualified pros in the area
3	Pros receive the lead and decide whether to respond
4	Pros who respond send custom quotes to the buyer
5	Buyer reviews quotes, profiles, and reviews, then selects
6	Platform charges the pro for the lead (lead-gen model) or takes commission on transaction

Optimal Two-Sided Matching

For marketplaces where assignments need to be globally optimal (not just locally greedy):

┌─────────────────────────────────────────────────────────┐
│        TWO-SIDED OPTIMAL MATCHING                        │
├─────────────────────────────────────────────────────────┤
│                                                         │
│  Problem: Assign N orders to M drivers to minimize      │
│  total delivery time across ALL assignments             │
│                                                         │
│  Orders:    O1    O2    O3    O4                        │
│              \   / \   / \   /                          │
│               \ /   \ /   \ /                           │
│  ┌─────────────────────────────────────┐               │
│  │     MATCHING OPTIMIZER              │               │
│  │     (Hungarian Algorithm /          │               │
│  │      Linear Programming)            │               │
│  └─────────────────────────────────────┘               │
│               / \   / \   / \                           │
│              /   \ /   \ /   \                          │
│  Drivers:   D1    D2    D3    D4                        │
│                                                         │
│  Greedy: Assign each order to nearest driver            │
│  → locally optimal but globally suboptimal              │
│                                                         │
│  Optimal: Consider all assignments simultaneously       │
│  → minimizes total wait time across all orders          │
│                                                         │
│  Used by: DoorDash, Uber (batched matching),           │
│           Instacart, Postmates                          │
│                                                         │
└─────────────────────────────────────────────────────────┘

Personalization Engine

As you collect more user behavior data, layer in personalization:

Signal	What It Reveals	How to Use It
Search history	Categories and price ranges of interest	Boost results in preferred categories
Click behavior	Which listing attributes attract this user	Emphasize similar attributes in results
Purchase history	Past transaction patterns	Recommend complementary or repeat purchases
Saved/wishlist	Aspirational preferences	Weight results toward saved listing characteristics
Location history	Where the user typically searches	Pre-filter to relevant geography
Device/time	Usage context (mobile/desktop, morning/evening)	Adjust result format and ranking for context

Search Fairness and Diversity

Ranking purely by predicted conversion can create winner-take-all dynamics. Apply diversity rules:

Rule	Purpose	Implementation
New seller boost	Help new sellers get initial visibility and reviews	Temporary ranking bonus for first 30 days
Seller rotation	Prevent same top sellers dominating page 1	Soft cap on impressions per seller per day
Category diversity	Show variety in search results	Ensure top 10 results include 3+ different sub-categories
Price diversity	Don't only show cheapest or most expensive	Include results across the price spectrum
Geographic spread	For location marketplaces, cover the search area	Don't cluster all results in one neighborhood

In Practice

What Good Looks Like: Airbnb's Search Ranking

Airbnb's ranking considers 100+ features, including:

Location relevance — how well the listing matches the searched area
Price competitiveness — reasonable pricing vs comparable listings
Listing quality — photo count and quality, description completeness, accuracy score
Host behavior — response rate, acceptance rate, cancellation history
Guest fit — past booking patterns, group size, trip type
Freshness — recently updated calendar signals active host

What Good Looks Like: DoorDash's Matching

DoorDash uses batched optimal matching:

Collects orders over a short window (30-60 seconds)
Runs an optimization algorithm across all pending orders and available drivers
Considers: driver location, restaurant prep time, delivery distance, driver schedule preferences
Assigns batches (sometimes 2 orders to 1 driver) to minimize total delivery time
Re-optimizes continuously as new orders arrive and drivers complete deliveries

Common Anti-Patterns

Keyword-only search — relying on text match without quality, location, or personalization signals. Results feel random.
Static ranking — hardcoding ranking factors instead of learning from user behavior. The model never improves.
Ignoring position bias — items ranked first get more clicks regardless of quality. Your click data is biased by your own ranking. Use techniques like randomization or inverse propensity weighting.
Over-optimizing for revenue — stuffing promoted listings into top positions degrades search quality and buyer trust.
Cold start neglect — new listings have no click/conversion data, so they rank poorly, so they never get data. Create explicit cold-start strategies.
One-size-fits-all — using the same ranking model for different intents (browsing vs buying vs comparing). Detect intent and adapt.

ML Model Evolution

Stage	Approach	Data Required
Pre-data	Hand-tuned rules (location, rating, price)	None
Early	Logistic regression on click data	10K+ searches with outcomes
Growth	Gradient boosted trees (XGBoost, LightGBM)	100K+ searches
Scale	Deep learning (embedding models, transformers)	1M+ searches
Advanced	Real-time personalization + contextual bandits	Continuous stream

Key Takeaways

Search and matching are different paradigms: search lets buyers choose from ranked options, matching assigns optimal pairings algorithmically
Build ranking in three layers: retrieval (narrow the universe), scoring (rank candidates), re-ranking (apply business rules and diversity)
Start with hand-tuned scoring weights, evolve to ML models as you collect transaction data
Quality signals (ratings, completion rate, response time) should be the heaviest-weighted features in most marketplace ranking models
Service marketplaces benefit from optimal matching algorithms that consider all supply and demand simultaneously, not greedy nearest-neighbor
Personalization compounds over time — each interaction makes the next recommendation better, creating a data moat competitors can't replicate
Balance ranking optimization with fairness — new seller boosts, seller rotation, and diversity rules prevent winner-take-all dynamics
Measure ranking quality with offline metrics (NDCG) and online metrics (CTR, conversion rate, search-to-transaction rate)

Action Items

☐ 🏢 Owner: Establish search quality as a core KPI — track search-to-transaction rate weekly alongside GMV
☐ 🏢 Owner: Decide your marketplace's search vs matching paradigm based on transaction urgency and consideration level
☐ 💻 Dev: Build a three-layer ranking pipeline (retrieval → scoring → re-ranking) with configurable feature weights
☐ 💻 Dev: Implement click and conversion tracking on search results from day one — this data trains your future ML models
☐ 📋 PM: Define ranking factor weights for launch and create an A/B testing framework to iterate on them
☐ 📋 PM: Design cold-start strategies for new listings (temporary boost, random exposure, editorial featuring)
☐ 🎨 Designer: Design search result cards that surface the signals your ranking model values (rating, reviews, response time, price)
☐ 🎨 Designer: Create distinct UI patterns for search results vs algorithmic matches so users understand what they're seeing

Next: Payments, Escrow, and Payouts

Search and Matching Algorithms ​

Why This Matters ​

The Concept (Simple) ​

How It Works (Detailed) ​

Search vs Matching: Two Paradigms ​

Building a Ranking Model ​

Scoring Features ​

Scoring Formula Example ​

Matching Algorithms for Service Marketplaces ​

Nearest-Neighbor Matching (Uber Model) ​

Auction-Based Matching (Thumbtack Model) ​

Optimal Two-Sided Matching ​

Personalization Engine ​

Search Fairness and Diversity ​

In Practice ​

What Good Looks Like: Airbnb's Search Ranking ​

What Good Looks Like: DoorDash's Matching ​

Common Anti-Patterns ​

ML Model Evolution ​

Key Takeaways ​

Action Items ​