Skip to content

Search and Matching Algorithms

Behind every successful marketplace transaction is an algorithm that connected the right buyer with the right seller — search and matching is the invisible engine that turns a catalog of supply into a personalized, high-converting experience.

Why This Matters

  • 🏢 Owner: Search and matching quality is the single biggest lever on conversion rate. A marketplace with great supply but poor matching is like a library with no catalog — the value is there, but nobody can find it.
  • 💻 Dev: You'll design and build the ranking models, scoring functions, and matching algorithms that power the core marketplace experience. This is where engineering effort has the highest ROI.
  • 📋 PM: Every ranking decision involves trade-offs — relevance vs revenue (promoted listings), fairness vs optimization (new seller visibility), personalization vs privacy. You'll navigate these daily.
  • 🎨 Designer: The results your algorithms produce are only as good as how they're presented. You design the surfaces that make ranking decisions visible and useful to buyers.

The Concept (Simple)

Think of a marketplace matching algorithm like a dating app.

A bad dating app shows you everyone in the city. You scroll endlessly, see mostly irrelevant profiles, and give up. A great dating app learns your preferences, filters out bad matches, and shows you a curated set of people you're most likely to connect with — and it gets better the more you use it.

Marketplace search works identically. The algorithm's job is to take a massive catalog of supply and surface the few options most likely to result in a successful transaction for this specific buyer at this specific moment. It learns from every search, click, booking, review, and abandonment.

How It Works (Detailed)

Search vs Matching: Two Paradigms

┌─────────────────────────────────────────────────────────────────┐
│            SEARCH vs MATCHING PARADIGMS                           │
├──────────────────────────────┬──────────────────────────────────┤
│         SEARCH               │          MATCHING                │
├──────────────────────────────┼──────────────────────────────────┤
│ Buyer initiates              │ Platform initiates               │
│ "Show me options"            │ "Here's your match"              │
│ Buyer chooses from results   │ Algorithm assigns/recommends     │
│ Many results returned        │ One or few results returned      │
│                              │                                  │
│ Examples:                    │ Examples:                        │
│ • Airbnb search results     │ • Uber driver assignment         │
│ • Etsy product search       │ • DoorDash dasher assignment     │
│ • Upwork freelancer browse  │ • Thumbtack pro quotes           │
│                              │ • Tinder profile suggestions     │
├──────────────────────────────┼──────────────────────────────────┤
│ Best for: High-consideration │ Best for: Commodity/urgent       │
│ purchases where buyer wants  │ services where speed matters     │
│ to compare and choose        │ more than selection              │
└──────────────────────────────┴──────────────────────────────────┘

Building a Ranking Model

A ranking model scores every candidate listing against a query and returns results in order of predicted transaction probability:

┌─────────────────────────────────────────────────────────────────┐
│                    RANKING MODEL LAYERS                           │
├─────────────────────────────────────────────────────────────────┤
│                                                                 │
│  Layer 1: RETRIEVAL (narrow the universe)                       │
│  ┌─────────────────────────────────────────────────────┐       │
│  │  Input: 500,000 active listings                     │       │
│  │  Filter: location, category, availability, price    │       │
│  │  Output: 2,000 candidate listings                   │       │
│  │  Tech: Inverted index (Elasticsearch), geo-hash     │       │
│  │  Speed: < 50ms                                      │       │
│  └─────────────────────────────────────────────────────┘       │
│           │                                                     │
│           ▼                                                     │
│  Layer 2: SCORING (rank the candidates)                         │
│  ┌─────────────────────────────────────────────────────┐       │
│  │  Input: 2,000 candidates                            │       │
│  │  Score each on: relevance, quality, personalization │       │
│  │  Output: Scored and sorted list                     │       │
│  │  Tech: ML model (gradient boosted trees, neural)    │       │
│  │  Speed: < 100ms                                     │       │
│  └─────────────────────────────────────────────────────┘       │
│           │                                                     │
│           ▼                                                     │
│  Layer 3: RE-RANKING (apply business logic)                     │
│  ┌─────────────────────────────────────────────────────┐       │
│  │  Input: Scored list                                 │       │
│  │  Apply: diversity rules, new seller boost,          │       │
│  │         promoted listing insertion, deduplication    │       │
│  │  Output: Final result set (20-50 per page)          │       │
│  │  Tech: Rule engine + A/B testing framework          │       │
│  │  Speed: < 20ms                                      │       │
│  └─────────────────────────────────────────────────────┘       │
│                                                                 │
│  Total pipeline latency target: < 200ms                         │
│                                                                 │
└─────────────────────────────────────────────────────────────────┘

Scoring Features

The scoring layer combines dozens of features into a single relevance score:

Feature CategoryFeaturesWeight (typical)
Text relevanceTitle match, description match, category match, synonym expansion20%
Quality signalsAverage rating, review count, completion rate, response time25%
Geo-relevanceDistance from search location, neighborhood match, service area20% (location marketplaces)
PersonalizationPast searches, past purchases, saved items, price sensitivity15%
FreshnessLast updated, recently active, calendar current10%
Conversion historyClick-through rate, booking rate, purchase rate for this listing10%

Scoring Formula Example

A simplified scoring function for a service marketplace:

┌─────────────────────────────────────────────────────────┐
│           SCORING FUNCTION (Simplified)                   │
├─────────────────────────────────────────────────────────┤
│                                                         │
│  score = (0.20 × text_relevance)                        │
│        + (0.25 × quality_score)                         │
│        + (0.20 × geo_score)                             │
│        + (0.15 × personalization_score)                 │
│        + (0.10 × freshness_score)                       │
│        + (0.10 × conversion_score)                      │
│                                                         │
│  Where:                                                 │
│  text_relevance  = BM25(query, listing_text)            │
│  quality_score   = normalize(rating × log(reviews+1))   │
│  geo_score       = 1 / (1 + distance_km)                │
│  personal_score  = cosine_sim(user_vector, listing_vec) │
│  freshness_score = decay(days_since_update, half_life)  │
│  conversion_score= historical_CTR × booking_rate        │
│                                                         │
│  In practice: replace with ML model trained on          │
│  click/booking data once you have enough volume         │
│                                                         │
└─────────────────────────────────────────────────────────┘

Matching Algorithms for Service Marketplaces

Service marketplaces often use algorithmic matching instead of search:

Nearest-Neighbor Matching (Uber Model)

┌─────────────────────────────────────────────────────────┐
│           NEAREST-NEIGHBOR MATCHING                      │
├─────────────────────────────────────────────────────────┤
│                                                         │
│  Rider requests pickup at location (x, y)               │
│       │                                                 │
│       ▼                                                 │
│  Query geo-index for drivers within 5km radius          │
│       │                                                 │
│       ▼                                                 │
│  Filter: available, correct vehicle type, rating > 4.5  │
│       │                                                 │
│       ▼                                                 │
│  Score remaining drivers:                               │
│  • ETA to pickup (50% weight)                           │
│  • Driver rating (20% weight)                           │
│  • Trip acceptance rate (20% weight)                    │
│  • Driver earnings balance (10% weight — fairness)      │
│       │                                                 │
│       ▼                                                 │
│  Assign top-scored driver, notify, start countdown      │
│  If declined → assign next driver within 10 seconds     │
│                                                         │
│  Tech: Geospatial index (H3, S2), real-time updates    │
│  Latency: < 3 seconds end-to-end                       │
│                                                         │
└─────────────────────────────────────────────────────────┘

Auction-Based Matching (Thumbtack Model)

StepWhat Happens
1Buyer submits a job request with details
2Platform identifies qualified pros in the area
3Pros receive the lead and decide whether to respond
4Pros who respond send custom quotes to the buyer
5Buyer reviews quotes, profiles, and reviews, then selects
6Platform charges the pro for the lead (lead-gen model) or takes commission on transaction

Optimal Two-Sided Matching

For marketplaces where assignments need to be globally optimal (not just locally greedy):

┌─────────────────────────────────────────────────────────┐
│        TWO-SIDED OPTIMAL MATCHING                        │
├─────────────────────────────────────────────────────────┤
│                                                         │
│  Problem: Assign N orders to M drivers to minimize      │
│  total delivery time across ALL assignments             │
│                                                         │
│  Orders:    O1    O2    O3    O4                        │
│              \   / \   / \   /                          │
│               \ /   \ /   \ /                           │
│  ┌─────────────────────────────────────┐               │
│  │     MATCHING OPTIMIZER              │               │
│  │     (Hungarian Algorithm /          │               │
│  │      Linear Programming)            │               │
│  └─────────────────────────────────────┘               │
│               / \   / \   / \                           │
│              /   \ /   \ /   \                          │
│  Drivers:   D1    D2    D3    D4                        │
│                                                         │
│  Greedy: Assign each order to nearest driver            │
│  → locally optimal but globally suboptimal              │
│                                                         │
│  Optimal: Consider all assignments simultaneously       │
│  → minimizes total wait time across all orders          │
│                                                         │
│  Used by: DoorDash, Uber (batched matching),           │
│           Instacart, Postmates                          │
│                                                         │
└─────────────────────────────────────────────────────────┘

Personalization Engine

As you collect more user behavior data, layer in personalization:

SignalWhat It RevealsHow to Use It
Search historyCategories and price ranges of interestBoost results in preferred categories
Click behaviorWhich listing attributes attract this userEmphasize similar attributes in results
Purchase historyPast transaction patternsRecommend complementary or repeat purchases
Saved/wishlistAspirational preferencesWeight results toward saved listing characteristics
Location historyWhere the user typically searchesPre-filter to relevant geography
Device/timeUsage context (mobile/desktop, morning/evening)Adjust result format and ranking for context

Search Fairness and Diversity

Ranking purely by predicted conversion can create winner-take-all dynamics. Apply diversity rules:

RulePurposeImplementation
New seller boostHelp new sellers get initial visibility and reviewsTemporary ranking bonus for first 30 days
Seller rotationPrevent same top sellers dominating page 1Soft cap on impressions per seller per day
Category diversityShow variety in search resultsEnsure top 10 results include 3+ different sub-categories
Price diversityDon't only show cheapest or most expensiveInclude results across the price spectrum
Geographic spreadFor location marketplaces, cover the search areaDon't cluster all results in one neighborhood

In Practice

What Good Looks Like: Airbnb's Search Ranking

Airbnb's ranking considers 100+ features, including:

  • Location relevance — how well the listing matches the searched area
  • Price competitiveness — reasonable pricing vs comparable listings
  • Listing quality — photo count and quality, description completeness, accuracy score
  • Host behavior — response rate, acceptance rate, cancellation history
  • Guest fit — past booking patterns, group size, trip type
  • Freshness — recently updated calendar signals active host

What Good Looks Like: DoorDash's Matching

DoorDash uses batched optimal matching:

  1. Collects orders over a short window (30-60 seconds)
  2. Runs an optimization algorithm across all pending orders and available drivers
  3. Considers: driver location, restaurant prep time, delivery distance, driver schedule preferences
  4. Assigns batches (sometimes 2 orders to 1 driver) to minimize total delivery time
  5. Re-optimizes continuously as new orders arrive and drivers complete deliveries

Common Anti-Patterns

  • Keyword-only search — relying on text match without quality, location, or personalization signals. Results feel random.
  • Static ranking — hardcoding ranking factors instead of learning from user behavior. The model never improves.
  • Ignoring position bias — items ranked first get more clicks regardless of quality. Your click data is biased by your own ranking. Use techniques like randomization or inverse propensity weighting.
  • Over-optimizing for revenue — stuffing promoted listings into top positions degrades search quality and buyer trust.
  • Cold start neglect — new listings have no click/conversion data, so they rank poorly, so they never get data. Create explicit cold-start strategies.
  • One-size-fits-all — using the same ranking model for different intents (browsing vs buying vs comparing). Detect intent and adapt.

ML Model Evolution

StageApproachData Required
Pre-dataHand-tuned rules (location, rating, price)None
EarlyLogistic regression on click data10K+ searches with outcomes
GrowthGradient boosted trees (XGBoost, LightGBM)100K+ searches
ScaleDeep learning (embedding models, transformers)1M+ searches
AdvancedReal-time personalization + contextual banditsContinuous stream

Key Takeaways

  • Search and matching are different paradigms: search lets buyers choose from ranked options, matching assigns optimal pairings algorithmically
  • Build ranking in three layers: retrieval (narrow the universe), scoring (rank candidates), re-ranking (apply business rules and diversity)
  • Start with hand-tuned scoring weights, evolve to ML models as you collect transaction data
  • Quality signals (ratings, completion rate, response time) should be the heaviest-weighted features in most marketplace ranking models
  • Service marketplaces benefit from optimal matching algorithms that consider all supply and demand simultaneously, not greedy nearest-neighbor
  • Personalization compounds over time — each interaction makes the next recommendation better, creating a data moat competitors can't replicate
  • Balance ranking optimization with fairness — new seller boosts, seller rotation, and diversity rules prevent winner-take-all dynamics
  • Measure ranking quality with offline metrics (NDCG) and online metrics (CTR, conversion rate, search-to-transaction rate)

Action Items

  • 🏢 Owner: Establish search quality as a core KPI — track search-to-transaction rate weekly alongside GMV
  • 🏢 Owner: Decide your marketplace's search vs matching paradigm based on transaction urgency and consideration level
  • 💻 Dev: Build a three-layer ranking pipeline (retrieval → scoring → re-ranking) with configurable feature weights
  • 💻 Dev: Implement click and conversion tracking on search results from day one — this data trains your future ML models
  • 📋 PM: Define ranking factor weights for launch and create an A/B testing framework to iterate on them
  • 📋 PM: Design cold-start strategies for new listings (temporary boost, random exposure, editorial featuring)
  • 🎨 Designer: Design search result cards that surface the signals your ranking model values (rating, reviews, response time, price)
  • 🎨 Designer: Create distinct UI patterns for search results vs algorithmic matches so users understand what they're seeing

Next: Payments, Escrow, and Payouts

The Product Builder's Playbook