Skip to content

Recommendation Model [WIP]

Status: This spec is a work-in-progress. The custom model architecture, scoring pipeline, training cycle, and embedding-based approach will be designed in a later phase. The current implementation uses a simple interim ranking (see below) to unblock the feed endpoint and UI.

The recommendation model provides personalized ranking of events and gigs for each user. It operates independently from the LLM chatbot. The final model will be a purpose-built system trained on platform-specific signals.

Until the custom model is designed, GET /recommendations uses a simple heuristic ranking:

  1. Candidate selection: Events with status OPEN or IN_PROGRESS and startAt in the future.
  2. Interest matching: Events whose category matches any of the user’s interests are boosted.
  3. Popularity: Events with more total interactions rank higher.
  4. Recency: Events starting sooner rank higher.
  5. Dismiss penalty: Events the user has dismissed are excluded.

This is a deterministic query-time sort — no model training, no vectors, no caching layer. It is implemented as a database query with ORDER BY clauses.

SignalSourceUsed in interim?
User interestsUser.interestsYes — category match boost
Interaction historyInteraction tablePartial — dismiss exclusion only
Event recencyEvent.startAtYes
Event popularityAggregate interaction countYes
Category affinityFrequency of interactions per categoryNo — deferred to custom model
Geographic proximityUser location vs. event lat/lngNo — deferred to custom model
Embeddings / semantic similarityEvent embeddings via pgvectorNo — deferred to custom model
Collaborative filteringUser-event matrixNo — deferred to custom model

The custom model phase will evaluate:

  • Embedding-based content scoring: Represent events as embeddings (via the AI module’s embedding task) and compute semantic similarity against user profile embeddings. Replaces manual feature vectors.
  • Collaborative filtering: User-event interaction matrix with matrix factorization.
  • Hybrid approach: Content-based as primary ranker, collaborative as boost signal.
  • Training/update cycle: Nightly retraining for collaborative; real-time embedding updates for content.
  • Caching: Per-user cache with TTL and invalidation on interaction/interest change.

These details will be spec’d when the model is designed.

GIVEN user A has interests ["music", "tech"]
WHEN user A sends GET /recommendations
THEN events with category "music" or "tech" appear higher in the list
GIVEN event E has 50 total interactions and event F has 5
AND both match user A's interests equally
WHEN user A sends GET /recommendations
THEN event E ranks higher than event F
GIVEN event A starts tomorrow and event B starts in 30 days
AND both have equal popularity and interest match
WHEN user A sends GET /recommendations
THEN event A ranks higher than event B
GIVEN user A dismissed event E
WHEN user A sends GET /recommendations
THEN event E does not appear in the results
GIVEN event E has startAt in the past and status COMPLETED
WHEN user A sends GET /recommendations
THEN event E is not in the results

S-REC-MODEL-6: No interests, no interactions — popularity fallback

Section titled “S-REC-MODEL-6: No interests, no interactions — popularity fallback”
GIVEN user A has no interests and no interactions
WHEN user A sends GET /recommendations
THEN events are sorted by popularity (interaction count) then recency