Files
spotify_vibe/DESIGN.md
2026-02-26 20:40:18 +00:00

9.2 KiB

Design / Architecture

Goal

The service generates a Spotify "daily vibe" playlist based on:

  • the user's recent listening
  • a local cache of liked tracks
  • the history of tracks previously recommended by the bot

The main user interface is a Telegram bot (/generate, /connect, /status, etc.), with an optional nightly cron trigger.

High-level overview

Core components:

  • FastAPI application
    • health check
    • Spotify OAuth start/callback
    • internal endpoint for cron (/internal/jobs/nightly)
  • TelegramBotRunner (polling)
    • handles user commands
    • starts generation and sends status updates
  • PlaylistJobService
    • orchestrates a single run (token -> sync likes -> candidates -> playlist -> persist)
  • RecommendationEngine
    • builds seed profile
    • collects candidate pool
    • ranks and selects tracks
  • SpotifyClient / LastFmClient
    • external API calls
  • SQLite (via async SQLAlchemy)
    • users, liked cache, recommendation history, run log

Runtime / lifecycle

Entry point: app/main.py.

On startup:

  1. Load Settings (app/config.py)
  2. Create async SQLAlchemy engine and session factory (app/db/session.py)
  3. Run create_all (auto-create tables)
  4. Create shared httpx.AsyncClient
  5. Create API clients:
    • SpotifyClient
    • LastFmClient
  6. Create services:
    • SpotifyAuthService
    • RecommendationEngine
    • PlaylistJobService
  7. Initialize TelegramBotRunner and start polling
  8. Store runtime/service objects in app.state.runtime and app.state.services

On shutdown:

  • stop Telegram polling
  • close httpx.AsyncClient
  • dispose DB engine

Containers / deployment

docker-compose.yml defines:

  • app (main service, FastAPI + Telegram polling)
  • cron (optional service with supercronic)

Important:

  • cron is under profiles: ["cron"] and does not start by default
  • the project is now manual-first: users generate playlists via Telegram /generate

cron runs scripts/run_nightly.sh, which calls:

  • POST /internal/jobs/nightly with Authorization: Bearer <INTERNAL_JOB_TOKEN>

Application layers

1. API layer (app/api/routes.py)

Responsibilities:

  • HTTP endpoints for OAuth and internal jobs

Endpoints:

  • GET /health
  • GET /auth/spotify/start
  • GET /auth/spotify/callback
  • POST /internal/jobs/nightly

Notes:

  • OAuth callback sends a Telegram notification to the user on success
  • nightly endpoint is protected by INTERNAL_JOB_TOKEN

2. Bot layer (app/bot/telegram_bot.py)

Responsibilities:

  • user-facing interface via Telegram commands and reply-keyboard buttons

Supported commands:

  • /start
  • /help
  • /connect
  • /status
  • /generate
  • /latest
  • /setsize
  • /setratio
  • /sync
  • /lang

Notes:

  • /generate calls PlaylistJobService.generate_for_user(..., force=True, notify=False)
  • /sync only refreshes liked tracks cache
  • each command uses a short-lived DB session from session_factory
  • bot UI supports ru, en, uk, and nl (localized text/buttons)

3. Service layer

SpotifyAuthService (app/services/spotify_auth.py)

Responsibilities:

  • create OAuth state
  • exchange code for tokens
  • refresh access token
  • ensure valid access token before Spotify calls

Notes:

  • datetime comparison is normalized to UTC (important for SQLite naive datetimes)
  • stores scopes and expiry on the users row

RecommendationEngine (app/services/recommendation.py)

Responsibilities:

  • sync liked tracks into local cache
  • build seed profile
  • collect candidates from multiple sources
  • rank/select final track list

Current candidate sources:

  • Spotify recommendations
  • Spotify artist top tracks
  • Spotify search (seed-artist fallback)
  • Last.fm track similar -> Spotify search
  • Last.fm artist similar -> Spotify search

Key implementation details:

  • respects Spotify recommendations seed limit: max 5 seeds per request
  • degrades gracefully when some sources fail
  • includes liked fallback (if all candidates are already liked)

PlaylistJobService (app/services/playlist_job.py)

Responsibilities:

  • orchestrate an end-to-end playlist generation run
  • create Spotify playlist and add tracks
  • persist run details and track list
  • update recommendation history
  • send Telegram notifications (if notifier is configured)

Run sequence:

  1. Validate user / Spotify connection
  2. Create playlist_runs row with running status
  3. Get valid access token
  4. Sync liked tracks
  5. Build playlist via RecommendationEngine
  6. Create playlist in Spotify
  7. Add tracks to playlist
  8. Persist run tracks / history / metadata
  9. Commit and return JobOutcome

On error:

  • playlist_runs.status = failed
  • error message is written to notes

Client layer

SpotifyClient (app/clients/spotify.py)

Encapsulates Spotify Web API calls.

Important implementation choices:

  • create_playlist() uses POST /me/playlists
    • chosen because POST /users/{id}/playlists can return 403 in some app/account combinations
  • add_playlist_items() uses POST /playlists/{playlist_id}/items
    • /tracks may return 403 while /items succeeds
  • delete_playlist() uses DELETE /playlists/{playlist_id}/followers
    • this is "unfollow" (Spotify does not support hard-delete of playlists)
  • built-in retry for 429 rate limiting using Retry-After

LastFmClient (app/clients/lastfm.py)

Optional enrichment source for similarity.

  • can be disabled (empty LASTFM_API_KEY)
  • Last.fm errors should not fail the whole run if other sources still work

Persistence layer (SQLite + SQLAlchemy)

Tables (app/db/models.py)

users

Stores:

  • Telegram identity (telegram_chat_id, telegram_username)
  • Spotify identity/tokens/scopes (spotify_user_id, access/refresh token, expiry, scopes)
  • user settings (playlist_size, min_new_ratio, timezone)
  • last outputs (last_generated_date, latest_playlist_id, latest_playlist_url)

auth_states

Temporary OAuth state for callback:

  • state
  • telegram_chat_id
  • expires_at

saved_tracks

Local cache of the user's Liked Songs:

  • spotify_track_id
  • track/artist metadata, album, popularity
  • added_at

recommendation_history

History of previously recommended tracks:

  • spotify_track_id
  • first_recommended_at
  • last_recommended_at
  • times_recommended

playlist_runs

Playlist generation run log:

  • status (running/success/failed)
  • Spotify playlist metadata
  • stats (total/new/reused)
  • notes

playlist_run_tracks

Snapshot of tracks in a specific run:

  • track id / name / artists
  • source (which source produced the track)
  • position
  • is_new_to_bot

Repository layer (app/db/repositories.py)

Pattern:

  • thin repositories over AsyncSession
  • isolates CRUD/query logic from the service layer

Repositories include:

  • UserRepository
  • AuthStateRepository
  • SavedTrackRepository
  • RecommendationHistoryRepository
  • PlaylistRunRepository

Data flows

OAuth flow

  1. Telegram /connect
  2. SpotifyAuthService.create_connect_url()
  3. User opens Spotify auth page
  4. GET /auth/spotify/callback
  5. SpotifyAuthService.handle_callback()
  6. Tokens and Spotify profile are saved to users
  7. User receives a Telegram confirmation message

Manual generation flow (/generate)

  1. Telegram /generate
  2. PlaylistJobService.generate_for_user(..., force=True)
  3. Sync likes + load recent listening + collect candidates
  4. Create playlist + add items in Spotify
  5. Persist run/history
  6. Reply to user in Telegram

Nightly cron flow (optional)

  1. supercronic in the cron container
  2. scripts/run_nightly.sh
  3. POST /internal/jobs/nightly
  4. PlaylistJobService.generate_for_all_connected_users()

Concurrency / consistency

  • Generation is protected by a single asyncio.Lock (generate_lock) in PlaylistJobService
    • prevents overlapping runs and history update races
  • Most run operations happen in one DB session
  • Errors inside a run mark the run as failed

Recommendation algorithm (summary)

Detailed explanation is in README.md, but architecturally the pipeline is:

  1. Build seed profile (recent + liked)
  2. Collect candidate pool (Spotify + Last.fm + fallback search)
  3. Deduplicate
  4. Rank (penalties/boosts)
  5. Select (min_new_ratio + artist caps)
  6. Persist stats/history

Configuration

Main environment variables (app/config.py):

  • TELEGRAM_BOT_TOKEN
  • SPOTIFY_CLIENT_ID
  • SPOTIFY_CLIENT_SECRET
  • SPOTIFY_REDIRECT_URI
  • SPOTIFY_DEFAULT_MARKET
  • LASTFM_API_KEY (optional)
  • INTERNAL_JOB_TOKEN
  • DB_PATH
  • DEFAULT_PLAYLIST_SIZE
  • MIN_NEW_RATIO
  • RECENT_DAYS_WINDOW
  • PLAYLIST_VISIBILITY

Diagnostics / observability

Current state:

  • primary feedback comes from Telegram messages and playlist_runs.notes
  • HTTP /health for liveness
  • tests cover critical Spotify routes and parts of the recommendation pipeline

Possible improvements:

  • structured logs for source coverage (how many candidates from each source)
  • metrics for Spotify/Last.fm errors and latency
  • dedicated debug dry-run endpoint (without creating a playlist)

Known limitations

  • SQLite is suitable for small-scale / single-node setups
  • Telegram polling + FastAPI run in the same process/container
  • per-user timezone support is limited (cron is global)
  • external API limitations (Spotify/Last.fm) vary by app/account