heboba/spotify_vibe

Fork 0

Files

heboba e3ae678fea Add uk and nl

2026-02-26 20:40:18 +00:00

9.2 KiB

Raw Blame History

Design / Architecture

Goal

The service generates a Spotify "daily vibe" playlist based on:

the user's recent listening
a local cache of liked tracks
the history of tracks previously recommended by the bot

The main user interface is a Telegram bot (/generate, /connect, /status, etc.), with an optional nightly cron trigger.

High-level overview

Core components:

FastAPI application
- health check
- Spotify OAuth start/callback
- internal endpoint for cron (/internal/jobs/nightly)
TelegramBotRunner (polling)
- handles user commands
- starts generation and sends status updates
PlaylistJobService
- orchestrates a single run (token -> sync likes -> candidates -> playlist -> persist)
RecommendationEngine
- builds seed profile
- collects candidate pool
- ranks and selects tracks
SpotifyClient / LastFmClient
- external API calls
SQLite (via async SQLAlchemy)
- users, liked cache, recommendation history, run log

Runtime / lifecycle

Entry point: app/main.py.

On startup:

Load Settings (app/config.py)
Create async SQLAlchemy engine and session factory (app/db/session.py)
Run create_all (auto-create tables)
Create shared httpx.AsyncClient
Create API clients:
- SpotifyClient
- LastFmClient
Create services:
- SpotifyAuthService
- RecommendationEngine
- PlaylistJobService
Initialize TelegramBotRunner and start polling
Store runtime/service objects in app.state.runtime and app.state.services

On shutdown:

stop Telegram polling
close httpx.AsyncClient
dispose DB engine

Containers / deployment

docker-compose.yml defines:

app (main service, FastAPI + Telegram polling)
cron (optional service with supercronic)

Important:

cron is under profiles: ["cron"] and does not start by default
the project is now manual-first: users generate playlists via Telegram /generate

cron runs scripts/run_nightly.sh, which calls:

POST /internal/jobs/nightly with Authorization: Bearer <INTERNAL_JOB_TOKEN>

Application layers

1. API layer (`app/api/routes.py`)

Responsibilities:

HTTP endpoints for OAuth and internal jobs

Endpoints:

GET /health
GET /auth/spotify/start
GET /auth/spotify/callback
POST /internal/jobs/nightly

Notes:

OAuth callback sends a Telegram notification to the user on success
nightly endpoint is protected by INTERNAL_JOB_TOKEN

2. Bot layer (`app/bot/telegram_bot.py`)

Responsibilities:

user-facing interface via Telegram commands and reply-keyboard buttons

Supported commands:

/start
/help
/connect
/status
/generate
/latest
/setsize
/setratio
/sync
/lang

Notes:

/generate calls PlaylistJobService.generate_for_user(..., force=True, notify=False)
/sync only refreshes liked tracks cache
each command uses a short-lived DB session from session_factory
bot UI supports ru, en, uk, and nl (localized text/buttons)

3. Service layer

`SpotifyAuthService` (`app/services/spotify_auth.py`)

Responsibilities:

create OAuth state
exchange code for tokens
refresh access token
ensure valid access token before Spotify calls

Notes:

datetime comparison is normalized to UTC (important for SQLite naive datetimes)
stores scopes and expiry on the users row

`RecommendationEngine` (`app/services/recommendation.py`)

Responsibilities:

sync liked tracks into local cache
build seed profile
collect candidates from multiple sources
rank/select final track list

Current candidate sources:

Spotify recommendations
Spotify artist top tracks
Spotify search (seed-artist fallback)
Last.fm track similar -> Spotify search
Last.fm artist similar -> Spotify search

Key implementation details:

respects Spotify recommendations seed limit: max 5 seeds per request
degrades gracefully when some sources fail
includes liked fallback (if all candidates are already liked)

`PlaylistJobService` (`app/services/playlist_job.py`)

Responsibilities:

orchestrate an end-to-end playlist generation run
create Spotify playlist and add tracks
persist run details and track list
update recommendation history
send Telegram notifications (if notifier is configured)

Run sequence:

Validate user / Spotify connection
Create playlist_runs row with running status
Get valid access token
Sync liked tracks
Build playlist via RecommendationEngine
Create playlist in Spotify
Add tracks to playlist
Persist run tracks / history / metadata
Commit and return JobOutcome

On error:

playlist_runs.status = failed
error message is written to notes

Client layer

`SpotifyClient` (`app/clients/spotify.py`)

Encapsulates Spotify Web API calls.

Important implementation choices:

create_playlist() uses POST /me/playlists
- chosen because POST /users/{id}/playlists can return 403 in some app/account combinations
add_playlist_items() uses POST /playlists/{playlist_id}/items
- /tracks may return 403 while /items succeeds
delete_playlist() uses DELETE /playlists/{playlist_id}/followers
- this is "unfollow" (Spotify does not support hard-delete of playlists)
built-in retry for 429 rate limiting using Retry-After

`LastFmClient` (`app/clients/lastfm.py`)

Optional enrichment source for similarity.

can be disabled (empty LASTFM_API_KEY)
Last.fm errors should not fail the whole run if other sources still work

Persistence layer (SQLite + SQLAlchemy)

Tables (`app/db/models.py`)

`users`

Stores:

Telegram identity (telegram_chat_id, telegram_username)
Spotify identity/tokens/scopes (spotify_user_id, access/refresh token, expiry, scopes)
user settings (playlist_size, min_new_ratio, timezone)
last outputs (last_generated_date, latest_playlist_id, latest_playlist_url)

`auth_states`

Temporary OAuth state for callback:

state
telegram_chat_id
expires_at

`saved_tracks`

Local cache of the user's Liked Songs:

spotify_track_id
track/artist metadata, album, popularity
added_at

`recommendation_history`

History of previously recommended tracks:

spotify_track_id
first_recommended_at
last_recommended_at
times_recommended

`playlist_runs`

Playlist generation run log:

status (running/success/failed)
Spotify playlist metadata
stats (total/new/reused)
notes

`playlist_run_tracks`

Snapshot of tracks in a specific run:

track id / name / artists
source (which source produced the track)
position
is_new_to_bot

Repository layer (`app/db/repositories.py`)

Pattern:

thin repositories over AsyncSession
isolates CRUD/query logic from the service layer

Repositories include:

UserRepository
AuthStateRepository
SavedTrackRepository
RecommendationHistoryRepository
PlaylistRunRepository

Data flows

OAuth flow

Telegram /connect
SpotifyAuthService.create_connect_url()
User opens Spotify auth page
GET /auth/spotify/callback
SpotifyAuthService.handle_callback()
Tokens and Spotify profile are saved to users
User receives a Telegram confirmation message

Manual generation flow (`/generate`)

Telegram /generate
PlaylistJobService.generate_for_user(..., force=True)
Sync likes + load recent listening + collect candidates
Create playlist + add items in Spotify
Persist run/history
Reply to user in Telegram

Nightly cron flow (optional)

supercronic in the cron container
scripts/run_nightly.sh
POST /internal/jobs/nightly
PlaylistJobService.generate_for_all_connected_users()

Concurrency / consistency

Generation is protected by a single asyncio.Lock (generate_lock) in PlaylistJobService
- prevents overlapping runs and history update races
Most run operations happen in one DB session
Errors inside a run mark the run as failed

Recommendation algorithm (summary)

Detailed explanation is in README.md, but architecturally the pipeline is:

Build seed profile (recent + liked)
Collect candidate pool (Spotify + Last.fm + fallback search)
Deduplicate
Rank (penalties/boosts)
Select (min_new_ratio + artist caps)
Persist stats/history

Configuration

Main environment variables (app/config.py):

TELEGRAM_BOT_TOKEN
SPOTIFY_CLIENT_ID
SPOTIFY_CLIENT_SECRET
SPOTIFY_REDIRECT_URI
SPOTIFY_DEFAULT_MARKET
LASTFM_API_KEY (optional)
INTERNAL_JOB_TOKEN
DB_PATH
DEFAULT_PLAYLIST_SIZE
MIN_NEW_RATIO
RECENT_DAYS_WINDOW
PLAYLIST_VISIBILITY

Diagnostics / observability

Current state:

primary feedback comes from Telegram messages and playlist_runs.notes
HTTP /health for liveness
tests cover critical Spotify routes and parts of the recommendation pipeline

Possible improvements:

structured logs for source coverage (how many candidates from each source)
metrics for Spotify/Last.fm errors and latency
dedicated debug dry-run endpoint (without creating a playlist)

Known limitations

SQLite is suitable for small-scale / single-node setups
Telegram polling + FastAPI run in the same process/container
per-user timezone support is limited (cron is global)
external API limitations (Spotify/Last.fm) vary by app/account

9.2 KiB Raw Blame History

Design / Architecture

Goal

High-level overview

Runtime / lifecycle

Containers / deployment

Application layers

1. API layer (app/api/routes.py)

2. Bot layer (app/bot/telegram_bot.py)

3. Service layer

SpotifyAuthService (app/services/spotify_auth.py)

RecommendationEngine (app/services/recommendation.py)

PlaylistJobService (app/services/playlist_job.py)

Client layer

SpotifyClient (app/clients/spotify.py)

LastFmClient (app/clients/lastfm.py)

Persistence layer (SQLite + SQLAlchemy)

Tables (app/db/models.py)

users

auth_states

saved_tracks

recommendation_history

playlist_runs

playlist_run_tracks

Repository layer (app/db/repositories.py)

Data flows

OAuth flow

Manual generation flow (/generate)

Nightly cron flow (optional)

Concurrency / consistency

Recommendation algorithm (summary)

Configuration

Diagnostics / observability

Known limitations

9.2 KiB

Raw Blame History

1. API layer (`app/api/routes.py`)

2. Bot layer (`app/bot/telegram_bot.py`)

`SpotifyAuthService` (`app/services/spotify_auth.py`)

`RecommendationEngine` (`app/services/recommendation.py`)

`PlaylistJobService` (`app/services/playlist_job.py`)

`SpotifyClient` (`app/clients/spotify.py`)

`LastFmClient` (`app/clients/lastfm.py`)

Tables (`app/db/models.py`)

`users`

`auth_states`

`saved_tracks`

`recommendation_history`

`playlist_runs`

`playlist_run_tracks`

Repository layer (`app/db/repositories.py`)

Manual generation flow (`/generate`)