# Spotify Daily Vibe Bot (Telegram + Spotify + Docker) Telegram bot: https://t.me/spotify_vibe_bot (`@spotify_vibe_bot`) Ready-to-run backend service that: - connects to your Spotify account - reads your liked tracks (`Liked Songs`) - uses your recent listening history - generates a Spotify playlist with a similar vibe via `/generate` - can optionally run on a schedule via `cron` - minimizes repeats and tries to keep `>=80%` of tracks "new" (not liked and not previously recommended by the bot) - is controlled via Telegram - runs in Docker (`app`, optional `cron`) ## What's inside - `FastAPI` backend (OAuth callback + internal job endpoint) - `python-telegram-bot` (polling) - `SQLite` (recommendation history, liked-track cache, run log) - `supercronic` in a separate container for nightly cron trigger (optional) ## Important note about the Spotify API Spotify endpoint `/recommendations` may be limited/unavailable for some apps. The service includes fallbacks: - Spotify recommendations (if available) - top tracks by artists from your recent listening / liked library - Spotify search by seed artists (fallback when recommendations/top-tracks are unavailable) - optional Last.fm similarity (very helpful for better "vibe" quality) For better recommendation quality, adding `LASTFM_API_KEY` is recommended. ## Quick Start 1. Create a Telegram bot via `@BotFather` and get a token. 2. Create a Spotify App: https://developer.spotify.com/dashboard 3. Add a Redirect URI in the Spotify App (must match exactly), for example: - `https://your-domain.com/auth/spotify/callback` - or for local development via tunnel: `https://xxxx.ngrok-free.app/auth/spotify/callback` 4. Copy `.env.example` to `.env` and fill in the values. 5. Start: If you are using the provided `docker-compose.yml` as-is, create the external Docker network once (used by Traefik labels/network wiring): ```bash docker network create web || true ``` Then start: ```bash docker compose up -d --build ``` By default this starts only `app` (manual mode via Telegram `/generate`). If you want nightly `cron`, start it separately: ```bash docker compose --profile cron up -d cron ``` 6. Open Telegram and message the bot: - `/start` - `/connect` (get the Spotify auth link) - after connecting: `/generate` ## `.env` configuration Minimum required fields: - `TELEGRAM_BOT_TOKEN` - `SPOTIFY_CLIENT_ID` - `SPOTIFY_CLIENT_SECRET` - `SPOTIFY_REDIRECT_URI` - `INTERNAL_JOB_TOKEN` Recommended: - `LASTFM_API_KEY` (improves similarity quality) - `APP_TIMEZONE` / `TZ` - `SPOTIFY_DEFAULT_MARKET` (two-letter country code, e.g. `NL`, `DE`, `US`) - `CRON_SCHEDULE` (e.g. `15 2 * * *`, only if you enable `cron`) ## Telegram commands - `/connect` - connect Spotify - `/status` - connection status and latest playlist run - `/generate` - generate a playlist now - `/latest` - latest playlist link - `/setsize 30` - playlist size (5..100) - `/setratio 0.8` - target new-track ratio (0.5..1.0) - `/sync` - force sync liked tracks - `/lang ru|en` - switch bot language ## Recommendation Algorithm This is the actual playlist generation pipeline used by the current code. ### 1. Input preparation Before generation, the bot: - refreshes Spotify access token if needed - syncs liked tracks from `Liked Songs` into the local cache (`saved_tracks`) - loads recent listening for the `RECENT_DAYS_WINDOW` period (default `5` days) - loads history of previously recommended tracks (`recommendation_history`) ### 2. Seed profile construction The bot builds seeds from two sources: recent plays and liked library. - Recent plays: - each track gets a recency-weighted score (newer plays matter more) - weights are accumulated for both tracks and artists - Liked tracks: - takes a slice of recent likes (`~120`) - adds a random sample from older likes (for exploration/diversity) - accumulates artist weights from this pool as well Seed profile output includes: - `seed_track_ids` (up to ~10 tracks) - `seed_artists` (up to ~20 artists) - `seed_artist_names` (used by Last.fm and Spotify Search fallback) - `recent_track_meta` (used for Last.fm track-similar lookups) ### 3. Candidate collection (candidate pool) The bot builds a shared candidate pool from multiple sources and deduplicates results. Sources (in order): 1. `Spotify recommendations` - requested in batches - respects Spotify limit: max `5` seeds per request (track + artist combined) 2. `Spotify artist top tracks` - by seed artists 3. `Spotify search` by seed artists (fallback) - used when recommendations / top-tracks are restricted or return too few results 4. `Last.fm track similar` -> `Spotify search` - for recent seed tracks 5. `Last.fm artist similar` -> `Spotify search` - for seed artists If Spotify/Last.fm fails on individual calls, the bot tries to degrade gracefully (use other sources) instead of failing the whole run immediately. ### 4. Candidate deduplication Candidates are deduplicated: - by `spotify_track_id` - by normalized signature `track_name + artist_names` (to catch duplicates / alternate versions) If the same track is found via multiple sources: - the best score is kept - the source field is merged (e.g. `source1+source2`) ### 5. Filtering and ranking Base logic: - first, tracks already in your likes (`liked_ids`) are excluded - if that leaves an empty pool, a fallback is enabled: - already-liked tracks may be used (with a penalty) so the run does not fail with an empty result Additional score adjustments: - penalty for tracks previously recommended by the bot (`history_ids`) - penalty for liked tracks (only if liked fallback is active) - small boost for collaborations / multiple artists - small boost for tracks with multiple source/reason signals - popularity scoring slightly favors mid-popularity tracks (not only mainstream and not only obscure tracks) ### 6. Final selection After ranking, candidates are split into: - `novel` - not previously recommended and not in likes - `reused` - previously recommended or (fallback case) already liked Then the bot: - first tries to satisfy `min_new_ratio` - enforces artist caps (limit tracks per artist) - relaxes caps if there are not enough new tracks - fills the remainder with reused candidates Result includes: - `tracks` - final ordered playlist tracks - `new_count` / `reused_count` - `notes` - explanation if the target new ratio could not be met ### 7. Playlist creation and history persistence After the final track list is selected, the bot: - creates a Spotify playlist - adds tracks to it - writes the run to `playlist_runs` and `playlist_run_tracks` - updates `recommendation_history` - stores `latest_playlist_url` for the user ## Anti-repeat behavior The bot stores: - all tracks it has recommended before - all your liked tracks (cached and refreshed) When building a new playlist: - it first excludes liked tracks (when possible) - prioritizes tracks that have not been recommended before - fills with history repeats only if there are not enough new tracks - may use a liked-track fallback instead of failing the run if all candidates are already liked - stores `new / reused` stats in the DB If there are not enough new tracks to satisfy the `80%` target, the run status includes a note explaining that. ## Cron (nightly run) `cron` is disabled by default (manual-first mode: run `/generate` manually in Telegram). In `docker-compose.yml`, the `cron` service is under profile `cron`, so it does not start with a normal: ```bash docker compose up -d --build ``` To enable nightly runs: ```bash docker compose --profile cron up -d cron ``` `cron` calls the internal endpoint on schedule: - `POST /internal/jobs/nightly` Change time via `.env`: ```env CRON_SCHEDULE=15 2 * * * TZ=Europe/Amsterdam ``` Disable again: ```bash docker compose stop cron ``` ## Data storage - SQLite DB: `./data/app.db` This folder is mounted as a Docker volume, so data persists across container restarts. ## Health check / verification - `GET /health` should return `{"ok": true}` - after `/generate`, Telegram should send a Spotify playlist link ## Typical deployment - VPS + Docker Compose - `APP_BASE_URL` = public service URL - `SPOTIFY_REDIRECT_URI` = `${APP_BASE_URL}/auth/spotify/callback` - Telegram runs via polling (no webhook required) - `cron` can remain disabled if you only want manual generation ## Architecture Detailed architecture, data flow, and DB table docs are in `DESIGN.md`. ## Feature Plans Roadmap items that fit the current architecture well: - Explicit feedback loop: - commands like `/ban`, `/unban`, `/prefer` - separate blacklist table so "didn't like it" != "just didn't save it" - Anti-repeat controls: - hard no-repeat window (N days/weeks) - separate rules for liked / previously recommended tracks - Explainability / debug: - why-this-track (source, score, reasons) - dry-run endpoint/command without creating a playlist - Fine-tuning the algorithm: - source weights (Spotify / Last.fm / search fallback) - generation modes (explore / familiar / mixed) - Better candidate sources: - additional music metadata sources - smarter genre/artist clustering - Personal scheduler: - per-user timezone and per-user cron schedule - weekday / time selection - Observability: - structured logs for source coverage and filtering reasons - basic metrics for Spotify/Last.fm errors and latency - Storage / scaling: - migrations (Alembic) - Postgres instead of SQLite for multi-user usage ## Limitations / future improvements - Per-user timezone support is only partially used today (cron is global, though manual per-user generation is supported) - More candidate sources could improve quality (e.g. MusicBrainz/Discogs mapping) - Postgres would be better than SQLite for higher multi-user load