"why is my model so suspiciously good at predicting the exact opposite of what I asked it to predict?" says idiot who is about to discover that he set all his labels backwards
Anna's Archive backed up Spotify. They got 99.9% of metadata, and 300TB of music representing 86 million tracks - original 160kbps OGG for tracks with popularity>0, and re-encoded 75kbps for popularity=0. absolutely wild project.
the metadata in particular is a hugely useful data source. MusicBrainz catalogues 5 million unique ISRCs (like ISBNs but for music releases), whereas this archive has a whopping 186 million.