These tutorials on the Million Song Dataset should help you get started.
First, here are some longer tutorials (with code and pdf version) that takes you step by step for some simple tasks, like checking the artist names in the dataset.
|tutorial 1||Python||simple exploration of the subset data||[pdf] [code]|
|tutorial 2||Matlab||Simple Matlab exploration of the subset data
|tutorial 3||Python||use of the SQLite track_metadata.db database||[pdf] [code]|
|tutorial 4||Python||use of the SQLite artist_term.db database||[pdf] [code]|
|tutorial 5||Python||use of the SQLite artist_similarity.db database||[pdf] [code]|
Then, below are some topic-specific tutorials. They cover the following issues:
- Basic getters functions
- Iterate over all songs
- SQLite interfaces for Python and MATLAB
- Find a song with a specific name or feature
- Find all songs from a list of artists
- Get all artists and their tags
- Get beat-aligned chromas
- Fast k-NN using HDF5
You can leave comments on the tutorial pages, but for security reasons, you must be registered as a user on this site. You can use OpenID.
Note for MATLAB users who'd wish to move to Python but don't want to lose all their code, look at the excellent mlabwrap that let's you call MATLAB from Python. Another tool (less tested) is ompc which lets you run m-files using the Python interpreter. Both help you make the transition smoothly.