Please leave us feedback
I've only been working with the subset for the moment, trying to get my head around how I can make use of this data, and what I'd really like to do is compare the metadata (e.g. location, artist name etc.) with analysis information (e.g. density, beats, segments etc.).
I have no idea how to achieve this since as far as I can tell, these different kinds if information are contained in different groups within the hdf5 summary file, and the only open-source programme I can get to read HDF5 files and allow me to filter then (ViTables in Ubuntu) won't allow me to filter between different groups.
I've been looking for ways to combine the data into a single summary group, but so far no luck. Has anyone out there already done this successfully?
I'm a real newbie at this data mining game, but I hope this can lead me onto some really interesting research opportunities, and give me something to write a thesis on.
Thanks in advance for any and all assistance.
I downloaded the whole dataset through infochimps and compared the checksums i got with md5sun (on ubuntu 10.10) with the checksums given by you. None of the checksums matched. For example I got:
And your checksum is:
2011-01-28 18:55 8251685808 a4ebc00350644bf21bc065c782dd0e0d-16
It might of course be that there was something wrong with all of my files. Am I doing something wrong when comparing checksums ?
Hi Rasmus, you're probably alright, you're the second person mentioning to us that the checksum seem wrong. We don't now how infochimps keeps the data, they might have moved / uncompressed / ... the files since they gave us the checksums. Soon, the checksum will be directly included in their API.
In the meantime, the best way to check your download is to run this python code.
it simply opens every file and read all fields it expects to find. It can take a few hours, but if it does not crash, you're fine.
We'll try to solve this MD5 issue once for all.
I'd like to download the list of unique terms (Echo Nest Tags) for Automatic Tagging task (http://labrosa.ee.columbia.edu/projects/millionsong/files/unique_terms.txt) but the link keeps sending me to the home page. Is the file missing?
Sorry for that, here is the list:
I'll try to find the wrong link you had, if you can point me to the page where you got it, it would be even easier.
It's the following possible with the dataset?
- user uploads a short sample (10sec) of unknown song/music
- the server do some magic with the dataset
- the the server will output: artist, songname, etc...?
No, you're looking for a fingerprinter, for instance Shazam: http://www.shazam.com/
The Echo Nest is building one to, but I don't believe it is available at the moment.
Which is your favorite track, of the million?
And how would you measure it?
April 25, 2012
The MSD Challenge has launched!
October 20, 2011
We release the Last.fm dataset of tags and similarity!
April 12, 2011
We release the musiXmatch dataset of lyrics!
March 15, 2011
We release the SecondHandSongs dataset of cover songs!
February 8, 2011
We release the dataset!
(and get Dan to blog)
The Echo Nest
MSD mailing list