A "song file" refers to the typical HDF5 file containing information for only one song.
An "aggregate file" is also an HDF5 file that contains the information for several songs. These are useful if you do I/O intensive experiments, since they reduce the number of open/close file operations you need to perform.
A "summary file" is similar to an aggregate file, but contains just the metadata, i.e. we remove all the tables (analysis of bars, beats, segments, ..., artist similarity, tags). Useful if you want to quickly search the metadata, since a lot of space is saved! Check the scripts create_summary_file.py and create_aggregate_file.py. The summary file of the whole dataset is available (only 300 Mb!): msd_summary_file.h5.
Note on summary files: if you're using the code display_song.py, you need the '-summary' flag to tell the code that some getters won't find their field, e.g. bars_start.
The dataset you received should contain one million song files. You can create aggregate and/or summary files using the python scripts.