« Cheating | Main | ALA implemented, AOTM eval started »

AOTM & audio

I matched the regularized art of the mix lists to the 414-artist playola DB
to see what kind of overlap we have. the results:

16% of the songs are by playola artists
7% of the songs are in our DB
35% of the lists have two or more songs in our DB.
346/417 "new playola" artists are represented.

this is good news! i think with these numbers, we have enough data to
explore the relationship between the audio-based sim metric and the AOTM
lists. Here's what I'm planning on doing:

Let's assume that songs that co-occur in a playlist are similar, i.e., the
probability of co-occurrence is some function of similarity. So I'd like to
see a plot of simlarity vs. (empirical) conditional probability. I''m
hoping it looks like an exponential density - probability of seeing
something very similar is high, and it quickly falls off as dissimilarity
(distance) increases. The question is, how to use this as a quantitative
measure of how good the similarity metric is? perhaps we fit an exponential
to the plot, and look at the rate of decay - a faster rate means that
cooccurrence probability falls off faster with similarity, so the similarity
metric is better.

anyway, it's something to try.

Post a comment

(If you haven't left a comment here before, you may need to be approved by the site owner before your comment will appear. Until then, it won't appear on the entry. Thanks for waiting.)