April 19, 2003
results w/ new musicseer datathe table is really big and ugly, so click on the extended entry to see it: Continue reading "results w/ new musicseer data"
April 11, 2003
disk full; expediencyRob filled up blush2, and I just got 3 albums from Brian for nelly furtado, simple minds, & liz phair, so have to put them somewhere else for the time being. I'll leave them in my home directoyr, and redirect DATAROOT there, and hand-create a .list file. When space frees up, I should put them in data/music, and run filldb on them.
April 09, 2003
New Artist SetTa-da, presenting the new artist set, christened aset400.3. It has 400 artists, chosen to be in the intersection of all the following sets:
It's basically the old topset I had at NEC and used for the ICME paper, but I dropped 5 artists (Eros Ramazzotti, Anouk, Laibach, Rockapella, and SR-71) because there wasn't enough of something (AOTM or all-responses), and added 5 more (Liz Phair, Simple Minds, PJ Harvey, Nelly Furtado, and The Verve) that fit the bill. Sorry Dan, Tori is forever banished because she's not in the opennap/erdos set.
Beth, the htk cepstra files for all of these except the following 3 are in the htk directory: furtado_nelly phair_liz simple_minds
Brian, when you come up for air, can you send us audio for these guys? I have some liz phair but I think you probly have more. Also I'd still like to fill in the gaps with the missing artists I sent last time; I have enough songs to get by, but I'd like to have the full albums if you got em. and good luck tomorrow!!! Dan, be gentle.Continue reading "New Artist Set"
April 03, 2003
another (minor) disasterIn trying to evaluate the old non-audio SIMs (opennap, erdos, etc) against the AOTM data for the ADVENT talk, I found that I had an ID mapping problem. the audio SIMs use topset-400, but the mapping I had from those artists to the old opennap artist IDs was wrong for 67 of the artists (because at some point last summer I merged database entries for these artists, and the wrong ID got kept).
but why are the results for the old stuff - erdos, opennap, n2 - different? esp. n2 is now worse than random!!?
I figured out the first part: the numbers for erdos etc changed when I replaced the buggy topset-400 id mapping because of another bug in the scoring code. When the code saw a response that it didn't have a sim value for, it was returning from the function Response() instead of "next'ing" to the next SIM-type for the same user judgment. So if erdos was after one of the ank14 things, e.g., then a bunch of judgments would get ignored. Now that it's fixed, the numbers look like the numbers from the old "Quest for ground truth" paper, which is good.
But what about n2? Looking more closely at the SIM file, I'm not sure that it's wrong after all. It looks horrible; maybe the question is, how was it getting decent scores before?! Under AOTM eval, it also does extremely badly: .12 while the other ones do about .36 (random is .07). Maybe I have the wrong SIM file somehow? I'll get Beth to run it and see what she comes up with...
I fixed the ICME paper and resubmitted it.
April 02, 2003
bug fixes.Of course everything was wrong. in the AOTM eval i found 2 bugs, and some other issues.
Bug 1: I wasnt treating distance matrices differently from similarity matrices.
Now fixed, the results make sense. I threw out the ranking agreement eval and went with the Information Retrieval style eval: treat the top N most probable co-occuring artists in AOTM playlists as ground truth similar hits, and treat the ranked SIM row as "retrieval results". The rank of each hit gets an exponentially decreasing score with halflife 20, then take the mean. So for N=10, optimal is .8598. This gives me a score for each row, and then I take the mean for an overall score.
Using N=10, random permutations score like this histogram. It's not Gaussian (because it bumps up against zero on the left), but I treat it like a Gaussian and do the same confidence test that I tried before. (If I make the cutoff larger, e.g. 20, it starts to look Gaussian, and as cutoff gets bigger the variance shrinks.)
So here are the results using the AOTM IR-type evaluation with N=10 and halflife=20, and normalizing by the prior:
Related to a suggestion of Dan's, I wanted to see if the score was correlated with the prior probability of the artist (i.e., popularity). So I made some plots, and it looks pretty much uncorrelated.
Search This Site
Movable Type 3.2