April 19, 2003

results w/ new musicseer data

the table is really big and ugly, so click on the extended entry to see it: Continue reading "results w/ new musicseer data"
Posted by madadam at 01:47 PM | Comments (0)

April 11, 2003

disk full; expediency

Rob filled up blush2, and I just got 3 albums from Brian for nelly furtado, simple minds, & liz phair, so have to put them somewhere else for the time being.

I'll leave them in my home directoyr, and redirect DATAROOT there, and hand-create a .list file. When space frees up, I should put them in data/music, and run filldb on them.
Posted by madadam at 06:55 PM | Comments (1)

April 09, 2003

New Artist Set

Ta-da, presenting the new artist set, christened aset400.3. It has 400 artists, chosen to be in the intersection of all the following sets:
  • Musicseer survey data (all-responses, in particular)
  • Opennap, erdos, n2 SIM matrices
  • AOTM lists (enough to be useful)
  • Audio we have (or can get soon)

It's basically the old topset I had at NEC and used for the ICME paper, but I dropped 5 artists (Eros Ramazzotti, Anouk, Laibach, Rockapella, and SR-71) because there wasn't enough of something (AOTM or all-responses), and added 5 more (Liz Phair, Simple Minds, PJ Harvey, Nelly Furtado, and The Verve) that fit the bill. Sorry Dan, Tori is forever banished because she's not in the opennap/erdos set.

Beth, the htk cepstra files for all of these except the following 3 are in the htk directory: furtado_nelly phair_liz simple_minds

Brian, when you come up for air, can you send us audio for these guys? I have some liz phair but I think you probly have more. Also I'd still like to fill in the gaps with the missing artists I sent last time; I have enough songs to get by, but I'd like to have the full albums if you got em. and good luck tomorrow!!! Dan, be gentle.

Continue reading "New Artist Set"
Posted by madadam at 11:02 AM | Comments (0)

April 03, 2003

another (minor) disaster

In trying to evaluate the old non-audio SIMs (opennap, erdos, etc) against the AOTM data for the ADVENT talk, I found that I had an ID mapping problem. the audio SIMs use topset-400, but the mapping I had from those artists to the old opennap artist IDs was wrong for 67 of the artists (because at some point last summer I merged database entries for these artists, and the wrong ID got kept).

but why are the results for the old stuff - erdos, opennap, n2 - different? esp. n2 is now worse than random!!?

I figured out the first part: the numbers for erdos etc changed when I replaced the buggy topset-400 id mapping because of another bug in the scoring code. When the code saw a response that it didn't have a sim value for, it was returning from the function Response() instead of "next'ing" to the next SIM-type for the same user judgment. So if erdos was after one of the ank14 things, e.g., then a bunch of judgments would get ignored. Now that it's fixed, the numbers look like the numbers from the old "Quest for ground truth" paper, which is good.

But what about n2? Looking more closely at the SIM file, I'm not sure that it's wrong after all. It looks horrible; maybe the question is, how was it getting decent scores before?! Under AOTM eval, it also does extremely badly: .12 while the other ones do about .36 (random is .07). Maybe I have the wrong SIM file somehow? I'll get Beth to run it and see what she comes up with...

I fixed the ICME paper and resubmitted it.

Posted by madadam at 10:10 PM | Comments (0)

April 02, 2003

bug fixes.

Of course everything was wrong. in the AOTM eval i found 2 bugs, and some other issues.

Bug 1: I wasnt treating distance matrices differently from similarity matrices.
Bug 2: I didnt understand what I was doing with some index sorting business (cdix) in the ranking eval, so I was getting basically random results.

Now fixed, the results make sense. I threw out the ranking agreement eval and went with the Information Retrieval style eval: treat the top N most probable co-occuring artists in AOTM playlists as ground truth similar hits, and treat the ranked SIM row as "retrieval results". The rank of each hit gets an exponentially decreasing score with halflife 20, then take the mean. So for N=10, optimal is .8598. This gives me a score for each row, and then I take the mean for an overall score.

Using N=10, random permutations score like this histogram. It's not Gaussian (because it bumps up against zero on the left), but I treat it like a Gaussian and do the same confidence test that I tried before. (If I make the cutoff larger, e.g. 20, it starts to look Gaussian, and as cutoff gets bigger the variance shrinks.)

So here are the results using the AOTM IR-type evaluation with N=10 and halflife=20, and normalizing by the prior:

The 95% significance level is .1644 and the 99.5% significance level is about .2157, the lowest score we see. So that's comforting. Of course, being significantly better than random is the least we should expect, in fact we want a much more demanding evaluation. But at least we're sane now.

Related to a suggestion of Dan's, I wanted to see if the score was correlated with the prior probability of the artist (i.e., popularity). So I made some plots, and it looks pretty much uncorrelated.

Posted by madadam at 11:33 PM | Comments (0)