View previous topic :: View next topic |
Author |
Message |
stiltr01
Joined: 06 Jun 2007 Posts: 8
|
Posted: Sun Jan 06, 2008 7:03 pm Post subject: Statistical Analysis |
|
|
Now that we have a substantial number of users (10,000+) it should be easy to get good statistics on the Guitar Hero community as a whole. Even something as simple as community mean/standard deviation would be motivating for some, and give everyone a better sense of where they stand than just simple ranking. The data's all there, I just can't think of a simple way to export it all into a spreadsheet. Anyone else think it's a good idea? |
|
Back to top |
|
|
KurtCobain
Joined: 21 Jun 2007 Posts: 1044 Location: Minneapolis, Minnesota
|
Posted: Sun Jan 06, 2008 7:06 pm Post subject: |
|
|
stiltr01 wrote: | 100,000+ users |
Fixed.
So you're saying we should have an average of our community. IT doesn't take that much effort to find the game you're looking for, divide the total users by two, and see where you rank opposed to the average player.
But if I took this completely out of context, tell me. _________________
|
|
Back to top |
|
|
Cliff
Joined: 06 May 2006 Posts: 3002 Location: Springfield, IL
|
Posted: Sun Jan 06, 2008 8:33 pm Post subject: |
|
|
KurtCobain wrote: | stiltr01 wrote: | 100,000+ users |
Fixed.
So you're saying we should have an average of our community. IT doesn't take that much effort to find the game you're looking for, divide the total users by two, and see where you rank opposed to the average player.
But if I took this completely out of context, tell me. |
That's not technically the mean score. That's the mid-ranking player's score. And I think it'd be nice to be able to see the standard deviation; something that would take way too much effort on our part and far less on a computer's.
Obviously, if this does end up on the list of Stuff To Do, it'll be pretty low, but I don't think it's a bad idea. _________________
Alakaiser wrote: | I will eat your fucking children. |
|
|
Back to top |
|
|
MPChedda
Joined: 06 Nov 2007 Posts: 1299 Location: Evansville, IN
|
Posted: Mon Jan 07, 2008 4:32 am Post subject: |
|
|
This actually is a pretty decent idea. It would be interesting to see how our community does cumulatively. It would give some of the lower ranked expert players something to look towards to say that they were above the average on SH. _________________
|
|
Back to top |
|
|
ricecake
Joined: 17 May 2007 Posts: 1890 Location: Linthicum Heights, MD
|
Posted: Mon Jan 07, 2008 6:01 am Post subject: |
|
|
Along this same lines, I think it would be interesting to see the score distribution plotted on a graph, to see if the scores really follow a Gaussian (Normal) distribution or if they're somehow skewed due to the number of awesome players here. _________________
|
|
Back to top |
|
|
SpoonMan
Joined: 06 Dec 2006 Posts: 3631
|
Posted: Mon Jan 07, 2008 6:37 am Post subject: |
|
|
^ i think it would vary by song actually, there would be a normal curve for harder songs i'd imagine, but really easy FCs like HSB, yes we can, and closer would ramp up and flatten out i think due to the high number of close to upperbound scores on those songs. _________________
|
|
Back to top |
|
|
thecaptainof
Joined: 04 May 2007 Posts: 7571 Location: ¯\(°_o)/¯
|
Posted: Mon Jan 07, 2008 4:43 pm Post subject: |
|
|
I've got an awesome piece of statistical analysis software (Cliff: I'd guess you're familiar with SPSS?) that might be able to very quickly do exactly what you're talking about... and from tomorrow morning I've got very little to do until Thursday, so I'll give it a try.
EDIT:
A quick experiment using only a handful of values (in this case, all the scores entered for Memories Of The Grove medium, since there's only ten of them) gave this:
Code: | Descriptives
| ----- | ----------------- | ----------- | ------------ | ---------- |
| | | | Statistic | Std. Error |
| ----- | ----------------- | | ------------ | ---------- |
| score | Mean | 106293.90 | 2900.291 |
| | ----------------- | ----------- | ------------ | ---------- |
| | 95% Confidence | Lower Bound | 99732.99 | |
| | Interval for Mean | ----------- | ------------ | ---------- |
| | | Upper Bound | 112854.81 | |
| | ----------------- | ----------- | ------------ | ---------- |
| | 5% Trimmed Mean | 106740.28 | |
| | ------------------------------- | ------------ | ---------- |
| | Median | 109716.00 | |
| | ------------------------------- | ------------ | ---------- |
| | Variance | 84116860.100 | |
| | ------------------------------- | ------------ | ---------- |
| | Std. Deviation | 9171.524 | |
| | ------------------------------- | ------------ | ---------- |
| | Minimum | 88864 | |
| | ------------------------------- | ------------ | ---------- |
| | Maximum | 115689 | |
| | ------------------------------- | ------------ | ---------- |
| | Range | 26825 | |
| | ------------------------------- | ------------ | ---------- |
| | Interquartile Range | 16081 | |
| | ------------------------------- | ------------ | ---------- |
| | Skewness | -.764 | .687 |
| | ------------------------------- | ------------ | ---------- |
| | Kurtosis | -.594 | 1.334 |
| ----- | ----------------------------- | ------------ | ---------- | |
I could undoubtedly do more with it given a bit of time to RTFM and play around a bit, unfortunately I've got stuff to do for the rest of the day so I can't get on it right away. Actually, the first thing I'd have to do is sort out a way of getting all of the scores entered for any given song; I tried copy/paste with the Jordan leaderboard and crashed the browser and everything else that I tried to paste it into... so that'll be something to think about. _________________
yksi-kaksi-kolme wrote: | Wow Mr. Mad, who fucked your buffalo? |
|
|
Back to top |
|
|
tma
Joined: 03 May 2007 Posts: 1414 Location: Australia
|
Posted: Mon Jan 07, 2008 9:26 pm Post subject: |
|
|
thecaptainof wrote: | I could undoubtedly do more with it given a bit of time to RTFM and play around a bit, unfortunately I've got stuff to do for the rest of the day so I can't get on it right away. Actually, the first thing I'd have to do is sort out a way of getting all of the scores entered for any given song; I tried copy/paste with the Jordan leaderboard and crashed the browser and everything else that I tried to paste it into... so that'll be something to think about. |
I currently scrape the top scores page for each each difficulty once per day for my own little stat-tracking project I've been working on (stathero.com). I could fairly easily extract the scores from these pages if that would be useful. My records go back at least 2-3 months.
I don't scrape the top 50 per song however, that would be a bit too heavy on the ScoreHero server. |
|
Back to top |
|
|
Cliff
Joined: 06 May 2006 Posts: 3002 Location: Springfield, IL
|
Posted: Mon Jan 07, 2008 9:51 pm Post subject: |
|
|
thecaptainof wrote: | (Cliff: I'd guess you're familiar with SPSS?) |
Don't own it, but the college does. I use it for all my stats needs, because it's prettier than Excel. _________________
Alakaiser wrote: | I will eat your fucking children. |
|
|
Back to top |
|
|
Barfo
Joined: 10 Oct 2006 Posts: 2596
|
Posted: Tue Jan 08, 2008 9:34 pm Post subject: |
|
|
Ive done decent amount of effort in the past on this sort of thing. The most detailed analysis i did was one time where i took the entire database at that time of GH80s scores (about 2-4k per song at the time, i think?) and plotted them all up in terms of the actual score distribution. I even started to create some measurements and categorizations of all the songs based on the properties of those distributions. Its decidedly non-gaussian distribution overall, except for some of the most tricky songs. Based on the disribution shape you can figure out some rough heuristics about the difficulty of the song per se, for example songs like The Warrior or WGTB look very similar to each other and much different from say PWM and 17 (which also may have some shape similarities). And within distributions with the same basic shape type you can definitely see differences which can hopefulyl be explained by (and perhaps be quanititative measures of) factors such as how easy the song is t FC, how easy/obvious the best non-squeeze SP path is, and also how easy/much points influence the squeezes are. I never really did a complete anaysis of even the GH80s data to try and rank songs based on all of these individual assements, and i meant to at some point revisit this whole idea in reference to GH3, but its not somethign ive gotten around to yet considering how much work it would be to manually go through and pull out the parameters of the distribution curves, and also i cant see a really pressing use for such data, other than just basic scientific interest. _________________
Watching her, these things she said / "Time," she cried "failed to wait, this time"
***
Hush now / Let it go now / I know it's time to go / Time to let this fall / From my hands |
|
Back to top |
|
|
thegibbonator
Joined: 10 Jun 2007 Posts: 2496 Location: Cardiff / Weston-super-Mare
|
Posted: Tue Jan 08, 2008 10:37 pm Post subject: |
|
|
Barfo wrote: | *tons of awesome stuff that I have no chance in hell of understanding before I reach university* ...other than just basic scientific interest. |
Everybody loves random facts.
I'm pretty sure you could get people to help you with it. Scorehero is an intelligent place... especially for a forum. _________________
|
|
Back to top |
|
|
Monk
Joined: 04 Aug 2007 Posts: 539 Location: The Chaos Sanctuary
|
Posted: Tue Jan 08, 2008 10:57 pm Post subject: |
|
|
maybe instead of a distribution curve the scores them selves could be divided up into like a circular or bar graph type thing... like example....
out of the entire community on x song 10% of the ppl scored over 400K and x% scored under 100K
i guess it could still work as a distribution curve but the graph would probably work better since you would have a max of 10 bars/slices on any given single player song. and some songs would only have 1-2 (unless like in mississippi wqueen you diveded that bar up by 25K or 50K increments... Does that make any sense to you guys? |
|
Back to top |
|
|
stiltr01
Joined: 06 Jun 2007 Posts: 8
|
Posted: Thu Jan 10, 2008 12:40 am Post subject: |
|
|
Those heuristics are very interesting. I've suspected that there are a few songs that are "too" difficult, in the sense that their cutoffs are so high that a normal person can't be expected to 5* it. In that regard, I think I'd be interesting to see a sort of error function, that is, what % of the community scored worse than a given score. |
|
Back to top |
|
|
Barfo
Joined: 10 Oct 2006 Posts: 2596
|
Posted: Thu Jan 10, 2008 1:22 am Post subject: |
|
|
stiltr01 wrote: | Those heuristics are very interesting. I've suspected that there are a few songs that are "too" difficult, in the sense that their cutoffs are so high that a normal person can't be expected to 5* it. In that regard, I think I'd be interesting to see a sort of error function, that is, what % of the community scored worse than a given score. |
I dont agree with th practice of using such data to assign value (such as "too" or "not enough" or even "just right") to the difficulty, but yeah that is basically exactly more or less what i settled on as the most useful way of plotting up the data. If i find myself unaccoutnably at my home PC with nothign to do maybe ill revist my 80s data (which is from a very outdated midsummer data snapshot) and put screencaps of a couple of representative curves or whatever up, so people could see what im talking about. _________________
Watching her, these things she said / "Time," she cried "failed to wait, this time"
***
Hush now / Let it go now / I know it's time to go / Time to let this fall / From my hands |
|
Back to top |
|
|
Boon
Joined: 25 Oct 2006 Posts: 269 Location: Paris France
|
Posted: Tue Jan 22, 2008 2:41 pm Post subject: |
|
|
Another interesting point using advanced statistics is inferring player profile i.e. what is your expected score for a song based on your other scores.
Each song as different characteristics : hard rythm, difficult chord, long solo, etc. ...
And everyone of us have different capabilities (that could nicely represented using a Kiviat diagram)
Maybe it is possible to go further than mean and standard deviation ?
But at least, those basic information for every song would be very interesting :
- number of player for every ranking (and a bar graph ?)
- number of score for every ranking (maybe interesting)
- mean score (for all player, for all score)
- Standard deviation
- median score
- total number of score
- total number of player having a score for this song _________________
Goro is a lucky guy... He can play coop all by himself !
My blog (in French) |
|
Back to top |
|
|
|