ScoreHero
Home | Forum | Wiki
Inbox [ Login ]Inbox [ Login ]
SearchSearch MemberlistMemberlist
ProfileProfile Log inLog in
Statistical Analysis
Goto page 1, 2, 3  Next
 
Post new topic   Reply to topic    ScoreHero Forum Index -> Website Discussion
View previous topic :: View next topic  
Author Message
stiltr01  





Joined: 06 Jun 2007
Posts: 8

PostPosted: Sun Jan 06, 2008 7:03 pm    Post subject: Statistical Analysis Reply with quote

Now that we have a substantial number of users (10,000+) it should be easy to get good statistics on the Guitar Hero community as a whole. Even something as simple as community mean/standard deviation would be motivating for some, and give everyone a better sense of where they stand than just simple ranking. The data's all there, I just can't think of a simple way to export it all into a spreadsheet. Anyone else think it's a good idea?
Back to top
View user's profile Send private message
KurtCobain  





Joined: 21 Jun 2007
Posts: 1044
Location: Minneapolis, Minnesota

PostPosted: Sun Jan 06, 2008 7:06 pm    Post subject: Reply with quote

stiltr01 wrote:
100,000+ users


Fixed.

So you're saying we should have an average of our community. IT doesn't take that much effort to find the game you're looking for, divide the total users by two, and see where you rank opposed to the average player.

But if I took this completely out of context, tell me.
_________________
46511.png
Back to top
View user's profile Send private message Send e-mail Visit poster's website Yahoo Messenger MSN Messenger XBL Gamertag: TTFAFWeCarryOn
Cliff  





Joined: 06 May 2006
Posts: 3002
Location: Springfield, IL

PostPosted: Sun Jan 06, 2008 8:33 pm    Post subject: Reply with quote

KurtCobain wrote:
stiltr01 wrote:
100,000+ users


Fixed.

So you're saying we should have an average of our community. IT doesn't take that much effort to find the game you're looking for, divide the total users by two, and see where you rank opposed to the average player.

But if I took this completely out of context, tell me.


That's not technically the mean score. That's the mid-ranking player's score. And I think it'd be nice to be able to see the standard deviation; something that would take way too much effort on our part and far less on a computer's.

Obviously, if this does end up on the list of Stuff To Do, it'll be pretty low, but I don't think it's a bad idea.
_________________
Alakaiser wrote:
I will eat your fucking children.
Back to top
View user's profile Send private message XBL Gamertag: Clyph
MPChedda  





Joined: 06 Nov 2007
Posts: 1299
Location: Evansville, IN

PostPosted: Mon Jan 07, 2008 4:32 am    Post subject: Reply with quote

This actually is a pretty decent idea. It would be interesting to see how our community does cumulatively. It would give some of the lower ranked expert players something to look towards to say that they were above the average on SH.
_________________
Back to top
View user's profile Send private message Visit poster's website MSN Messenger
ricecake  





Joined: 17 May 2007
Posts: 1890
Location: Linthicum Heights, MD

PostPosted: Mon Jan 07, 2008 6:01 am    Post subject: Reply with quote

Along this same lines, I think it would be interesting to see the score distribution plotted on a graph, to see if the scores really follow a Gaussian (Normal) distribution or if they're somehow skewed due to the number of awesome players here.
_________________
Back to top
View user's profile Wiki User Page Send private message Visit poster's website PSN Name: ricecake138
SpoonMan  





Joined: 06 Dec 2006
Posts: 3631

PostPosted: Mon Jan 07, 2008 6:37 am    Post subject: Reply with quote

^ i think it would vary by song actually, there would be a normal curve for harder songs i'd imagine, but really easy FCs like HSB, yes we can, and closer would ramp up and flatten out i think due to the high number of close to upperbound scores on those songs.
_________________
My stupid GH achievements
My stupid RB achievements
Participate in the Master FC Breakdown Project! GH1 - GH2 - GH80s - GH3 - GH3 DLC - GHA - GHWT - GHM - GHSH - WTDLC - GHVH - GH5 - BH - GH5 DLC
All Guitar-Only Games
WINNING.
Back to top
View user's profile Wiki User Page Send private message
thecaptainof  





Joined: 04 May 2007
Posts: 7571
Location: ¯\(°_o)/¯

PostPosted: Mon Jan 07, 2008 4:43 pm    Post subject: Reply with quote

I've got an awesome piece of statistical analysis software (Cliff: I'd guess you're familiar with SPSS?) that might be able to very quickly do exactly what you're talking about... and from tomorrow morning I've got very little to do until Thursday, so I'll give it a try.

EDIT:

A quick experiment using only a handful of values (in this case, all the scores entered for Memories Of The Grove medium, since there's only ten of them) gave this:

Code:
Descriptives
 | ----- | ----------------- | ----------- | ------------ | ---------- |
 |       |                   |             | Statistic    | Std. Error |
 | ----- | ----------------- |             | ------------ | ---------- |
 | score | Mean                            | 106293.90    | 2900.291   |
 |       | ----------------- | ----------- | ------------ | ---------- |
 |       | 95% Confidence    | Lower Bound | 99732.99     |            |
 |       | Interval for Mean | ----------- | ------------ | ---------- |
 |       |                   | Upper Bound | 112854.81    |            |
 |       | ----------------- | ----------- | ------------ | ---------- |
 |       | 5% Trimmed Mean                 | 106740.28    |            |
 |       | ------------------------------- | ------------ | ---------- |
 |       | Median                          | 109716.00    |            |
 |       | ------------------------------- | ------------ | ---------- |
 |       | Variance                        | 84116860.100 |            |
 |       | ------------------------------- | ------------ | ---------- |
 |       | Std. Deviation                  | 9171.524     |            |
 |       | ------------------------------- | ------------ | ---------- |
 |       | Minimum                         | 88864        |            |
 |       | ------------------------------- | ------------ | ---------- |
 |       | Maximum                         | 115689       |            |
 |       | ------------------------------- | ------------ | ---------- |
 |       | Range                           | 26825        |            |
 |       | ------------------------------- | ------------ | ---------- |
 |       | Interquartile Range             | 16081        |            |
 |       | ------------------------------- | ------------ | ---------- |
 |       | Skewness                        | -.764        | .687       |
 |       | ------------------------------- | ------------ | ---------- |
 |       | Kurtosis                        | -.594        | 1.334      |
 | ----- | ----------------------------- | ------------ | ---------- |


I could undoubtedly do more with it given a bit of time to RTFM and play around a bit, unfortunately I've got stuff to do for the rest of the day so I can't get on it right away. Actually, the first thing I'd have to do is sort out a way of getting all of the scores entered for any given song; I tried copy/paste with the Jordan leaderboard and crashed the browser and everything else that I tried to paste it into... so that'll be something to think about.
_________________


yksi-kaksi-kolme wrote:
Wow Mr. Mad, who fucked your buffalo?
Back to top
View user's profile Wiki User Page Send private message
tma  





Joined: 03 May 2007
Posts: 1414
Location: Australia

PostPosted: Mon Jan 07, 2008 9:26 pm    Post subject: Reply with quote

thecaptainof wrote:
I could undoubtedly do more with it given a bit of time to RTFM and play around a bit, unfortunately I've got stuff to do for the rest of the day so I can't get on it right away. Actually, the first thing I'd have to do is sort out a way of getting all of the scores entered for any given song; I tried copy/paste with the Jordan leaderboard and crashed the browser and everything else that I tried to paste it into... so that'll be something to think about.


I currently scrape the top scores page for each each difficulty once per day for my own little stat-tracking project I've been working on (stathero.com). I could fairly easily extract the scores from these pages if that would be useful. My records go back at least 2-3 months.

I don't scrape the top 50 per song however, that would be a bit too heavy on the ScoreHero server.
Back to top
View user's profile Wiki User Page Send private message XBL Gamertag: zzUrbanSpaceman
Cliff  





Joined: 06 May 2006
Posts: 3002
Location: Springfield, IL

PostPosted: Mon Jan 07, 2008 9:51 pm    Post subject: Reply with quote

thecaptainof wrote:
(Cliff: I'd guess you're familiar with SPSS?)


Don't own it, but the college does. I use it for all my stats needs, because it's prettier than Excel.
_________________
Alakaiser wrote:
I will eat your fucking children.
Back to top
View user's profile Send private message XBL Gamertag: Clyph
Barfo  





Joined: 10 Oct 2006
Posts: 2596

PostPosted: Tue Jan 08, 2008 9:34 pm    Post subject: Reply with quote

Ive done decent amount of effort in the past on this sort of thing. The most detailed analysis i did was one time where i took the entire database at that time of GH80s scores (about 2-4k per song at the time, i think?) and plotted them all up in terms of the actual score distribution. I even started to create some measurements and categorizations of all the songs based on the properties of those distributions. Its decidedly non-gaussian distribution overall, except for some of the most tricky songs. Based on the disribution shape you can figure out some rough heuristics about the difficulty of the song per se, for example songs like The Warrior or WGTB look very similar to each other and much different from say PWM and 17 (which also may have some shape similarities). And within distributions with the same basic shape type you can definitely see differences which can hopefulyl be explained by (and perhaps be quanititative measures of) factors such as how easy the song is t FC, how easy/obvious the best non-squeeze SP path is, and also how easy/much points influence the squeezes are. I never really did a complete anaysis of even the GH80s data to try and rank songs based on all of these individual assements, and i meant to at some point revisit this whole idea in reference to GH3, but its not somethign ive gotten around to yet considering how much work it would be to manually go through and pull out the parameters of the distribution curves, and also i cant see a really pressing use for such data, other than just basic scientific interest.
_________________
Watching her, these things she said / "Time," she cried "failed to wait, this time"
***
Hush now / Let it go now / I know it's time to go / Time to let this fall / From my hands
Back to top
View user's profile Wiki User Page Send private message
thegibbonator  





Joined: 10 Jun 2007
Posts: 2496
Location: Cardiff / Weston-super-Mare

PostPosted: Tue Jan 08, 2008 10:37 pm    Post subject: Reply with quote

Barfo wrote:
*tons of awesome stuff that I have no chance in hell of understanding before I reach university* ...other than just basic scientific interest.


Everybody loves random facts.

I'm pretty sure you could get people to help you with it. Scorehero is an intelligent place... especially for a forum.
_________________



My last.fm profile. My musical taste is obviously superior to everybody else's. Even if it does include black metal, classical and synthpop.
Back to top
View user's profile Wiki User Page Send private message Send e-mail Visit poster's website MSN Messenger PSN Name: Tetrinity
Monk  





Joined: 04 Aug 2007
Posts: 539
Location: The Chaos Sanctuary

PostPosted: Tue Jan 08, 2008 10:57 pm    Post subject: Reply with quote

maybe instead of a distribution curve the scores them selves could be divided up into like a circular or bar graph type thing... like example....

out of the entire community on x song 10% of the ppl scored over 400K and x% scored under 100K

i guess it could still work as a distribution curve but the graph would probably work better since you would have a max of 10 bars/slices on any given single player song. and some songs would only have 1-2 (unless like in mississippi wqueen you diveded that bar up by 25K or 50K increments... Does that make any sense to you guys?
Back to top
View user's profile Send private message
stiltr01  





Joined: 06 Jun 2007
Posts: 8

PostPosted: Thu Jan 10, 2008 12:40 am    Post subject: Reply with quote

Those heuristics are very interesting. I've suspected that there are a few songs that are "too" difficult, in the sense that their cutoffs are so high that a normal person can't be expected to 5* it. In that regard, I think I'd be interesting to see a sort of error function, that is, what % of the community scored worse than a given score.
Back to top
View user's profile Send private message
Barfo  





Joined: 10 Oct 2006
Posts: 2596

PostPosted: Thu Jan 10, 2008 1:22 am    Post subject: Reply with quote

stiltr01 wrote:
Those heuristics are very interesting. I've suspected that there are a few songs that are "too" difficult, in the sense that their cutoffs are so high that a normal person can't be expected to 5* it. In that regard, I think I'd be interesting to see a sort of error function, that is, what % of the community scored worse than a given score.

I dont agree with th practice of using such data to assign value (such as "too" or "not enough" or even "just right") to the difficulty, but yeah that is basically exactly more or less what i settled on as the most useful way of plotting up the data. If i find myself unaccoutnably at my home PC with nothign to do maybe ill revist my 80s data (which is from a very outdated midsummer data snapshot) and put screencaps of a couple of representative curves or whatever up, so people could see what im talking about.
_________________
Watching her, these things she said / "Time," she cried "failed to wait, this time"
***
Hush now / Let it go now / I know it's time to go / Time to let this fall / From my hands
Back to top
View user's profile Wiki User Page Send private message
Boon  





Joined: 25 Oct 2006
Posts: 269
Location: Paris France

PostPosted: Tue Jan 22, 2008 2:41 pm    Post subject: Reply with quote

Another interesting point using advanced statistics is inferring player profile i.e. what is your expected score for a song based on your other scores.

Each song as different characteristics : hard rythm, difficult chord, long solo, etc. ...
And everyone of us have different capabilities (that could nicely represented using a Kiviat diagram)
Maybe it is possible to go further than mean and standard deviation ?

But at least, those basic information for every song would be very interesting :
- number of player for every ranking (and a bar graph ?)
- number of score for every ranking (maybe interesting)
- mean score (for all player, for all score)
- Standard deviation
- median score
- total number of score
- total number of player having a score for this song
_________________
Goro is a lucky guy... He can play coop all by himself !

My blog (in French)
Back to top
View user's profile Send private message Visit poster's website
Display posts from previous:   
Post new topic   Reply to topic    ScoreHero Forum Index -> Website Discussion All times are GMT
Goto page 1, 2, 3  Next
Page 1 of 3

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum




Copyright © 2006-2024 ScoreHero, LLC
Terms of Use | Privacy Policy


Powered by phpBB