Ever since the formation of VGChartz, we have faced the same difficult scenario. Every piece of data we report is an estimate. It will always be higher or lower than the actual figure and there will always be some for which the figure is favourable and some for which the figure is not. Whenever we release a piece of data that isn't favourable for some reason, the easiest thing to do is to simply discredit the data. The most recent example of this is Aaron Greenberg's Twitter post commenting on our initial Kinect sales estimate:
"LOL'ing at sales reports from VGChartz, why do people release info as official when there is no source or science behind the #s?"
Firstly, the data was never presented as official. Secondly, our estimate of 475,000 units was pretty close to the final figure which seems to be around 550,000 units as confirmed by Microsofts recent announcement of over 1 million units sold worldwide in ten days. So why feel the need to comment when the sales are in the right area? Perhaps Greenberg didn't feel that the 475,000 figure was favourable given the hype behind the Kinect launch. Thirdly, attempting to discredit VGChartz with the claim that there is no science or source is the same line that everyone takes when they don't like a figure we publish - it is an easy way to deflect attention.
So what was the science behind Kinect estimates? Where does the 475,000 figure come from? We arrived at the figure via four distinct routes:
- A telephone survey of retailers across America. We called over 200 retailers (distributed across the country and across different chains in a proportion representative of market share) and asked how many Kinect units they had purchased and how many they had sold. Now 200 retailers from 30,000 isn't a lot, but one GameStop will sell similar amounts to another - each new store you call follows the law of diminishing returns in terms of accuracy improvement. Then, on a retailer by retailer basis, we took an average per store and multiply by the number of stores. We then scaled this with any missing retailers to match the overall marketshare. This gives one estimate of Kinect sell-through.
- A pre-order analysis using a typical pre-order to week-one ratio for a casual Xbox 360 title to arrive at a second estimate. An explanation of where our pre-order data comes from can form it's own editorial but a combination of retail pre-order info from best-seller lists at various stores and user pre-order data (purchase intent) via trawls of major sites are the two main contributors.
- Contact a small retail panel who provide regular data for VGChartz and weigh and scale that data to provide a third estimate. We have a lot of experience with this data so while it isn't particularly representitive we have a lookup table of scaling factors and adjustment for different types of game / hardware.
- A Gamercard analysis of more than 5 million Gamercards to calculate the proportion of players playing Kinect games (specifically Kinect Adventures), and from the raw figures we can scale up to produce a fourth estimate of sales via an analysis of gamercard to sales ratios for different types of game.
Taking a weighted average from the data arrived at via these four methods landed the 475,000 figure. These are just four of the ten different processes we have available for data collection (most of them automated) and the four that were applicable in this example. So, while VGChartz certainly doesn't have direct access to sell-through data from major retailers (and has never claimed that to be the case), there is still a science behind the data and some very clever and innovative work going on behind the scenes using data that most could have access to but only we know what to do with it (the competitive advantage that leads to such mystery surrounding our methodology).
What VGChartz offers is timely data that isn't meant to be 100% accurate but be in the right range. We don't compete with the likes of NPD, GFK or ChartTrack; we offer a service that is totally different. One that is not based on comprehensive and direct retail tracking, but rather uses modern and alternative methods to quickly arrive at estimates, combined with a database of historical sales - constantly adjusted and tweaked to be as accurate as possible. Timely data is a much sought-after commodity, especially in the world of investment and retail where knowing the success of a product 12 hours before your competitors can result in a huge advantage. The way we track sales at VGChartz is pretty much realtime - we don't have to wait for data to be aggregated or sent over (in the case of NPD, at the end of the month, for some retailers). We can pull figures in minutes after the close of the day. Accuracy is obviously the biggest sacrifice, but that is the balance we have to manage. To some users of VGChartz data, knowing that a game sold 7 million copies on day 1 with a margin of +-0.5 million and having that data hours after the end of the day is more valuable than having a more accurate figure two days later. These are the people who understand the idea behind VGChartz and what I am striving to achieve with the service.
It saddens me to see a senior figure like Greenberg taking cheap shots at VGChartz to mask their own apparent disappointment at initial Kinect sales. I don't understand why they would think that half a million units in 3 days in the Americas is something to laugh at - that puts sales right in line with Wii launch sales. Why bother to try to discredit VGChartz when, firstly, the sales are very good, and secondly, our estimates were clearly in the right area? It also saddens me to see major publishers wanting to hide and control the flow of data to the public. This is where VGChartz comes in - our primary goal is to provide independent data that is unavailable elsewhere to enable users to analyse the videogame industry. It shouldn't come as a major suprise that publishers, who have no influence or control over the information shown on VGChartz, should resort to trying to undermine a figure when sales of a product doesn't meet their expectations - it is no different to the behaviour seen by users on sites like NeoGAF and N4G who clearly fail to understand the point of VGChartz.
As VGChartz grows in importance and influence, I expect to see this happening more and more. Publishers will be happy to use our data when a product performs well but quick to dismiss when something doesn't. It will always be easy to attack the methodology and sources of our data, as it is something I keep closely guarded and for good reason. Anything that isn't understood is easy to dismiss and this mindset is unlikely to change. Fortunately, our data often speaks for itself. For every time we get a piece of data wrong, and I'm the first to admit that it happens, we get ten pieces of data right. We pegged Black Ops at 7 million on day 1 with 5.4 million across North America and the UK, Activision later confirmed 5.6 million across NA and UK (but including sales via Steam which VGChartz doesn't) - therefore our 7 million day 1 figure was spot on. For Kinect, we reported 475,000 units sold in 3 days in the Americas compared to 550,000 as the latest estimate. Initial Fable III figures were very close, same with Medal of Honor and Fallout: New Vegas. The only recent product we undertracked initially was PlayStation Move by around 20% and this is mainly due to the more casual nature of the product and inexperience with scaling ratios etc. Now that we have experience with Move, future estimates should be far more accurate.
The key to success with the data on VGChartz (or in fact ANY piece of data) is understanding its nature and limitations. Too many people fail to understand that the vast majority of figures reported in the press are estimates of one kind or another. "A third of 16 to 24-year-olds lost their virginity below the age of consent" according to a poll of 29,623 listeners of BBC Radio 1 in the UK. Seems like a reasonable sample size but is it representative? Are 30,000 Radio 1 listeners (for those not familar, Radio 1's audience tends to be in the 16-24, single, hip, sexually-aware category) who replied to an online poll about sex likely to be any more representitive of the whole population than 30,000 World of Warcraft users? Is a poll on gamrConnect asking "which of the VGChartz sub-sites is your favourite" going to return a result other than gamrConnect? Is asking ten people if they bought a game likely to return good results? Is reporting a game increasing 758% in sales on Amazon following a price cut newsworthy with Amazon's 3% marketshare? With VGChartz all of our samples are representative. If we carry out telephone polls of retailers, we ensure we target the different retailer chains in the same proportion as their overall marketshare. When we weigh different sources of data, we do so with weightings to reflect the overall market or the reliability of that source of data.
However, even with the greatest of diligence, our data is still just an estimate so whenever quoting a figure from VGChartz it should be listed as an estimate and readers should be made aware that there is a margin of error associated. With this in mind, VGChartz data is fine for most applications - from a year-on-year genre analysis to first-day estimates for a major title to a ballpark estimate of total sales to date for a given game. It just requires the user to have a little common sense and realise that an estimate is not exact but better than having no information and intended to point you in the right direction. If we list the sales of a game at 600,000 then you know it hasn't sold 1 million and you know it hasn't sold few hundred thousand. It might not have sold exactly 600,000 but it should be around that range. It gives you more information than you had before but you must remember that it isn't an exact figure.
Maybe if websites, readers, retailers and major publishers got on board with VGChartz, dismissed the various political reasons they have not to support the site (which could form an editorial of its own) and understood the nature of the data and the insight it gives into the videogame industry to the extent that developers, investors and VGChartz readers seem to appreciate, then maybe the data could improve even further and that way everybody could benefit.
Any questions or comments, please leave a comment below or email me - firstname.lastname@example.org