I managed to fish out quite a few errors and inconsistencies. Some where systemic, and could be fixed in software, some required a manual intervention, sometimes by editing the source file, sometimes by providing the correct value overriding what the parser would read. These interventions, classified into a variety of types, formed another library.
But I also wanted to share the errors that I found and the fixes I figured out with the analytics community. So I decided to create a website dedicated to it. It took a while, but finally a couple of days ago I was able to open NHLErrata.com. There you can find:
An overview of data sources. Information on missing players and events Information of broken reports, players and events Systemic problems encountered with the reports
Both mentioned libraries, Test.pm and Errors.pm are part of my scrape-to-database package on CPAN.
- Read more...
- 0 comments
- 295 views