Jump to content
  • entries
    37
  • comments
    27
  • views
    12,649

NHLErrata launch


Ever since I started collecting the NHL data it was very important to me to validate the collected information. So I created a set of checks that finally formed a whole library to test both the NHL boxscore feeds and the HTML reports.

I managed to fish out quite a few errors and inconsistencies. Some where systemic, and could be fixed in software, some required a manual intervention, sometimes by editing the source file, sometimes by providing the correct value overriding what the parser would read. These interventions, classified into a variety of types, formed another library.

But I also wanted to share the errors that I found and the fixes I figured out with the analytics community. So I decided to create a website dedicated to it. It took a while, but finally a couple of days ago I was able to open NHLErrata.com. There you can find:
 
  • An overview of data sources.
  • Information on missing players and events
  • Information of broken reports, players and events
  • Systemic problems encountered with the reports

Both mentioned libraries, Test.pm and Errors.pm are part of my scrape-to-database package on CPAN.

0 Comments


Recommended Comments

There are no comments to display.

Guest
Add a comment...

×   Pasted as rich text.   Paste as plain text instead

  Only 75 emoji are allowed.

×   Your link has been automatically embedded.   Display as a link instead

×   Your previous content has been restored.   Clear editor

×   You cannot paste images directly. Upload or insert images from URL.

×
×
  • Create New...