A Look at Election Campaign Contributions with R

The state of Florida, every four years, is a definitive swing state in the US Presidential election. Since 1996 – 5 election cycles ago – the candidate that captured Florida’s electoral votes became the next US President. Assuming a strong correlation between campaign contributions and election results and having perfect hindsight of knowing the 2012 Presidential election … [Read more…]

MongoDB & pymongo: Tutorial

In this post I’ll pretend that I am teaching a data science course on collecting, cleaning, storing, and updating data. Rather than pretending my students are computer scientists or software developers, I’ll pretend that they are business analysts or college grads going on to become business analysts. However, I’ll assume they know some programming (python), … [Read more…]

MongoDB & pymongo: Step by Step

As I ventured into Lesson 4 in Udacity’s Data Wrangling with MongoDB, I really wanted to run the first script — inserting a record into the database — locally. I feel like I really damaged the sanctity of my files by installing, uninstalling, messing with permission, etc. for hours in all different locations in my … [Read more…]

Worksheet for Udacity’s Intro Statistics Courses

I’ve created an ‘in-progress’ google spreadsheet for a lot of the exercises and examples in Udacity’s Intro Statistics courses (Intro to Descriptive Statistics & Intro to Statistical Inference) — link is below. When I used to teach finance, I taught class with spreadsheets like this one rather than with powerpoint. Preparing this kind of document … [Read more…]

Pleasantly Distracted by Google’s Foobar Challenges

The result of a google search for ‘itertools’ one day late last week resulted in my Chrome browser caving in to reveal a hidden message. It said ‘You’re speaking our language. Up for a challenge?’ But of course! I’ve since read a few articles about what this is — Business Insider article and Reddit thread — and … [Read more…]

What is the Most Harmful Storm in the US?

The National Oceanic and Atmospheric Administration (NOAA) regularly publishes data on storm occurrences in the US. They make available annual data dating back to 1950 and it includes time-series, geographic proximity, and financial destruction information as well as storm characteristics (event type, width of tornado, wind gust estimates, etc.). While you could do thousands of … [Read more…]

Fixing Excel’s Sci Not Faux Pas with R

Encountered what I think is a pretty common excel problem at work today. A colleague showed me an excel spreadsheet that was reading warehouse locations as scientific notation. For example, location 05E03 was being read into excel as 5.00E+03 and if you tried to edit the cell or convert it to text, you’d be given … [Read more…]