MongoDB & pymongo: Tutorial

In this post I’ll pretend that I am teaching a data science course on collecting, cleaning, storing, and updating data. Rather than pretending my students are computer scientists or software developers, I’ll pretend that they are business analysts or college grads going on to become business analysts. However, I’ll assume they know some programming (python), … [Read more…]

MongoDB & pymongo: Step by Step

As I ventured into Lesson 4 in Udacity’s Data Wrangling with MongoDB, I really wanted to run the first script — inserting a record into the database — locally. I feel like I really damaged the sanctity of my files by installing, uninstalling, messing with permission, etc. for hours in all different locations in my … [Read more…]

Worksheet for Udacity’s Intro Statistics Courses

I’ve created an ‘in-progress’ google spreadsheet for a lot of the exercises and examples in Udacity’s Intro Statistics courses (Intro to Descriptive Statistics & Intro to Statistical Inference) — link is below. When I used to teach finance, I taught class with spreadsheets like this one rather than with powerpoint. Preparing this kind of document … [Read more…]

Pleasantly Distracted by Google’s Foobar Challenges

The result of a google search for ‘itertools’ one day late last week resulted in my Chrome browser caving in to reveal a hidden message. It said ‘You’re speaking our language. Up for a challenge?’ But of course! I’ve since read a few articles about what this is — Business Insider article and Reddit thread — and … [Read more…]

What is the Most Harmful Storm in the US?

The National Oceanic and Atmospheric Administration (NOAA) regularly publishes data on storm occurrences in the US. They make available annual data dating back to 1950 and it includes time-series, geographic proximity, and financial destruction information as well as storm characteristics (event type, width of tornado, wind gust estimates, etc.). While you could do thousands of … [Read more…]

Fixing Excel’s Sci Not Faux Pas with R

Encountered what I think is a pretty common excel problem at work today. A colleague showed me an excel spreadsheet that was reading warehouse locations as scientific notation. For example, location 05E03 was being read into excel as 5.00E+03 and if you tried to edit the cell or convert it to text, you’d be given … [Read more…]

Micro Time Study

Of late, I’ve been feeling accomplishment-less. So I decided to conduct a micro time study to figure how exactly how I’m utilizing all of my minutes and see if that allocation was efficiently contributing to what I was trying to achieve. The first step was in identifying exactly what I was trying to achieve so … [Read more…]

30 Miles on Foot

Last week, with a trusty Sachmel by my side, I ventured to run 30 miles on the day I turned 30 years old. In return for the effort, I gained an immense sense of satisfaction along with 25.5 miles of data collected by my Garmin Forerunner. Unfortunately, the battery only lasted 85% of the run. … [Read more…]