MLB Bunt Tracker (or My First Twitter Bot)

Last July I was sitting at home, trying to think of a fun side project—something to occupy a weekend and help me procrastinate on writing Thank You notes for our wedding gifts—when I came across this Tweet:

Of course I’d known about the excellent MLB Home Run Tracker Twitter, and I realized Molly was, in all likelihood, joking, but I sort of loved this idea.

A bunt, for the uninitiated, is about as opposite of a home run as the sport of baseball contains. Instead of trying to hit the ball over the outfield wall, the goal is to deaden and knock down the ball. A perfect bunt may travel only a few feet and stay just inside fair territory. A really fast runner might reach base on a perfectly executed bunt, but the goal is normally for the batter to sacrifice himself in order to advance a runner already on the bases.

My grandfather was born in Poland in 1921 and never knew much about baseball, but for whatever reason, he LOVED bunts. My father likes to tell me that his father would yell at the TV for the batter to bunt even when the situation doesn’t really call for it. And I get it—there are few offensive plays quicker or more exciting than the bunt.

So with all that in mind, I thought “why not build a bunt tracker Twitter bot? How hard could it be?” As it turned out, not super hard! I ended up building a proof of concept over a 3-day weekend and then refining and re-platforming a few weeks later.

I’ll give you all the gory details down below, but if you’re just looking to see the thing in action, you can follow my Bunt Tracker Bot here: @MLBBunts and you can have a peek at the current source code on my GitHub page.

How I Did It

As I approached this project, the broad strokes of what I needed to accomplish were pretty obvious: 1) Figure out when someone bunts, then 2) Tweet about it.

I took that broad overview and broke it down a bit more (because that’s what Product Managers do). I did some quick research and found the MLB Stats API, a publicly accessible REST API with real-time data (used to power the GameDay experience and other MLB applications). Suddenly the pieces of the puzzle began to fall into place; the app would run on a schedule (every 1 minute to start) and would do the following on each run:

  1. Call the MLB API to get a list of today’s games and current game statuses.
  2. Compare the game date to the list fetched during the prior run.
    • If the dates are the same, refresh the game/processing statuses only.
    • If the dates are different, wipe out the list of games and start fresh.
  3. Determine which games need to be processed:
    • Games that started (moved from a Not Started to an In Progress status) since the last app run.
    • Games that had In Progress status during the previous run.
  4. Loop through games, one at a time.
  5. For each game, starting at the play (atBatNumber) where we left off last time, loop through plays one at a time to determine if it was a bunt.
    • If a play was a bunt:
      • Gather the play and game details needed.
      • Compile the Tweet text using a tokenized template.
      • Call the Twitter v2 API to create a new Tweet.
  6. After the play loop, log the atBatNumber of the last play processed.
  7. If the last play processed was a game-ending play, mark the game as complete
  8. Sleep, rinse, repeat.

I initially built the entire thing as a proof of concept using Google App Script. It was sort of a clunky way to do things, but it had a couple of advantages, most importantly I could store persistent data in Google Sheets rather than having to spin up a database just yet. That allowed me to work out all the kinks with the APIs and the processing logic before worrying about where to house the data. Another plus was I had worked in App Script before and had even used it to call third-party APIs and parse the resulting payloads before.

Once I’d made this initial decision, it was a matter of a few hours work (about 25% of it just figuring out how to detect when a play was a bunt and probably 50% spent just getting the Twitter API to authenticate) before I got the bot to send its first-ever automated Tweet:

Bunt Bot’s first automated tweet

After that I gave myself a much-deserved break and just let the thing run for a couple weeks. Google App Script and its associated tasks scheduler s is remarkably stable, and I briefly considered just leaving things there. But I had some ideas to improve performance (and Tweet content) and I decided if I was going to touch it, I might as well find Bunt Bot a permanent home.

Through a bunch of research (read: Googling) I decided the best solution was to house the data in a MySQL database and manage all the logic and processing in Python. That sounded great except I’d never written a single line of Python before! Luckily I’m a quick learner (and an even quicker reader of StackOverflow) and in early September, I re-launched Bunt Bot (v2) in a Google Cloud Function using Python and MySQL. I took advantage of the re-factoring to improve the look of the Tweets a bit (and to leverage the team social handles and hashtags, in hopes of attracting more views):

I have some ideas for future improvements as well: Tweeting weekly and monthly stats, auto-retweeting certain accounts when they tweet about bunts, adding in more info like where and how far the ball went, etc. For now those are all in the backlog as I work on other projects. But Bunt Bot taught me a lot about the Twitter API, MLB’s data, Python, and MySQL. I’ll bring all those new skills and experience to my next job or my next impulse project, whatever that turns out to be.