Selected posts

Last time, we used a minimalist parser combinator library to build a parser for an oddly familiar language called OBAN. The problem with our previous parser is that it produces extremely unhelpful error messages. This is probably fine for a parser which runs as part of an automated toolchain and processes almost-always-valid input, but is completely unacceptable for a user-facing tool.

We’ll address this, while making only minimal changes to the parser’s structure, by tweaking the “base monad” on which it's built. In other words, we’ll change what it means to chain parsers together.

Some people, when confronted with a problem, think “I know, I’ll use regular expressions.” Now they have two problems.
–Jamie Zawinski

The fundamental problem with regular expressions is that they only recognize regular languages. This sounds, and is, tautological, but it has huge implications.

A Dialogue


SOCRATES: What is “machine learning”?

It’s important, as a rule of thumb, when operating or investing in a firm which produces a physical commodity, to have the ability to reliably quantify the expected future production of the commodity given the firm’s assets. In the oil and gas industry, we have many different ways to forecast future production from an oil or gas well. Some rely on detailed measurements and explicitly incorporate detailed mathematical models of flow physics. Others use whatever historical data we can scrape together and a bit of curve-fitting.

Today, we’re talking about the second kind.

A couple weekends ago, I found myself with the desire to fetch oil and gas production data for a specific county in New Mexico from the New Mexico Oil Conservation Division (OCD).

Fortunately, the OCD provides access to historical well production via an FTP server. The OCD doesn’t seem to provide a way to query a limited time- or area-based subset of production history data, so we’re stuck with a single ZIP file for “all of New Mexico since the dawn of time”. The result is a whopping 712MB ZIP file.

Here’s where I knew I was in trouble: the only thing inside was a single 38 GB file called wcproduction.xml.

If you wish to make an apple pie from scratch, you must first create the universe.
–Carl Sagan

Many clients, friends, acquaintances, and (in the Before Times) strangers in bars have asked me over the years: what do I think about “business intelligence” (BI) tools? These are applications which make it easy, without any custom programming required, to connect to data sources, visualize data, and create interactive analytics and “dashboards”.

What is object-oriented programming really about? What’s so special about “late binding”? And why do I have to pass self around everywhere in Python? We’ll take a meandering path in today’s post which will try to answer each of these questions, and build our own miniature object system along the way.

Oh, R. I can’t tell you why “data scientists” have switched, for the same reason I can’t tell you how Santa’s reindeer achieve lift: I simply don’t believe in them. But, I can tell you why I no longer use R for new projects.

A recent discussion on the SPE discussion board inspired me to jot down some thoughts on how a young (or experienced!) petroleum engineer—or for that matter any other technically skilled non-programmer—might best engage with “learning to code”.