Surviving the Energy (To “Coding”) Transition
It’s (finally) 20211, and the flood of oil and gas professionals interested in software development and data science shows no signs of slowing. I think that’s great. 2020 was a hard year for all of us, and especially so in the oil and gas industry. A world grappling with climate change, the “energy transition”, and a worldwide pandemic—on top of the multi-year bender of shale drilling in the US—have all made for an uncertain future, and experts in other quantitative fields certainly have an advantage when it comes to learning and applying these skills. Whether you view building programming and data science skills as a hedge against uncertainty, a “force multiplier” for your existing work, an escape route from a contracting industry, a great learning opportunity during the COVID slowdown, or just a stimulating intellectual puzzle, you’re not wrong!
Along with this flood of interest, though, comes a few things that aren’t so great. Where there’s excitement and money, there’s sure to be grifts and get-rich-quick schemes, and the “oil-and-gas to coding” boom is no exception. I’d like to share some thoughts, and maybe a little unsolicited advice, based on my experience in the grey space between “engineering” and “tech” (and perhaps motivated by the record amount of time I spent drifting through LinkedIn feeds in late 2020). I’m going to caveat this advice with a personal note on goals and desires. I’ve run a consulting and software development business since 2013, after working as an engineer in oil and gas. Last year we grew from a one-person operation to a two-person operation. At no point has my goal been to raise outside funding, “growth hack”, etc.—so if your dream is to go down the Silicon Valley-style VC-backed startup route, you may well find you need an entirely different mindset. I’d also rather gnaw my own arm off than spend the next ten years optimizing click-through rates on ads for erectile dysfunction pills displayed above videos about how lizard people rigged the last election, so if your dream is to join a “FAANG”, you’re on your own.
There is no Royal Road to Geometry, nor Anything Else
In a time of fear and uncertainty, nothing sells like the quick fix. There’s no shortage of shallow “content” out there on programming and data science: emoji polls, funny GIFs (if this were that sort of thing, you’d see Lady Gaga and Bradley Cooper shortly), feel-good platitudes, empty “empowerment”. At least that stuff is free (if, as the old saying about Linux goes, your time is worth nothing).
I also see lots of oil and gas professionals turning to coding or data science “bootcamps”, or going back to school for a master’s program. The bootcamp business model frightens me: I see extraordinary tuition costs, curricula designed by committee to produce completely fungible entry-level programmers, and extremely sketchy contracts. To elaborate on that last point, bootcamps often require income sharing arrangements where the student may owe a significant percentage of their salary for the first years of their post-bootcamp career to the bootcamp, or offer a tuition refund to students who aren’t able to find a programming job after the bootcamp, citing phenomenal job placement numbers—with the catch that those numbers are inflated by hiring bootcamp graduates immediately as bootcamp instructors. (Imagine the impact on teaching quality from that particular gambit.) And speaking of teaching quality, the impression I get is that many bootcamps, even with quality instructors, have such high student-to-instructor ratios that instruction boils down mostly to reading assignments and independent or team project work with minimal lecture or instructor-guided work time.
The worst-kept secret in this whole no-longer-cottage industry is that everything you need to learn can be had from at most a few hundred dollars worth of textbooks (and likely for free on the Internet). The most common argument I hear from people who know this but nonetheless decide to pursue a “paid” learning model is that they need the structure provided by the bootcamp or degree program. I am not at all convinced that the bootcamp model provides enough of this structure to be worth the cost (and the sometimes unconscionable contract). It’s possible that a master’s program is a good option, but I wouldn’t pursue it unless my goal were to specialize in a technical discipline, and I had particular instructors in mind that I wanted to take advantage of interaction with.
The truth is that, as Euclid said to Ptolemy I, “there is no royal road to geometry”. It doesn’t matter how much money or desire you throw at the problem: what you really need is the discipline to work through challenging topics, and engage in consistent and deliberate practice. You do also need some structure, or at least a plan: what do you want to learn? How are you going to get there? For a given focus area, this plan might be provided by a good introductory book, by your own research, or by one of the growing number of university courses whose course material is freely available online.
A good mentor, or mentors, can also be a great help. Potential mentors are everywhere, but the good ones are in demand—anyone offering open-ended free mentoring to all and sundry may have too much time on their hands. That said, remember that sometimes the best mentors are peers just one or two steps ahead, and that everyone has different expertise and experience. A friendly peer who recently made a similar career change might be a great source of guidance on job hunting and interview preparation—but maybe not so much for providing rigorous code review or critical feedback on your machine learning projects. The old joke is true, too: sometimes the fastest way to learn how to do something the right way is to post the wrong way on the Internet. StackOverflow and its ilk are the obvious suspects for most beginners, but copy-and-paste programming can be a fast track to more problems: focus instead on understanding why a proposed solution is (or isn’t!) correct. Read, but don’t obsess, about “How to Ask Questions the Smart Way”, and consider hanging out on more interactive channels. The “kids these days” have all kinds of Slacks, Discords, and other negative-valence abstract nouns, but good old IRC is still my favorite. The Freenode network has fantastic channels for all kinds of programming topics, including newbie-friendly ones (search for channels with “beginner” in the name or description).
Above all else, practice! Write code, train models, and build “muscle memory” for the basics. The goal isn’t memorization of trivia (although you’ll find that just naturally happens) but to reach the point that your mental energy can be directed at the essential complexity of the problem you are solving (How can I predict these outcomes? How can I ensure that this constraint holds?), rather than the accidental complexity of the tool you are using to solve it (How do I launch an editor? How do I compile and run a program? How do I iterate over a data set?).
Sometimes “Impostor Syndrome” is Your Body Telling You to Read a Book
While other phrases come to mind for 2020 (“spike protein”…), the phrase “impostor syndrome” might be one of the most iconic buzzwords of the last few years in the technology industry. It’s a feeling I think everyone can relate to: comparing your own inner state of uncertainty, doubt, and “not knowing” to the polished and confident exterior presentation of your peers, and wondering whether you truly belong in [insert industry, discipline, organization, or barbeque cookoff here]. Social media is essentially a pressure cooker for this kind of insecurity, and as historically underrepresented groups grow in representation in technology fields, it’s little wonder that a lot of people feel a distinct lack of confidence about their place in the industry.
I think fighting against that feeling is absolutely necessary and noble—many people never realize just how competent they are compared to peers who are more adept at self-promotion; indeed, the only people who never feel that way are exactly the kind of Dunning-Kruger casualties2 whose glib confidence coupled with incompetence leads to ruin. But therein lies the rub: the cure for impostor syndrome isn’t empty affirmations, it’s building (and recognizing!) an authentic competence and the confidence that goes with it. Sometimes the feeling of uncertainty and doubt is your brain, finely-honed by 4.5 billion years of evolution, telling you to gather more information before proceeding!
I bring this up because I’ve seen some alarming posts around LinkedIn lately, to the tune of “don’t worry about what qualifications you have, start calling yourself a data scientist today”. Well, to be fair, “data science” is a poorly-defined field. But imagine that someone were to give aspiring surgeons the same advice! Impostor syndrome is a pernicious foe, and we need to do better at helping new programmers and data scientists build and recognize their own authentic competence. But this does not mean that we should fling the doors wide open to actual impostors! (If this article were to appear on Towards Data Science, there would be an animated GIF of popular children’s video game “Among Us” here.)
Don’t Speculate When You Can Investigate
Free and open-source software has won the revolution in the fields of software development tools, data science, and machine learning so thoroughly that for many of us even trying to perceive its reach is like a fish trying to comprehend “water”. This has many positive consequences, but there’s one I feel is underappreciated by many new programmers and data scientists, and is intimately linked to one of the most powerful techniques I know for overcoming “impostor syndrome” and the feeling of “not-knowing”.
That consequence is this: when faced with a question about our tools, we need never speculate—only investigate! Questions like “are for loops faster or slower than list comprehensions in Python, and why?” need not be open-ended invitations to quasi-theological speculation (any answer that begins “Python is interpreted line by line…” should be treated the same way as arguments about how many angels can dance on the head of a pin!). Instead, we can just dig in and investigate, like scientists and engineers. For example, to answer that question, we might first profile our code to answer the empirical question (which is faster, or is it a wash?). We might observe that list comprehensions are indeed faster, especially for trivial transformations on long sequences. Then, we might begin to answer the “why” question by researching and understanding Python’s execution model. Python is compiled (well, let’s assume we’re using CPython) to bytecode and then executed by an interpreter. We can use built-in tools (the
dis module) to inspect the bytecode generated for both cases and observe that we do indeed use fewer function call operations in the list comprehension case3, since the generated code uses the
LIST_APPEND bytecode instruction directly rather than the
CALL_FUNCTION instruction on the
append function. If we wanted to assure ourselves of why
LIST_APPEND was faster, we could even dig into the source code for Python itself!
It’s this last step that’s a unique and underappreciated property of the open-source ecosystem. One step at a time, we can answer any question about our tools. Sometimes this means keeping a deep “recursive call stack” of side references, but there’s no better way to learn and build a sense of true competence than to get your hands dirty on an open-ended problem and emerge with a solution (and a new understanding).
You Can Do It! (If You Want To.)
…I don’t believe in the idea that there are a few peculiar people capable of understanding math and the rest of the world is normal. Math is a human discovery, and it’s no more complicated than humans can understand. I had a calculus book once that said, “What one fool can do, another fool can.”
It’s a great time, maybe the best time in the short history of computer technology to date, to learn programming and data science. You’re surrounded by an incredible wealth of free tools and resources, and the excitement about these disciplines in society at large is palpable. But with that wealth comes danger: snake oil salesmen and the “quick fix”. The answer is the same as it always was.
Bonus: FUQ (Frequently Unasked Questions)
- What are some warning signs that the [course/degree program/Medium article] I am learning from is a total waste of time?
- Every program you write takes the form of a Jupyter notebook.
- Strong opinions on the superiority of [R/Python] over [R/Python].
- Every program you write contains the line
import pandas as pd.
- What’s “overfitting”?
- What’s a “type”?
- What technologies aren’t going to revolutionize life as we know it and instantly make everything I’ve learned obsolete?
- NoSQL databases
- “Robotic process automation”
- Yet another JIT-to-GPU array library for Python
- Cold fusion
- What are the most important Python libraries for me to learn?
- the standard library
- What are the most important R libraries for me to learn?
reticulate(joke! Or not.)
- Are recruiters actually impressed by my GitHub with 14,000 forks of well-known open-source projects and no original code?
- Hell if I know! Surely someone must be, or this wouldn’t strategy wouldn’t have been converged upon by every newly-minted “data scientist” on the planet, right?