

Discover more from Win With Data
Before I condemn the IDE, I want to say that I do look back fondly on my days writing in SQL IDEs. The process was spare and brutally technical in a way that nourished my obsession with Unix fundamentalism. While I was at Airbnb, I even wrote an open source library to be able to build my CLI setup into a proper SQL IDE, data discovery and all.
But itās high time I admit something Iāve always known deep in my shell-script heart:
The IDE was not built for analytics.
Letās talk about why.
Note: if youāre searching for the bias, I am a co-founder of Hyperquery, where weāre building a notebook for analytics. Youāll find that the thrust of this article pushes towards a notebook shape as the optimal solution. This is not meant to be a sales pitch, but to share an argument (and it just happens to be the motivation behind what weāve built). Iāll try to stay measured.
Why the IDE isnāt for analytics:
analytics isnāt about development.
Well, the IDE is not for analytics, by definition. The IDE is an integrated development environment. It consolidates the needs of the development workflow.
But analytics is not about development. The extent to which we leverage a programming language at all is not to develop an application, but as an access and manipulation layer. And everything else that happens in analytics revolves around interpreting the resulting data payload, not hardening the code into a codebase.
Analytics is primarily about alignment, interpretation, communicationāāāthe non-SQL behaviors that enable us to establish an interface between data and impact. While our scripting chops open the door to a world of data inaccessible to the rest of the business, our subsequent behaviors unlock the value therein.
What the solution should look like:
the notebook, where data and interpretation mix.
So whatās the solution?
We donāt need an integrated development environment, because analytics isnāt primarily about development. We need an integrated analytics environment that addresses the needs of analytics, not just SQL. Itās time we stopped co-opting an interface from another field when our needs are different.
We need a proper analytics notebook.
Why?
1. Notebooks fit the analytics workflowĀ better.
While SQL IDEs push you towards consolidation (one, final query), notebooks push you towards exploration. And the latter is the preferred pattern for analytics: your queries are rarely ends in and of themselves. They deserve, at minimum, a line or two of explanation, contextualization, always. Notebooks are better for this.
2. Notebooks reinforce better behaviors.
It might seem that dumpster diving into your IDE is the fastest way to get going, itās seldom the best solutionāāāitās the fast food of analytics work. It may work in a pinch, but relying on it for the bulk of your work will only reinforce bad habits and degrade the quality of your work in the long-term. Work should always be aligned and interpreted on either end of the technical work.
3. Notebooks elevate data to knowledge, and thatās what we care about.
Notebooks represent knowledge, and knowledge is the currency of the business that analytics teams should peddle (not data!). SQL queries deal in data. Orientation around the thing that matters aligns all ancillary processes to it in a more coherent way. Knowledge should be organized, not data. Knowledge should be shared, not data.
Final comments
A few caveats:
There are certainly workflows where development is appropriate: building pipelines, data models, etc. But these fall within the realm of data engineering and analytics engineering. While these are often within the scope of analytics work, they are not analytics.
Some of you may be chanting āJupyterā or its derivatives at this point, but I donāt think this is the optimal solution. Itās not built from first principles for analytics, meaning its shape will inevitably bear fundamental shortcomings and clumsy vestiges. But thatās another post for another time.
And all that said, certainly Iām biased. We have a lot of sunk cost here. But I hope you find the original reasoning sound (and if not, let me know ā there is precious little I care about more than challenging this line of reasoning).
Notebook or not, an upheaval is overdue. Not everything is a nail. We deserve a tool purpose-built for analytics, not another re-purposed development tool.
The SQL IDE should die
Just gonna jump in and pile on the hot-takes here...
First off, I donāt know anything about hyperquery so Iām staying out of that part of it.
But analysis both is and isnāt development.
Analysts write code, it should be repeatable, and correct. In other words, it is development. There is nothing fundamentally incompatible or suboptimal with using an IDE for this. Depending on the IDE of course.
But analysts donāt ship code. Not the same way. I mean, you could set up a CI system to generate and publish a report on merges, but thatās quite fanatic. I tried. It was short-lived. So we default to think ānotebookā as the solution because itās code but has plots and isnāt an IDE.
RStudio is an interesting case study here. It is unquestionably an IDE. I have shown it to colleagues in IT and they all agree it is an IDE. It has all the debugging features. The git integration. The testing framework integrations. But it is also definitely a tool for analysis. It displays plots. It produces reports. It publishes interactive visualizations. It can even help you write to word if you need to.
So where does this leave us? A lot of IDEs arenāt good IDEs for analysts. DataGrip (or whatever other SQL client) isnāt, and afaik never attempted to be. Same with IntelliJ, even though someone has tried to duct-tape a āshow plotsā feature onto it. But there are IDEs for analysis. In addition to RStudio, there is DataSpell (which I havenāt used in a while, and had a very ānotebook with stuff glued onā feeling to it if memory serves). Most seem to be python-first though. SQL is left behind.
I have left out any discussion of new-ish entrants like hex (and hyperquery) because I donāt know enough about them. But It feels like we are at a Henry Ford kind of moment where everyone is so used to notebooks, it is the only thing they can think to ask for. The next winner in this space introduces something new that is neither a notebook nor IntelliJ. But they will have to teach people how to use it.