The metrics layer is dead
Long live the metrics layer.
If you don't smell war in the data space, you haven't been paying attention. Forces are amassing along political boundaries, and everything is centering around a tiny, yet strategic slice of the modern data stack: the metrics layer. I.e. the tool within which metrics can be defined and disseminated. And the two key players: dbt and Google (Looker).
Before we dive in, why should you care? Well, the outcome of The Great Metrics War is going to be pivotal for us as data practitioners. Whoever wins here is not only going to set the standard for how metrics are defined and interfaced against, but also define the zeitgeist of data science and analytics over the next decade, owing to the simple fact that there are a lot of loops connected to the metrics layer. If you own the pipes, you control where they flow. A metrics layer can easily nudge users to particular downstream tools, routing momentum in a way consistent with their priorities and vision. In the case of Google, we know how this’ll play out — we can see the bundling already happening with clear pushes towards Looker Studio. For dbt, one might imagine the world would stay a bit more open. So at the end of the day, will analytics infrastructure be open or closed? Will the virtues of bundling finally win out, or will the modern data stack prevail in all its Florentine (albeit vendor-addled) glory?
But I’m getting ahead of myself. Let's review what’s happened and make some wild guesses as to what will happen next.
The last two years: dbt vs. Looker
Let's take a walk through the last two years in data. In 2021, the metrics layer became the next hot thing for the modern data stack. Companies like Airbnb and Lyft had already proven the value prop of building a centralized engine where metrics could be defined, calculated, and universally accessed. All that was left was for it to be properly commoditized. VCs latched on, and we, the data community, eagerly anticipated the arrival of something we hoped might be as groundbreaking as dbt. Startups made their announcements, practitioners drooled.
But then the game changed.
Oct 12, 2021: Google pivots Looker.
”It's just LookML now, really, and it’s a metrics layer.”
Oct 15, 2021: dbt says they're building one too.
And so two of the biggest names in the data space decided they could own metrics: dbt and Google (Looker).
It made a lot of sense, of course. LookML was already a metrics layer in some sense - if you squint,
measures in Looker are
metrics. They were just split-brained between their semantic layer and BI. So they killed the BI part.
dbt, on the other hand, already owned transformations, and the extension to metrics was natural. And as the darling of the data world, they likely knew they'd be able to leverage their goodwill to force the creation of a new standard.
And so the scale of the battle instantly changed.
What was a game of territories now became a game of thrones. Google and dbt decided they were willing to take on any lingering PMF risk. The new game: who’ll become the standard? Looker, with the whole force of Google's ecosystem (and enterprise mindshare, probably) on its back? Or dbt, with the grassroots world of data rallying behind it?
2022: The battle for the standard.
Fast forward 1 year later…
Oct 11, 2022: Google announces a bunch of metrics layer integrations.
Oct 18, 2022: dbt announces a bunch of metrics layer integrations.
It finally happened: both sides scrambled to establish a standard.
Google went after BI. Data Studio became Looker Studio (an already thriving BI offering), with semantics natively accessible within. They then announced integrations with Google Sheets and BI behemoths Tableau and PowerBI.
dbt went after ubiquity. They built up partnerships with a bunch of different tools (notebooks, monitoring tools, data discovery tools, ETL tools). They've reminded the world that they have the weight of everyone in the modern data stack behind them.
The only question left: what’s going to happen when the dust settles?
If we were to be completely fair with my predictions here, we would ideally dive into the core differences between dbt and Looker, but I’ll leave that for another post. But in short, Looker’s key advantage is that it’s more built out in terms of raw functionality. Looker is a full semantic layer that lets you replicate almost everything you would want to do in SQL, while dbt only lets you define metrics (at least as the time I’m writing this article).
But dbt’s strength is that they seem to be hyper-focused on building something that directly solves the problems that [I presume] need to be solved by a metrics layer — version control, basic exploration, SSOT. Metrics are an elevated first-class citizen in dbt in a way that they aren’t in Looker. It’s a stronger wedge for a particular category of use cases.
With that in mind, here’s how I imagine this might play out:
Why dbt wins
Their metrics wedge wins. It directly solves the primary pain points the world wants solved.
Looker integrations are botched, in proper Google fashion.
Why Google wins
Looker users + bundling with the Google ecosystem drives forward adoption faster than dbt can build. It’s a smaller hop from Looker measures → metrics than from 0 → dbt metrics, and it turns out to be not as much cognitive load as you might expect.
Faster time to market. While dbt is the crowd favorite, it’s tough to go up against a more feature-rich competitor, and dbt metrics v1 just missed too many essential things.
Looker dramatically alters their cost structure and takes over as the standard.
Why somebody else (Transform?) wins
The value of metrics only makes sense at scale. dbt and Looker are both notoriously cumbersome for cross-team collaboration. A team that's been there / done that could anticipate this and build a more robust at-scale engine from the ground up.
Something, something data mesh.
The metrics layer from 2021 is dead. Google and dbt have broken the startups, annexed their syntaxes. But long live the new metrics (semantic, sure) layer. It's hard to tell what's in store for us, but the next few years in data are going to be titillating. Competition is better for consumers, after all. And I for one can’t wait for all the semantic goodies headed our way.
Thanks for reading Win With Data! Subscribe for free to receive new posts.
I’m going to use metrics layer and semantic layer interchangeably, until I don’t.
Though I find it really interesting that they insist on this product shape while rebranding as a “semantic layer”.
I can’t be sure but iirc dbt’s syntax looks really suspiciously like Supergrain’s did before they pivoted.
dbt Metrics are so immature. I think that the definition of metric by dbt shows the constrains their approach has...
dbt Metrics definition:
A metric is a timeseries aggregation over a table that supports zero or more dimensions.
The need for time dimension and aggregation over a (single) table is very limiting.
I recommend checking the following article: https://medium.com/gooddata-developers/gooddata-and-dbt-metrics-aa8edd3da4e3
The article compares GoodData metrics with dbt metrics, and it is really interesting to see the key differences.
An interesting topic that, to me, is nothing new. As a BI engineer something I have worked since I started. The only difference is that it is now ripped out of the BI tool and added additional features. Interesting to see the Evolution of the Semantic Layer 📈 :
1991: SAP BusinessObjects Universe and BI semantic layer
2008: Master Data Management (MDM) (with MDS from Microsoft in 2008)
2013: Kimball discussed the concept of a semantic layer in #158 Making Sense of the Semantic Layer
2016: Maturing BI tools with an integrated semantic layer such as Tableau, TARGIT, PowerBI, Apache Superset, etc. have their own metrics layer definition
2018: Jinja templates and dbt eroding the transformation layer into a semantic layer
2019: Looker and LookML popularized as the first real semantic layer
2022: Modern Semantic Layer, Metric Layer or Headless BI tools such as MetriQL, MetricFlow, Minerva, dbt arose with the explosion of data tools (BI tools, notebooks, spreadsheets, machine learning models, data apps, reverse ETL, …)
More on https://airbyte.com/blog/the-rise-of-the-semantic-layer-metrics-on-the-fly in case of interest.