But wait - haven’t Modern Data Stacks been around since 2016? How is this still new?
So everyone uses a Modern Data Stack now, right?
Not even remotely.
While building infrastructure to handle data is of the utmost importance (at least, in our opinion), it can seem like a big and expensive project. Plenty of organizations have other priorities, and because maintaining a data lake is relatively easy and cheap, creating an MDS falls to the bottom of the to-do list. As a result, data scientists and analysts are left to figure out solutions on their own.
At Pickaxe, we’ve met many clients who were still using Excel to manage their data at the beginning of our working relationship. And some of these are large corporations working across multiple verticals. They’d been hoarding data in a big swamp for over a decade and hadn’t begun looking into it at all, mostly for cost reasons. They’re just now beginning to invest in this technology. Even though it seems passé to us, it’s brand new to them.
In our experience, once we introduce the concept of a Modern Data Stack and begin building warehouse infrastructure, our clients are thrilled. They find that it solves problems they didn’t even know were problems. And it’s a massive improvement over what they were doing before.
Does “Modern Data Stack” still mean anything?
As time has gone on – especially in the last 12-18 months – many of the 3rd-party tools commonly used in Modern Data Stacks have been acquired or copied by big companies. The term “Modern Data Stack” is cropping up in contexts that may seem irrelevant to those who are experts in the subject.
The ubiquity of the term might make it seem like MDS has become a buzzword rather than a salient concept, which could be an indication that the Modern Data Stack is dying.
However, the ubiquity of the term “Modern Data Stack” could also imply that the concept is becoming mainstream. And working with big companies means tools can be souped up with more functionality – for example, you can have things like observability, lineage, and orchestration built into the tool, because there are more resources behind them.
Is the Modern Data Stack perfect? Definitely not. However, as data scientists who are closely following the development of these tools, it’s natural to assume that they’re more prevalent than they are.
It’s easy to hyperfocus on the next thing when you’ve got deep industry knowledge, and you know what’s happening on the bleeding edges of data. But outside of our specialized circle, most people haven’t put an MDS into practice, and many don’t even know what it is.
Before discarding the Modern Data Stack to move to the next thing, let’s try to put it into practice fully, and then think about what we can do to improve.