By Hannah Smalltree, Cazena
If you’re considering a Platform as a Service for big data, analytics or data science – this is no holiday wishlist, these are requirements.
The platform-as-a-service (PaaS) market is growing quickly for big data and analytics, and for good reasons. The cloud PaaS model promises more agility, less administration, big cost savings and the list goes on. So picking one out should be easy, right? Not so fast. PaaS labels are applied to many different services, and organizations report very different results.
Part of the confusion is a growing market with fuzzy boundaries. As a broader category, there are many flavors of PaaS from aPaaS (application PaaS) to iPaaS (integration PaaS). But analyst firm Gartner Inc. recently found that one PaaS category is growing faster than the others. The November 2017 Gartner Market Guide for Database Platform as a Service (dbPaaS) reported that dbPaaS segment became the largest PaaS in 2016, surpassing application PaaS. That’s a big shift. Further, Gartner forecasts dbPaaS revenue to double over the next five years.
That incredible growth is somewhat explained by the category’s breadth and diversity, as well as the ongoing momentum in big data and analytics in general. The Gartner guide includes a list of over 25 representative vendors with varied data platform solutions, from Cazena’s fully-managed Big Data as a Service to longer listings of components from Amazon Web Services and Microsoft Azure.
The variables in big data PaaS offerings can be profound, as explored in this recent blog post about enterprise challenges with big data PaaS. A common theme reported by enterprise teams is finding out after the fact (or mid-deployment) that their PaaS is missing something very important. This can be a show-stopper with a cloud project. The age-old advice is obvious but true again: ignore labels, define your expectations and focus on those requirements.
Five Must Haves if You’re Thinking About a Big Data PaaS
- The Integration Plan. A PaaS doesn’t necessarily seamlessly integrate with the enterprise. Unless your company has the luxury of growing up in the “-as-a-service world” where all data and tools are already in the cloud, significant work can be required to deploy and manage PaaS as part of an on-premises data flow. Make sure you know what that will take, or choose a service that works with your company’s requirements.
- Turnkey, Component or End-to-End? Perhaps the biggest differences in PaaSs is exactly how much of the stack they cover. For example, if you’re looking to use your data platform for production analytic processing, you will need to develop a secure pipeline and processes. Otherwise, you won’t be able to move data from your various sources and have it be analytic-ready in real-time. Put some time into considering how the platform will ingest data from various sources including on-premises datacenter systems or apps in the cloud.
- How Managed? There are varying degrees of “managed” and “fully-managed” and “super-duper-managed” (ok, maybe not quite that.) Get a list of what’s included in “managed” PaaS, check it twice and make sure it covers old chestnuts like: Who handles upgrades, backup and restore, security patching? Who’s monitoring the service? Who’s looking out for your workloads? Who will be responsible for managing the cloud vs the platform? What DevOps skills are required to maintain the platform?
- Security, Obviously. Not that you weren’t going to consider this in your PaaS evaluations, but especially if you’re with an enterprise – plan for this. Don’t assume all cloud PaaSs have some ‘standard’ level of security or automatically meet GDPR requirements. Double check that your PaaS of choice offers capabilities for security, encryption, compliance and data governance, then get someone else to check again, then call your CSO for a final evaluation.
- What’s Required for Operations? A PaaS will certainly automate many tasks, but it doesn’t mean all platform operations magically go away. There will still be things to be done. Understand those. Who will manage the pipeline, make sure the cloud components work together and manage the upgrade process? Can you talk with another customer about how it works? What skills and time will be required for operations?
PaaS, in theory, offers tremendous benefits, but the category is new and growing. Knowing your requirements, planning ahead and never taking labels at face value will always serve you well. Explore more resources about PaaS and Big Data as a Service at Cazena.com.
Bio: Hannah Smalltree is on the leadership team at Cazena, Big Data as a Service for enterprises. She’s worked for several data software companies and spent over a decade as technology journalist, interviewing companies about their data and analytics programs.