The New Model for Data Ops: Q&A with Rohit Choudhary, Founder & CEO at Acceldata
How have data and analytics evolved recently? What have been some of the most significant developments?
The distributed nature of data processing has been one of the biggest changes over the past decade. Based on these technological advancements, we’re seeing the advent of massively successful open source-based companies, such as Databricks, Cloudera, and Confluent. Cloud providers deliver use-case specific purpose-built databases which are getting used.
Data has also become much more operational. In the past, data was primarily used to populate monthly or quarterly reports to give people a snapshot of their businesses at a given point in time. Now, data powers a variety of real-time use cases that are core to business operations, and that is putting extra pressure on data teams to ensure the quality of data.
The time for democratization is now, especially when you consider the horizon. Within five years, everyone will have the same access to technology via the cloud — this includes everything from compute power to analytical and algorithmic capabilities. The whole curve is being flattened, which means data teams need the right capabilities to manage the technology that they adopt.
What should enterprises focus on to gain ground on data and analytics modernization initiatives?
Start by establishing a clear understanding of your key business use cases. Do you need to put data to operational use? What types of insights do analytical users need to perform their jobs? Do you plan to monetize your data? Consider how these use cases might evolve over the next 18 to 36 months.
Once you’ve decided to invest in a particular use case, it’s important to find the best technology with the largest community and fastest developing ecosystem. You also need to plan for unexpected operational concerns, which is why you should start thinking about data observability sooner rather than later.
How can businesses better plan and execute self-service analytics and data democratization for users who actually need it?
Select technology that fits your user community and is widely applicable. Avoid overly complex solutions when something simple will work.
Leverage a data lake or data warehouse architecture to centralize as much data as possible, avoid data silos, and power more use cases with less complexity and administrative effort.
Why do data teams need data observability?
New data pipelines and databases continue to come into existence, further increasing data volumes and the number of supported use cases. Modern data systems are getting more complex. At the same time, data systems are opaque, so you can’t get the necessary visibility into what’s happening.
When something breaks or goes wrong, data engineers don’t have the context to understand the issue. It creates a compounding effect that places even more pressure on data teams to effectively monitor everything. This creates a great deal of work, and, without the right approach, this can quickly erode data engineering productivity — and the success of your data teams.
Data observability gives you end-to-end visibility into the health of your data and pipelines and gives you context to understand why things break or fail.
Developer productivity is the hidden secret to data success. How do you increase productivity? By keeping developers focused on business problems — as opposed to dealing with operational issues related to compute, data quality, or data pipelines. Data observability covers the surface area of the technology you’re implementing, saving developers time and effort and increasing their productivity.
Data observability also provides a common vocabulary to align your data science and analytics, operations, and engineering groups. At Acceldata, we accomplish this by providing a single pane of glass for data teams to manage their respective concerns while ensuring data is reliable, scalable, and optimized.
How do inefficiencies in the end-to-end orchestration of analytics pipelines affect organizations’ data scientists and engineers?
Data’s journey begins when it is generated by your ERP and other source systems. The journey continues as data is ingested into your data warehouse or data lake. It’s then transformed into something that can be consumed for self-service analytics, ML models, and recommendation engines.
As with any journey, there can be unexpected detours and roadblocks along the way. Delays are particularly disruptive for data scientists who are waiting to apply value-added algorithms to fresh data sets. Delays create frustration for data engineers who are responsible for ensuring processes complete as expected. In either case, it would be nice to know what went wrong — and why.
Data observability provides end-to-end visibility across the entire data journey.
Can you share an insightful use case for Acceldata?
Mobile payment app PhonePe (division of Walmart) handles 400 million cash transactions per month. The company needed a better way to distinguish between infrastructure, seasonal, and campaign-based anomalies. PhonePe began using Acceldata to monitor HBase, Spark, and Kafka, which led to improved engineering productivity and a $5 million savings in annual software licensing costs.
According to Burzin Engineer, Founder & Chief Reliability Officer at PhonePe, “Acceldata supports our hypergrowth and helps us manage one of the world’s largest instant payment systems. PhonePe’s biggest-ever data infrastructure initiative would never have been possible without Acceldata.”
What are the key trends driving the growth in data observability?
Data volume is the main driver. Enterprises are overwhelmed by the amount of data and use cases. Some organizations collect more data in a single week than they used to collect in an entire year. If you’re on-prem, you have to keep adding resources to avoid falling behind. If you’re in the cloud, cost becomes a big factor.
And, there simply aren’t enough engineers to support all of this growth. That’s why the top job in the US is data scientist / data engineer.
How do C-suite executives leverage data to deliver business value to their organizations?
Data becomes a competitive advantage when effectively utilized. C-suite executives at organizations — both large and small — are increasingly aware of this reality, which further accelerates data’s rapid growth. Data-driven dashboards and reports help C-suites and their teams understand current performance, identify market opportunities, and innovate faster. Leveraging real-time data to automate everything from manufacturing processes to online shopping experiences elevates performance at a lower cost.
What are you excited about looking at the immediate future? What is your larger vision for big data and data observability?
More companies need to be successful with their data initiatives — not just a handful of large, Internet-focused companies. At Acceldata, we’re trying to level the playing field through data observability. Our vision is to increase data efficiency for 95% of the companies on Forbes’ Global 2000 list.
What advice would you give companies who are at the beginning of their digital transformation?
Think long term. Think systems. Think interconnectivity. And, think data.
After all, data is intertwined with your digital transformation strategy.
How can one learn more about data observability?
This article was repurposed from The Time For Data Democratisation Is Now.