Crack the Code on Data Observability
1. Job latency, SQL analytics, Spark analytics
What is the shortest distance between two points? A straight line of course. What if there are multiple points? Then, it depends.
A job executed in response to a user action — refreshing a dashboard, aggregating data, building a report,developing an ML algorithm, performing analytics — all require multiple hops through the data ecosystem. In order to know how long a job takes, and more importantly where that time is spent, you need to know all the points in the journey and know how long each task takes, the dependency of these tasks, and what may be causing missing your SLAs.
This level of detailed information can only be gathered from observation. Data observation. Various artifacts such as logs, traces, events collect bits and pieces of the information — let’s call them the signals. These are observable data points. In order to get an actionable insight e.g. data has drifted, data schema has changed, data has anomalies, etc., these signals must be correlated and synthesized. Further, these must go through a “pattern of use” filter to make sure false positives can be minimized.
Acceldata’s multidimensional data observability platform Improves data reliability, eliminates complexity, and scales data usage so enterprises can accelerate their Data (digital) transformation. We have built an AI-enabled data monitoring solution to detect, predict, and prevent data issues early in the data lifecycle before it hits the target systems that creates a big disruption and a bigger bill. It requires a holistic approach. And, we believe that in order to build trust in your data, you need to have deep visibility of all the data layers — compute engines, data processing pipelines and workflow orchestration.
2. Resource management and containerized deployment
In order to optimize the response time for requests, resources required by those requests must be optimized. This includes resources such as CPUs, memory, I/O, and N/W. Sometimes it may be for total costs or total throughput or peak performance but it is always in support of SLOs set up by business users and decision makers that need the data and insights. All these resources are often a part of the K8S container.
Getting to know when a job fails or takes longer than expected or data is delayed is certainly something you should demand of any solution that is in the space of monitoring. What data observability adds is multi-layered deeper inspection/visibility capabilities. With that level of details, patterns of use can drive the optimization of resources.
3. Data as a critical business asset
As long as businesses have been around, data has played a role. In the last decade, businesses are collecting a lot more data in the hope that there are some “golden nuggets” of insight and information that can give them an edge, a competitive advantage. This is more true now than ever.
We need to take control of our data and protect it as a critical business asset.
Data is the lifeblood of a modern enterprise. Data is also the most dynamic valuable entity — it moves, it changes, it creates new data, it mingles with other data, it gets dirty, it drops — and yet we have ignored it for a long time. Businesses make decisions everyday based on data, life decisions are based on data in healthcare, financial services, and a host of other fields, Governments and militaries make key decisions about defense strategies all based on data. There is not a single area of our lives where data is not used. Even our intuitions are based on historical data at a deeper level. And, in order to use data for such key decisions, we must be able to trust the data.
It is said that trust is built over time, and it is true with data trust as well.
With such enormous volumes of data, it is not possible for the humans who use, manage, and make decisions with data to act with the level of care that is needed to protect and treat these assets with respect. Machines and automation are needed. The good news is that in the last few years, data automation has exploded!
Data observability, born out of this need, collects, curates, and applies automation to observable data (metrics, logs, events and traces) and intelligently synthesizes these signals into actionable insights. Caring and knowing when the data drifts, when data schema changes, when data picks up anomalies, lineage of where data has been in its journey — all these aspects are necessary to know and build trust with data.
Just as you can’t be out driving safely in a vehicle without looking over your blindspots, the same is true for data journeys. Acceldata has created a data observability platform that prevents, predicts and remediates the data blindspots.
4. Extensible platform
Our needs with data are constantly changing and evolving. This is reflected in the complexity of the modern data stack. Any point tool/solution has a limited useful life until the use case changes. The big investments in data teams and data tooling must be preserved and extended for anticipated and unanticipated changes.
This may seem obvious or simple but it is a key aspect of data solutions having a long useful utility for the data teams. The data landscape changes often — new sources are added, new use cases are onboarded, different consumption patterns for the trusted data emerge. All of this is to say that extensibility should be a core characteristic of any technology solution platform.
Acceldata took this to heart from the very beginning. We have built an automated low-code data observability platform for the 90% of the tasks that any data team is responsible for. We created a comprehensive set of APIs and an SDK that instruments the data process to reflect the uniqueness of your business, with exactly what your business needs.
5. Automate, automate, automate
The premise of the modern tech stack is to help the data teams in relieving them of the tedious, repetitive, mundane tasks. This allows them to focus their time and energy on more strategic value-add projects, e.g. finding better ways to use the data for business.
Acceldata takes the approach that everything that can be automated should be. Automation also has another beautiful side-effect: it doesn’t wait for someone to be available during their dinner time or 3 am to take actions. While many platforms may have auto-alerting capabilities, taking known actions based on the SME knowledge is another way Acceldata helps the data teams to offload part of the tedious work.
Won’t it be nice if once you solve a pattern of problems — triage, RCA and remediation/action — that it becomes a reusable knowledge asset. Acceldata has the unique capability of building Runbooks that capture this essence and can be reused again and again.
Photo by Ross Sneddon on Unsplash