Part 1: Alerting for the Open-Core Enterprise Data Stack

Acceldata’s alerting platform is built for Data Intensive Applications responsible for stream processing, real-time and batch processing.

  • Lack of capacity on a Yarn Queue
  • Ever increasing size of Hive Data Partitions
  • Every increasing number of HDFS small files.
  • SLA Violations on critical Hive Business Processes
  • Stuck Jobs, due to straggling Spark SQL
  • Lagging Consumers of a Kafka Topic system
  • Poorly written Spark code resulting in excessive garbage collection
  • PySpark ML Algorithms slowing down from their standard SLAs
  • Hardware issues such as — CPU utilisation, Slow Disk, I/O issues.
  • An abstract separation, absolute non-interference of core data systems
  • Unified DSL for creation of alerts across all kinds of data-systems
  • Robust evaluation of comparative, mathematical, statistical and ML rules
  • The evaluation engine should work on various metrics datasources of the types such as — document store, time-series, in-stream
  • Unified, intuitively usable system to configure infra and application alerts alike
  • Process equivalent communication alerts ranging from email to pagers
  • An ability to trigger auto-corrective pre-configured workflows
  • Alert Service — Glue component for the rest of the system. It runs evaluators corresponding to the configured actions.
  • Evaluators — Converts alerts definition DSL to appropriate database queries and executes them at the proper schedule.
  • Notification system — When an evaluator triggers an incident, the notification system sends the incident across various channels.
  • REST Server — Provides APIs for Alerts CRUD, incidents and executions.
  • Administration Interface — Single page application runs in a browser and is used for configuring the system.

--

--

--

Thoughts and trends on data observability

Love podcasts or audiobooks? Learn on the go with our new app.

Recommended from Medium

Search and Big O Notation

Battle of the Backends: Analysing NodeJS, Ruby on Rails and Django

Portainer, Docker friend for command line haters

The Upside Down ..cast

Software is beautiful, Software is buggy: a quick non-technical intro to issues with memory-unsafe…

Rules for beginners to write scalable applications

Personal development using the scrum framework

An Android COVID-19 Tracking App, Robinhood Style (Open Source)

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
The Data Observer

The Data Observer

Thoughts and trends on data observability

More from Medium

Cloud Data Migration — Why Data Observability Plays a Critical Role

Introduction to Data Lineage, Data Governance and Data Dictionary Use Cases and Application

A well-known e-commerce platform: Building a top data platform with StarRocks

Data Engineering Zoomcamp — Week 3 (Data Warehouse)