Data engineers can modernize fast with data observability if they avoid common pitfalls, or “gotchas”

How to Avoid Modernization “Gotchas” with Data Observability

The Data Observer
6 min readOct 11, 2021


“Modernization” can mean different things depending on the data professional that you’re talking to. It could be a tactical approach that involves replacing existing technology with a newer, better technology or about adding new capabilities to support new use cases. Or it could be a more strategic initiative that involves moving to the cloud or outsourcing to managed services. Regardless, getting the best return from a modernization effort requires avoiding some common pitfalls. But let’s start with some of the major potential benefits of modernization.

  • Business outcome focus: Modern technology and managed services shift the focus of your technical talent off of data and infrastructure management and onto making improvements to your business, whether that’s insurance, manufacturing, healthcare, or something else.
  • Increased scalability and elasticity: Cloud-native technologies support rapidly scaling workloads and data volumes as needed. This provides tremendous flexibility to meet increased demand immediately, without a drop in performance, and to scale down when appropriate, thereby avoiding unnecessary long-term capital investment. Moreover, only paying for the resources that you actually use can deliver significant cost savings.
  • Accelerated speed of execution: Creating new solutions no longer requires a significant delay to procure, deliver, configure, and test new hardware and software. In the cloud, new capabilities and capacity can be provisioned instantly with the click of a few buttons. This translates into faster time to value.

Despite the benefits of modernization, wholesale changes such as a complete “lift and shift” to the cloud may not always be a winning strategy. You need measurement and analysis of the current and future state of your environment to help you make informed data modernization decisions and avoid these common issues faced by modern enterprises.

Watch Out for these Modernization “Gotchas”

From planning to implementation and post-implementation management, here are eight modernization “gotchas” to be aware of on your modernization journey.


1. Giving Up Too Much Control

Outsourcing infrastructure can present tradeoffs in terms of performance, capabilities, security, cost, and other aspects that may be more easily controlled in an on-premise environment. Loss of control is a major reason why many organizations opt for a hybrid model rather than moving all applications, workloads, and data sources to the cloud. What’s more, for large, consistent workloads, the economics of on-premise can outweigh the benefits of the cloud.

2. New tech, same problems

Moving to a new home is a great opportunity to clean out the clutter in your garage, attic, and basement. The same is true for modernizing your data environment. There’s a real cost (and risk) of retaining unused and redundant data, workloads, and pipelines. Consider decluttering and consolidating your data environment as part of your modernization effort.

3. Migration Costs

A lift-and-shift approach can make migration easier but might not take advantage of the benefits of the new technology or environment. Refactoring or re-architecting solutions can provide bigger benefits, including lowering production costs and improving performance. Yet these engineering efforts bring additional cost and risk. In many cases, neither option will yield a return on investment, in which case, precious time and resources should be spent elsewhere.

4. Tech Sprawl

Cloud marketplaces can make a creative engineer feel like a kid in a candy store with a vast array of capabilities at their fingertips. Leveraging new technology can be necessary to support innovation for your business. The downside is this can also lead to tech sprawl that’s difficult to oversee, manage, and budget for — especially in a multi-cloud environment. Risk increases from dependency on additional technology, additional skill sets, and greater complexity overall.


5. Data Migration Risks

Many aspects of data can get “lost in translation” as it gets migrated from one technology or environment to another. For example, moving data from a data warehouse to a data lake can provide flexibility on the one hand but a lack of control on the other. That’s why data reconciliation — ensuring data arrives as expected — should be a key component for any modernization strategy. Of course these sorts of checks and balances are becoming ever more important post-migration as data becomes increasingly distributed and accessed on-demand and in real-time.

6. The Data Swamp

Siloed data in the cloud is still siloed data — it’s distributed and is only usable if identifiable. Migrating a data swamp from on-prem to the cloud doesn’t solve much. In fact, doing so might actually increase confusion and cost. Data was hard enough to find before, and now it has moved. To maximize the impact of your data, you need to make it easy to find, explore, and validate.


7. The Meter is Running

Limitless scalability sounds great — until someone forgets to turn off a service that’s no longer needed. It can also be quite easy to generate workloads that scale out of control. Failing to implement the right monitoring and alerts could lead to a massive cloud bill. Insight into how and why resources are consumed and how to improve workload efficiency can improve the price / performance ratio. Multiplied across all of your workloads and use cases, it’s easy to see why data observability has become such a hot topic for budget owners.

8. Cost vs. Benefit

In theory, accurately assigning compute and storage utilization costs to individual business processes should be easier with cloud technology that is metered. Unfortunately, that’s typically not how the billing works, which makes internal chargebacks much more complex than they should be. To truly optimize data investments for the cloud you need visibility into the cost of supporting specific business processes so that you can align business and technical strategy and tactics for the maximum return on data investment. Unlike the sunk costs of on-premise infrastructure, the cloud brings the flexibility to adapt usage quickly based on business priorities, but this only benefits you if you measure what you manage.

Use Data Observability to Reduce Risk in a Modernization Strategy

Data observability can reduce risk in your modernization strategy by providing end-to-end visibility into your data, processing, and pipelines. Get the insights you need to make informed decisions for every step of your modernization journey:


Data observability gives you the insight needed to put the right plan in place to get the maximum return on modernization initiatives. It enables you to:

  • Declutter your data with visibility into unnecessary redundancies, unused (“cold”) data and other utilization patterns. This will result in a future state that consumes fewer resources, is easier to manage and easier to use. A simpler target state definition will make the rest of the planning and implementation easier too.
  • Benchmark the current state in terms of performance, cost and other factors to ensure the future state meets the business requirements at a lower cost
  • Optimize the design by comparing the price and performance of one architecture vs. another and the corresponding migration costs of each. This also helps weigh the benefits of adding new technology against the downsides of tech sprawl. This enables you to right-size the short term investment of modernization for the longer term gain while minimizing tech sprawl.


Beyond improving agility, lowering cost, and adding capabilities, modernization can provide an opportunity to improve data reliability, management and usability:

  • Reconciling data movement is becoming a key component of modern data architectures. Data observability platforms provide a fast, comprehensive way to verify data in motion. The initial benefit is in reducing the time and risk of migrating data but the long term benefit is greater data reliability overall.
  • Data catalogs that can make data management, governance and consumption much easier and modernization initiatives are a great time to implement one. It will also accelerate adoption of the future state providing a faster and greater return on investment.


“X”-as-a-Service means cost is not a one-time, fixed event but a meter that runs whenever it’s used. Cost efficiency can only be achieved by monitoring, analyzing and acting upon utilization. Data Observability technologies that provide the following capabilities often yield the highest return on investment:

  • Turning off unused resources automatically
  • Disposing of old or unused data
  • Predicting and preventing data issues to avoid reprocessing data
  • Stopping or alerting on runaway queries and jobs
  • Trending analysis to flag rising cost or degrading performance
  • Recommendations such as query, data or configuration optimizations
  • Chargeback reports that align cost with business activity

While XaaS vendors do a great job of providing agility, they are not incentivized to provide efficiency. In fact, they profit from inefficiency. Data observability looks out for the consumer of XaaS.

Photo by Matteo Catanese on Unsplash