Azure Monitor’s Change Analysis Helps You Troubleshoot Problems Quickly

azure-monitor’s-change-analysis-helps-you-troubleshoot-problems-quickly

Azure Monitor’s Change Analysis Helps You Troubleshoot Problems Quickly

Image: PhotoGranary/Adobe Stock

Change management is key to running a mature IT organization. If problems arise, it’s important to know what’s changed in your environment so you can quickly diagnose failures and troubleshoot issues. A fix might be as simple as backing out the last change, or it might be resolved by understanding the interactions between the services that make your platform.

That’s as true in the cloud as it is on premises, and possibly more important, with cloud-native architectures depending on microservices that may be shared between multiple applications. A change in one service might affect multiple applications; for example, suddenly consuming more resources than planned, blocking APIs.

Jump to:

  • Change management in the cloud
  • Introducing Azure Change Analysis
  • Change analysis in Azure Monitor
  • Debugging with Change Analysis

Change management in the cloud

Traditional change management approaches don’t work at cloud scale. Processes designed to work in a manually operated data center are unlikely to be suited for automated infrastructures that scale on demand and operate across many cloud platform regions. With an automated environment, we need an automated way of understanding and managing change. Tools like Microsoft’s Azure Monitor provide that framework, instrumenting dynamic infrastructures and providing the tooling needed to build cloud operations dashboards and workbooks.

Much of what we use to monitor and manage cloud infrastructures is purely reactive, showing us what happened and when. Log files can be analyzed to trace the causes of an issue, but that’s only part of the story. We need to understand why the issue occurred: Was it a bug in code, or was it a problem with the virtual infrastructure we deployed? Or was it a problem with a platform service used by our code?

Introducing Azure Change Analysis

That’s where Azure Monitor’s Change Analysis tooling comes into play. It tracks infrastructure changes, using Azure resource properties to indicate what has changed and when it changed. It’s an approach that takes advantage of the same tooling we use to build and manage our applications, the Azure Resource Manager templates that describe everything we deploy. Microsoft’s choice to use a declarative language to define every aspect of an Azure deployment makes it possible to record changes to those properties, and to use Azure’s own data exploration and filter tools to build a searchable timeline.

Under the hood is the Azure Resource Graph, which Azure uses for your backup snapshots and other service replication platform features. As the service stores changes automatically, they’re available for Azure Monitor through a secure API. That allows it to track not only the changes you make, but also changes that come from the Azure platform itself. Where changes aren’t made directly through ARM, the service captures configuration properties every six hours for most user changes, and every 30 minutes for Azure Functions and Web Apps. There’s a 14-day limit on all change snapshots, though that shouldn’t be significant as problems are likely to arise relatively quickly.

Change analysis in Azure Monitor

You can access the Change Analysis tooling from Azure Portal as part of Azure Monitor. This makes sense, as Azure Monitor is a key component of the Azure operations platform. This is where you can collect and analyze telemetry data from across your various subscriptions and tenants, even from on-premises System Center Operations Manager installs. It works across Azure APIs and resources, as well as offering tooling to bring in telemetry from your own code. It’s perhaps easiest to think of this as all part of Azure’s approach to observability.

Traditional monitoring and management tools aren’t designed to work at scale, and struggle when it comes to distributed systems built on top of service architectures. Telemetry helps, but that results in a flood of data that can be hard to analyze. Observability techniques allow us to use big data tooling to look for patterns in those logs that indicate where systems have failed or where we need to investigate possible issues, allowing us to understand the internal state of a complex system. There’s an added advantage in that you don’t need to add extra tools to your application that might consume additional resources, avoiding performance issues and cloud compute costs.

Azure Monitor is where all this information is gathered, giving you a one-stop shop for the information you need to manage your applications. It’s best thought of as an observability dashboard, where information is collated, processed and displayed. There are four key data types it uses: metrics, logs, traces, and now, changes.

Its data sources include feeds from the underlying Azure Platform, using the platform’s resource management features to track operational details of your services. This is where its change data is sourced and used to generate insights about your platform operations. All the various sources used by Azure Monitor are processed and used to provide insights, visualizations and analytics, ready to help diagnose issues. You can take that data and build it into automation tools, such as rolling back to a previous ARM template for a service if it persistently has problems.

Debugging with Change Analysis

Change details can feed through the diagnostic tools built into Azure Monitor, giving you the extra information that may be needed to solve a problem. As details of networks are stored in ARM, being able to see if a route or an address has changed can show whether problems with a service are due to the service itself or any changes that have been made to your virtual networks and network appliances. This way you can see if rules added to Front Door affect your application, or if there are problems with caching in Azure CDN.

Where traditional change management tools are standalone, meaning that any analysis needs to be manual, bringing change data into Azure Monitor ensures that it’s available to the service’s built-in analytics tools. Having it as an input in the Diagnose and Solve Problems service makes a lot of sense, as it can quickly isolate possible fixes, while using Azure Workbooks gives you a place to compare and correlate data across various inputs, like application performance, to see how infrastructure changes have affected application operations short of causing failures. This approach allows you to determine if a change needs to be repeated, like increasing the capabilities of a switch, or using a different class of virtual machine.

Microsoft has gone a long way to make Azure Monitor your operations hub for all your Azure-hosted applications and services. Adding Change Analysis to the platform has given you another diagnostic tool that can speed up fixing problems, keeping sites and services running. With the public cloud hosting more and more customer-facing and business-critical applications, tools like this can help reduce downtime and keep your business afloat.

Read next: The Complete Microsoft Azure Certification Prep Bundle (TechRepublic Academy)

Image: PhotoGranary/Adobe Stock Change management is key to running a mature IT organization. If problems arise, it’s important to know what’s changed in your environment so you can quickly diagnose failures and troubleshoot issues. A fix might be as simple as backing out the last change, or it might be resolved by understanding the interactions…