Learn Monitor and maintain Azure resources (AZ-104) with Interactive Flashcards

Master key concepts in Monitor and maintain Azure resources through our interactive flashcard system. Click on each card to reveal detailed explanations and enhance your understanding.

Azure Monitor metrics and logs

Azure Monitor allows an Azure Administrator to maximize the availability and performance of applications and services by collecting, analyzing, and acting on telemetry using two fundamental data types: Metrics and Logs.

**Metrics** are numerical values that describe an aspect of a system at a specific point in time. Think of them as the lightweight 'vital signs' of your resources. Examples include CPU percentage, memory usage, or network throughput. Metrics are collected at regular intervals and stored in a time-series database. Because of their numerical nature and low latency, metrics are ideally suited for near real-time alerting and fast visualization on dashboards. They answer operational questions like 'Is the server response time high right now?' or 'Has the CPU load spiked?' Administrators primarily use metrics to trigger autoscaling events or fire alerts based on specific numeric thresholds.

**Logs** typically contain different kinds of data organized into records with distinct properties. They can vary from simple text messages to structured blobs of data, encompassing events, traces, and audit trails. Logs are stored in Log Analytics workspaces. Unlike metrics, logs provide deep context about *what* happened and *why*. For instance, while a metric tells you HTTP errors are increasing, a log tells you specific code exception details or which user initiated a change. To analyze logs, administrators use the Kusto Query Language (KQL), which enables complex joining, filtering, and aggregation across diverse data sources.

In summary, utilize Metrics for monitoring general health, creating visual dashboards, and setting up real-time alerts. Utilize Logs for deep troubleshooting, auditing, security analysis, and complex reporting. Mastering both is essential for effectively maintaining Azure resources.

Azure Monitor alerts and action groups

In the context of the Azure Administrator Associate certification, **Azure Monitor Alerts** serve as a critical proactive mechanism to ensure resource availability and performance. Rather than manually watching dashboards, administrators configure alert rules to monitor specific signals—such as metrics (numerical values like CPU usage) or logs—against defined thresholds. When the conditions of a rule are met (e.g., average CPU usage exceeds 90% for five minutes), the alert changes state to 'Fired.'

To manage the response to these triggers effectively, Azure utilizes **Action Groups**. An Action Group is a named collection of notification preferences and remediation actions. It acts as a reusable object that can be linked to multiple alert rules, ensuring consistency and reducing administrative overhead. When an alert triggers, it invokes the specific Action Groups assigned to it.

Action Groups perform two primary functions:
1. **Notifications:** Alerting operations teams via email, SMS, push notifications to the Azure mobile app, or voice calls to ensure immediate awareness.
2. **Automation:** Triggering automated responses to fix the issue without human intervention. This includes calling Webhooks to integrate with ITSM tools, executing Azure Functions, triggering Logic Apps, or running Azure Automation Runbooks.

For example, an administrator can configure an alert for low disk space. The associated Action Group could send an email to the IT support team and simultaneously trigger an Automation Runbook to clear temporary files. By decoupling the detection logic (Alerts) from the response logic (Action Groups), Azure enables administrators to build scalable, maintainable, and responsive monitoring strategies essential for maintaining the health of cloud infrastructure.

Azure Monitor Insights (VMs, Storage, Networks)

Azure Monitor Insights provides a curated, pre-configured monitoring experience for specific Azure resources, significantly simplifying the operational workload for an Azure Administrator. Instead of manually building queries and dashboards from raw logs and metrics, Insights offers out-of-the-box visualizations and health logic.

VM Insights (Azure Monitor for VMs) monitors the performance and health of virtual machines and scale sets. It relies on the Azure Monitor Agent to collect data. Its standout feature is the 'Map' view, which visualizes dependencies between servers, processes, and external services, aiding in root cause analysis. It also provides a 'Performance' view to track CPU, memory, disk, and network trends, helping to identify bottlenecks immediately.

Storage Insights provides a unified view of your Azure Storage accounts performance, capacity, and availability. It aggregates data across Blob, File, Queue, and Table services. Administrators can use this to quickly detect transaction latency, spot throttling issues, or analyze capacity growth without complex configuration.

Network Insights acts as a centralized console for network health. It visualizes the topology of your entire network, including Load Balancers, Application Gateways, and ExpressRoute circuits. By integrating with Network Watcher, it helps administrators understand resource connectivity, identify broken links, and verify that network traffic flows as intended through Network Security Groups (NSGs).

In summary, these Insights reduce the Time to Detect (TTD) and Time to Resolve (TTR) issues by transforming raw telemetry data into actionable, visual intelligence specific to the resource type.

Azure Network Watcher and Connection Monitor

Azure Network Watcher is a regional service designed to monitor, diagnose, and gain insights into network performance and health within Azure. Unlike standard monitoring that focuses on individual resource health, Network Watcher focuses on the Infrastructure-as-a-Service (IaaS) network layer. It provides a suite of diagnostic tools such as IP Flow Verify to check if traffic is allowed or denied by Network Security Groups (NSGs), Next Hop to identify routing issues, and Packet Capture to perform deep inspection of traffic anomalies directly on Virtual Machines.

A critical component of this service is Connection Monitor. This unified tool provides end-to-end connectivity checking between a source (like an Azure VM or an on-premises machine with a Log Analytics agent) and a destination (another VM, a URI, an FQDN, or an IP address). It supports TCP, ICMP, and HTTP protocols. Connection Monitor measures network performance metrics, specifically Round-Trip Time (RTT) and packet loss, across Azure regions, ExpressRoute connections, and VPNs.

For an Azure Administrator, these tools are vital for maintaining network availability. Connection Monitor visualizes the network topology hop-by-hop, allowing you to pinpoint exactly where a handshake fails or where latency spikes occur. By integrating findings with Azure Monitor and Log Analytics, administrators can configure alerts to trigger when connectivity drops or performance thresholds are breached, ensuring proactive management of hybrid and cloud-native network infrastructures.

Recovery Services vault and Backup vault

In the context of the Azure Administrator Associate exam (AZ-104), protecting data is a core component of the 'Monitor and maintain Azure resources' domain. Azure utilizes two distinct logical containers to manage and store backup data: the **Recovery Services vault** and the **Backup vault**. While both serve the purpose of securing data, they support different sets of workloads.

The **Recovery Services vault** is the established standard. It supports Azure Virtual Machines, SQL Server and SAP HANA running on Azure VMs, Azure File Shares, and on-premises workloads (via the MARS agent or Azure Backup Server). Crucially, this vault is also the exclusive entity used for Azure Site Recovery (ASR) to manage disaster recovery replication. It offers critical security features like Soft Delete, immutability, and Cross-Region Restore.

The **Backup vault** is a newer entity designed specifically for newer, cloud-native workloads. You are required to use a Backup vault for protecting Azure Blobs, Azure Managed Disks, Azure Database for PostgreSQL, and Azure Kubernetes Service (AKS).

As an administrator, you do not choose between them based on preference; the resource type dictates the vault. If you are backing up a whole VM, you utilize a Recovery Services vault. If you are backing up a specific Managed Disk or Blob container, you utilize a Backup vault. Both provides a centralized interface to define backup policies (frequency and retention), configure storage redundancy (LRS, GRS, or ZRS), and monitor job success via Azure Monitor to ensure business continuity and compliance.

Azure Backup policies and operations

In the context of the Azure Administrator Associate role, managing Azure Backup relies heavily on Recovery Services vaults and Backup Policies to ensure business continuity.

A **Backup Policy** is a rule set that governs the operational behavior of the backup service. It defines two critical parameters: the **Schedule** (frequency and specific time backups occur, such as Daily at 11:00 PM) and the **Retention Range** (how long recovery points are stored). Administrators often utilize a Grandfather-Father-Son (GFS) rotation scheme, retaining daily backups for short periods while keeping monthly or yearly backups for years to satisfy compliance and audit requirements. Policies are associated with specific items, such as Azure VMs, SQL databases within Azure, or Azure Files shares.

**Backup Operations** involve the ongoing management of these resources:

1. **Configuration:** This involves creating vaults, defining policies, and enabling protection for resources. For Azure VMs, this triggers the installation of the VM Extension.
2. **Restoration:** Administrators must be proficient in various restore methods, including creating a new VM from a restore point, performing 'File Recovery' to mount a snapshot and retrieve individual files, or restoring a disk.
3. **Monitoring and Reporting:** Using **Backup Center**, administrators track job success/failure, monitor storage consumption, and generate compliance reports across subscriptions.
4. **Security Operations:** Crucial for maintaining integrity, this includes managing **Soft Delete** (which retains deleted backup items for 14 days to protect against ransomware or accidental deletion) and configuring Multi-User Authorization (MUA) to prevent unauthorized critical actions.

Azure Site Recovery (ASR)

Azure Site Recovery (ASR) is a critical Disaster Recovery as a Service (DRaaS) solution that Azure Administrators use to ensure Business Continuity and Disaster Recovery (BCDR). It is designed to keep applications and workloads running during planned and unplanned outages by orchestrating replication, failover, and recovery processes.

In the context of monitoring and maintenance, ASR allows administrators to replicate workloads from a primary site to a secondary location. This includes replicating on-premises physical servers, VMware, and Hyper-V VMs to Azure, or replicating Azure VMs from one Azure region to another to protect against regional failures. The service helps maintain low Recovery Point Objectives (RPO) and Recovery Time Objectives (RTO).

Key components include:

1. **Replication:** Data is continuously mirrored to the target site. Admins configure policies to define snapshot frequency and retention.
2. **Failover:** During an outage, the administrator triggers a failover, creating VMs in the target region based on replicated data to resume operations.
3. **Failback:** Once the primary site is restored, ASR synchronizes changes back to the original location.

Administrators also utilize **Recovery Plans** to group multi-tier applications, ensuring VMs start in a specific order (e.g., database before web server) and triggering Azure Automation runbooks for custom configuration. Furthermore, ASR supports **Test Failovers** (DR Drills), allowing the validation of recovery strategies in an isolated network without impacting production environments. This capability is essential for compliance and verifying that the resources are being properly monitored and maintained for high availability.

More Monitor and maintain Azure resources questions
210 questions (total)