Help Docs

Azure Guidance Report

Note

The Cost and Security recommendations in Site24x7 Guidance Report will be available only in ManageEngine CloudSpend as Recommendation Reports. If you use both Site24x7 and CloudSpend, you can continue to get these recommendations from CloudSpend > Reports > Recommendations Reports.

The Recommendations Report in CloudSpend helps you optimize cloud costs, and improve fault tolerance and performance of your cloud infrastructure across AWS, Azure, and GCP accounts. It provides tailored recommendations that can help you achieve significant savings and improve the overall efficiency of your cloud environment.

If you have not subscribed to CloudSpend and want to keep getting these recommendations, you can get started with CloudSpend now.

Get a set of Availability best practice checks to increase the performance, and reliability of your Azure services. These recommendations are grouped based on three priority levels: High, Moderate, and Low.

Metrics-based practices will be calculated with the data collected during the Azure monitor's data collection. For the other practices, on-demand Azure API calls will be made and checked if the data is in line with the practice.

Best Practice Checks

Azure Virtual Machine (VM)

1. Under-utilized VMs detected

Priority:

High

Baseline:

A VM is deemed idle by analyzing its CPU utilization, memory usage, network in, network out and disk usage patterns. An Azure VM is deemed under-utilized if it meets one or more of the following criteria:

  • The average daily CPU usage is less than 2% for the last seven days.
  • The average daily memory usage is less than 30% for the last seven days (applicable only if the agent extension is deployed on the Azure VM).
  • The average daily VM Uncached IOPS Consumed Percentage is less than 10% for the last seven days.
  • The average daily VM Uncached Bandwidth Consumed Percentage is less than 10% for the last seven days.
  • The total number of bytes transmitted and received on all network interfaces is less than 1000 bytes by default.

Recommendation:

In Azure, you’re billed for even the partial hours taken by your idle VMs. To reduce associated costs, consider stopping/terminating VMs or scale down the VM size.  

Site24x7 monitors for all the cases mentioned above and provides suggestions for cost optimization so that you can identify and stop under-utilized instances using the Guidance Report. The Instance Type recommendations for Azure VM displays the Current Instance Type and recommend Suggested Instance Type that you can downgrade to, for cost optimization.

2. High usage of VM

Priority:

High

Baseline:

An Azure VM is deemed over-utilized if it meets one or more of the following criteria:

  • The average daily CPU usage is more than 90% for the last seven days.
  • The average daily memory usage is more than 90% for the last seven days (applicable only if the agent extension is deployed on the Azure VM).
  • The average daily VM Uncached IOPS Consumed Percentage is more than 95% for the last seven days.
  • The average daily VM Uncached Bandwidth Consumed Percentage is more than 95% for the last seven days.

Recommendation:

Change the VM size or add the VM to a VM Scale Set group.

Site24x7 monitors for all the cases mentioned above and provides suggestions for increasing efficiency and performance so that you can identify and stop highly utilized instances using the Guidance Report. The Instance Type recommendations for Azure VM displays the Current Instance Type and recommend Suggested Instance Type that you can upgrade to, for better performance and efficiency.

3. User-defined tags for VMs

Priority:

High

Baseline:

Assign metadata in the form of tags (key-value pair) to better track and manage instances, images, and VM Scale Set groups.

Recommendation:

Create a tagging strategy adhering to Azure best practices.

4. High I/O intensity VMs

Priority:

High

Baseline:

I/O intensive workloads with lower state disks will significantly affect VM performance.

Recommendation:

Migrate any VM disks requiring high IOPS to premium storage.

5. Under-utilized VMs

Priority:

Moderate

Baseline:

A VM is deemed under-utilized if its CPU usage is less than 2% for the past 48 hrs.

Recommendation:

In Azure, you are billed based on the instance type and the number of consumed hours. Lower costs by identifying and stopping under-utilized VMs.

6. Auto-shutdown resources with 'environment: testing, env: testing' tag

Priority:

Moderate

Baseline:

Delete VMs created for testing and other internal activities, to reduce incurring costs.

Recommendation:

Remove the VMs added for testing and that are running for more than a week's time. You can also create Spot VMs for testing and other workloads.

7. VMs not attached to Availability Set Group

Priority:

Low

Baseline:

VMs within an availability set helps to keep the overall VM performance operational, when a hardware or software failure happens, with only a subset of your VMs being impacted.

Recommendation:

Create an availability set for the VM.

8. Auto-delete test VMs

Priority:

Medium

Baseline:

Delete VMs created for testing and other internal activities to reduce the incurring costs.

Recommendation:

Remove the VMs added for testing and that have been running for more than a week's time.

9. VMs with no tags

Priority:

High

Baseline:

Assign metadata in the form of tags (key-value pair) to track and manage the instances, images, and VM Scale Sets groups.

Recommendation:

Create a tagging strategy adhering to Azure's best practices.

10. VMs not backed up

Priority:

High

Baseline:

Backing up VMs in Azure protects their data, ensures business continuity, enables point-in-time disaster recovery, and paves the way for centralized management and scalability.

Recommendation:

Backup Azure VMs for comprehensive data protection and to ensure that your data and applications are safe, compliant, and available when you need them.

11. Azure VM - Unavailable

Priority:

High

Baseline: 

Checks if the current resource health status of the VM is unavailable.

Description: 

Azure Resource Health reports the current and past health status of the VM by checking the operational reliability of multiple Azure components, such as the server hosting the VM, network connectivity between related components, scheduled maintenance, etc. Azure Resource Health also provides additional information to determine the source of the problem.

Recommendation: 

Set up automated actions to restart or replace the instance if the issue is persistent. Follow the actions recommended by Azure Resource Health to address the issue.

12. Azure VM - Accelerated networking

Priority:

Moderate

Baseline: 

Checks whether accelerated networking is enabled on the VM's network interfaces.

Description: 

Accelerated networking enables single root I/O virtualization to a VM, greatly improving its networking performance. This feature provides lower latency, reduced jitter, and decreased CPU utilization.

Recommendation: 

Enable accelerated networking for your VMs that run network-intensive workloads to maximize the benefits of network virtualization.

13. Azure VM - Unmanaged disk

Priority:

Moderate

Baseline: 

Checks if the VM has an unmanaged disk attached to it.

Description: 

Managed disk provides numerous benefits over unmanaged disk like eliminating management and maintenance overhead, better reliability, and scalability, etc.

Recommendation: 

It is recommended by Azure to use a managed disk over an unmanaged disk with a VM. Stopped VMs using unmanaged disk cannot be started, and running VMs using unmanaged disk will be stopped and deallocated on September 30, 2025.

14. Azure VM - Data disk

Priority:

Low

Baseline: 

Checks if the VM is using data disks for storing application data.

Description: 

Using data disks for application data can improve performance and manageability. It separates the OS disk from application data, allowing for better I/O performance, easier backups, and preventing potential issues if both the OS and applications compete for disk resources.

Recommendation: 

Ensure that your Azure VMs are configured to use data disks for storing application data to enhance performance and manageability.

Azure Public IP Address

1. Unmapped Public IP Address

Priority:

High

Baseline:

Hide the failure of an instance or resource by disassociating the IP address from the resource and remapping to a different one in the same account.

Recommendation:

A small hourly fee gets levied on unused addresses. So, either associate the public IP address with an active instance/interface or delete it.

Azure App Service Plan

1. Scale in less-used App Service Plan

Priority:

High

Baseline:

Stop paying more for under-used App Service Plans.

Recommendation:

Scale in the instances to reduce costs.

2. App Service consuming more than 80% average memory

Priority:

High

Baseline:

High memory usage may degrade the performance of applications running on the App Service Plan. Consider increasing the plan to increase the memory limit.

Recommendation:

Scale up the plan to improve the performance.

3. App Service consuming more than 80% CPU time

Priority:

High

Baseline:

High CPU usage may degrade the performance of applications running on the App Service Plan. Consider increasing the plan to increase the CPU limit.

Recommendation:

Scale up the plan to improve the performance.

4. Less than 5% site count usage for App Service Plan

Priority:

High

Baseline:

If the number of sites used is less than 5% of the allowed number of sites, then we consider it as under-utilized.

Recommendation:

Move the apps to a different App Service Plan and remove this to save costs.

Azure App Services

1. App Services with high response time

Priority:

High

Baseline:

Slow is the new down. An App Service with high response time will affect your business. Keep track of the App Services that start behaving slowly for the last one week.

Recommendation:

Probe your application further using APM and find the modules/resources that are causing problems.

2. App Services with more number of 5xx error codes

Priority:

High

Baseline:

An App Service that is error-prone indicates some part/module is failing and thus affecting business.

Recommendation:

Reduce the error response by proper error handling mechanisms and rectify the error modules.

3. Auth-disabled App Services

Priority:

High

Baseline:

Authentication-disabled App Services allow anonymous entry and users will not be prompted to login.

Recommendation:

Enable authentication to avoid anonymous access.

4. Backups are not enabled for some App Services

Priority:

High

Baseline:

Azure Backup will help to recover the App Services in case of any failure.

Recommendation:

Enable backup for the Azure App Service.

5. App Services with no tags

Priority:

High

Baseline:

Manage Azure resources more easily with tags. Untagged resources may sometimes go unnoticed and are difficult to manage.

Recommendation:

Tag the Azure resources with appropriate key-value pairs to ease management.

Azure Function App

1. Publicly accessible Azure Functions

Priority:

High

Baseline:

Azure Functions are charged based on the number of requests, and a request is any response to an event notification or invoke call. Allowing unauthorized executions can lead to unexpected charges on your subscriptions.

Recommendation:

Use Azure function login policies to manage invocation permissions.

Azure Logic Apps

1. Retry Policy not configured

Priority:

Medium

Baseline:

Use a Retry Policy in any supported action or trigger. A retry policy specifies whether and how the action retries a request when the original request times out or fails.

Recommendation:

Set up a Retry Policy to automate error handling and recovery in your Logic Apps.

 
 

Azure Load Balancer

1. Add Health Probes

Priority:

Medium

Baseline:

Health Probes are used to detect the backend point's health status.

Recommendation:

We recommend adding Health Probes to detect the application's failure and improve its performance.

2. Basic load balancer 

Priority:

Moderate

Baseline: 

Checks for basic load balancers being used in your Azure environment.

Description: 

The basic load balancer will be retired on September 30, 2025. Standard load balancer provides significant improvements high performance, ultra-low latency, superior resilient load-balancing and security by default and SLA of 99.99% availability.

Recommendation: 

Consider migrating from basic to standard load balancers.

3. Zone redundancy

Priority:

High

Baseline: 

Checks if the front end public IP of the load balancer is zone redundant.

Description: 

Zone redundancy ensures that the load balancer is resilient to zone failures by distributing its resources across multiple availability zones. If the public IP in the load balancer's front end is zone-redundant, then the load balancer is also zone-redundant.

Recommendation: 

Consider migrating the load balancer to availability zone support.

4. Backend pool redundancy

Priority:

Moderate

Baseline: 

Checks if the backend pool contains at least two instances.

Description: 

If your backend pool only has one instance and it's unhealthy, all traffic sent to the backend pool fails due to a lack of redundancy. The Standard Load Balancer SLA is also only supported when there are at least two healthy backend pool instances per backend pool.

Recommendation: 

Ensure that the backend pool has at least two backend addresses to maintain redundancy and high availability.

5. Outbound rule default port allocation

Priority:

Moderate

Baseline: 

Checks if the load balancer outbound rules are using default port allocation.

Description: 

Default port allocation for outbound rules might not be optimal for all scenarios and can cause a higher risk of source network address translation (SNAT) port exhaustion and scalability issues. Manual port allocation can help maximize the number of SNAT ports made available for each of the instances in your backend pool, which can help prevent your connections from being impacted due to port reallocation. 

Recommendation: 

Review and customize the port allocation instead of the default port allocation setting for outbound rules to suit your application needs better.

6. TCP reset

Priority:

Moderate

Baseline: 

Checks whether TCP Reset is enabled on outbound rules of the load balancer.

Description: 

TCP resets on your load balancer send bidirectional TCP reset packets to both client and server endpoints on idle time-out to inform your application endpoints that the connection timed out and is no longer usable. Enabling TCP Reset on the load balancer helps in quickly terminating idle or long-lived connections, improving the overall performance.

Recommendation: 

Enable TCP Reset on your load balancer to ensure efficient connection management and ensure the desired application behavior.

Azure Managed Disk

1. Shared LRS

Priority:

Low

Baseline:

Checks for shared locally-redundant storage (LRS) disks attached to multiple VMs.

Description:

LRS replicates your data three times within a single data center in a region. If the disk is shared between multiple VMs, the shared disk becomes a single point of failure for your clustered application. If the shared disk experiences an outage, all the VMs attached to it will experience downtime.

Recommendation:

Consider using zone-redundant storage for improved availability when using shared disk.

Azure Virtual Machine Scale Set

1. Automatic instance repairs

Priority:

Moderate

Baseline

Checks whether automatic instance repairs are enabled on the VM.

Description: 

Enabling automatic instance repairs for Azure VM scale sets helps achieve high availability for applications by maintaining a set of healthy instances. Automatic instance repair will attempt to recover an unhealthy instance by triggering repair actions such as deleting the unhealthy instance and creating a new one to replace it, reimaging the unhealthy instance, or restarting the unhealthy instance.

Recommendation:

Enable automatic instance repairs to ensure the health and availability of your VM instances in the VM scale set.

2. Zone redundancy

Priority:

High

Baseline:

Checks if the VM scale set is deployed across multiple availability zones.

Description:

Zone redundant VM scale set distributes instances across multiple availability zones, providing higher availability and resilience against zone failures.

Recommendation:

Deploy your Virtual Machine Scale Sets across multiple availability zones to enhance availability and protect against data center-level failures.

FAQs

Was this document helpful?

Would you like to help us improve our documents? Tell us what you think we could do better.


We're sorry to hear that you're not satisfied with the document. We'd love to learn what we could do to improve the experience.


Thanks for taking the time to share your feedback. We'll use your feedback to improve our online help resources.

Shortlink has been copied!