Alarms Engine
Alarms Engine helps decide if a network resource has a problem. Alarms Engine applies conditions defined on data obtained via resource monitoring and decides to mark the status of the resource (a monitor) as Down, Trouble or Up. This uptime check can be configured in Threshold and Availability and Notification Profile.
Internet Services Monitoring:
Monitors like Website, Web Application, DNS, FTP etc. are categorized as Internet Services Monitors. For these monitors, the Alarms Engine monitors the performance and availability from multiple locations. Also Site24x7 eliminates false alarms by applying the "False Alarm Protector".
Whenever a downtime is detected, Site24x7 takes a screenshot via a Real Browser for website checks. To eliminate network failures, Site24x7 will look for any other monitored resources which are available during the same period. If any other monitor is up, it will conclude that this particular monitor is down and the alert will be triggered. If the up notification is not received for any other monitor, Site24x7 checks the accessibility of known websites and would determine the network status. Moreover when website downtimes are returned by error codes thrown by the browser, the alarms engine examines them from other global locations(secondary) and then confirms whether a website is down or not. And when a website is marked down, persistent monitoring is done every minute to reduce the downtime period.
Thresholds on Performance
Apart from uptime monitoring, Site24x7 also examines the performance of your resources, validates response and notifies if there is any problem detected by sending severity status as Trouble, Down etc. Alarms engine ensures the validity of data so that corrective actions can be taken when a particular keyword is present or not in your web page. For example, keywords like "Exception", "Error", "Page Not Found" will trigger an alert when present in the web page. Site24x7 also checks for the presence of non-static keywords in your site that are either generated by your scripts (JSP or ASP) or output from your back-end server and also triggers alert when unauthorized changes are made to the web page.
Server Monitoring:
Site24x7 has smart alerting for some of the metrics like Response Time URLs, CPU and Memory Utilization for servers built in. The Trouble status is generated only if the
- Current value is higher than Threshold Configured.
- Greater than the average of the 6 polled values (current value + 5 last polled values).
- Atleast one other value in the last 5 polled values is higher than the threshold configured.
To know how Alarms Engine keeps an eye on your server uptime, refer here.
E-mail sample of the RCA report generated during a server downtime