Why monitoring sucks — for now
http://gigaom.com/2012/02/12/why-monitoring-sucks-for-now/
A new (old) model
I’d suggest that any well-designed monitoring tool can help automate the OODA loop for operations teams.
1. Deep integration
2. Contextual alerting and pattern recognition
3. Timeliness
4. High resolution
5. Dynamic configuration
What’s next for monitoring?
Why Alerts Suck and Monitoring Solutions need to become Smarter
http://www.appdynamics.com/blog/2012/01/23/why-alerts-suck-and-monitoring-solutions-need-to-become-smarter/
#1 Problem Identification – Do I have a problem?
#2 Problem Isolation – Where is my problem?
#3 Problem Resolution – How do I fix my problem?
My ideal monitoring system
http://forecastcloudy.net/2012/01/12/my-ideal-monitoring-system/
- Hosted (CloudKick, ServerDensity, CloudWatch, RevelCloud and others) vs Installed (Nagios, Munin, Ganglia, Cacti)
- Hosted solutions pricing plans use varied parameters such as price/server, price/metric, retention policy, # of metrics tracked, realtime-ness, etc.
- Poll based method – where collecting server polls the other servers/service vs. Push – where you have a client on the server that pushes locally collected data to logging/monitoring server
- Allowing custom metrics – not all systems allows monitoring, plotting, sending and alert on custom data (at least not in a easy manner)