At a conference I attended, I recall a speaker sharing the idea that if your customer has to let you know about a problem, you have already failed. WebSphere Health Management system is a system that is meant to bridge this gap, so that problems are detected and remediated before customers are impacted.
At a high level, there are two main components that make up the Health Management system, the Health Controller and Health Policies. The Health Controller is used to enable or disable the Health Management system, to configure how often the Health Management system is invoked, and how application server restarts are handled.
Each Health Policy let's you configure an action that can be triggered on targets when a condition is met, in an attempt to detect and resolve problems systematically.
As an example, a health policy can be created to restart (action) a cluster of application servers (target) that has not been restarted for 7 days (condition).