This white paper addresses four different components of the Tivoli monitoring infrastructure for creating a high availability environment while providing redundancy and failover capacity. Although the principles are the same for each of these systems, the methods for adhering to these principles are different for each of them.
If you've deployed IBM Tivoli Monitoring (ITM), you are probably well aware of how great it is at finding out when things go wrong in an enterprise's IT environment. But when a component of ITM fails, it can leave your organization in the dark. Luckily, there are ways to prepare for and prevent ITM failures. Understanding the different components of ITM and how they interact can make all the difference in terms of keeping an ITM system up and running strong.
This paper will address four different components of the Tivoli monitoring infrastructure and contains some tips for ensuring that those crucial ITM components stay up and running. The four components are:
1. Tivoli Enterprise Monitoring Server (TEMS)
2. Tivoli Enterprise Portal Server (TEPS)
3. Remote Tivoli Enterprise Monitoring Server (RTEMS)
4. Tivoli Enterprise Monitoring Agent (TEMA)
Your strategy for ensuring the availability of each of these systems is based on the same principles: creating a high availability environment and providing redundancy and failover capacity. Although the principles are the same for each of these systems, the methods for adhering to these principles are different for each of them. This paper will address each of these four systems individually and provide examples of ways to ensure their availability.
TEMS-The Queen Bee of Tivoli Monitoring Infrastructure
Tivoli Enterprise Monitoring Server (TEMS) is the most important component of an effective Tivoli monitoring system. TEMS could be considered the check-in point for the entire Tivoli monitoring system: if it goes down, your monitoring system simply won't work. Because of this, measures have to be taken to ensure its availability. These measures can include monitoring as well as creating high availability by providing redundancy/fail over capacity.
How to Build Redundancy into the TEMS
There are two main ways to provide redundancy for the TEMS. One is to use a commercial clustering technology, such HACMP software, to failover components to a new system. In this scenario, there would need to be a secondary server configured identically to the primary server. All of the monitored systems would have to connect up to a virtual IP and this IP address would only be active only on the currently running monitoring system.
Creating redundancy by using clustering software has the advantage of providing a seamless transition to the failover system, thus allowing uninterrupted availability of the monitored system. When using clustering software, managed systems that are being monitored should be blissfully unaware of any problems: the transition to the failover system (ideally) occurs so quickly and smoothly that it barely disrupts the functioning of the system. Although this redundancy solution sounds almost effortless, keep in mind that it requires the purchase of commercial clustering technology as well as implementation of that purchased solution.