Warm Standby

Concept

To make the iAGENT system more resistant to failures, the routing process is running on a secondary node on a separate server in the same network as a backup – an identical iAGENT routing process is installed, running in warm standby (inactive) mode on the same iAGENT database.

In the event of a failure of the primary routing process, the secondary process automatically switches (by means of DB synchronization) from standby to active. In the “warm standby” state, the routing process has already initialized some components and thus may have already gone through longer steps of the startup sequence. After completing the startup sequence, the standby routing process is fully functional and takes over all tasks of the original primary node.

How is a failure of the primary process detected?

The active routing process writes a “heartbeat update” to the database every 5 seconds. If this cyclic update fails to occur due to a process failure or hardware failure, the backup routing process detects the failover situation and obtains additional confirmation by attempting a connection between the backup and primary nodes.

If the primary routing process is properly stopped or database is unavailable or the connection to the database from either node is broken, there is no takeover by the backup node.

How does switching back to the primary process work?

It is ensured that only one active routing process is running against the same iAGENT database at a time. When the primary node becomes available again and resumes its cyclic update, it first shuts down again while the backup is still active. The backup must first be terminated manually after a failover.

Requirements

  • Second server for the backup process with similar hardware equipment in the same network
  • Apache web server (see iAGENT system requirements)
  • File system synchronization of certain files/directories with write permissions on both sides between primary node and backup (see iAGENT warm standby file system for documentation)
  • Access to iAGENT database from backup
  • TCP/IP connection / firewall sharing between primary and backup
  • http/https accessibility of Apache on the backup by the clients