Important things to know about nodes and clusters.

  • IBM i cluster resource services provide mechanism and management services. These services group IBM i systems, partitions and resources working together as a unified system.
  • Clusters provide application availability and highly available disk pools.
  • An IBM i cluster can have one or more nodes up to a maximum of 128.
    A node can only be a member of only one cluster.
  • During cluster configuration resources (nodes, disk pool, Takeover IP address, e.t.c.) are defined to be part of the cluster.

When the iASP is switched from SYSA to SYSB it should have the same iASP name, iASP number, disk unit numbers and virtual address. This is achieved by making cluster nodes as member of a construct called Device domain. Device domain also prevents conflicts that can cause an attempt to switch the iASP to fail. So, after creation of the cluster and enrollment of nodes in to it, you add nodes to a device domain (ADDDEVDMNE). Note that a cluster node can belong to only one device domain.

Let us move on now to Cluster Resource Groups. A Cluster Resource Group (CRG) is an IBMi System Object that is used to manage high availability of resources in a cluster.

IBM i cluster resource services provide the ability to create specific type of CRGs like application, data, device and peer. The differences are noted below.

  • The iASP has data that should be available on SYSB after a switchover or failover. A Device CRG manages this availability
  • A resilient application would use an Application CRG to restart the application on the backup node or to enable a take over IP.
  • When there are logical replication based applications in a cluster middlware deployment. A Data DRG manages the availability of data.
  • Peer CRG is used to make objects in *SYSBAS available and synchronized.

Each of the supported IBM i cluster CRGs has two common elements: a recovery domain and an exit program. The exit program invokes actions when the CRG detects cluster wide events such as node failure or application errors. The exit program is optional for a resilient Device CRG but is required for the other CRG types.

The CRG manages resource availability across a subset of nodes within the cluster called a recovery domain. Within recovery domain you have a primary node and one or more backup nodes. Primary Node and Backup nodes are roles within the recovery domain. Primary Node has access to switchable resources. The role of nodes changes during switchover or failover. All nodes that will be in the recovery domain for a device CRG must be in the same device domain.

Tasks to perform when creating a switchable cluster:

  1. Create a device CRG (CRTCRG) with the cluster defined earlier and make SYSA as the primary node and SYSB as the  backup node. During creation of the device CRG, you associate the existing iASP on SYSA with the device CRG. The iASP is now defined at a cluster level as a switchable device and both the nodes in the cluster are aware of it.
  2. Add SYSA and SYSB to the recovery domain of the device CRG.
  3. Start the device CRG (STRCRG) to enable the iASP switchover. Switched disk configuration is now complete and iASP can be switched over manually.
  4. Using command CHGCRGPRI a manual switchover can be invoked. Switchover causes the role of SYSA and SYSB to change. The current primary node (SYSA) is assigned the role of last active backup. The current first backup is assigned the role of primary.

How does switching work?

If the iASP is varied on, cluster resource services will try to end jobs having the iASP in its namespace and then vary off the iASP on SYSA.

If the device CRG was set to vary on the iASP on SYSB, the cluster resource services will submit a batch job QCSTVRYDEV on SYSB for varying on the iASP.

If the switchover is successful the device CRG status changed from Switchover Pending to Active.


Permalink Part 1-

Permalink Part 2 –