Name	Name	Last commit message	Last commit date
parent directory ..
etcd	etcd
resource-agents-build	resource-agents-build
README.md	README.md
apply-rpm-patch.sh	apply-rpm-patch.sh
apply-rpm-patch.yml	apply-rpm-patch.yml
build-and-patch-resource-agents.yml	build-and-patch-resource-agents.yml
collect-tnf-logs.yml	collect-tnf-logs.yml
fencing_validator.sh	fencing_validator.sh
force-new-cluster.yml	force-new-cluster.yml
inventory_ocp_hosts.sample	inventory_ocp_hosts.sample
refresh-certificates.sh	refresh-certificates.sh

Helpers

Utilities for OpenShift cluster operations including package management and cluster validation.

Description

This directory contains multiple helper tools for various OpenShift cluster operations:

Resource Agent Patching: Scripts and playbooks (recommended usage) for installing RPM packages on cluster nodes using rpm-ostree's override functionality
Fencing Validation: Tools for validating two-node cluster fencing configuration and health
Custom OCP 5.x payload (optional): resource-agents-build/custom-payload.sh to build a custom RHCOS layer from the resource-agents RPM and publish a custom release image with oc adm release new

Requirements

For build-and-patch-resource-agents.yml and apply-rpm-patch.yml

Ansible playbooks. See below for specific prerequisites for each

For apply-rpm-patch.sh

oc CLI tool (logged into OpenShift cluster)
jq for JSON processing
SSH access to cluster nodes

For fencing_validator.sh

oc CLI tool (logged into OpenShift cluster)
For SSH transport: passwordless sudo access to cluster nodes
Two-node cluster with fencing topology

Available Tools

Force New Cluster

Automates etcd cluster recovery by configuring CIB (Cluster Information Base) attributes to force a new etcd cluster formation. This is useful when etcd quorum is lost and manual intervention is required to restore cluster functionality.

Features:

Automated etcd snapshot creation before recovery operations
CIB attribute management for force-new-cluster operations
Leader/follower node detection and verification
Etcd member list management
Automatic cleanup and resource recovery
STONITH management during operations

Usage:

# From helpers/ directory
ansible-playbook -i ../deploy/openshift-clusters/inventory.ini force-new-cluster.yml

Prerequisites:

Inventory file with exactly 2 nodes in cluster_vms group
SSH access to cluster VMs with sudo privileges
Running Pacemaker cluster with etcd resources

What it does:

Validates cluster has exactly 2 nodes
Disables STONITH temporarily for safety
Takes etcd snapshots on both nodes (if etcd is not running)
Clears existing CIB attributes (learner_node, standalone_node, force_new_cluster)
Sets force_new_cluster attribute on the leader node (first node in cluster_vms)
Verifies CIB attributes on both nodes
Removes follower from etcd member list
Performs pcs resource cleanup on both nodes
Re-enables STONITH after completion

Attribution: Original shell script by Carlo Lobrano

Log Collection

Collects etcd related logs from cluster VMs

Usage:

From deploy/ directory (recommended):

make get-tnf-logs

Using Ansible directly:

# From helpers/ directory
ansible-playbook -i ../deploy/openshift-clusters/inventory.ini collect-tnf-logs.yml

Prerequisites:

Inventory file with cluster_vms group
SSH access to cluster VMs via ProxyJump
oc CLI tool on cluster nodes

Fencing Validator

Validates fencing configuration and health for two-node OpenShift clusters with STONITH-enabled Pacemaker.

Features:

Non-disruptive validation (default): Checks STONITH presence/enabled status, node health, etcd quorum, and daemon status
Disruptive testing: Performs actual fencing of both nodes to verify recovery (optional with --disruptive)
Multiple transport methods: Auto-detection, SSH, or oc debug
IPv4/IPv6 support with automatic node discovery

Usage:

From outside the hypervisor (uses oc debug transport by default):

# Non-disruptive validation (recommended)
./fencing_validator.sh

# With custom hosts
./fencing_validator.sh --hosts "10.0.0.10,10.0.0.11"

From inside the hypervisor via ansible (requires hypervisor deployed via make deploy):

# Copy script to hypervisor and execute remotely
ansible all -i deploy/openshift-clusters/inventory.ini -m copy -a "src=helpers/fencing_validator.sh dest=~/fencing_validator.sh mode=0755"
ansible all -i deploy/openshift-clusters/inventory.ini -m shell -a "./fencing_validator.sh"

Disruptive testing options:

# Disruptive testing (NOTE: Not yet supported - under development)
./fencing_validator.sh --disruptive

# Dry run to see what would be tested
./fencing_validator.sh --disruptive --dry-run

Note: Disruptive testing functionality is not yet fully supported and should not be used in production environments.

Resource Agents Patching

The build-and-patch-resource-agents.yml playbook automates the entire workflow:

Builds the resource-agents RPM on the hypervisor
Copies the RPM back to your laptop
Automatically calls apply-rpm-patch.yml to patch cluster nodes

Usage

# From the deploy/ directory
# Simplest, no customization. Uses resource-agents repo, main branch, auto sets next version
make patch-nodes

Using Ansible Directly

# From the helpers/ directory

# Use defaults (ClusterLabs repo, main branch, version 4.11)
ansible-playbook -i ../deploy/openshift-clusters/inventory.ini \
  build-and-patch-resource-agents.yml

# Specify custom version
ansible-playbook -i ../deploy/openshift-clusters/inventory.ini \
  build-and-patch-resource-agents.yml \
  -e rpm_version=4.12

# Use custom repository and branch
ansible-playbook -i ../deploy/openshift-clusters/inventory.ini \
  build-and-patch-resource-agents.yml \
  -e repo_url=https://github.com/myorg/resource-agents \
  -e rpm_branch=my-feature-branch \
  -e rpm_version=5.0

Prerequisites:

Inventory file at ../deploy/openshift-clusters/inventory.ini with both metal_machine and cluster_vms groups
SSH access to hypervisor (metal_machine)
ProxyJump SSH configuration for cluster VMs (automatically configured by setup.yml)

What it does:

Validates inventory contains both [metal_machine] and [cluster_vms] groups
Installs build dependencies on hypervisor
Clones resource-agents repository on hypervisor
Builds RPM using make rpm VERSION=<version>
Fetches RPM back to helpers/ directory
Automatically patches cluster_vms group with the new RPM
Reboots cluster nodes one at a time with etcd health verification

Variables:

repo_url: Git repository URL (default: https://github.com/ClusterLabs/resource-agents)
rpm_branch: Git branch to checkout (default: main)
rpm_version: Version string for the RPM (default: 4.11)

apply-rpm-patch.yml playbook

If the RPM to be installed is already available to you, this Ansible playbook provides automated installation and rebooting with proper orchestration.

Option 1: From Your Laptop

Use with the automatically-generated inventory from the openshift-clusters deployment:

# Target the cluster_vms group
ansible-playbook -i /path/to/inventory.ini \
  apply-rpm-patch.yml \
  -l cluster_vms \
  -e rpm_full_path=/absolute/path/to/package.rpm

Prerequisites:

Inventory with cluster_vms group (created automatically by update-cluster-inventory.yml task)
ProxyJump SSH configuration through hypervisor (automatically configured in inventory)
Absolute path to RPM file on your laptop

Process:

Validates RPM file exists on localhost
Copies RPM to cluster VMs via ProxyJump
Installs using rpm-ostree override with privilege escalation
Reboots nodes one at a time
Verifies etcd health after reboot

Option 2: From the Hypervisor

Use with a custom inventory directly on the hypervisor:

# On the hypervisor, create a simple inventory file first
# See inventory_ocp_hosts.sample for reference
ansible-playbook -i inventory_ocp_hosts \
  apply-rpm-patch.yml \
  -e rpm_full_path=/path/to/package.rpm

Prerequisites:

Copy RPM file and apply-rpm-patch.yml playbook to hypervisor
Create inventory file listing cluster VM IPs (see inventory_ocp_hosts.sample)

Process:

Validates RPM file existence
Copies RPM to all nodes
Installs using rpm-ostree override with privilege escalation
Reboots nodes one at a time
Verifies etcd health after reboot

apply-rpm-patch.sh (not recommended)

If you don't want or are unable to use the previous Ansible playbooks, you can use this shell script .It should be inoked from within the hypervisor, as it requires direct access to the nodes via SSH and assumes the "core" user.

Usage

./apply-rpm-patch.sh /path/to/package.rpm

Process:

Validates required tools and RPM file
Discovers all node IPs via OpenShift API
Copies RPM to each node using SCP
Installs package with rpm-ostree override replace
Provides manual reboot commands

Note: The shell script does not handle reboots automatically. You must manually reboot nodes after installation. Follow the instructions provided at the end of the script execution

Containerized Build Validation

The resource-agents-build/ directory contains Dockerfiles and a script for validating that resource-agents compiles correctly on CentOS Stream 9 and 10, without needing a hypervisor or cluster. This is useful for quickly verifying a branch builds before running the full build-and-patch-resource-agents.yml playbook.

Usage:

cd helpers/resource-agents-build

# Run both builds — prompts for repo and ref, press Enter to use defaults
./local-build-test.sh

# Skip prompts by providing values via flags
./local-build-test.sh --repo https://github.com/myorg/resource-agents --ref my-feature-branch

# Build individually with podman
podman build -f Dockerfile.stream9 -t localhost/tnf-resource-agents-build:stream9 .
podman build -f Dockerfile.stream10 -t localhost/tnf-resource-agents-build:stream10 .

Script options:

Option	Description
`--repo URL`	Git repository URL (default: `https://github.com/ClusterLabs/resource-agents`)
`--ref REF`	Git branch, tag, or commit (default: `main`)
`-h`, `--help`	Show help

When no flags are provided, the script prompts for each value. Press Enter to use the default.

Extracting the built RPM from the container (Stream 9 only — Stream 10 skips make rpm):

# Build from a specific branch
./local-build-test.sh --ref my-feature-branch

# Copy the RPM out of the Stream 9 image
podman create --name ra-build localhost/tnf-resource-agents-build:stream9
podman cp ra-build:/tmp/resource-agents.rpm ./resource-agents.rpm
podman rm ra-build

# Then patch your cluster nodes with it
ansible-playbook -i ../../deploy/openshift-clusters/inventory.ini \
  ../apply-rpm-patch.yml \
  -l cluster_vms \
  -e rpm_full_path=$(pwd)/resource-agents.rpm

This is useful when you want to validate the RPM locally before patching.

Stream 10 limitation: libqb-devel is not yet available in EPEL 10. The Dockerfile builds libqb from source for configure/make validation, but skips make rpm since rpmbuild's BuildRequires: libqb-devel cannot be satisfied without the actual RPM package.

custom-payload.sh

custom-payload.sh ties the containerized build to an OpenShift 5.x release: it resolves a nightly payload, prints the base rhel-coreos / rhel-coreos-10 image from oc adm release info, and generates Dockerfile.custompayload by combining the contents of Dockerfile.stream9 or Dockerfile.stream10 (depending on --base-os) with an RPM-ostree override snippet—the script does not change Dockerfile.stream9 or Dockerfile.stream10 on disk. Then, unless you use dry-run-style flags, it runs podman to build and push the custom OS image and runs oc adm release new to publish a custom release payload image that points that OS layer at the correct component name (for example rhel-coreos-10=...).

When to use it: after you are happy with a resource-agents RPM from local-build-test.sh (or an equivalent build), and you want a full custom payload image to install or test on a 5.0 line cluster—without going through the hypervisor-based build-and-patch-resource-agents.yml path for node RPM overrides alone.

Requirements: curl, oc, podman, and jq or python3 (for --auto-release). Pull secret handling works the same way as other oc / podman registry operations (-a / PULL_SECRET_PATH, or the default ~/.docker/config.json / ~/.config/containers/auth.json as documented in the script's --help).

Usage (from helpers/resource-agents-build/):

cd helpers/resource-agents-build

# Use the newest Accepted 5.0.0-0.nightly tag from the release API
./custom-payload.sh --auto-release

# Or pin a full release pullspec
./custom-payload.sh --release registry.ci.openshift.org/ocp/release-5:5.0.0-0.nightly-...

# Explicit pull secret (optional if a default auth file exists)
./custom-payload.sh --auto-release -a /path/to/pull-secret

# Only print Dockerfile and commands; do not run oc / skip real builds (see --help)
./custom-payload.sh --auto-release --print-only
./custom-payload.sh --auto-release --no-build

Run ./custom-payload.sh --help for flags such as --base-os (rhel-coreos vs rhel-coreos-10), --to-os-image, and --to-payload-image (custom registry targets and tags).

Pull secret and the same registry twice: if you use different tokens on the same registry host—for example one credential for the general quay.io pull path and another for a specific org such as quay.io/rh-edge-enablement—put them in separate auths entries, for example one key quay.io and another quay.io/rh-edge-enablement. A single quay.io entry may not match both paths reliably; splitting them keeps pulls and pushes unambiguous for oc and podman.

ex:

{
        "auths": {
           ....
                "quay.io": {
                        "auth": "<secret>"
                },
                "quay.io/eggfoobar": {
                        "auth": "<secret_same>"
                },
                "quay.io/rh-edge-enablement": {
                        "auth": "<secret_same>"
                },
           ....
        }
}

Notes

Both tools use rpm-ostree override replace which is appropriate for updating existing packages
Node reboots are required to activate rpm-ostree changes
The Ansible playbooks handle rebooting automatically with proper orchestration; the shell script requires manual intervention
Plan reboots carefully to maintain cluster availability
Monitor cluster health during the patching process

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

Helpers

Description

Requirements

For build-and-patch-resource-agents.yml and apply-rpm-patch.yml

For apply-rpm-patch.sh

For fencing_validator.sh

Available Tools

Force New Cluster

Log Collection

Fencing Validator

Resource Agents Patching

Usage

Using Ansible Directly

apply-rpm-patch.yml playbook

Option 1: From Your Laptop

Option 2: From the Hypervisor

apply-rpm-patch.sh (not recommended)

Usage

Containerized Build Validation

custom-payload.sh

Notes

FilesExpand file tree

helpers

Directory actions

More options

Directory actions

More options

Latest commit

History

helpers

Folders and files

parent directory

README.md

Helpers

Description

Requirements

For build-and-patch-resource-agents.yml and apply-rpm-patch.yml

For apply-rpm-patch.sh

For fencing_validator.sh

Available Tools

Force New Cluster

Log Collection

Fencing Validator

Resource Agents Patching

Usage

Using Ansible Directly

apply-rpm-patch.yml playbook

Option 1: From Your Laptop

Option 2: From the Hypervisor

apply-rpm-patch.sh (not recommended)

Usage

Containerized Build Validation

custom-payload.sh

Notes