This documentation is for a release that is no longer maintained
See documentation for the latest supported version 3 or the latest supported version 4.Chapter 6. Downgrading OpenShift
6.1. Overview Copy linkLink copied to clipboard!
Following an OpenShift Container Platform upgrade, it may be desirable in extreme cases to downgrade your cluster to a previous version. The following sections outline the required steps for each system in a cluster to perform such a downgrade for the OpenShift Container Platform 3.9 to 3.7 downgrade path.
You can downgrade directly from 3.9 to 3.7, but you must restore from the etcd backup.
If the upgrade failed at the 3.8 step, then the same downgrade procedures apply.
These steps are currently only supported for RPM-based installations of OpenShift Container Platform and assumes downtime of the entire cluster.
6.2. Verifying Backups Copy linkLink copied to clipboard!
The Ansible playbook used during the upgrade process should have created a backup of the master-config.yaml file and the etcd data directory. Ensure these exist on your masters and etcd members:
/etc/origin/master/master-config.yaml.<timestamp> /var/lib/etcd/openshift-backup-pre-upgrade-<timestamp> /etc/origin/master/scheduler.json.<timestamp>
/etc/origin/master/master-config.yaml.<timestamp>
/var/lib/etcd/openshift-backup-pre-upgrade-<timestamp>
/etc/origin/master/scheduler.json.<timestamp>
Also, back up the node-config.yaml file on each node (including masters, which have the node component on them) with a timestamp:
/etc/origin/node/node-config.yaml.<timestamp>
/etc/origin/node/node-config.yaml.<timestamp>
If you are using an external etcd cluster (versus the single embedded etcd), the backup is likely created on all etcd members, though only one is required for the recovery process.
Keep a copy of the .rpmsave backups of the following files:
/etc/sysconfig/atomic-openshift-master-api /etc/sysconfig/atomic-openshift-master-controller /etc/etcd/etcd.conf
/etc/sysconfig/atomic-openshift-master-api
/etc/sysconfig/atomic-openshift-master-controller
/etc/etcd/etcd.conf
Restore from the first pre-upgrade backup taken on the day of the upgrade.
6.3. Shutting Down the Cluster Copy linkLink copied to clipboard!
On all masters, nodes, and etcd members (if using an external etcd cluster), ensure the relevant services are stopped.
systemctl stop atomic-openshift-master-api atomic-openshift-master-controllers
# systemctl stop atomic-openshift-master-api atomic-openshift-master-controllers
On all master and node hosts:
systemctl stop atomic-openshift-node
# systemctl stop atomic-openshift-node
On any external etcd hosts:
systemctl stop etcd
# systemctl stop etcd
6.4. Removing RPMs Copy linkLink copied to clipboard!
The *-excluder packages add entries to the exclude directive in the host’s /etc/yum.conf file when installed.
On all masters, nodes, and etcd members (if using a dedicated etcd cluster), remove the following packages:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow If you are using external etcd, also remove the etcd package:
yum remove etcd
# yum remove etcd
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
6.5. Downgrading Docker Copy linkLink copied to clipboard!
OpenShift Container Platform 3.9 requires Docker 1.13, while OpenShift Container Platform 3.7 requires Docker 1.12.
On any host running Docker 1.13, downgrade to Docker 1.12 using the following steps:
Remove all local containers and images on the host. Any pods backed by a replication controller will be recreated.
WarningThe following commands are destructive and should be used with caution.
Delete all containers:
docker rm $(docker ps -a -q) -f
# docker rm $(docker ps -a -q) -f
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Delete all images:
docker rmi $(docker images -q)
# docker rmi $(docker images -q)
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Use
yum swap
(instead ofyum downgrade
) to install Docker 1.12.6:yum swap docker-* docker-*1.12.6 -y sed -i 's/--storage-opt dm.use_deferred_deletion=true//' /etc/sysconfig/docker-storage systemctl restart docker
# yum swap docker-* docker-*1.12.6 -y # sed -i 's/--storage-opt dm.use_deferred_deletion=true//' /etc/sysconfig/docker-storage # systemctl restart docker
Copy to Clipboard Copied! Toggle word wrap Toggle overflow You should now have Docker 1.12.6 installed and running on the host. Verify with the following:
docker version systemctl status docker
# docker version # systemctl status docker
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
6.6. Reinstalling RPMs Copy linkLink copied to clipboard!
Disable the OpenShift Container Platform 3.8 and 3.9 repositories, and re-enable the 3.7 repositories:
subscription-manager repos \ --disable=rhel-7-server-ose-3.8-rpms \ --disable=rhel-7-server-ose-3.9-rpms \ --enable=rhel-7-server-ose-3.7-rpms
# subscription-manager repos \
--disable=rhel-7-server-ose-3.8-rpms \
--disable=rhel-7-server-ose-3.9-rpms \
--enable=rhel-7-server-ose-3.7-rpms
On each master, install the following packages:
On each node, install the following packages:
If using an external etcd cluster, install the following package on each etcd member:
yum install etcd
# yum install etcd
6.7. Restoring etcd Copy linkLink copied to clipboard!
The restore procedure for etcd configuration files replaces the appropriate files, then restarts the service.
If an etcd host has become corrupted and the /etc/etcd/etcd.conf
file is lost, restore it using:
ssh master-0 cp /backup/yesterday/master-0-files/etcd.conf /etc/etcd/etcd.conf restorecon -Rv /etc/etcd/etcd.conf systemctl restart etcd.service
$ ssh master-0
# cp /backup/yesterday/master-0-files/etcd.conf /etc/etcd/etcd.conf
# restorecon -Rv /etc/etcd/etcd.conf
# systemctl restart etcd.service
In this example, the backup file is stored in the /backup/yesterday/master-0-files/etcd.conf
path where it can be used as an external NFS share, S3 bucket, or other storage solution.
6.7.1. Restoring etcd v2 & v3 data Copy linkLink copied to clipboard!
The following process restores healthy data files and starts the etcd cluster as a single node, then adds the rest of the nodes if an etcd cluster is required.
Procedure
Stop all etcd services:
systemctl stop etcd.service
# systemctl stop etcd.service
Copy to Clipboard Copied! Toggle word wrap Toggle overflow To ensure the proper backup is restored, delete the etcd directories:
To back up the current etcd data before you delete the directory, run the following command:
mv /var/lib/etcd /var/lib/etcd.old mkdir /var/lib/etcd chown -R etcd.etcd /var/lib/etcd/ restorecon -Rv /var/lib/etcd/
# mv /var/lib/etcd /var/lib/etcd.old # mkdir /var/lib/etcd # chown -R etcd.etcd /var/lib/etcd/ # restorecon -Rv /var/lib/etcd/
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Or, to delete the directory and the etcd, data, run the following command:
rm -Rf /var/lib/etcd/*
# rm -Rf /var/lib/etcd/*
Copy to Clipboard Copied! Toggle word wrap Toggle overflow NoteIn an all-in-one cluster, the etcd data directory is located in the
/var/lib/origin/openshift.local.etcd
directory.
Restore a healthy backup data file to each of the etcd nodes. Perform this step on all etcd hosts, including master hosts collocated with etcd.
cp -R /backup/etcd-xxx/* /var/lib/etcd/ mv /var/lib/etcd/db /var/lib/etcd/member/snap/db chcon -R --reference /backup/etcd-xxx/* /var/lib/etcd/ chown -R etcd:etcd /var/lib/etcd/R
# cp -R /backup/etcd-xxx/* /var/lib/etcd/ # mv /var/lib/etcd/db /var/lib/etcd/member/snap/db # chcon -R --reference /backup/etcd-xxx/* /var/lib/etcd/ # chown -R etcd:etcd /var/lib/etcd/R
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Run the etcd service on each host, forcing a new cluster.
This creates a custom file for the etcd service, which overwrites the execution command adding the
--force-new-cluster
option:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Check for error messages:
journalctl -fu etcd.service
$ journalctl -fu etcd.service
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Check for health status:
etcdctl2 cluster-health
# etcdctl2 cluster-health member 5ee217d17301 is healthy: got healthy result from https://192.168.55.8:2379 cluster is healthy
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Restart the etcd service in cluster mode:
rm -f /etc/systemd/system/etcd.service.d/temp.conf systemctl daemon-reload systemctl restart etcd
# rm -f /etc/systemd/system/etcd.service.d/temp.conf # systemctl daemon-reload # systemctl restart etcd
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Check for health status and member list:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - After the first instance is running, you can restore the rest of your etcd servers.
6.7.1.1. Fix the peerURLS parameter Copy linkLink copied to clipboard!
After restoring the data and creating a new cluster, the peerURLs
parameter shows localhost
instead of the IP where etcd is listening for peer communication:
etcdctl2 member list
# etcdctl2 member list
5ee217d17301: name=master-0.example.com peerURLs=http://*localhost*:2380 clientURLs=https://192.168.55.8:2379 isLeader=true
6.7.1.1.1. Procedure Copy linkLink copied to clipboard!
Get the member ID using
etcdctl member list
:`etcdctl member list`
`etcdctl member list`
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Get the IP where etcd listens for peer communication:
ss -l4n | grep 2380
$ ss -l4n | grep 2380
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Update the member information with that IP:
etcdctl2 member update 5ee217d17301 https://192.168.55.8:2380
# etcdctl2 member update 5ee217d17301 https://192.168.55.8:2380 Updated member with ID 5ee217d17301 in cluster
Copy to Clipboard Copied! Toggle word wrap Toggle overflow To verify, check that the IP is in the member list:
etcdctl2 member list
$ etcdctl2 member list 5ee217d17301: name=master-0.example.com peerURLs=https://*192.168.55.8*:2380 clientURLs=https://192.168.55.8:2379 isLeader=true
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
6.7.2. Restoring etcd for v3 Copy linkLink copied to clipboard!
The restore procedure for v3 data is similar to the restore procedure for the v2 data.
Snapshot integrity may be optionally verified at restore time. If the snapshot is taken with etcdctl snapshot save
, it will have an integrity hash that is checked by etcdctl snapshot restore
. If the snapshot is copied from the data directory, there is no integrity hash and it will only restore by using --skip-hash-check
.
The procedure to restore only the v3 data must be performed on a single etcd host. You can then add the rest of the nodes to the cluster.
Procedure
Stop all etcd services:
systemctl stop etcd.service
# systemctl stop etcd.service
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Clear all old data, because
etcdctl
recreates it in the node where the restore procedure is going to be performed:rm -Rf /var/lib/etcd
# rm -Rf /var/lib/etcd
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Run the
snapshot restore
command, substituting the values from the/etc/etcd/etcd.conf
file:Copy to Clipboard Copied! Toggle word wrap Toggle overflow Restore permissions and
selinux
context to the restored files:chown -R etcd.etcd /var/lib/etcd/ restorecon -Rv /var/lib/etcd
# chown -R etcd.etcd /var/lib/etcd/ # restorecon -Rv /var/lib/etcd
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Start the etcd service:
systemctl start etcd
# systemctl start etcd
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Check for any error messages:
journalctl -fu etcd.service
$ journalctl -fu etcd.service
Copy to Clipboard Copied! Toggle word wrap Toggle overflow
6.8. Bringing OpenShift Container Platform services back online Copy linkLink copied to clipboard!
After you finish your changes, bring OpenShift Container Platform back online.
Procedure
On each OpenShift Container Platform master, restore your master and node configuration from backup and enable and restart all relevant services:
Copy to Clipboard Copied! Toggle word wrap Toggle overflow - On each OpenShift Container Platform node, restore your node-config.yaml file from backup and enable and restart the atomic-openshift-node service:
cp /etc/origin/node/node-config.yaml.<timestamp> /etc/origin/node/node-config.yaml systemctl enable atomic-openshift-node systemctl start atomic-openshift-node
# cp /etc/origin/node/node-config.yaml.<timestamp> /etc/origin/node/node-config.yaml
# systemctl enable atomic-openshift-node
# systemctl start atomic-openshift-node
6.9. Verifying the Downgrade Copy linkLink copied to clipboard!
To verify the downgrade, first check that all nodes are marked as Ready:
oc get nodes
# oc get nodes NAME STATUS AGE master.example.com Ready,SchedulingDisabled 165d node1.example.com Ready 165d node2.example.com Ready 165d
Copy to Clipboard Copied! Toggle word wrap Toggle overflow Then, verify that you are running the expected versions of the docker-registry and router images, if deployed:
oc get -n default dc/docker-registry -o json | grep \"image\" oc get -n default dc/router -o json | grep \"image\"
# oc get -n default dc/docker-registry -o json | grep \"image\" "image": "openshift3/ose-docker-registry:v3.7.23", # oc get -n default dc/router -o json | grep \"image\" "image": "openshift3/ose-haproxy-router:v3.7.23",
Copy to Clipboard Copied! Toggle word wrap Toggle overflow You can use the diagnostics tool on the master to look for common issues and provide suggestions:
oc adm diagnostics
# oc adm diagnostics ... [Note] Summary of diagnostics execution: [Note] Completed with no errors or warnings seen.
Copy to Clipboard Copied! Toggle word wrap Toggle overflow