Table of Contents
Purpose
Cluster Re-installation through the Installer Script
Clean up on the Analytic personality nodes
Clean up on the Search node
Re-executing the Installer script
Purpose
This document explains the procedure involved in “cleaning up” the analytics nodes prior to re-running the analytics installer script (van_cluster_installer.py) to install an analytics cluster. When the installer script is executed again on the same nodes on which it was previous executed (to a point where one or more nodes were configured through the script), there can be failure conditions ensuring due to existent associations/bindings which lead to a conflict with the new installation.
The procedure detailed in the following sections will ensure that the concenred bindings/associations are cleared up on the analytic/search nodes before the re-run of the script to ensure a successful re-installation.
Cluster Re-installation through the Installer script
Consider a four node cluster consisting of the below nodes (hostnames) on 20.2 version
Analytic personality: can1 and can2
Search personality : csn1 and csn2
The above cluster was installed/created via the analytics installer script (van_cluster_installer.py), which was initiated from the master Versa director.
Let’s imagine that there is a requirement to re-run the analytics installer script to re-install the nodes in the cluster. This requirement could manifest for many reasons
- As a part of testing, where there is to test the installer script through multiple re-runs on the same setup
- During scenario where the installation fails on one of the nodes (and succeeds on the others)
- There was some mis-configuration identified post the completion of the installation script
- The script was terminated in the middle of the execution through user intervention (like pressing ctrl-c)
- A requirement arises to add a few more nodes to the cluster after the installation was completed (these new nodes can be added through executing vansetup.conf/van_setup.py on the individual nodes or they can integrated through the re-installation of the entire cluster)
It’s not recommended to re-run the analytic installer script once it has been executed to a point where one or more nodes have been configured throught it, as there would existing bindings/associations that would interfere with the installation causing it to fail during the second run.
The below “clean-up” procedure would need be executed of the “analytic personality” nodes and one of the “search personality” nodes.
Note: This clean up procedure would have to be executed on the concerned node (separate procedure for Analytic personality and search personality nodes are detailed below) even while re-executing vansetup.py as a part of re-installation on an individual analytic node which is already part of a cluster.
Clean up on the Analytic Personality nodes
The requirement is to dis-associate/delete the node bindings pertaining to the Cassandra database syncing. We see these bindings as the output of “nodetool status” or “vsh dbstatus” (both the outputs are valid in 20.2) on the Analytic personality nodes.
Note: You cannot execute “nodetool status” on a search personality node in 20.2 green-field (or after dse is migrated to fusion after 16.1R2 to 20.2 upgrade) as there is no Cassandra database running on search nodes in the Fusion model
The analytic nodes in our example setup are “can1” and “can2”
The output of “vsh dbstatus” on these nodes is as below
The requirement is to delete the peer binding on each of the above nodes. In order to do so we follow the below steps
Step 1:
Stop the "Cassandra" service on can1 along with "Monit" service.
Step 2:
Execute “nodetool removenode <node-id>” on the peer node(s) – in this case on can2. You can see that can1 shows up as “DN” in the vsh dbstatus (or “nodetool status”) output on can2.
Step 3:
Check to confirm that the binding has indeed been deleted on can2, using “nodetool status” or “vsh dbstatus”
Step 4:
Stop the "Cassandra" service and "Monit" service on can2
Step 5:
Start the Cassandra and Monit service on can1, so that we can repeat the procedure of deleting can2’s binding in can1. As you can see, can2 shows up as “DN” and it can be removed through the use of “nodetool removenode <node-id>”
Step 6:
Check to confirm that can2 binding has indeed been deleted on can1
Step 7:
Start Cassandra and Monit services on can2 and re-check the “nodetool status” output (pause for a few seconds after starting Cassandra before checking the nodetool status output)
The above steps conclude the clean up required on the analytic personality nodes
Clean up on Search Nodes
The clean up procedure on search nodes is fairly straight-forward. You would just need to execute the below command on one of the search nodes (provided that the search node installation was completed during the previous run of the installer script).
You can see that the solr deletion returned a “success” in the response, which means that the search nodes have been dis-associated.
If you check the “vsh dbstatus” on the search nodes, post the above deletion, you will notice that the collections shows up as 0
On csn2:
Note: You will just need to execute to above “deletion” on one of the search nodes, it’s not required to run it on all the search nodes. In fact, if you run the above deletion script on a search node which does not have a solr installation (search node was not fully installed) or which was already dis-associated (by running the deletion script on its peering node), it would return the below error
The above steps conclude the clean procedure on the search nodes.
Re-executing the Installer script
Post the completion of the enlisted steps, check the “vsh status” output on all the nodes – ideally they should all be in “running” state. If you find any of the services in “stopped” state you can try a vsh restart to rectify the status. If the status does not get rectified, you can still continue with the re-installation as the script would perform a restart of its own.
You can edit the clustersetup.conf to make alterations pertaining to the cluster size, ip-addresses or hostnames, if required and save the file. Now you can go ahead with the re-run/re-execution of the installer script from the master director.
Note: This clean up procedure is also required while re-executing vansetup.py on the individual analytic nodes that are already part of a cluster