This section describes the replacement procedure for a failed SATADOM disk on a host.
Before you begin
Ensure that there are at least 4 hosts in the management or workload domain to which the faulty host belongs. If there are less than 4 hosts, add a host to the domain from the capacity pool, if possible.
Procedure
-
Decommission the affected host.
- If you are decommissioning a qualified vSAN Ready Node (i.e. if you did not purchase a fully integrated system from a partner), note the BMC password for the host by navigating to the /home/vrack/bin/directory in the SDDC Manager VM and running the ./vrm-cli.sh lookup-password command.
- On the Dashboard page, click VIEW DETAILS for Workload Domain and click the affected domain.
- In the PHYSICAL RACKS column, click the physical rack that contains the affected server.
- Scroll down to the Hosts section.
-
In the HOST column, click the host name that shows a critical status (for example, N1 in the example below).
-
In the HOST column, click the host name that shows a critical status (for example, N1 in the example below).
The Host Details page displays the details for this host.
- In the HOST column, click the host name that shows a critical status (for example, N1 in the example below).
- Note the IP addresses displayed in the NETWORK TWO and MANAGEMENT IP ADDRESS fields.
-
Click Decommission.
If this host belongs to a workload domain, the domain must include at least 4 hosts. If the domain has fewer than 4 hosts, you must expand the domain before decommissioning the host. If the domain contains only 4 hosts and one of them is dead, click Force decommission to decommission the host.
-
Click CONFIRM.
During the host decommissioning task, the host is removed from the workload domain to which it was allocated and the environment’s available capacity is updated to reflect the reduced capacity. The ports that were being used by the server are marked unused and the network configuration is updated.
-
Monitor the progress of the decommissioning task.
-
On the SDDC Manager Dashboard, click STATUS in the left navigation pane.
-
In the Workflow Tasks section, click View Details.
-
Look for the VI Resource Pool – Decommission of hosts task.
-
After about 10 minutes, refresh this page and wait till the task status changes to Successful.
-
-
For qualified vSAN Ready Nodes, change the password on the host to the common password for ESXi hosts. Log in to the BMC console using the password noted in step a and change the OOB password to D3c0mm1ss10n3d!.
This step is automated for hosts in an integrated system.
- Power off the server.
-
Turn on the chassis-identification LED on the host.
- In a web browser, navigate to the Management IP address that you noted down in step 5.
- Login with your BMC user name and password.
-
Following the documentation from your vendor, turn on the chassis-identification LED.
The chassis-identification LED on the host starts to beacon (flashing on and off).
- Replace the faulty SATADOM on the server and power on the server.
-
Install ESXi on the host. See Install ESXi VIBs on New Host.
-
Select the SATADOM for installation.
-
Set the root password on the host to EvoSddc!2016.
-
- Reboot the host.
-
Log in to the server with the following credentials.
User name: root
Password: EvoSddc!2016
-
Perform the following steps on the host.
- Assign a static IPv4 address between the range 192.168.100.50 – 192.168.100.73, subnet 255.255.252.0, and gateway 192.168.100.1.
- Set the DNS IP to 192.168.1.254.
- Enable SSH.
-
Enable firewall on SSH host and restrict connections to the 192.168.100.0/22 subnet by running the following commands:
esxcli network firewall ruleset set –ruleset-id=sshServer –allowed-all false
esxcli network firewall ruleset allowedip add –ip-address=192.168.100.0/22 –ruleset-id=sshServer
-
SSH to the host and clean the vSAN partitions by running the following commands.
#esxcli vsan storage automode set –enabled=false
#esxcli vsan storage list|grep “Is SSD: true” -C5| grep “Display Name” |awk ‘{print $3}’
Note the SSD naa.
#esxcli vsan storage remove -s SSD naa
Run this command for each diskgroup.
#esxcli vsan cluster leave
- If you were unable to remove the vSAN naa, power cylce the host and re-try step 15.