WFCS and SQL 2016 FCI – SCSI-3 Persistent Reservation and FIPS

WFCS and SQL 2016 FCI – SCSI-3 Persistent Reservation and FIPS

August 30, 2016

I recently did a Failover Instance installation and configuration on SQL Server 2016 and though to share two major issues I faced. The fist was the following error while attempting Cluster Validation:

Failover Cluster Validation Report

Node <Node1 Name> successfully issued call to Persistent Reservation RESERVE for Test Disk 0 which is currently reserved by node <Node2 Name>. This call is expected to fail.
Test Disk 0 does not provide Persistent Reservations support for the mechanisms used by failover clusters. Some storage devices require specific firmware versions or settings to function properly with failover clusters. Please contact your storage administrator or storage vendor to check the configuration of the storage to allow it to function properly with failover clusters.
Stop: 8/29/2016 2:51:23 PM.
Test failed. Please look at the test log for more information.


The first time I have this issue some two years ago, it was related to the version of V-Sphere we were running back then and the manner in which the virtual disks were presented to the VMs I was working on. Back then I ended up installing a one-node cluster! What? Does that exist? Well, i just told you it does.
This time around my Sys Admins assured me he had followed all the the required steps for configuring the VMs for clustering according to best practices. I believed him because he is good with VMware. And when I say someone is good, he must b good.
Summary is after a lot of troubleshooting around enabling MPIO, applying patches and using Clear-ClusterDiskReservation, I ended up using plan B in this blog and it worked:


  1. Shutdown all nodes but keep only one node active
  2. Restart that active node while other nodes are shutdown
  3. Verify that disks are accessible in disk management in the active node, and start other cluster nodes

Note that I excluded a step in the original sequence. This is because in my case, the cluster was not yet setup so there was no service to stop or start.

Once I had overcome this major hurdle, I went on to install the SQL Server FCI using Advanced Preparation/Completion which was cool. However, at the end of my installation a certain resource called SQL Server (CEIP) failed to start . These are the errors through:

From the Cluster Error Log:


Cluster resource ‘SQL Server CEIP (MSSQLSERVER)’ of type ‘Generic Service’ in clustered role ‘<VNO>’ failed.

Based on the failure policies for the resource and role, the cluster service may try to bring the resource online on this node or move the group to another node of the cluster and then restart it. Check the resource and group state using Failover Cluster Manager or the Get-ClusterResource Windows PowerShell cmdlet.



Generic service ‘SQL Server CEIP (MSSQLSERVER)’ failed with error ‘1067’. Please examine the application event log.


From Event Viewer Application Log:


Faulting application name: sqlceip.exe, version: 13.0.1601.5, time stamp: 0x57245244
Faulting module name: KERNELBASE.dll, version: 6.3.9600.18340, time stamp: 0x57366075
Exception code: 0xe0434352
Fault offset: 0x0000000000008a5c
Faulting process id: 0x1aec
Faulting application start time: 0x01d202bc3121dea7
Faulting application path: C:\Program Files\Microsoft SQL Server\MSSQL13.MSSQLSERVER\MSSQL\Binn\sqlceip.exe
Faulting module path: C:\Windows\system32\KERNELBASE.dll
Report Id: 6eecd790-6eaf-11e6-810d-005056915f0d
Faulting package full name:
Faulting package-relative application ID:



Fault bucket , type 0
Event Name: CLR20r3
Response: Not available
Cab Id: 0

Problem signature:
P1: sqlceip.exe
P2: 13.0.1601.5
P3: 57245244
P4: mscorlib
P5: 4.6.1055.0
P6: 563c113c
P7: 21fb
P8: 1c
P9: System.InvalidOperationException
P10:

Attached files:

These files may be available here:
C:\ProgramData\Microsoft\Windows\WER\ReportQueue\AppCrash_sqlceip.exe_f12de3a955d767e42364d059c15eb2f25930b475_1b763428_19f019e5

Analysis symbol:
Rechecking for solution: 0
Report Id: 6eecd790-6eaf-11e6-810d-005056915f0d
Report Status: 4100
Hashed bucket:


A few similar errors were observed. After doing things like disabling the Antivirus, running Failover Cluster Manager as administrator and granting the SQL Server Service Account Admin rights, I eventually decided to run the sqlceip.exe executable manually to see what I would encounter and I saw this:

 


… This implementation is not part of the Windows Platform FIPS validated Cryptograhic Algorithms…


 

I had seen something similar before the first time I configured Reporting Services on SQL Server 2012 so I knew a workaround. I engaged the Enterprise Admin to disable FIPS on my servers and he did so by moving the servers to a non-FIPS OU on Active Directory. I did a gpupdate /force and voila! SQL Server CEIP came online.

SQL

AD

Igiri Books © Copyright 2016. All Rights Reserved. Site Credits: AppWorld