[BUG]: PowerMax node pods are crashing even though the second array is reachable #1769
Labels
area/csi-powermax
Issue pertains to the CSI Driver for Dell EMC PowerMax
type/bug
Something isn't working. This is the default label associated with a bug issue.
Milestone
Bug Description
PowerMax node pods are crashing when the IP interfaces for first array are not reachable and when it processes the second/next array - if 2-minute timeout is reached for NodeGetInfo() call, the port-group API call fails with 'context cancelled' error.
This can be addressed by adding an IP reachability check similar to the PowerStore driver. We can potentially avoid the node pod crash and topology map will contain at least the second array.
Logs
time="2025-02-18T17:51:42Z" level=info msg="/csi.v1.Node/NodeGetInfo: REQ 0006: XXX_NoUnkeyedLiteral={}, XXX_sizecache=0"
Error discovering 10.20.30.41: exit status 11
time="2025-02-18T17:52:42Z" level=error msg="Failed to connect to the IP interface(10.20.30.40) of array(000120000001)"
Error discovering 10.20.30.40: signal: killed
time="2025-02-18T17:53:02Z" level=error msg="Failed to connect to the IP interface(10.20.30.41) of array(000120000001)"
Error discovering 10.20.30.41: exit status 11
time="2025-02-18T17:53:07Z" level=error msg="Failed to connect to the IP interface(10.20.30.42) of array(000120000001)"
time="2025-02-18T17:53:24Z" level=error msg="Failed to connect to the IP interface(10.20.30.41) of array(000120000001)"
Error discovering 10.20.30.42: exit status 11
Error discovering 10.20.30.41: exit status 11
time="2025-02-18T17:53:42Z" level=error msg="Failed to connect to the IP interface(10.20.30.41) of array(000120000001)"
time="2025-02-18T17:53:42Z" level=error msg="GetPortGroupByID failed: Get "https://csipowermax-reverseproxy:2222/univmax/restapi/100/sloprovisioning/symmetrix/000120000002/portgroup/iscsi-PG-sl\": context deadline exceeded"
time="2025-02-18T17:53:42Z" level=error msg="unable to fetch ip interfaces for 000120000002: Get "https://csipowermax-reverseproxy:2222/univmax/restapi/100/sloprovisioning/symmetrix/000120000002/portgroup/iscsi-PG-sl\": context deadline exceeded"
time="2025-02-18T17:53:42Z" level=error msg="No topology keys could be generated"
time="2025-02-18T17:53:42Z" level=info msg="/csi.v1.Node/NodeGetInfo: REP 0006: rpc error: code = FailedPrecondition desc = no topology keys could be generated"
Screenshots
No response
Additional Environment Information
No response
Steps to Reproduce
Add 2 arrays to secret with iSCSI as protocol, with first array's network not reachable to the nodes
Deploy the driver
Expected Behavior
Node pod shouldn't crash
CSM Driver(s)
CSI PowerMax v2.12
Installation Type
Operator
Container Storage Modules Enabled
None
Container Orchestrator
K8s
Operating System
RHEL
The text was updated successfully, but these errors were encountered: