Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[BUG]: PowerMax node pods are crashing even though the second array is reachable #1769

Open
santhoshatdell opened this issue Feb 21, 2025 · 0 comments
Assignees
Labels
area/csi-powermax Issue pertains to the CSI Driver for Dell EMC PowerMax type/bug Something isn't working. This is the default label associated with a bug issue.
Milestone

Comments

@santhoshatdell
Copy link
Contributor

Bug Description

PowerMax node pods are crashing when the IP interfaces for first array are not reachable and when it processes the second/next array - if 2-minute timeout is reached for NodeGetInfo() call, the port-group API call fails with 'context cancelled' error.

This can be addressed by adding an IP reachability check similar to the PowerStore driver. We can potentially avoid the node pod crash and topology map will contain at least the second array.

Logs

time="2025-02-18T17:51:42Z" level=info msg="/csi.v1.Node/NodeGetInfo: REQ 0006: XXX_NoUnkeyedLiteral={}, XXX_sizecache=0"
Error discovering 10.20.30.41: exit status 11
time="2025-02-18T17:52:42Z" level=error msg="Failed to connect to the IP interface(10.20.30.40) of array(000120000001)"
Error discovering 10.20.30.40: signal: killed
time="2025-02-18T17:53:02Z" level=error msg="Failed to connect to the IP interface(10.20.30.41) of array(000120000001)"
Error discovering 10.20.30.41: exit status 11
time="2025-02-18T17:53:07Z" level=error msg="Failed to connect to the IP interface(10.20.30.42) of array(000120000001)"
time="2025-02-18T17:53:24Z" level=error msg="Failed to connect to the IP interface(10.20.30.41) of array(000120000001)"
Error discovering 10.20.30.42: exit status 11
Error discovering 10.20.30.41: exit status 11
time="2025-02-18T17:53:42Z" level=error msg="Failed to connect to the IP interface(10.20.30.41) of array(000120000001)"
time="2025-02-18T17:53:42Z" level=error msg="GetPortGroupByID failed: Get "https://csipowermax-reverseproxy:2222/univmax/restapi/100/sloprovisioning/symmetrix/000120000002/portgroup/iscsi-PG-sl\": context deadline exceeded"
time="2025-02-18T17:53:42Z" level=error msg="unable to fetch ip interfaces for 000120000002: Get "https://csipowermax-reverseproxy:2222/univmax/restapi/100/sloprovisioning/symmetrix/000120000002/portgroup/iscsi-PG-sl\": context deadline exceeded"
time="2025-02-18T17:53:42Z" level=error msg="No topology keys could be generated"
time="2025-02-18T17:53:42Z" level=info msg="/csi.v1.Node/NodeGetInfo: REP 0006: rpc error: code = FailedPrecondition desc = no topology keys could be generated"

Screenshots

No response

Additional Environment Information

No response

Steps to Reproduce

Add 2 arrays to secret with iSCSI as protocol, with first array's network not reachable to the nodes
Deploy the driver

Expected Behavior

Node pod shouldn't crash

CSM Driver(s)

CSI PowerMax v2.12

Installation Type

Operator

Container Storage Modules Enabled

None

Container Orchestrator

K8s

Operating System

RHEL

@santhoshatdell santhoshatdell added area/csi-powermax Issue pertains to the CSI Driver for Dell EMC PowerMax type/bug Something isn't working. This is the default label associated with a bug issue. labels Feb 21, 2025
@santhoshatdell santhoshatdell added this to the v1.14.0 milestone Feb 21, 2025
@santhoshatdell santhoshatdell self-assigned this Feb 21, 2025
@santhoshatdell santhoshatdell changed the title [BUG]: PMAX node pods are crashing even though the second array is reachable [BUG]: PowerMax node pods are crashing even though the second array is reachable Feb 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/csi-powermax Issue pertains to the CSI Driver for Dell EMC PowerMax type/bug Something isn't working. This is the default label associated with a bug issue.
Projects
None yet
Development

No branches or pull requests

1 participant