Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

catch errors during secret refreshes #2438

Open
wants to merge 3 commits into
base: main
Choose a base branch
from
Open

Conversation

rtpascual
Copy link
Contributor

Problem

For long running Lambda functions, there may be times when our SSM shim which gets secret values is running when the Lambda starts shutting down causing errors like:

ERROR	Unhandled Promise Rejection 	
{
    "errorType": "Runtime.UnhandledPromiseRejection",
    "errorMessage": "TimeoutError: Client network socket disconnected before secure TLS connection was established",
    "reason": {
        "errorType": "TimeoutError",
        "errorMessage": "Client network socket disconnected before secure TLS connection was established",
        "code": "ECONNRESET",
        "path": null,
        "host": "ssm.us-west-1.amazonaws.com",
        "port": 443,
        "name": "TimeoutError",
        "$metadata": {
            "attempts": 3,
            "totalRetryDelay": 182
        },
        "stack": [
            "Error: Client network socket disconnected before secure TLS connection was established",
            "    at connResetException (node:internal/errors:720:14)",
            "    at TLSSocket.onConnectEnd (node:_tls_wrap:1714:19)",
            "    at TLSSocket.emit (node:events:529:35)",
            "    at endReadableNT (node:internal/streams/readable:1400:12)",
            "    at process.processTicksAndRejections (node:internal/process/task_queues:82:21)"
        ]
    },
    "promise": {},
    "stack": [
        "Runtime.UnhandledPromiseRejection: TimeoutError: Client network socket disconnected before secure TLS connection was established",
        "    at process.<anonymous> (file:///var/runtime/index.mjs:1276:17)",
        "    at process.emit (node:events:517:28)",
        "    at emit (node:internal/process/promises:149:20)",
        "    at processPromiseRejections (node:internal/process/promises:283:27)",
        "    at process.processTicksAndRejections (node:internal/process/task_queues:96:32)"
    ]
}

Issue number, if available:

Changes

Catch errors on subsequent calls to get secrets and do nothing so the Lambda function can shutdown gracefully.

Corresponding docs PR, if applicable:

Validation

Lambda deployed to my account that is running on a schedule every minute:
Screenshot 2025-01-21 at 12 43 13 PM
Error spikes are the error message mentioned above.

Checklist

  • If this PR includes a functional change to the runtime behavior of the code, I have added or updated automated test coverage for this change.
  • If this PR requires a change to the Project Architecture README, I have included that update in this PR.
  • If this PR requires a docs update, I have linked to that docs PR above.
  • If this PR modifies E2E tests, makes changes to resource provisioning, or makes SDK calls, I have run the PR checks with the run-e2e label set.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

@rtpascual rtpascual requested a review from a team as a code owner January 21, 2025 20:58
Copy link

changeset-bot bot commented Jan 21, 2025

🦋 Changeset detected

Latest commit: e985e64

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
@aws-amplify/backend-function Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

Amplifiyer
Amplifiyer previously approved these changes Jan 21, 2025
ShadowCat567
ShadowCat567 previously approved these changes Jan 21, 2025
Comment on lines 12 to 14
// Catch errors and do nothing in the case we are retrieving secrets when the Lambda starts shutting down
// eslint-disable-next-line promise/prefer-await-to-then
void internalAmplifyFunctionResolveSsmParams().catch(() => {});
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Two things.

  1. Can we try-catch inside internalAmplifyFunctionResolveSsmParams ?
  2. Should we at least console.debug this ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re 1. Peharps we don't because we want lambda to crash at cold start.

But maybe there's some way to have normal try catch here ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For 1, yeah we would want it to crash at cold start, just not for subsequent SSM calls. I'll look into having this as a normal try catch so we can remove the eslint comment.

For 2, any logging would cause an EPIPE error in the scenario for this error.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For 2, any logging would cause an EPIPE error in the scenario for this error.

Unless... we try-catch logger as well.

The thing I'm worried about is that empty catch block will eat legit errors that are not due to shutdown.

Copy link
Contributor Author

@rtpascual rtpascual Jan 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But maybe there's some way to have normal try catch here ?

Seems like it's not possible, tried doing

setInterval(() => {
  try {
    void internalAmplifyFunctionResolveSsmParams();
  } catch (error) {
    // Catch errors and do nothing in the case we are retrieving secrets when the Lambda starts shutting down
  }
}, SSM_PARAMETER_REFRESH_MS);

and Lambda doesn't seem to handle top level try catch gracefully and gives us the original error.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you would need this

setInterval(async () => {
  try {
    await internalAmplifyFunctionResolveSsmParams();
  } catch (error) {
    // Catch errors and do nothing in the case we are retrieving secrets when the Lambda starts shutting down
  }
}, SSM_PARAMETER_REFRESH_MS);

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

setInterval(void (async () => {
  try {
    await internalAmplifyFunctionResolveSsmParams();
  } catch (error) {
    // Catch errors and do nothing in the case we are retrieving secrets when the Lambda starts shutting down
  }
}), SSM_PARAMETER_REFRESH_MS);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

...I may still be in weekend mode, of course it has to be like this.

@rtpascual rtpascual dismissed stale reviews from ShadowCat567 and Amplifiyer via 306fd62 January 22, 2025 00:33
Comment on lines 11 to 24
// eslint-disable-next-line @typescript-eslint/no-misused-promises
setInterval(async () => {
try {
await internalAmplifyFunctionResolveSsmParams();
} catch (error) {
try {
// Attempt to log error
// eslint-disable-next-line no-console
console.debug(error);
// eslint-disable-next-line amplify-backend-rules/no-empty-catch
} catch (error) {
// Do nothing if logging fails
}
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels bad to add this many eslint-disable-next-line (especially the first one, tried several things but couldn't get them to work), are there any other alternatives?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@rtpascual have you tried this one #2438 (comment) ? (my suggestion differs from @Amplifiyer 's a bit - void and extra bracket, the misused promise is expected in this case that syntax expresses it.)

for second one we should add file like this to lambda-shims folder /~https://github.com/aws-amplify/amplify-backend/blob/main/packages/backend-auth/src/lambda/.eslintrc.json

Last one, we can't solve so that one remains.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missed the differences in your suggestion, updated!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants