Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Error Backtraces v2 #17056

Open
wants to merge 10 commits into
base: master
Choose a base branch
from

Conversation

ericnorris
Copy link
Contributor

This pull request seeks to continue the work from #5642, but using an implementation more in line with what nikic suggested in https://externals.io/message/110302#113264:

To put my feedback into more actionable form: Rather than adding an ini
setting that enables error backtraces for all diagnostics, I'd make it a
mask of error types for which a backtrace should be captured, and default
it to fatal errors only. So error_reporting=E_ALL,
error_backtrace=E_ERROR|E_CORE_ERROR|E_COMPILE_ERROR|E_USER_ERROR|E_RECOVERABLE_ERROR|E_PARSE.
It might be handy to expose the internal E_FATAL_ERRORS constant we have
for that.

I think this should give a very sensible default behavior, and people who
want to capture backtraces for all errors still have an option to do so.

Finally, I'm wondering if the backtrace in error_get_last() shouldn't be in
array form, rather than string form.

The RFC will be at https://wiki.php.net/rfc/error_backtraces_v2.

I've created a new INI setting, error_backtrace_recording, which users can set to an error mask to enable backtraces for those errors. It defaults to E_FATAL_ERRORS, which I've also exposed as a userland constant.

Considering the default, any fatal error will now have a backtrace associated with it. For example, a script timeout will now look like:

Fatal error: Maximum execution time of 1 second exceeded in example.php on line 23
Stack trace:
#0 example.php(23): usleep(10000)
#1 example.php(24): recurse()
#2 example.php(24): recurse()
...

This would have been useful to us very recently: we had an issue in production that presented as "Maximum execution time of X second exceeded" errors, and it was difficult for us to identity the underlying culprit. We eventually identified it as an infinite loop, which would have been incredibly obvious in a trace if it was available.

The implementation respects the zend.exception_ignore_args INI setting and the SensitiveParameter attributes, so users can ensure that sensitive arguments do not end up in the backtrace.

I have some open questions about this implementation:

  • Why is zend_startup responsible for setting EG(error_reporting), and not executor_globals_ctor? I've done the same for EG(error_backtrace_recording) purely for consistency.
  • Is there any appetite for modifying the signature of zend_error_cb? If so, I could pass the backtrace as a nullable argument instead of setting an executor global.

I've updated all tests that are now failing due to the stack trace in the second commit, to keep the diff reviewable.

@ericnorris
Copy link
Contributor Author

Apologies for the additional notifications, there are understandably quite a lot of tests that trigger fatal errors.

Zend/zend.c Outdated Show resolved Hide resolved
Zend/zend.c Outdated Show resolved Hide resolved
@bukka
Copy link
Member

bukka commented Dec 8, 2024

I really don't like the amount of changed files. The default for test runner should be to not display it IMHO.

Zend/zend.c Outdated Show resolved Hide resolved
@NattyNarwhal
Copy link
Member

I really don't like the amount of changed files. The default for test runner should be to not display it IMHO.

It is unfortunate; I have a similar feature I was working on that changed error output and the amount of tests it involved touching was really annoying, both to update tests and to review.

@bwoebi
Copy link
Member

bwoebi commented Dec 11, 2024

Can you please add error_backtrace_recording=0 to $ini_overwrites in run-tests.php (and revert most of the phpt changes) - it's generally just noise in all tests.

@ericnorris
Copy link
Contributor Author

Can you please add error_backtrace_recording=0 to $ini_overwrites in run-tests.php (and revert most of the phpt changes) - it's generally just noise in all tests.

I'll take that into consideration - I'm waiting on the RFC discussion and potential vote before making any further changes at this time. Please chime in there if you have any other thoughts on this proposal, I'd appreciate it!

That said, I've kept the changes to a separate commit, which should allow people to review the first commit in isolation without needing to go through the changed tests. I don't really feel strongly here, however, so if people either want this to default to off (which is a voting option), or just to default to off for tests (which we could decide in the implementation), I'm fine with making the change once we've moved further along in the discussion.

Copy link
Member

@derickr derickr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll add that I am also not in favour of changes to alllll the tests. I would also add, that the PHP project does not prefix their commit messages with feat: etc.

NattyNarwhal added a commit to NattyNarwhal/php-src that referenced this pull request Dec 12, 2024
This is a useful feature, but enabling it by default requires rewriting
every PHPT file's output section. Since that would be a hellish diff to
make and to review, I think the best option is unfortunately, another
INI option. We can enable this for prod/dev recommended INIs, but make
sure it's disabled for the test runner.

This takes some inspiration from the discussion in phpGH-17056, which has
similar problems to this PR.
Copy link
Member

@iluuu1994 iluuu1994 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a huge fan of the code churn. IMO, the stack trace should be avoided for compile errors, where the trace is really just noise.

main/main.c Outdated Show resolved Hide resolved
Zend/zend.c Outdated Show resolved Hide resolved
This commit adds an INI setting, 'fatal_error_backtraces', which users
can enable or disable to control whether or not PHP produces backtraces
for fatal errors.

It defaults to enabled, meaning that any non-recoverable error will now
have a backtrace associated with it. For example, a script timeout will
now look like:

  Fatal error: Maximum execution time of 1 second exceeded in example.php on line 23
  Stack trace:
  #0 example.php(23): usleep(10000)
  php#1 example.php(24): recurse()
  php#2 example.php(24): recurse()
  ...

It respects the `zend.exception_ignore_args` INI setting and the
SensitiveParameter attributes, so users can ensure that sensitive
arguments do not end up in the backtrace.
@ericnorris ericnorris force-pushed the feat-error-backtrace-v2 branch from 2abc87a to 3b9bd2f Compare January 7, 2025 19:57
@TimWolla TimWolla added the RFC label Jan 7, 2025
Zend/zend.c Outdated Show resolved Hide resolved
Zend/zend.c Outdated Show resolved Hide resolved
Zend/zend_globals.h Outdated Show resolved Hide resolved
@ericnorris ericnorris force-pushed the feat-error-backtrace-v2 branch from 619ee0e to 8db4221 Compare January 8, 2025 03:49
Without this, the ext/soap tests would trigger a double stack trace:

```
  Fatal error: Uncaught SoapFault exception: [WSDL] SOAP-ERROR: Parsing Schema: can't import schema from '/Users/enorris/workspace/php-src/ex\
t/soap/tests/bugs/bug62900.xsd', unexpected 'targetNamespace'='http://www.w3.org/XML/1998/namespacex', expected 'http://www.w3.org/XML/1998\
/namespace' in /Users/enorris/workspace/php-src/ext/soap/tests/bugs/bug62900_run:2
  #0 /Users/enorris/workspace/php-src/ext/soap/tests/bugs/bug62900_run(2): SoapClient->__construct('/Users/enorris/...')
  php#1 {main}
    thrown in /Users/enorris/workspace/php-src/ext/soap/tests/bugs/bug62900_run on line 2
  Stack trace:
  #0 /Users/enorris/workspace/php-src/ext/soap/tests/bugs/bug62900_run(2): SoapClient->__construct('/Users/enorris/...')
  php#1 {main}
```
@ericnorris ericnorris force-pushed the feat-error-backtrace-v2 branch from 8db4221 to e843e06 Compare January 8, 2025 04:02
Comment on lines 908 to 909
zval_ptr_dtor(&EG(error_backtrace));
ZVAL_UNDEF(&EG(error_backtrace));
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% confident in this being the right fix, though it is similar to the EG(exception) = NULL on L906 above. Thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems correct, this will clear the message for the next error. Do you have a test for this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have a test for this?

In a sense, yes, since the existing ext/soap tests triggered the need for this, and once it was fixed the tests passed. On the other hand, it was only the ext/soap code that needs this. For example, throwing an exception that doesn't get caught in normal PHP won't trigger a double stack trace.

I'm concerned this (clearing the trace when throwing an exception) may not be enough and there may be other "weird" error handling things in extensions that may also end up having a double stack trace.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's hard to understand the problem exactly, it would be best to replicate in a new test.

Copy link
Contributor Author

@ericnorris ericnorris Jan 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not exactly sure how to replicate it in a test, since it has something to do with how ext/soap manipulates zend_error_cb in C. I linked the ext/soap test that demonstrates this behavior in the commit message for this change: e843e06

When I was debugging this, I noticed that ext/soap changes the zend_error_cb here:

zend_error_cb = soap_error_handler;

It's then kind of hard to follow, but roughly speaking it ends up in here:

php-src/ext/soap/soap.c

Lines 1868 to 1871 in a091e52

add_soap_fault_ex(&fault, &SOAP_GLOBAL(error_object), code, ZSTR_VAL(message), NULL, NULL);
Z_ADDREF(fault);
zend_throw_exception_object(&fault);
zend_bailout();

If I remember correctly, the backtrace exists at this point, and then calling zend_throw_exception_object(&fault); and zend_bailout(); end up triggering the original zend_error_cb which prints out the exception and the backtrace, thus producing a double "Stack trace:" output.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having dealt a tiny bit with ext/soap my guess is the following:

SOAP Uses the bailout mechanisms (i.e. fatal errors) as its "exception" mechanism (as it predates exceptions existing in the language), and it now catches those to throw them as proper exceptions (although it sometimes also throws normal exceptions) which is probably the cause of this.

Refactoring ext/soap so it stops doing this is a long term goal of mine, but the extension is quite complex.

Copy link
Contributor Author

@ericnorris ericnorris Jan 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, that makes sense. Do you think the above is an appropriate fix that might apply to other extensions doing stuff like this? I'm repeating myself, but I'm concerned that there might be other complex flows involving zend_error_cb that might run into related issues.

Would it make more sense to do backtrace fetching only in php_error_cb? Perhaps that would solve the issue since presumably EG(exception) would be set and could be checked before fetching? Edit: though that would mean other extensions couldn't rely on the backtrace existing.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Frankly, the only other extensions I see overloading zend_error_cb are debuggers and profilers. ext/soap is very special in this instance.

@ericnorris
Copy link
Contributor Author

As you may be able to tell, I've been working on this pull request under the assumption that (a) the RFC will pass, (b) defaulting to on will win, and (c) defaulting to off in run-tests.php will win.

I'm still working through one final test failure, but I believe this should be mostly ready for review. I had to edit some tests that execute PHP within the test, as the inner invocation does not preserve the run-tests.php INI override.

Zend/zend.c Outdated Show resolved Hide resolved
Zend/tests/new_oom.phpt Outdated Show resolved Hide resolved
Zend/tests/fatal_error_backtraces_003.phpt Outdated Show resolved Hide resolved
Zend/zend_globals.h Outdated Show resolved Hide resolved
Comment on lines 908 to 909
zval_ptr_dtor(&EG(error_backtrace));
ZVAL_UNDEF(&EG(error_backtrace));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems correct, this will clear the message for the next error. Do you have a test for this?

@ericnorris
Copy link
Contributor Author

ericnorris commented Jan 8, 2025

Anyone have an idea as to why bug73037.phpt is failing in CI? It's not failing on my arm64 Macbook.

@nielsdos
Copy link
Member

nielsdos commented Jan 8, 2025

@ericnorris If I had to take a guess, it's a crash somewhere in the server. I'll make a checkout of your branch and see what's going on.

@nielsdos
Copy link
Member

nielsdos commented Jan 8, 2025

@ericnorris I can't reproduce this, but note that you don't initialize error_backtrace on NTS, executor_globals_ctor is only executed on ZTS and that's the only place where error_backtrace gets initialized. So I think that's the problem.
Please try the following:

From c571e797e3f974315f1514cda3fed50cda2a0b70 Mon Sep 17 00:00:00 2001
From: Niels Dossche <7771979+nielsdos@users.noreply.github.com>
Date: Wed, 8 Jan 2025 21:28:30 +0100
Subject: [PATCH] Move initialization of error_backtrace

---
 Zend/zend.c             | 1 -
 Zend/zend_execute_API.c | 2 ++
 2 files changed, 2 insertions(+), 1 deletion(-)

diff --git a/Zend/zend.c b/Zend/zend.c
index 9c8ab08a0eb..a85336f2515 100644
--- a/Zend/zend.c
+++ b/Zend/zend.c
@@ -812,7 +812,6 @@ static void executor_globals_ctor(zend_executor_globals *executor_globals) /* {{
 	executor_globals->in_autoload = NULL;
 	executor_globals->current_execute_data = NULL;
 	executor_globals->current_module = NULL;
-	ZVAL_UNDEF(&executor_globals->error_backtrace);
 	executor_globals->exit_status = 0;
 #if XPFPA_HAVE_CW
 	executor_globals->saved_fpu_cw = 0;
diff --git a/Zend/zend_execute_API.c b/Zend/zend_execute_API.c
index 6f4afa77341..bf96d5e58d2 100644
--- a/Zend/zend_execute_API.c
+++ b/Zend/zend_execute_API.c
@@ -140,6 +140,8 @@ void init_executor(void) /* {{{ */
 	original_sigsegv_handler = signal(SIGSEGV, zend_handle_sigsegv);
 #endif
 
+	ZVAL_UNDEF(&EG(error_backtrace));
+
 	EG(symtable_cache_ptr) = EG(symtable_cache);
 	EG(symtable_cache_limit) = EG(symtable_cache) + SYMTABLE_CACHE_SIZE;
 	EG(no_extensions) = 0;
-- 
2.47.1

Co-authored-by: Niels Dossche <7771979+nielsdos@users.noreply.github.com>
@ericnorris ericnorris force-pushed the feat-error-backtrace-v2 branch from 2b8ca41 to aaa2443 Compare January 8, 2025 20:38
@ericnorris
Copy link
Contributor Author

Nice catch @nielsdos, that did in fact appear to be the issue. It wasn't super clear to me where the various initialization bits should go (see the "Why is zend_startup responsible for setting..." part of my PR message), and it seems like I was off the mark. Thank you!

@nielsdos
Copy link
Member

nielsdos commented Jan 8, 2025

executor_globals_ctor basically decouples the globals from the startup thread where necessary and initializes the thread local variants.
fatal_error_backtraces does not need explicit initialization because it's handled by the ini handler.

@TimWolla TimWolla changed the title feat: error backtrace v2 RFC: Error Backtraces v2 Jan 10, 2025
@@ -1436,6 +1436,11 @@ PHP_FUNCTION(error_get_last)

ZVAL_LONG(&tmp, PG(last_error_lineno));
zend_hash_update(Z_ARR_P(return_value), ZSTR_KNOWN(ZEND_STR_LINE), &tmp);

if (!Z_ISUNDEF(EG(error_backtrace))) {
ZVAL_COPY(&tmp, &EG(error_backtrace));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking at this function context: The global should probably be renamed to last_error_backtrace for consistency with the other globals.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about fatal_error_backtrace? since it won't be populated with anything other than fatal errors now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've speculatively went ahead with renaming it fatal_error_backtrace, and as a consequence I've renamed the fatal_error_backtraces global (not the INI) to fatal_error_backtrace_on. Let me know if this change is disagreeable.

My concern with last_error_backtrace is that it may not be populated for most "last errors", but I don't know that I feel strongly here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No strong opinion here. last_fatal_error_backtrace might be another option to include both the common “last” prefix and the “fatal” indicator. Not a blocker in any case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

last_fatal_error_backtrace might be another option to include both the common “last” prefix and the “fatal” indicator.

done!

ext/standard/tests/general_functions/error_get_last.phpt Outdated Show resolved Hide resolved
@ericnorris ericnorris force-pushed the feat-error-backtrace-v2 branch from ebe1983 to b1db022 Compare January 13, 2025 15:37
@ericnorris
Copy link
Contributor Author

ericnorris commented Jan 16, 2025

I hope folks don't mind a friendly nudge here - would someone mind taking a final look when they can and letting me know if there are any necessary changes, or otherwise merging this? I'd like to backport this to our internal PHP package so we can start using it, but I'd prefer to wait until it's merged.

Copy link
Member

@nielsdos nielsdos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't really see significant problems since I last looked at this. Please take a look at my comment.

@@ -1407,7 +1412,7 @@ static ZEND_COLD void php_error_cb(int orig_type, zend_string *error_filename, c
php_printf("%s<br />\n<b>%s</b>: %s in <b>%s</b> on line <b>%" PRIu32 "</b><br />\n%s", STR_PRINT(prepend_string), error_type_str, ZSTR_VAL(buf), ZSTR_VAL(error_filename), error_lineno, STR_PRINT(append_string));
zend_string_free(buf);
} else {
php_printf_unchecked("%s<br />\n<b>%s</b>: %S in <b>%s</b> on line <b>%" PRIu32 "</b><br />\n%s", STR_PRINT(prepend_string), error_type_str, message, ZSTR_VAL(error_filename), error_lineno, STR_PRINT(append_string));
php_printf_unchecked("%s<br />\n<b>%s</b>: %S in <b>%s</b> on line <b>%" PRIu32 "</b><br />%s%s\n%s", STR_PRINT(prepend_string), error_type_str, message, ZSTR_VAL(error_filename), error_lineno, ZSTR_LEN(backtrace) ? "\nStack trace:\n" : "", ZSTR_VAL(backtrace), STR_PRINT(append_string));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is supposed to be readable in HTML, but I doubt the backtrace will be displayed in a readable manner. Perhaps, this should be wrapped in a <pre> tag or smth alike?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch, I'll look into that.

Semi-related tangent so feel free to disregard: Is it on anyone's radar to refactor this function at all? I saw "Deprecate the error_prepend_string and error_append_string INI directives" in https://wiki.php.net/rfc/deprecations_php_8_5, which would help make this a little more readable, but I wonder if there's more we could do simplify this logic. I acknowledge that is probably easier said than done, however.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I doubt refactoring this function is on someone's TODO list. There are indeed some outdated pieces (e.g. the ones you mentioned but also xmlrpc_errors imo). However, the question with refactoring is always how much it's worth spending time on this. This function likely is not on any hot code path and refactoring also brings in a risk of breaking things 🤷

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be also nice to break up those huge long lines when you are in it...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nielsdos apologies, but what's the best way to wrap a zend_string in <pre>? strpprintf? Or would it be the worst thing to just change this to %s%s%s and then pass in ZSTR_LEN(backtrace) ? "<pre>Stack trace:\n" : "", ZSTR_VAL(backtrace), ZSTR_LEN(backtrace) ? "</pre>" : ""? We could also drop adding the pre at all since we already have some fatal errors with stack traces that do not have pre: /~https://github.com/search?q=repo%3Aphp%2Fphp-src+%22%3Cb%3EFatal+error%3C%2Fb%3E%22&type=code, e.g.

<br />
<b>Fatal error</b>:  Uncaught Error: Cannot access private property test::$pri in %s004.test.php:8
Stack trace:
#0 {main}
  thrown in <b>%s004.test.php</b> on line <b>8</b><br />

@bukka I received feedback elsewhere to not break up long lines, though maybe here is a good counterpoint. I'd personally like to err on keeping it as-is since I'd like to be able to merge this relatively soon, but if people feel strongly I could break this up. This is kind of what I was getting at with my earlier comment - this function could probably be simplified to make it more readable, besides just the long line length.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we wrap it with pre it should be done consistently.
If I were to write the code, I would go for the ZSTR_LEN() ternary trick.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we wrap it with pre it should be done consistently.

I'm not sure I follow - what I was saying was that we currently do not wrap it with <pre> (for errors that already have backtraces before my change), so if we wrap it with <pre> here we'd be inconsistent with existing errors like:

<b>Fatal error</b>: Uncaught Error: Cannot access private property test::$pri in %s004.test.php:8
Stack trace:
#0 {main}
thrown in <b>%s004.test.php</b> on line <b>8</b><br />

If I were to write the code, I would go for the ZSTR_LEN() ternary trick.

Thanks, I was leaning that way but I wanted your opinion. I'm still not convinced we should add the <pre> based on the above, but if you think we should I'll go the ZSTR_LEN route.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I received feedback elsewhere to not break up long lines,

In case you are referring to my remark here: #16937 (comment) - this case of a printf-style function is quite different and splitting it across multiple lines would be reasonable, because it accurately represents the complexity of the statement.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok let's not do the <pre> conversion.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TimWolla I was! and when I said "maybe here is a good counterpoint", I meant that I could see that you might agree with splitting this up. My preference was to not do this myself though, since for consistency I should probably do that for all of them, and I was concerned with making it a topic for bikeshedding.

Copy link
Member

@nielsdos nielsdos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

barring any coding style changes, this looks good to me now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants