Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve NDK stack-unwinder data #1731

Merged
merged 1 commit into from
Aug 11, 2022
Merged

Conversation

lemnik
Copy link
Contributor

@lemnik lemnik commented Aug 9, 2022

Goal

Improve the NDK stack-unwinder availability to symbol data.

Design

Introduced the ability to refresh the cached symbol / maps data in the native stack-unwinder:

  • a new bugsnag_refresh_symbol_table function for native code
  • a new BugsnagNDK.refreshSymbolTable() as part of the bugsnag-plugin-android-ndk module for Java / Kotlin code using System.loadLibrary after startup
  • when the app startup is marked as complete, the native layer will refresh to handle any libraries that were loaded between Bugsnag.start and startup completion (for example in the main activity, or similar)

Testing

Moved the CXXExternalStackElementScenario into the cxx-scenarios-bugsnag project, so that it can use the new API and ensure it behaves as expected.

@lemnik lemnik requested a review from kattrali August 9, 2022 13:15
@lemnik lemnik force-pushed the PLAT-8777/refresh_symbol_table branch from afc24f3 to b7baf8d Compare August 9, 2022 13:15
@bugsnagbot
Copy link
Collaborator

bugsnagbot commented Aug 9, 2022

Android notifier sizes

Format Size impact of Bugsnag (kB) Size impact of Bugsnag when Minified (kB)
APK 1886.17 1643.46
arm64_v8a 663.95 422.28
armeabi_v7a 598.42 356.75
x86 737.66 495.99
x86_64 704.9 467.33

Generated by 🚫 Danger

@lemnik lemnik force-pushed the PLAT-8777/refresh_symbol_table branch from b7baf8d to f5ffb10 Compare August 10, 2022 08:37
@lemnik lemnik force-pushed the PLAT-8777/refresh_symbol_table branch from f5ffb10 to 3ca097b Compare August 10, 2022 09:51
@lemnik lemnik merged commit 1e076ed into next Aug 11, 2022
@lemnik lemnik deleted the PLAT-8777/refresh_symbol_table branch August 11, 2022 08:37
@lemnik lemnik mentioned this pull request Aug 17, 2022
Comment on lines +9 to +12
if (Bugsnag.isStarted()) {
val client = Bugsnag.getClient()
client.ndkPlugin?.nativeBridge?.refreshSymbolTable()
}

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isStarted() isn't thread safe - #1795

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even if isStarted() were thread safe, bugsnag could be started straight after isStarted() has returned false so a refresh and therefore symbols could be lost.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @jgreen210, thanks very much for the input and suggestions. We did consider these edge cases when we implemented
the isStarted method, but felt that given its niche use-case the small risk of stale-reads was acceptable.

It’s unlikely that a false-negative isStarted will lead to “missing symbols” here since the native layer will load the
symbol table during Bugsnag.start. Once the app is marked as "started" the symbol table is refreshed in the background. While there are possible races between these points, there are equally possible races between every dlopen and a native crash. Given the nature of the symbol table and its loading requirements, it is deadlock prone guarantee we will have every possible symbol from every possible library load (we would need to refresh the table during a native crash, which risks deadlock or heap corruption). In most (almost all) Android apps the native libraries are loaded before Bugsnag.start is called, and refreshSymbolTables exists entirely for niche cases where native libraries are loaded late in the app execution (for example when starting an Activity or Service).

While isStarted is lacking any memory barriers in its current state, it is extremely unlikely that
isStarted will cache the null client reference in a normal application. We have experimented with several contrived situations to test this, and even a strong compareAndSet on a completely unrelated variable
is enough to flush the caches and trigger the next load to retrieve the correct value (CPU cores don't have huge
caches after all).

isStarted is intended for situations where you are willing to lose data, otherwise there is no good reason
to use it as your app would be structured such that start is always called before calling other Bugsnag APIs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants