Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No deps json #29

Merged
merged 28 commits into from
Mar 24, 2023
Merged

No deps json #29

merged 28 commits into from
Mar 24, 2023

Conversation

hkupty
Copy link
Owner

@hkupty hkupty commented Mar 13, 2023

This PR grew larger than what I anticipated, but it contains:

  • a new sink that doesn't use any underlying library;
  • better stacktrace and exception rendering;
  • a bloom filter implementation for avoiding logging long shared stacktraces;
  • property tests and performance tests;
  • bug fixing.

hkupty added 5 commits March 13, 2023 23:06
If this is correct, which I believe it is, we should be able to achieve ~10% performance improvement (being conservative),
just by replacing jackson with this implementation.

Additionally, this one has a ByteBuffer that is reused through runtime, so we are able to achieve steady mem usage
hkupty added 22 commits March 17, 2023 23:53
This is crucial to make sure we are able to write valid json messages.
The result of the tests are so far promising, but more should be added to
ensure our implementation is compliant.
Do not use regex for escaping
Also, hidden in this commit, is a bloom filter implementation to ensure we're not
writing the same stacktrace multiple times. A bloom filter should be OK here because,
even though we don't want false-positives, the filter is relatively small (~1kb), which is
more than enough to have a minimum collision rate.

In fact, for m=8192 and k=6, we have 2% collision rate if we reach maximum stack size (1024, usually).
At a fairly more common, though still very big stack of size 256, we have p=0.0000249 (1 in 40.000)
and at a stack size of mere 128 frames we're almost at 1 in every 2 million.

So, in other words, the bloom filter implementation using the native hashCode with a Swamidass & Baldi
algorithm for k-hashes, will be space-efficient (~1kb) and fast enough for us to have a decent cache for our
stack-trace dedupe.
There are a few worth noting here:
  - k has been reduced to 2, because 6 is overkill;
  - the hash algorithms have been adapted (to be tested for uniform
    distribution still)
  - The loop has been unrolled since position 0 is effectively h1 only
    while position 1 can pretty much be h2 only then;
  - Since the usage was check-then-mark always, it did not make sense to
    re-calculate the hashes twice so close together, so the positions
    array is exposed and expected to be passed in as argument for the
    respective `check` and `mark` functions, saving some time;

Now, the API is a bit confusing and deserves some refactoring for
maintainability, but the commit will go as-is to avoid piling up too
much.
This should make it a bit faster to log throwables
A few things to note:
 - The logstash encoder test was not correct, because the encoder wasn't
   started;
 - The exception has been moved away from the function because throwing
   the exception skews the result in terms of mem usage. Given the idea
   is to try to isolate the loggers, creating a new exception instance
   at every function call becomes unnecessarily noisy.
Also, since we're sticking to a reusable instance in the logger,
we no longer need to make it ThreadLocal.
This should improve hash distribution and it seems to have improved performance also
We should, even for large (1024) stacktraces (that are still not supported by the logger btw) strive
to have a <1% collision rate.
@hkupty hkupty marked this pull request as ready for review March 24, 2023 09:42
@sonarqubecloud
Copy link

Kudos, SonarCloud Quality Gate passed!    Quality Gate passed

Bug A 0 Bugs
Vulnerability A 0 Vulnerabilities
Security Hotspot A 0 Security Hotspots
Code Smell A 8 Code Smells

No Coverage information No Coverage information
2.0% 2.0% Duplication

@hkupty hkupty merged commit 78cb697 into main Mar 24, 2023
@hkupty hkupty deleted the no-deps-json branch March 30, 2023 20:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant