No deps json #29

hkupty · 2023-03-13T00:21:28Z

This PR grew larger than what I anticipated, but it contains:

a new sink that doesn't use any underlying library;
better stacktrace and exception rendering;
a bloom filter implementation for avoiding logging long shared stacktraces;
property tests and performance tests;
bug fixing.

If this is correct, which I believe it is, we should be able to achieve ~10% performance improvement (being conservative), just by replacing jackson with this implementation. Additionally, this one has a ByteBuffer that is reused through runtime, so we are able to achieve steady mem usage

This is crucial to make sure we are able to write valid json messages. The result of the tests are so far promising, but more should be added to ensure our implementation is compliant.

Do not use regex for escaping

Also, hidden in this commit, is a bloom filter implementation to ensure we're not writing the same stacktrace multiple times. A bloom filter should be OK here because, even though we don't want false-positives, the filter is relatively small (~1kb), which is more than enough to have a minimum collision rate. In fact, for m=8192 and k=6, we have 2% collision rate if we reach maximum stack size (1024, usually). At a fairly more common, though still very big stack of size 256, we have p=0.0000249 (1 in 40.000) and at a stack size of mere 128 frames we're almost at 1 in every 2 million. So, in other words, the bloom filter implementation using the native hashCode with a Swamidass & Baldi algorithm for k-hashes, will be space-efficient (~1kb) and fast enough for us to have a decent cache for our stack-trace dedupe.

There are a few worth noting here: - k has been reduced to 2, because 6 is overkill; - the hash algorithms have been adapted (to be tested for uniform distribution still) - The loop has been unrolled since position 0 is effectively h1 only while position 1 can pretty much be h2 only then; - Since the usage was check-then-mark always, it did not make sense to re-calculate the hashes twice so close together, so the positions array is exposed and expected to be passed in as argument for the respective `check` and `mark` functions, saving some time; Now, the API is a bit confusing and deserves some refactoring for maintainability, but the commit will go as-is to avoid piling up too much.

This should make it a bit faster to log throwables

A few things to note: - The logstash encoder test was not correct, because the encoder wasn't started; - The exception has been moved away from the function because throwing the exception skews the result in terms of mem usage. Given the idea is to try to isolate the loggers, creating a new exception instance at every function call becomes unnecessarily noisy.

Also, since we're sticking to a reusable instance in the logger, we no longer need to make it ThreadLocal.

This should improve hash distribution and it seems to have improved performance also

We should, even for large (1024) stacktraces (that are still not supported by the logger btw) strive to have a <1% collision rate.

sonarqubecloud · 2023-03-24T09:42:43Z

Kudos, SonarCloud Quality Gate passed!

0 Bugs
0 Vulnerabilities
0 Security Hotspots
8 Code Smells

No Coverage information
2.0% Duplication

hkupty marked this pull request as draft March 13, 2023 00:21

hkupty mentioned this pull request Mar 13, 2023

Missing synchronization and multiline exception messages #28

Closed

hkupty added 5 commits March 13, 2023 23:06

test: Better tests

d56b8ea

fix: Address bugs in json writer

b3474a6

feat: Add more methods to write json

27db421

fix: Add separators to fields

5efd67b

hkupty force-pushed the no-deps-json branch from a8895be to 5efd67b Compare March 13, 2023 22:08

hkupty added 22 commits March 17, 2023 23:53

test: Add property-based testing

38e579e

This is crucial to make sure we are able to write valid json messages. The result of the tests are so far promising, but more should be added to ensure our implementation is compliant.

refactor: Manually escape characters

42f27ac

Do not use regex for escaping

fix: Ensure we don't overflow if value is 0

fe38082

test: Don't break on mount

644ed58

test: Compare using more data

45dcc50

fix(sonar): Address sonarcloud issues

e099d24

test(jmh): Add performance tests

b25ea01

feat: Allow buffer to resize if log is too big

e871e65

fix: Use the correct name for the thread

fed4199

refactor: Throwable logging trimming

1a092a0

This should make it a bit faster to log throwables

refactor: Ensure the buffer can resize

2ea1715

test: Rename var

d03b3fa

docs: Add docs on perfromance

e172a1e

refactor: drop the array in StackTraceFilter obj

793a312

Also, since we're sticking to a reusable instance in the logger, we no longer need to make it ThreadLocal.

test(jmh): Remove object creation from benchmark fns

b51d23e

refactor: Use Objects.hash for better hashes

669df67

This should improve hash distribution and it seems to have improved performance also

test(property): Ensure collision rate is < 1%

ac9aaf6

We should, even for large (1024) stacktraces (that are still not supported by the logger btw) strive to have a <1% collision rate.

test: unit tests should have default visibility

4bf824c

style(pmd): Address minor issues

536ee49

hkupty marked this pull request as ready for review March 24, 2023 09:42

Merge branch 'main' into no-deps-json

c26af30

hkupty merged commit 78cb697 into main Mar 24, 2023

hkupty deleted the no-deps-json branch March 30, 2023 20:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

No deps json #29

No deps json #29

hkupty commented Mar 13, 2023 •

edited

Loading

sonarqubecloud bot commented Mar 24, 2023

No deps json #29

No deps json #29

Conversation

hkupty commented Mar 13, 2023 • edited Loading

sonarqubecloud bot commented Mar 24, 2023

hkupty commented Mar 13, 2023 •

edited

Loading