data: Replace custom formats with msgpack #374

fstachura · 2024-12-29T21:34:09Z

I have previously attempted to refactor data converters used in data.py in this branch. In the end I wasn't happy with the result, because I believe that writing custom parsers for each database is the wrong approach.

This PR replaces all data.py parsers with msgpack. Plain python objects can be serialized and deserialized into the databases.
The main advantage is convenience - values can be manipulated like normal Python objects, no string parsing is required anywhere in the codebase that interacts with the database.

From what I remember, larger databases were also a bit smaller, mainly because large ints take less space in msgpack than in base10 representation. But to be fair, there is some storage overhead for other datatypes.
I also wouldn't be surprised if average serialization/deserialization times were a bit smaller, although I don't have numbers on that and I doubt it's a major bottleneck anywhere.

Leaving this as a draft - I tested it only a little bit.

data: Replace custom formats with msgpack

7eaa77b

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data: Replace custom formats with msgpack #374

data: Replace custom formats with msgpack #374

fstachura commented Dec 29, 2024

data: Replace custom formats with msgpack #374

Are you sure you want to change the base?

data: Replace custom formats with msgpack #374

Conversation

fstachura commented Dec 29, 2024