Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve handling of unsigned and decimal types in JSON #2404

Merged
merged 14 commits into from
Mar 19, 2024

Conversation

nicktobey
Copy link
Contributor

Fixes #2391

MySQL's JSON type differs from standard JSON in some important ways. It supports types not supported in standard JSON, such as separate types for integers and floats, an unsigned int type, and a decimal type.

Prior to this PR, we would convert values to JSON by using the encodings/json package to marshall the value to a JSON string and then unmarshall it to a go map. This is not only slow, but it's incorrect for these additional types.

The main purpose of this PR is to add special handling for these types that allow them to be stored in JSON documents. We also avoid generating and parsing JSON in places where it's not actually necessary, and fix bugs where decimals get incorrectly converted into strings, or unsigned ints get converted into signed ints.

Finally, this fixes an issue where we send incorrect bytes for JSON-wrapped decimal values along the wire.

nicktobey and others added 7 commits March 18, 2024 21:26
…converting to JSON. This is both faster and also prevents us from dropping information when writing to a table and reading it back. (Particularly about whether a value is unsigned, and whether it's a decimal.)
We still return "INTEGER" for a wrapped float that represents a whole number, because the `encoding/json` package always emits floats when parsing strings into JSON. It's not currently possible for us to distinguish between a "1" and "1.0" in a parsed JSON string, so we assume that it's an integer.
@nicktobey nicktobey requested a review from max-hoffman March 19, 2024 15:41
Copy link
Contributor

@max-hoffman max-hoffman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm! couple comments

@@ -67,6 +68,24 @@ func (t JsonType) Convert(v interface{}) (doc interface{}, inRange sql.ConvertIn
if err != nil {
return nil, sql.OutOfRange, sql.ErrInvalidJson.New(err.Error())
}
case int8:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

floats?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think a float case here matters for correctness, because in the default case floats and doubles will both get serialized and then parsed back as doubles, which is the correct behavior.

But adding a case here avoids doing more unnecessary serialization though, so I added it.

Since the observed behavior should be the same, I didn't add a test.


var InvalidJsonArgument = errors.NewKind("invalid data type for JSON data in argument 1 to function json_value; a JSON string or JSON type is required")

func GetJSONOrCoercibleString(js interface{}) (jsonData interface{}, err error) {
Copy link
Contributor

@max-hoffman max-hoffman Mar 19, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the naming implies the argument shape but doesn't give expectations of the return value. MapFromJSONOrCoercibleString? Maybe docstring?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not necessarily a map, since it could also any other value that can be represented in JSON.

I renamed it to GetJSONFromWrapperOrCoercibleString and added a docstring.

@@ -314,6 +316,7 @@ func writeMarshalledValue(writer io.Writer, val interface{}) error {
writer.Write([]byte{'"'})
return nil
case json.Marshaler:
decimal.MarshalJSONWithoutQuotes = true
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would we want to do this on init? seems like we'd want the lifecycle to be predictable

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We probably do want to do it on init, but I wasn't sure where the best place to do that would be. This is the only place where we marshal decimals (I think...) so it should be predictable, but confusing and easy to make unpredictable in the future.

Do you have a suggestion for where I should do this?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

my first thought was top of file in the startup routine func init() {}?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah. Not sure how I missed that. Fixed.

},
},
{
Query: `select JSON_TYPE(CAST("\"321.4\"" AS JSON))`,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we've had weird type edge cases with exponentials before, ex 1e2, "1e2", 1.0e2, ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added additional tests.

@nicktobey nicktobey merged commit 734c46b into main Mar 19, 2024
7 checks passed
@nicktobey nicktobey deleted the nicktobey/json-wire branch March 19, 2024 17:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Potential regression: number cast to JSON no longer read as float
3 participants