-
Notifications
You must be signed in to change notification settings - Fork 597
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[CORE-5093] Data Transforms SDK: Schema Registry Support for JavaScript #21491
[CORE-5093] Data Transforms SDK: Schema Registry Support for JavaScript #21491
Conversation
8accf27
to
ee358eb
Compare
fa6b583
to
058b065
Compare
92ba82f
to
f777ce2
Compare
6c0ddec
to
c218201
Compare
@rockwotj - marking this ready for review in case you have time. going to be working on adding |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
First look is good - I will look more in-depth Monday.
70b0b70
to
52a3364
Compare
Otherwise any nontrivial JS transform will blow our stack allocation pretty consistently. Since the project is not very large, build time is not much of an issue. Qualitatively, any difference is not noticeable. Signed-off-by: Oren Leiman <oren.leiman@redpanda.com>
Adds `is_error` to distinguish between JavaScript errors and exceptions. Also adds a stack trace to the debug string for exceptions and errors both. Note that traversing the stack trace is a recursive operation, but generally we're going to be hitting this code when an exception escapes a transform function, killing the VM. Signed-off-by: Oren Leiman <oren.leiman@redpanda.com>
As an alias for 'global' itself. This usage is (anectdotally) pretty common for code that expects to find itself in a nodeJS env and is not covered by our usual polyfills. Signed-off-by: Oren Leiman <oren.leiman@redpanda.com>
Take double representation of the value, round to the nearest long, and cast to int32.
52a3364
to
7d1cbde
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good - a few smaller things, then we can ship this
src/transform-sdk/js/js_vm.cc
Outdated
@@ -132,12 +133,26 @@ value value::uint8_array(JSContext* ctx, std::span<uint8_t> data) { | |||
JSFreeArrayBufferDataFunc* func = [](JSRuntime*, void* opaque, void* ptr) { | |||
// Nothing to do | |||
}; | |||
|
|||
// /~https://github.com/quickjs-ng/quickjs/blob/da5b95dcaf372dcc206019e171a0b08983683bf5/quickjs.c#L51945-L51953 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some context here would be nice :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah I had a comment inline that I must have deleted when I separated the two functions...
* ] | ||
* }; | ||
* | ||
* const subj_schema = sr_client.createSchema( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JS usually uses camelCase
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think i got them all
auto orig = value_as_bytes_view(buf); | ||
redpanda::bytes encoded = redpanda::sr::encode_schema_id( | ||
redpanda::sr::schema_id{sid.as_integer()}, orig); | ||
return qjs::value::uint8_array_copy(ctx, encoded); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why make a full copy? Can't we pass the original bytes and have a dtor cleanup the bytes when the JS object is GC'd? I guess this is fine now. Let's leave a TODO(perf) and not prematurely optimize.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can't we pass the original bytes
not sure how this would work mechanically. we get a vector back from encode_schema_id
, so I'm fairly certain you have to copy those out into an unmanaged buffer anyway.
could return a unique_ptr<uint8_t*>
from encode_schema_id
or something else that allows releasing the underlying memory. am I missing something? maybe overload encode_schema_id
to take a view into memory allocated at the call site?
generally agree that we can punt on this though. will add the TODO.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You would have to "move" the vector to a heap allocated pointer (no copies of the data) and make that the opaque value in the JSRuntime that is deleted when it's done. No need to do now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, i tried that... though you know what, i may have been going through the wrong pointer param 🤦
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
probably fast follow if I have time
src/transform-sdk/js/main.cc
Outdated
// NOTE(oren): returning a view back into the target buffer might | ||
// be a terrible idea. Needs a good think. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's make a copy for now and optimize later. We should be able to keep a reference to the old data and make a subview: https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/TypedArray/subarray
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Signed-off-by: Oren Leiman <oren.leiman@redpanda.com>
Introduces '@redpanda-data/transform-sdk-sr' module and export for creating a client instance Signed-off-by: Oren Leiman <oren.leiman@redpanda.com>
7d1cbde
to
e4e646e
Compare
value::uint8_array_copy: Use Uint8Array constructor that copies input buffer into the object. Useful for returning arbitrary buffers to user code (e.g. the result of `transform-sdk-sr.encodeSchemaID`). value::array_buffer_copy: similar reasoning
e4e646e
to
0d02c00
Compare
force pushes: CR comments and fix mangled push from yesterday |
Signed-off-by: Oren Leiman <oren.leiman@redpanda.com>
Returns an owning string/ArrayBuffer/Uint8Array depending on the input buffer type.
Includes TypeScript typings in sr-module/index.d.ts CORE-5577 Signed-off-by: Oren Leiman <oren.leiman@redpanda.com>
0d02c00
to
80a858a
Compare
force push adds perf TODOs |
This PR introduces
@redpanda-data/sr
, a C++-backed node module offering a client and various related types for accessing Redpanda Schema Registry from JavaScript or TypeScript transforms.Includes TypeScript type descriptions, primarily for developer QoL (LSP support).
Also Closes CORE-5577
Backports Required
Release Notes
Features