-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow hosts to reuse an existing Realm Intrinsics record #1420
base: main
Are you sure you want to change the base?
Conversation
What kind of security guarantees are you looking for? Putting things in the same agent has a number of vulnerabilities by definition given current hardware. |
@annevk The read vulnerability that keeps showing up due to timing does continue to occur in process, but was also cross process for some time before patches went out. It is not just Node that has in-process concerns when you look at other environments like CloudFlare Workers which also are doing things in-process it seems. My main investigation is into limiting references to powerful primitives. I have several calls with various people advocating for multi-process architecture due to the read vulnerability after releasing one path forward for Node. Due to the nature of and size of the Node.js ecosystem the requirements such as C++ addons running in the same process, as JS, multi-process mitigations are not necessarily safer. This leads to being concerns with another type of attack that recurs just like the in-process reads on current architectures with the confused deputy that applies even to multi-process architectures. The ability to limit the communication of powerful primitives and even to isolate them is beneficial even if it does not save us from arbitrary reads. However, if someone has knowledge of arbitrary execution any attempt to have communication with powerful primitives, isolated or not, would be vulnerable; I do not know of any currently though. Mitigations of speculative timing attacks are not likely to roll out into hardware soon, and timers are somewhat difficult to prevent from being obtained; however, that does not mean it is not fruitful to look into the discussions of isolating primitives entirely. @erights and @dtribble have done some work towards this mode of thinking that does follow my own desires; however, they are seeking to put code within an in-process sandbox rather than the root of the process which Node.js would be restricted to. My investigations is towards the benefits and faults of these approaches rather than using Spectre as just what its name implies. The hope is that this PR can help with that investigation, and that of other companies and projects that also are doing in-process multi-tenancy. Right now the vast majority of Node.js applications do have multiple (HTTP) users per process and that is unlikely to change to per process per HTTP authorization, would have some new and interesting concerns for process management, and would break some usage of C++ addons. I am hopeful to get people to allow these investigations to continue and this PR would help. |
How does this PR relate to implementations? Do the JavaScript implementations used in Node.js support this usage mode, or is this PR a way to make the feature request to them? Are they interested in implementing this feature if the PR lands? |
@@ -6163,7 +6163,7 @@ <h1>CreateRealm ( )</h1> | |||
<p>The abstract operation CreateRealm with no arguments performs the following steps:</p> | |||
<emu-alg> | |||
1. Let _realmRec_ be a new Realm Record. | |||
1. Perform CreateIntrinsics(_realmRec_). | |||
1. If the host requires use of an existing Realm Record's intrinsics to serve as _realmRec_'s intrinsics, let _realmRec_.[[Intrinsics]] be assigned in an implementation-defined manner to an existing Realm Record's intrinsics record. Otherwise, perform CreateIntrinsics(_realmRec_). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
let foo be assigned
should probably be set foo
.
And actually, I don't think the imp-defness applies to the manner in which the set
happens, but rather (presumably) to the host's option.
So I suggest something like:
1. If the host, for implementation-defined reasons, requires that _realmRec_ reuse the intrinsics of an existing Realm Record _rr_, then
1. Set _realmRec_.[[Intrinsics]] to _rr_.[[Intrinsics]].
1. Else
1. Perform CreateIntrinsics(_realmRec_).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe this change is not necessary since we do not plan to reuse intrinsics between realm records. Mostly because this was meant to solve the identity discontinuity issue that can be solved via a membrane, or the upcoming parameterized evaluator (which we don't know yet how that will work).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@caridy The identity discontinuity issue is not solved by the membrane. It is hidden well enough for some practical purposes but not others.
The parameterized evaluator is exactly where this pull request is relevant. We all agreed to move that portion of the realms proposal into the SES / parameterized evaluator proposal. But the new understanding needs new names. The thing we're talking about should no longer be called a Realm record, because it is no longer per Realm. It is per evaluation context.
Attn @jfparadis @michaelfig
To my knowledge VMs do not support sharing the intrinsics record amongst Realms. V8 does allow reusing a global if it is not in use with We see multiple talks about Realms either for JS APIs for creating virtual hosts, relating to potential isolation of JS references, and frozen Realms. From the perspective of Node.js ongoing work is being looked at isolation and freezing of intrinsics that do related to those talks. A common part of all those discussions is the ability to create Realms and intricacies of sharing intrinsics between Realms (often due to what we call the "Identity Discontinuity"). I would like to attempt implementing features using the ideas of those talks, but doing so requires us to make changes to allow reusing intrinsics. |
OK, I think it's good to have this PR written to be concrete, but I would encourage us to hold back on landing it until it we get implementation interest/experience (rather than just interest/consensus in committee). |
To explain an example use case here, the new loader work in Node.js has the loaders running in their own Realm, kind of like service workers. These threaded loaders spin up with not-insignificant startup time partially due to spinning up the intrinsics. If we could use the existing intrinsics securely (ie if they were frozen), then this would provide a great way to get the necessary isolation with memory and CPU benefits. Node.js is kind of exploring this space ahead of browsers at this point due to these use cases being more specific to this platform. @littledan would you suggest working more closely with v8 on these kinds of investigations then first? |
in V8 at least (relevent for node), using the same intrinsics will provide negligible cpu/memory benefits because it already uses the same ones internally: https://v8.dev/blog/embedded-builtins |
@devsnek that does give memory savings, but isn't improving startup time https://v8.dev/blog/embedded-builtins#optimizing-for-performance ? |
@bmeck in the section you link they mention reworking most of the indirect calls into single lookups relative to the root I'm sure this still has good security reasons to exist, I'm just saying this isn't a blocker performance wise for node's use cases |
XS has implemented something akin to this PR via "Compartments", you can read about it in their blog https://blog.moddable.com/blog/secureprivate/ |
@bmeck, this change is clever, thanks. Questions:
|
|
I think we can start from Moddable's implementation in XS as their compartment API already handles a module loader and a module map. CC @phoddie
On the other hand, in the original compartment shim, we shim compartments using a "with" statement inside a function, which acts as a function who's lexical environment is the global environment. In the new compartment shim, there is no relation to multiple realm records, a choice we made because XS has no support for multiple realms, and that simplified everything. |
I fear that both the title of this PR and some of the recent discussion concerning it reflects a fundamental misunderstanding of the nature and mechanisms of the ES specification and how they are used to define the semantics of ECMAScript programs. Apparently by "Realm Intrinsics record" what is meant is the Record that is the value of the [[Intrinsics]] fields of a Realm Record. That Record (like all Records in the specification) is a value of an ECMAScript Specification Type:
In other words, the ES specification and its specification types exist to define the intended (by TC39) static and runtime semantics of ECMAScript language elements. They are not intended to describe how those semantics should or could be implemented. Specification type values are not required runtime data structures. It is completely up to an implementation to decide on the internal data structures (if any) that are used to actually implement the required semantics. In the case of a Realm's [[Intrinsics] Record. It primarily exists to support a specification shorthand. It provides an indirection mechanism that allows %Foo% to be be used, without additional verbiage, within ES pseudo-code to reference intrinsics objects that are specific to the current realm. In most cases (given the current ES semantics), such an indirection will be unnecessary at runtime within an implementation because the current realm (and its intrinsics) can be determined prior to evaluation of (most?) ECMAScript code. From that perspective, "Allow hosts to reuse an existing Realm intrinsics record" seems like a nonsense statement. A "host" presumably corresponds to an ECMAScript implementation and such implementation are allowed to represent the semantics of the %Foo% mechanism in any manner that correctly implements the language level semantics described by the specification. It may or may not have a per realm data structure for accessing the realm intrinsics. If it does have such a data structure it may or may not be shared (in whole or part) by multiple realms. The only requirement is observable conformance to the specified language semantics. The current specification does not talk about concepts such as shared [[Intrinsic]] Records because there are no current language semantics that require such sharing. Presumably what this PR is really about is "Define a semantics for multiple realms that share (some) common intrinsics". In that case, the proper starting point is to actually state informally the full semantics of such sharing. What, at the observable language level, will it mean if, for example, two realms map %Object% to the same object value. Note that isn't just about the [[Instrinsic]] Record? There may be deep invariants in the current specification that will be invalidated by that. Has anybody analyzed that possibility? After there is documented agreement on the semantics of shared intrinsics, only then it will be appropriate to talk about how to tweak the specification abstractions to express that semantics. |
@allenwb We're simply evaluating this PR, not defining the specs for realms and compartments. In other words, Bradley found a clever way to express what we need with appears to be a minimal change to the specs, so we are using this PR as a probe to explore how far we can go with it. Its one set of lenses to explore and understand the situation. As you say, the realm record is a concept of the specs. It combines everything related to a realm: [[intrinsics]], [[GlobalObject]], etc. We're all in agreement here, implementers might or might not have the concept of a realm record. However, the realm record implies more than that: it assumes a one-to-one relationship, that there is only one [[GlobalObject]] per [[intrinsics]]. That assumptions is visible in many aspects of the specs like GetGlobalObject(). The compartment API breaks that assumption. Something needs to change. Bradley's PR basically mean that a compartment is like a "child realm": it has everything a realm has, except that it delegates its intrinsic record. It allows specifications like abstract operation GetGlobalObject() to go without any change. However, it means that the definition of realm does change to include compartments. Regarding you question about two realms shearing the same intrinsics, yes it's been explored in the realm-shim, and even implemented in XS where the realm record has no intrinsics slot (there is only one set of intrinsics, and they are all global defines). |
3d0c24c
to
7a79833
Compare
This PR builds upon previous talks we have had about Realms and Frozen Realms in JS. After having several discussions in meetings about these topics, it seems like there is still division on the direction that could be taken regarding Realms. While working and attend both the Frozen Realms calls and Node.js Security WG issues, it seems like investigation into this area would be good for Node.js to get a better view of what it would be like to use Realms.
This allows a host to reuse an existing Realm's Intrinsics while exposing a different set of globals. This PR would enable hosts to create Realms with the same set of intrinsics such as Frozen Realms and Compartments that have been talked about in the past. This is intended only to be exposed at the host level and not be a required behavior for hosts not wishing to expose the behavior. Node.js has a specific interest in allowing this at the host level while investigating security policies of modules and the ability to run code in isolated spaces. This PR may be built upon after further investigation, but it intended to be minimal while that investigation continues.
Prior to this PR hosts are already able to supply their own global object and global this value; this PR expands the options of what a host may supply to include intrinsics from another existing realm. If there is need for having modified intrinsics such as having differing values for
Function
orFunction.prototype.constructor
, I feel that can be addressed in a separate PR.We could refactor Realm creation a bit to make the whole process much more centralized and easier to read, but I left the PR as minimal as possible for now to allow easier reading and expression of the host determining this choice. If it is desirable to refactor the Realm creation text, we can do so hopefully separately from this PR, but it seems like it should be done sometime given some odd wording while reading things.