-
Notifications
You must be signed in to change notification settings - Fork 217
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Strawman ideas for vat secondary storage #1846
Comments
@FUDCo and I just got off a long phonecall to discuss this, unfortunately before I had a chance to read the text above, so I received a mix of those ideas and ones that we came up with during the call. I'm going to write down my thoughts first, before I forget them, and then go back and read that text and see what I got wrong. The form of his approach that we settled on uses a mutable The syscalls would be the same as described in #1831, although there was some discussion about whether these should be non-syscall functions that maintain the same atomicity/transaction boundaries as the rest of the vat's state. Using syscalls is an easy way to start out, for sure: everything goes through the crank buffer so it automatically gets committed/reverted in the same way as all other vat state. But as we move vats out into child processes, syscalls that must traverse the pipe back to the kernel might feel too expensive, and we might want to route them to a local DB (encouraged by the fact that this is per-vat secondary storage, and never needs to be accessed by more than one worker process at a time; whichever worker is hosting the vat at that moment). The requirement, though, is that one crank's changes to this local DB are committed atomically along with changes to the kernelDB, which is more difficult if they are in two separate databases. There are tricks to accomplish that (recording a generation number along with each change in the vat DB, committing the generation number into the kernelDB when the crank is committed, then on reload you discard any vatDB changes whose generation number is higher than what you find in the kernelDB), but I don't really want to engineer that stuff yet. In this system, liveslots would maintain a fixed-size list of (strong references to) On top of that, we have some number of ephemeral state objects (I don't know what to call them, let's use Liveslots creates a new pair of (ephemeral behavior object, ephemeral state object) each time deserialization encounters a virtual object ID. We (liveslots) have no way to tell when these go away, but we must maintain a WeakMap from the behavior object to the object ID so they can be serialized properly. The ephemeral behavior object has methods like The user-constructor-defined object uses The RAM usage is driven by the number of behavior/EphState objects held by user code (hopefully for short-lived operations), and the size of the LiveState table. The number of syscalls is driven by the size of the table (therefore the lifetime of a LiveState object) and the churn on its data. In this model, user code (the constructor function) is written to use a simple const purseConstructor = {
getBalance(state) {
return state.balance;
},
burn(state) {
state.balance = 0;
},
};
const c = liveslots.createContainer(purseConstructor);
function mint(initialBalance) {
return c.create(initialBalance);
} (I haven't thought about how The benefit of using The If user code hangs on to the behavior object for a while, that will keep the
We might also use the I think we have a range of potential approaches (this
|
Jessie doesn't have mutable properties. In our style we express all mutation be either
With only fixed properties, there's no need for proxies. Whatever you need to do to mark dirtyness, it wouldn't be by mutating an unknown property, and thus you'd need that same logic whether there's a proxy in the way or not. For ERTP, all mutation is only in the shared ledger stores --- shared among instances which jointly encapsulate it. For Zoe, most mutations are in shared stores. Some are in lexical variables, and these would need to be transformed to track dirtyness. (See @michaelfig 's /~https://github.com/Agoric/agoric-sdk/pull/1843/files/989030259c768410e9bb662058986e799fb4ec72..d0dc6ee6fc022673decb7b64095de458834d3378 ). None are even in accessor properties that give the appearance of assignable properties. We don't do that. That's a good thing. API surface is about requests. State representation is internal to the abstraction. |
I agree that state representation is internal to the abstraction, but that internal state is the state we need to store. I don't see how Jessie can be made to work in this context without modifying the JS-engine or doing some fairly complex form of code transformation, neither of which seem plausible to me in the near or moderately near term. Mutable state has to live somewhere, and while my preferred answer would also be lexical variables, I'm not aware of anything in stock JS that would enable an automatable mechanism to capture their values for transfer to secondary storage. |
@erights @michaelfig I was recapping our discussion this morning to Brian and he harkened back to the July discussion with Dean that's summarized in #455 (comment) linked above, which, in particular yielded the following snippet which I'll just copy here for convenience: const purse = (me, c) => ({
deposit(other) {
const myBalance = c.get(me);
const otherBalance = c.get(other);
c.update(me, myBalance + otherBalance);
c.update(other, 0);
}
}); in this case, |
That's not a problem: let purseToBalance;
const { make: makePurse, makeWeakStore: makePurseWeakStore } = makeCollection(() => {
const me = {
deposit(other) {
const myBalance = purseToBalance.get(me);
const otherBalance = purseToBalance.get(other);
purseToBalance.set(me, myBalance + otherBalance);
purseToBalance.set(other, 0);
},
};
purseToBalance.init(me, 0);
return me;
}));
purseToBalance = makePurseWeakStore(); The key insight from the discussion with @erights is that the "collection" is not the same as the "weakStore" that uses objects from the collection as keys. You have to make such stores explicitly in this model. |
I haven't update my example code to implement that, but I'm working on FWIW, here is the proposed rewrite of the above code. let purseToBalance;
const { make: makePurse, makeWeakStore: makePurseWeakStore } = makeVatCollection(() => ({}), $hinit => $hdata => {
const me = {
deposit(other) {
const myBalance = purseToBalance.get(me);
const otherBalance = purseToBalance.get(other);
purseToBalance.set(me, myBalance + otherBalance);
purseToBalance.set(other, 0);
},
};
$hinit && purseToBalance.init(me, 0);
return me;
}));
purseToBalance = makePurseWeakStore(); |
In this example the balance is not kept as part of the purse's closed over state, it's stored external to the purse in the weakStore. This solves the rights amplification problem at the cost of separating an object's state into two parts: one part that is private to the object (which the above example has none of and thus doesn't benefit from the code transformation) and a second part that is potentially shareable amongst others of its ilk that needs some kind of additional, separate storage and get/store mechanism that I don't think we've discussed. I can't tell if I'm still confused or if I just don't care for this design. |
Internal state is only one reason for the code transform. The other reason is to separate initialization from reconstitution, which is a critical separation to make the system work. We already have the separation of private vs. public data in our source code. Developers are familiar with it. |
The current Jessie implementation of the purse ledger system does exactly the same thing for the same reason. One purse needs to be able to amplify another purse into access to the other purse's balance. It's what we've come to call "class private per-instance state". Closure-captured variables and Smalltalk instance variables are both "instance private per-instance state" because only the instance can see its own state. Java private instance fields and JavaScript private fields are "class private per-instance state" because it is private to code in the class, but one purse can see into another's class-private state. The purse ledger system uses weakStore to get the same effect for object-as-closure objects, where the weakStore as a whole is held in a shared closure-captured variable.
As @michaelfig said, we did effectively discuss it in making the distinction between the "collection" as (large)hidden backing store vs (large)weakStore as explicit ocap-based collection abstraction exposed to the application programmer. Both of these are encapsulated by the same objects. But they're encapsulated at different levels of abstraction.
"don't care for" sounds like a judgement. Even with your manual code, how do you solve it? You can't use |
As @FUDCo and I were noodling about this today, I remembered that @dtribble 's example is incomplete: it uses a simplified Purse model that we stopped using maybe a year ago (it lacks So we need a way to reify this "allowed to manipulate state" authority into something that can be closed over by both the Purse methods and the Payment methods. In the WeakMap expression, this is exactly the I think @michaelfig 's example uses the return value from The approach that I sketched out with Chip might be equivalent to Michael's, but I'll write it here just so I don't forget: function makeMint() {
const { open: openPurse, create: createPurse } = liveslots.createContainer();
const { open: openPayment, create: createPayment } = liveslots.createContainer();
function mint(initialBalance) {
const purseBehavior = {
deposit(me, other) {
openPurse(me).balance += openPayment(other).balance;
openPayment(other).balance = 0;
},
};
return createPurse(purseBehavior);
}
return mint;
} (this is incomplete, of course). The idea was that the In @dtribble 's original writeup, the behavior was defined early, and passed into the In mine, In @michaelfig 's code, I think the return value is the Unsealer, but it uses a mutable |
Good! It also makes the distinction between "collection" backing storage for implementing a huge number of instances vs the weakStore for giving them access to each other's class-private state, and only that state. |
My "not a problem" example is intended to be all hand-written. The "FWIW" example is what the hand-written example would be automatically rewritten into (preserving line numbers), by a transform in SwingSet automatically detecting the use of Names will be bikeshodden. |
Here, most of the code is exactly the same. This will often be the case, but will often not be the case. It is the case here because none of the per-instance state was in captured lexical variables. If it were, the rewrite would still be pleasant, readably related to the original, and line number preserving. But variable accesses would turn into similar looking property access. |
Brian's example is illuminating because it makes clear that access comes not from being of a type but from having the capability for access -- in this case an object of type A is reaching "into" an object of type B. In particular, it goes beyond the distinction that MarkM usefully labelled above as "class private per-instance state" vs. "instance private per-instance state", into something else more general where you can see the need for the capability to be explicitly wielded (actually this seems like a nice reification of another of MarkM's long standing notions, the systems-of-status vs. systems-of-property idea). |
Notes from today's meeting, where we walked through @michaelfig 's example in #1857 :
I'd like to see how it feels to retain instance state using only the Once we define these three functions, the next step is to rewrite @michaelfig 's ERTP example with it. We'd also discussed how We're thinking that We're not overly concerned about people mistakingly taking a Representative and using it as the key in a All of this is predicated on the scheme described up in the earlier comment, in which we keep a finite number of "data holder objects", they're dirtied by WeakStore |
Here's one approach to the Issuer I wrote based upon the notes above, using weak-stores exclusively for the state management, rather than an additional non-rights-amplifying instance store object. I think we could make this code work with a secondary-storage -based /* global makeExternalStore, sameRepresentative, makeWeakStore */
// these three come from (and share private tables with) liveslots
import { Remotable } from '@agoric/marshal';
import { makeInterface, ERTPKind } from './interfaces';
export function makeIssuerKit(allegedName) {
const brand = Remotable(makeInterface(allegedName, ERTPKind.BRAND));
const purseBalanceLedger = makeWeakStore(); // value: balance
const purseDepositLedger = makeWeakStore(); // value: deposit
const paymentLedger = makeWeakStore(); // value: balance
const depositLedger = makeWeakStore(); // value: Purse
const purseInterface = makeInterface(allegedName, ERTPKind.PURSE);
const paymentInterface = makeInterface(allegedName, ERTPKind.PAYMENT);
const depositInterface = makeInterface(allegedName, ERTPKind.DEPOSIT_FACET);
function buildPurse() {
const purse = harden({
deposit(srcPayment) {
const purseBalance = purse.getCurrentAmount();
const srcPaymentBalance = paymentLedger.get(srcPayment);
const newPurseBalance = purseBalance + srcPaymentBalance;
paymentLedger.delete(srcPayment);
purseBalanceLedger.set(purse, newPurseBalance);
return srcPaymentBalance;
},
withdraw(amount) {
const purseBalance = purse.getCurrentAmount();
const newPurseBalance = purseBalance - amount;
const payment = makePayment();
purseBalanceLedger.set(purse, newPurseBalance);
paymentLedger.init(payment, amount);
return payment;
},
getCurrentAmount() {
return purseBalanceLedger.get(purse);
},
getAllegedBrand() {
return brand;
},
getDepositFacet() {
return purseDepositLedger.get(purse);
},
});
// note: we cannot use 'purse' in a weak store yet, because it is not yet
// associated with a virtual object
return purse;
}
function initializePurse(purse, initialBalance = 0) {
purseBalanceLedger.init(purse, initialBalance);
const depositFacet = makeDeposit(purse);
purseDepositLedger.init(depositFacet);
}
const makePurse = makeExternalStore('purse', purseInterface, buildPurse, initializePurse);
function buildPayment() { // simpler because it doesn't reference self
return harden({
getAllegedBrand: () => brand,
});
}
function initializePayment(payment, amount) {
paymentLedger.init(payment, amount);
}
const makePayment = makeExternalStore('payment', paymentInterface, buildPayment, initializePayment);
function buildDeposit() {
const deposit = harden({
receive(payment) {
const purse = depositLedger.get(deposit);
return purse.deposit(payment);
},
});
return deposit;
}
function initializeDeposit(deposit, purse) {
depositLedger.init(deposit, purse);
}
const makeDeposit = makeExternalStore('deposit', depositInterface, buildDeposit, initializeDeposit);
return harden({ makePurse });
}
harden(makeIssuerKit); The reason I had to split up So internally, the external store does something like: const representatives = new WeakMap();
function makePurse(...args) {
const vobjid = allocateID();
const purse = buildPurse(); // 'purse' is the initial Representative
representatives.set(purse, vobjid);
initializePurse(purse, ...args);
return purse;
}
function hydratePurse(vobjid) {
const purse = buildPurse();
representatives.set(purse, vobjid);
return purse;
} The const foo = {
deposit(srcPayment) {
// stuff that references foo
},
};
return foo; Another pattern would be to define the methods with an extra initial "me" argument (us Python programmers spell this return {
deposit(me, srcPayment) { ... },
}; I can't help but wonder if we should resurrect Another option is to make the |
Please change the variadic This gives us extensibility if we ever want to pass different arguments to the So, we'd destructure the arguments explicitly in the initialize function, like: function initializePurse(purse, [initialBalance = 0]) {
purseBalanceLedger.init(purse, initialBalance);
const depositFacet = makeDeposit(purse);
purseDepositLedger.init(depositFacet);
} and its caller would be variadic, but pass the arguments without expansion: function makePurse(...makerArgs) {
const vobjid = allocateID();
const purse = buildPurse(); // 'purse' is the initial Representative
representatives.set(purse, vobjid);
initializePurse(purse, makerArgs);
return purse;
} |
Ok here's an updated version, which also changes /* global makeExternalStore, sameRepresentative, makeWeakStore */
// these three come from (and share private tables with) liveslots
import { Remotable } from '@agoric/marshal';
import { makeInterface, ERTPKind } from './interfaces';
export function makeIssuerKit(allegedName) {
const brand = Remotable(makeInterface(allegedName, ERTPKind.BRAND));
const purseBalanceLedger = makeWeakStore(); // value: balance
const purseDepositLedger = makeWeakStore(); // value: deposit
const paymentLedger = makeWeakStore(); // value: balance
const depositLedger = makeWeakStore(); // value: Purse
const purseInterface = makeInterface(allegedName, ERTPKind.PURSE);
const paymentInterface = makeInterface(allegedName, ERTPKind.PAYMENT);
const depositInterface = makeInterface(allegedName, ERTPKind.DEPOSIT_FACET);
function representPurse() {
const purse = harden({
deposit(srcPayment) {
const purseBalance = purse.getCurrentAmount();
const srcPaymentBalance = paymentLedger.get(srcPayment);
const newPurseBalance = purseBalance + srcPaymentBalance;
paymentLedger.delete(srcPayment);
purseBalanceLedger.set(purse, newPurseBalance);
return srcPaymentBalance;
},
withdraw(amount) {
const purseBalance = purse.getCurrentAmount();
const newPurseBalance = purseBalance - amount;
const payment = makePayment();
purseBalanceLedger.set(purse, newPurseBalance);
paymentLedger.init(payment, amount);
return payment;
},
getCurrentAmount() {
return purseBalanceLedger.get(purse);
},
getAllegedBrand() {
return brand;
},
getDepositFacet() {
return purseDepositLedger.get(purse);
},
});
// note: we cannot use 'purse' in a weak store yet, because it is not yet
// associated with a virtual object
return purse;
}
function initializePurse(purse, [initialBalance = 0]) {
purseBalanceLedger.init(purse, initialBalance);
const depositFacet = makeDeposit(purse);
purseDepositLedger.init(depositFacet);
}
const makePurse = makeExternalStore('purse', purseInterface, representPurse, initializePurse);
function representPayment() { // simpler because it doesn't reference 'self'
return harden({
getAllegedBrand: () => brand,
});
}
function initializePayment(payment, [amount]) {
paymentLedger.init(payment, amount);
}
const makePayment = makeExternalStore('payment', paymentInterface, representPayment, initializePayment);
function representDeposit() {
const deposit = harden({
receive(payment) {
const purse = depositLedger.get(deposit);
return purse.deposit(payment);
},
});
return deposit;
}
function initializeDeposit(deposit, [purse]) {
depositLedger.init(deposit, purse);
}
const makeDeposit = makeExternalStore('deposit', depositInterface, representDeposit, initializeDeposit);
return harden({ makePurse });
}
harden(makeIssuerKit); Is this good enough for contract developers to use? And/or is it likely we could automatically translate into this format from something that is good enough for developers? And/or should we just get started with this and experiment our way into a more ergonomic API as we go? |
The following is just my opinion: Good enough for us to use to get something working with @FUDCo's backend, but not other contract developers. It is likely that we can translate our way from something good enough for developers. And we should just get started with it, use it for our own issuer.js, replace its uses with the translation mechanism when we have it (but leave your API intact for tests, etc), and then explain how to use the ergonomic API to a wider audience. Other people's opinions are solicited. |
Naming bikeshed: I suggest
|
My take on externalizable storage for vats: I've now implemented the API I had in mind. The implementation is not yet integrated with the kernel but it does do crude serialization to a store, manages a limited size cache of in-memory object state with handling of faulting state in when needed, and has support for serializing and deserializing representatives by their internal IDs (though this isn't yet actually integrated with There's basically one function:
If there is a method named So you'd write something like this: function representThing(state) {
return {
initialize(label = 'thing', counter = 0) {
state.counter = counter;
state.label = label;
state.resetCounter = 0;
},
inc() {
state.counter += 1;
},
reset(newStart) {
state.counter = newStart;
state.resetCounter += 1;
},
relabel(newLabel) {
state.label = newLabel;
},
get() {
return state.counter;
},
describe() {
return `${state.label} counter has been reset ${state.resetCounter} times and is now ${state.counter}`;
},
};
}
const thingMaker = makeKind(representThing);
const thing1 = thingMaker('thing-1');
const thing2 = thingMaker('thing-2', 100); then treat Also, before you ask, via some sleight of hand the The existing weak store would get extended to be indexable by these representative objects using the Via another bit of sleight of hand, the value you return from the I rewrote Brian's rewriting of purse based on this API. It's a little simpler because he was keeping all the state in weak stores, but we only need the weak stores for rights amplification, so this reduces the number of weak stores from 4 to 1 and lets objects that just want to keep their own state do so naturally. It looks like this: /* global makeExternalStore, sameRepresentative, makeWeakStore */
// these three come from (and share private tables with) liveslots
import { Remotable } from '@agoric/marshal';
import { makeInterface, ERTPKind } from './interfaces';
export function makeIssuerKit(allegedName) {
const brand = Remotable(makeInterface(allegedName, ERTPKind.BRAND));
const paymentLedger = makeWeakStore(); // value: balance
function representPurse(state) {
const purse = {
initialize(initialBalance = 0) {
state.balance = initialBalance;
state.deposit = makeDeposit(purse);
},
deposit(srcPayment) {
const srcPaymentBalance = paymentLedger.get(srcPayment);
paymentLedger.delete(srcPayment);
state.balance += srcPaymentBalance;
return srcPaymentBalance;
},
withdraw(amount) {
state.balance -= amount;
const payment = makePayment();
paymentLedger.init(payment, amount);
return payment;
},
getCurrentAmount() {
return state.balance;
},
getAllegedBrand() {
return brand;
},
getDepositFacet() {
return state.deposit;
},
};
return purse;
}
const makePurse = makeKind(representPurse);
function representPayment() {
const payment = {
initialize(amount) {
paymentLedger.init(payment, amount);
},
getAllegedBrand: () => brand,
};
return payment;
}
const makePayment = makeKind(representPayment);
function representDeposit(state) {
return {
initialize(purse) {
state.purse = purse;
},
receive(payment) {
return state.purse.deposit(payment);
},
};
}
const makeDeposit = makeKind(representDeposit);
return harden({ makePurse });
}
harden(makeIssuerKit); |
The const makeThingState = (label = 'thing', counter = 0) => {
counter,
label,
resetCounter: 0,
};
const representThing = state => harden({
inc() {
state.counter += 1;
},
reset(newStart) {
state.counter = newStart;
state.resetCounter += 1;
},
relabel(newLabel) {
state.label = newLabel;
},
get() {
return state.counter;
},
describe() {
return `${state.label} counter has been reset ${state.resetCounter} times and is now ${state.counter}`;
},
});
const thingMaker = makeKind(makeThingState, representThing); This avoids doing surgery based on treating a property name like |
Your code did lead me to a new insight about ERTP: Purses need to rights-amplify payments. But nothing needs to rights-amplify purses. Hence the existing ERTP should just use a lexical balance for purses and we should get rid of the |
Hmm, can you explain further? I don't think the |
Hi @katelynsills see #1889 |
Actually, what you propose was the way I originally did it, based on a similar misgiving about magic (which concern does linger in my sensibilities still). I tried out the One way to frame it is that with the |
I agree with all of that analysis. Between
and
I strongly prefer the latter. |
I think you might be able to achieve some kind of unification effect by passing the state initializer and the representative maker inline to const thingMaker = makeKind(
(label = 'thing', counter = 0) => {
counter,
label,
resetCounter: 0,
},
state => {
inc() {
state.counter += 1;
},
reset(newStart) {
state.counter = newStart;
state.resetCounter += 1;
},
relabel(newLabel) {
state.label = newLabel;
},
get() {
return state.counter;
},
describe() {
return `${state.label} counter has been reset ${state.resetCounter} times and is now ${state.counter}`;
},
},
); It's mainly just a code formatting trick and the added burden of getting all the nested punctuation right might not pay its way. And of course it's also nice for things to have names. But it's something to consider. (BTW, note that you don't need the harden as |
An unrelated issue that came up as I was working on implementing the above stuff was how |
Further thought: this is yet another thing that would be aided by having some kind of reliable brand-check mechanism. |
@FUDCo I like it. Question 1in the Purse withdraw(amount) {
state.balance -= amount;
const payment = makePayment();
paymentLedger.init(payment, amount);
return payment;
}, it calls withdraw(amount) {
state.balance -= amount;
const payment = makePayment(amount);
return payment;
}, It raises a minor ergonomics issue, summarized as "There's More Than One Way To Do It (pejorative form)", but I don't think it'd be a big deal in practice: if you have the right weakstore amplifier, you can mess around with somebody else's state directly, but it's still nicer to ask them politely when they have an API for the particular thing you're doing. Point 2
Point 3
Question 4@erights could you try writing the Purse example using your alternative two-function API? The problem I think you'll run into is that the Payment initializer needs to keep its amount in the rights-amplified weakstore (so a Purse method can get to it), which means it needs a key to use in that store, which is of course the Payment object/representative itself. But if the initializer is creating that object itself, then the object is not yet known to the In @FUDCo 's one-function API, the object is created first, then its In your proposed two-function version, Concern 5@FUDCo and I were talking about the implementation of the There will be multiple representatives for any given virtual object: we can't prevent that because we don't have WeakRefs at this level. We also don't get to know which representatives are still around, for the same reason. Those representatives must all interact with the same state. One way to achieve that is for them all to share the same In my all-state-is-rights-amplification approach, the user-provided code is doing obvious (and painful, I'll admit) The Proxy approach doesn't reveal anything to the user that would suggest So that's my concern with the magic |
Liveslots does not yet provide any `vatGlobals`, but this ensures that any ones it does provide in the future will be added to the `endowments` on the vat's Compartment. We'll use this to populate `makeExternalStore` -type functions, which want to be globals (because threading them from `vatPowers` into modules that need them would be too annoying), but must also be in cahoots with closely-held WeakMaps and "Representative" state management code from liveslots, as described in #455 and #1846. closes #1867
(my attempt to capture pieces of our 21-Oct-2020 kernel meeting, in no particular order) @dtribble wants to avoid "chained transitive reads", where loading one virtual object whose state points to a second one doesn't need to load the second data right away. To imagine a worst-case, think about a linked list made of virtual objects: loading one element should not cause the entire rest of the list to be loaded into RAM. If userspace code compares a stashed Representatives for object equality against a new representative, and the answer reveals information about how many representatives have been created during the interval, that opens up a communication channel between otherwise independent object graphs. We would consider this "EQ representative" channel to be a fatal flaw. An equally fatal "EQ oldvalue" communication channel could result if userspace code uses a Representative to read data from secondary storage and stashes the object it gets back, and compares that for object equality against a new object read the same way but later. A system might have exactly one Representative per virtual object, or many. To achieve exactly one representative would require using WeakRefs, and would almost certainly cause the DB read/write calls to not be a deterministic function of vat operations. The "EQ Representatives" channel would be closed by having exactly one Representative per virtual object, or by having each message delivery create a new Representative. The state API made available to the userspace-provided behavior function (e.g.
The shallow API is the least magical. The deeper APIs are more ergonomic. I think we agreed that one-level deep is a reasonable compromise. The data we put into secondary storage will be marshalled using the same functions we use for After some thought, I have a set of three proposals. The simplest is: Brian's Proposal A: no cacheEach The new Representative for each message delivery closes the "EQ representative" channel. The Downsides: if the behavior code reads the data a lot, we do lots of DB reads. Same for writes. Proposal B: read cache, write-through cacheWe still use new Representatives and new Each Upside: each read costs a deserialization, but not a new DB read. Lots of writes cause lots of DB writes. Proposal C: write-back cache
|
Closed by #1967 |
Thoughts on externalizable objects and secondary storage for vats
LiveSlotsStore
LiveSlotsStore
is the low level store, used by liveslots but not exposeddirectly to vat code. In principle it could be implemented on top of the same
store that the kernel db uses now, merely adding some conventions about
prefixing keys (with the vat ID and some constant) to avoid collisions.
However, it might also be helpful to put each vat's data in a separate store to
allow vats to be moved around, e.g., if a worker is run on a different machine
from the kernel.
EOSlot
EOSlot
is just a pattern for identifying an externalizable object, in the samemanner as we do for kernel or vat objects/promises.
type EOSlot = 'eo$NN'; // yes, I know this isn't a real TypeScript type
EOCapData
EOCapData
represents an externalizable object in its serialized form. It's likeCapData
, but adds a type field so that data can be reunited with its behaviorwhen deserialized from the database.
ExternalizableObject
ExternalizableObject
is an object that can be stored in the per-vat store. Itconsists of two parts: a hunk of mutable data (which must be serializeable using
our marshaling framework) and a deeply immutable collection of methods that
realize its behavior. The latter may be shared by multiple externalizable
objects, making it a kind of type or class. We require the code to be immutable
both so that it can be shared in this way and so that its meaning cannot change
over time (e.g., it can't close over any mutable state, which could allow it to
sense if it's been reloaded from secondary storage). The idea here is that the
methods are the only things (aside from the runtime) that have direct access to
the state. [Q: though there would seem to be a signifcant role for pure data
objects with no methods, which would require them to have public state if
they're to be useful; perhaps this could be realized by some conventional type
marker, like
null
orundefined
?]What I'd like to do is something like this:
This won't work, of course, since the passed-in methods carry their lexical
scope with them. I don't believe there's any graceful way to inject a bundle of
methods into a different lexical scope. Possibly there's some kind of games we
can play with
Function.bind
or prototypal inheritence (e.g., set the prototypeof the
state
object to be the methods object) which would be a little weirdfor us but might work -- this is a question to call in the real JS gurus on.
We also need some way to validate that the implementation doesn't close over any
other variables, which requires some kind of parsing and analysis, but I'm
assuming we have tools in our toolbox for that which could hardly be more
involved that what we do for metering. He said.
The type Data denotes arbitrary serializable data. [Q: Do we have a name for that
abstraction? It's not CapData, which is already serialized... Also, can it be
expressed as a TypeScript type?]
It would be nice to have a clean way to detect if any values reachable from
state
have been modified, so we can know if the state is dirty when we'redeciding about writing it to disk. Proxy?
VatStore
VatStore
is the high level store, the thing that's made available to vat code.It's implemented on top of
LiveSlotsStore
but hides the mechanics ofserialization and deserialization. The vat is given a single
VatStore
as oneof its vat powers; this store can in turn be used to create object containers
for storing externalizable objects in.
VatStoreContainer
allows objects to actually be stored and retrieved.The idea here is that once an object has been associated with a container (via
the
enroll
method), it can be automatically saved and fetched by avirtual-memory like mechanism to be described shortly.
The
VatStore
maintains table indexing external object type identifier strings(
et$NN
) to the method objects passed tomakeExternalizableObject
. Eachunique method object gets its own type ID, but objects with the same method
object get the same type ID. This is used to reassociate an object's behavior
with its state when the latter is read from the database and instantiated in
memory.
EOPresence
EOPresence
enables you to interact with the object.An
EOPresence
encapsulates anEOSlot
(aneo$NN
slot string) and possibly apointer to an instantiated
ExternalizableObject
in memory. We define anoperator that I'm calling
O
here (we can bikeshed this later), analagous toour existing
E
andD
operators. You invoke a method on an externalizableobject with something like:
O(eopresence).method(args...)
which looks at what's encapsulated: if there's an instantiated
ExternalizableObject
in memory, it synchronously invokes the method on theobject and returns the result it produced or throws any exception that it
throws. If the object is not in memory, it uses the encapsulate
EOSlot
tofetches the correspnoding
EOCapData
from theLIveSlotsStore
and uses this toinstantiate an
ExternalizableObject
, which it memoizes in theEOPresence
,and then sends invokes the method. It is up to the liveslots layer to manage
a cache of instantiated objects, and to automatically commit any modified
objects to the container at the end of each crank.
EOIndex
EOIndex
is an indexed view onto the contents of aVatStoreContainer
,generated by call the container's
makeIndex
method. The caller provides twofunctions that define the operation of the index, a key mapper and a key
comparator:
type KeyMapper = (obj: EOPresence) => Key;
type Key = string;
A
KeyMapper
provides an application-domain translation between the object'sinternal state and a key to be used for locating it in the index being created.
The key mapper will be called whenever an object is inserted into the index and
whenever a modified version of that object is written to the container. The
produced key values are strings, but no further semantics is imposed by the
indexing mechanism itself.
type KeyComparator = (k1: Key, k2: Key) => number;
A
KeyComparator
is used by the indexing mechanism to determine the ordering ofkeys according to application-domain-specific logic. It returns the usual less
than zero, zero, greater than zero indicator used by comparator functions the
world over.
The
index
method instructs the index to keep an index for the indicatedobject.
The
lookup
method performs a query and returns an array of the objects thatmatch.
A strawman query scheme:
The text was updated successfully, but these errors were encountered: