-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Introduce GetRangeAndFlatMap to push computations down to FDB #5609
Conversation
AWS CodeBuild CI Report for macOS Catalina 10.15
|
AWS CodeBuild CI Report for Linux CentOS 7
|
fdbserver/storageserver.actor.cpp
Outdated
state Transaction tr(data->cx); | ||
tr.setVersion(version); | ||
// TODO: is DefaultPromiseEndpoint the best priority for this? | ||
tr.info.taskID = TaskPriority::DefaultPromiseEndpoint; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this means the second read go through the read path front door.
Can we craft something like how you get the first read?
GetValueRequest req(Span().context, key, version, Optional<TagSet>(), Optional<UID>());
data->actors.add(data->readGuard(req, getValueQ));
GetValueReply reply = wait(req.reply.getFuture());
If the second read is also on the same SS, you will save the overhead of getKeyLocation() and the serialization and deserialization on the SS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it is in the same SS, it should have returned here /~https://github.com/apple/foundationdb/pull/5609/files#diff-c4dbfae809f757f2061099ee5021ce2cbbf0f4dbc2f0f18480a4856d63d033ceR1623. Note /~https://github.com/apple/foundationdb/pull/5609/files#diff-c4dbfae809f757f2061099ee5021ce2cbbf0f4dbc2f0f18480a4856d63d033ceR1630-R1637 is a fallback when the data is not in current SS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe you meant quickGetValue
= the current two-hop solution?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The first query's code is at getKeyValuesAndHopQ
which is effectively same as a normal range scan aka getKeyValuesQ
.
hop()
issues the second queries from the results returned by the first query.
quickGetValue
is the second query.
fdbserver/storageserver.actor.cpp
Outdated
// TODO: is DefaultPromiseEndpoint the best priority for this? | ||
tr.info.taskID = TaskPriority::DefaultPromiseEndpoint; | ||
Future<Optional<Value>> valueFuture = tr.get(key, Snapshot::True); | ||
// TODO: async in case it needs to read from other servers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we reply error if it needs to hop to another server and let client side retry?
Doing like this means the first SS is doing the same amount of work client does if the second-hop key is not colocated with the first-hop key.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that's another design choice. See the discussion in the design doc.
I'm currently using this one because it has the simplest API and least modification in client.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@xumengpanda I would prefer first SS "do the work". If we can bypass JNI and not have to serde index records, it is a win
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah, I forgot the JNI overhead when I wrote the comment.
Letting the first SS forward the requests and do the work can avoid JNI and is likely better.
I'd like to point out its tradeoff as well:
- Suppose the data flow is client->SS1 (to find index) -> SS2 (to find the value) -> SS2 directly reply to client, which means the first SS forwards the request to SS2
- SS1 will need to cache the shard-to-SS mapping, which causes extra memory. That means we use some extra memory on storage node to save some cpu time on compute node.
We probably don't have to decide the corner case now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are three possibilities, I believe:
- first SS does all the proxy calls to wherever
- C client does any proxy calls when hopping within SS is not possible
- API client does any proxy calls
3 has the JNI overhead, but 2 doesn't. It's more modification, as @nblintao says, but I think it might be the most robust. Particularly once we undertake to do range scans of the hopped-to place.
fdbserver/storageserver.actor.cpp
Outdated
state Transaction tr(data->cx); | ||
tr.setVersion(version); | ||
// TODO: is DefaultPromiseEndpoint the best priority for this? | ||
tr.info.taskID = TaskPriority::DefaultPromiseEndpoint; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe you meant quickGetValue
= the current two-hop solution?
fdbserver/storageserver.actor.cpp
Outdated
GetKeyValuesReply _r = | ||
wait(readRange(data, version, KeyRangeRef(begin, end), req.limit, &remainingLimitBytes, span.context)); | ||
|
||
std::cout << "read range done, start hopping" << std::endl; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TraceEvent is better because it can be ingested into our log search sys.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I will change to TraceEvent when I finalize the code.
changeCounter, | ||
KeyRangeRef(std::min<KeyRef>(begin, std::min<KeyRef>(req.begin.getKey(), req.end.getKey())), | ||
std::max<KeyRef>(end, std::max<KeyRef>(req.begin.getKey(), req.end.getKey())))); | ||
if (EXPENSIVE_VALIDATION) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
making it knob will make your perf & correctness evaluation life on real clusters better
fdbserver/storageserver.actor.cpp
Outdated
result.arena.dependsOn(hopKey.arena()); | ||
|
||
std::cout << "quickGetValue start key: " << hopKey.toString() << std::endl; | ||
Optional<Value> valueOption = wait(quickGetValue(data, hopKey, input.version)); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we can make sure the input.data indexed data is still on the same SS, quickGetValue() will not need the transaction inside it. right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, see /~https://github.com/apple/foundationdb/pull/5609/files#diff-c4dbfae809f757f2061099ee5021ce2cbbf0f4dbc2f0f18480a4856d63d033ceR1619-R1622. Most of the queries should hit this and return, without transactions.
@@ -0,0 +1,171 @@ | |||
/* | |||
* IndexPrefetchDemo.actor.cpp |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it means you can run this in multi-tester for perf tests
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand. Could you elaborate it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
multi-tester is a load generator framework FDB uses to generate load.
Each fdbserver can start as a tester role, which will pick up the specified workloads
, like the one you wrote.
These testers will behave as "clients generate traffic to the test cluster" .
Let's sync offline on how to set it up in our internal infra.
47c7f2a
to
33ad45e
Compare
AWS CodeBuild CI Report for macOS Catalina 10.15
|
AWS CodeBuild CI Report for Linux CentOS 7
|
AWS CodeBuild CI Report for Linux CentOS 7
|
fdbclient/StorageServerInterface.h
Outdated
@@ -278,11 +283,35 @@ struct GetKeyValuesReply : public LoadBalancedReply { | |||
} | |||
}; | |||
|
|||
// Describe how to hop. Assume the keys are encoded as Tuple, hence there are concept of "elements". | |||
struct HopInfo { | |||
// Do GetKeyValues, extract the last suffixLen elements from each keys. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we could make this a bit more efficient with modest additional cost in request size by adding a number of bytes to skip before starting to parse a tuple. This would also cover cases where, for some reason, the entire key isn't a well-formed tuple, but the same secondary index pattern is followed after some point.
fdbclient/StorageServerInterface.h
Outdated
int suffixLen; | ||
// Form hop keys by appending the fetched suffixes to hopPrefix. Do GetValue using the hop keys. | ||
// TODO: Support getting ranges using the hop keys as prefixes. | ||
KeyRef hopPrefix; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this covers the important cases, but to make the scope explicit.
The simplest case is where the scanned keys are
dir, 'I', value, id1, id2
and the key-values to be fetched are
dir, 'R', id1, id2, n
(n
might be a split record counter or an exploded per-field key)
This also handles cases where the scanned keys are
dir, 'I', id1, value, id2
and the key-values to be fetched are still
dir, 'R', id1, id2, n
provided that the scan fixes id1
so that it can be included in the skip prefix and the hop prefix. That is,
WHERE id1 = ? AND value BETWEEN ? AND ? ORDER BY value
It is not sufficient for cases where id1
needs to be extracted for the hop. Such as,
WHERE id1 > ? ORDER BY id1, value
Nor for the case where the index is
dir, 'I', id2, id1
for scans like
id2 > ? ORDER BY id2
Again, I am not sure more complicated tuple element shuffling descriptors are worth the trouble. The two supported cases match the ones I have seen in real life.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I can update int suffixLen
to a vector of picked item indexes in the expected order.
To make it more expressive (Is there a use case of interleaving the fetched items with the constant items?), and probably more elegant, I'm also thinking about using a single tuple to express both suffixLen
and hopPrefix
. It could be something like:
dir, "R", "%2", "%4", "%.*"
"%.*" is representing doing a range query rather than point query.
Escaping on the strings are needed, of course.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a use case of interleaving the fetched items with the constant items?
We might distinguish three kinds of hop elements:
- variable, fetched from the scanned keys
- constant, occurring in both the scan range and the hop prefix
- constant, occurring only in the hop prefix
The scanned range might arbitrarily permute its elements for the sake of ordering or inequality comparisons. So the first two interleave arbitrarily, too.
Placing the last someplace other than at the front is maybe a little more contrived, but suppose you have
dir, "I", value, group, id
and
dir, "R", group, "widget", id
where the constant type (widget) is implied for the original scan -- it's an index on that type only -- and so not part of its keys, but the typing is normalized within some form of group.
ACTOR Future<Optional<Value>> quickGetValue(StorageServer* data, StringRef key, Version version) { | ||
if (data->shards[key]->isReadable()) { | ||
try { | ||
// TODO: Use a lower level API may be better? Or tweak priorities? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also think this needs to be a range request, to get all the key-value pairs starting with the hop prefix.
fdbserver/storageserver.actor.cpp
Outdated
// TODO: is DefaultPromiseEndpoint the best priority for this? | ||
tr.info.taskID = TaskPriority::DefaultPromiseEndpoint; | ||
Future<Optional<Value>> valueFuture = tr.get(key, Snapshot::True); | ||
// TODO: async in case it needs to read from other servers. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are three possibilities, I believe:
- first SS does all the proxy calls to wherever
- C client does any proxy calls when hopping within SS is not possible
- API client does any proxy calls
3 has the JNI overhead, but 2 doesn't. It's more modification, as @nblintao says, but I think it might be the most robust. Particularly once we undertake to do range scans of the hopped-to place.
626e517
to
aec1782
Compare
AWS CodeBuild CI Report for Linux CentOS 7
|
AWS CodeBuild CI Report for macOS Catalina 10.15
|
aec1782
to
3ef815c
Compare
AWS CodeBuild CI Report for Linux CentOS 7
|
3ef815c
to
d9a6aa3
Compare
AWS CodeBuild CI Report for macOS Catalina 10.15
|
AWS CodeBuild CI Report for Linux CentOS 7
|
d9a6aa3
to
99f99a0
Compare
AWS CodeBuild CI Report for Linux CentOS 7
|
AWS CodeBuild CI Report for macOS Catalina 10.15
|
99f99a0
to
bed03c6
Compare
AWS CodeBuild CI Report for macOS Catalina 10.15
|
AWS CodeBuild CI Report for Linux CentOS 7
|
bed03c6
to
36091b9
Compare
AWS CodeBuild CI Report for Linux CentOS 7
|
AWS CodeBuild CI Report for Linux CentOS 7
|
AWS CodeBuild CI Report for Linux CentOS 7
|
fdbserver/storageserver.actor.cpp
Outdated
@@ -2880,9 +2870,8 @@ ACTOR Future<Void> getKeyValuesAndHopQ(StorageServer* data, GetKeyValuesAndHopRe | |||
GetKeyValuesReply _r = wait( | |||
readRange(data, version, KeyRangeRef(begin, end), req.limit, &remainingLimitBytes, span.context, type)); | |||
|
|||
// std::cout << "read range done, start hopping" << std::endl; | |||
// Hop!!! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I feel directly changing the std::cout to comment is more reader-friendly than "Hop":
read range done, start hopping
++data->counters.quickGetKeyValuesHit; | ||
|
||
// Convert GetKeyValuesReply to RangeResult. | ||
return RangeResult(RangeResultRef(reply.data, reply.more), reply.arena); | ||
} catch (Error& e) { | ||
// std::cout << "quickGetValue fallback because of exception or not managed by the shard " << e.name() | ||
//<< std::endl; | ||
// Fallback. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: your cout
message body is a better comment line.
fdbserver/storageserver.actor.cpp
Outdated
++data->counters.quickGetValueHit; | ||
return reply.value; | ||
} catch (Error& e) { | ||
// std::cout << "quickGetValue fallback because of exception " << e.name() << std::endl; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: your cout
message body is a better comment line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very complex change. Still don't fully get the implementation detail.
// than the prefix of the key. So we don't use key() or create_data() in this test. | ||
std::map<std::string, std::string> data; | ||
for (int i = 0; i < 3; i++) { | ||
data[indexEntryKey(i)] = EMPTY; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do you link index with the actual key? Looks like in this example, as long as the primary key is the suffix of the index key, it will work. It is not necessary to have the index key be in the record of the value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When users actually use it, the indexed field is likely to be stored in the value of the record, and encoded in someway (e.g. Protobuf in Record Layer). But that is not necessary for using this API.
fdbclient/NativeAPI.actor.cpp
Outdated
Reverse reverse, | ||
TransactionInfo info, | ||
TagSet tags) { | ||
template <class GetKeyValuesMaybeHopRequest> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The client API has getRange
and getRangeAndHop
. Storage server also has getKeyValues
and getKeyValuesAndHop
. Why merging the two code path into GetKeyValuesMaybeHop
in the middle? Seems unnecessary.
If the main idea is to reuse getRange
implementation for both getRange and getRangeAndHop, is there a way we can tell whether a getRange
is which one inside the getRange
function?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good obervation. Ideally, we should "reuse getRange implementation for both getRange and getRangeAndHop" to make the code clean and avoid duplicated code. But it turns out very hard (if not impossible) to make templates work with Flow compiler. This is one of a few places that we are able to merge.
I guess it makes sense to duplicated here as well, to keep things consistent and avoid confusions.
fdbclient/NativeAPI.actor.h
Outdated
@@ -289,6 +289,21 @@ class Transaction : NonCopyable { | |||
reverse); | |||
} | |||
|
|||
[[nodiscard]] Future<RangeResult> getRangeAndHop(const KeySelector& begin, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add some documentation how to use this API? Especially how to set hopInfo
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, there should be a document. But I'm not going to add it for now because the interface is going to be changed very soon in next PR.
@@ -296,6 +301,9 @@ struct GetKeyValuesRequest : TimedRequest { | |||
SpanID spanContext; | |||
Arena arena; | |||
KeySelectorRef begin, end; | |||
// This is a dummy field there has never been used. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because of this, I think maybe we should consider sharing the implementation of getRange with different interface rather than an unified interface (the ...MaybeHop...).
referenceTuple = &valueTuple.get(); | ||
} else { | ||
ASSERT(false); | ||
throw internal_error(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we throw an error with more info here? So that we can quickly identify where is the internal_error()
thrown.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This piece of code will be removed in next PR, so I wish to no change it for now. nblintao@2b627b9
fdbserver/storageserver.actor.cpp
Outdated
Tuple hopInfoTuple = Tuple() | ||
.append("normal"_sr) | ||
.append("{{escaped}}"_sr) | ||
.append("{K[2]}"_sr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This K[], V[], is a little hard to read, still don't know how is the hopInfo is constucted and used.
Some comments coming up during the discussion:
|
I think the ideal implementation of this feature would be that the storage server only returns to the client the results for secondary lookups it was able to resolve locally. Then, the FDB client can resolve the rest of the lookups with parallelism before the C API returns. Internally, the return type to this request to the StorageServer interface could be logically equivalent to
and then the client can scan the vector and dispatch any |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left one inline comment about naming, but I have a few questions as well around the semantics of this (which I did try to find in the code itself, but figured it might be easier to just ask):
- Regarding continuations: I was originally a bit concerned that if a get-range-and-hop scan was terminated prematurely (e.g., to split a range scan across multiple transactions to avoid the 5 second limit) that it would be difficult
- Does the key info language support returning a range of kv-pairs (per source key)? The way that the Record Layer stores records, it will both need to be able to concatenate various values together (from a range read that is currently done per index entry) as well as differentiate which record key is which (in order to differentiate the record version from the protobuf record data). (This request is fairly record layer specific, but you could imagine other layers with slightly different data models still wanting similar control. For example, you could imagine a different layer that stored each field in a record in its own key-value pair, grouped per record, so a query of the form
SELECT field_1, field_2 FROM type WHERE field_3 > x
might be satisfied by scanning an index onfield_3
and then looking up only the values forfield_1
andfield_2
for each entry, which live on separate keys.) - How does the key info language handle escape characters? For example, if I legitimately had the value
{K[4]}
in my query string, I wouldn't want that to be accidentally interpreted as looking for value4
in theK
tuple. (If there's a spec for the language, that would probably answer both this question and the preceding.) - What's the behavior of empty records (e.g., as might result from there being a dangling index entry that points to a record that has since been deleted)? In particular, I could see users wanting to sometimes error on those situations or sometimes ignore them, but that implies being able to distinguish an empty value from a missing value.
bindings/c/foundationdb/fdb_c.h
Outdated
@@ -244,6 +244,24 @@ DLLEXPORT WARN_UNUSED_RESULT FDBFuture* fdb_transaction_get_range(FDBTransaction | |||
fdb_bool_t reverse); | |||
#endif | |||
|
|||
DLLEXPORT WARN_UNUSED_RESULT FDBFuture* fdb_transaction_get_range_and_hop(FDBTransaction* tr, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
minor: I don't want to bike shed too much about names, but this is in the public API, so it would be difficult to change in the future. I'm not sure "hop" really captures what this does. Maybe "resolve" or even "lookup"--in particular, when I hear "hop", I assume it's going to make a network hop, which (ideally) it won't have to when used optimally.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks so much for pointing out. Changed to "flat map"after discussion.
@alecgrieser Good questions.
The current version doesn't have well support or tests for continuation. It will fail. Clients should have a fallback using the old way. It is planned to be improved in the future version.
Yes, it is supported (apparently I simplified this in the presentation). If you add a special element
In the current version, it doesn't raise any problem. I had a TODO here /~https://github.com/apple/foundationdb/pull/5609/files#diff-c4dbfae809f757f2061099ee5021ce2cbbf0f4dbc2f0f18480a4856d63d033ceR2763. We could consider add this as an option in |
Okay, I see. I think that sounds good to me, with a few more responses below.
What does that look like? Does the user get some kind of error (somehow), or is it more like if the user knows that they won't be able to guarantee that they read everything in one transaction, they have to pick a different way to read the data.
A more structured A lot of this might actually come down to how comfortable we are with essentially releasing this as a "beta" feature that we intend to make backwards incompatible changes to that users have to use "at their own risk". |
@sfc-gh-satherton, thanks for pointing out. @MMcM has also mentioned this as method 2 at #5609 (comment). That's exactly what I'm planning to do in the road map. This can avoid deteriorating SS throughput when fallback is needed (which is also summarized in @jzhou77's comment). In addition to that, I can imagine that users should be able to define how many requests a FDB client is able to dispatch simultaneously during fallback (like "pipeline size" concept in Record Layer), which adds additional complexity. The current implementation handles fallback at SS because it was easier to implement at the beginning. SS is still protected from deteriorated performance by having a knob to fail rather than fallback when the key is not in the same SS (The that the it can fallback at application level). Thanks @jzhou77 for summarizing. I wish I was in the discussion to explain: This is an initial PR of a large feature that is going to be evolved in months if not years. Before I pushed out the "finalized" version of this PR for review, I complied a list of more than a dozen items I wanted. Surprisingly, I see every major items in the comments had been included in my list (So probably we'll have a very good consensus on the ultimate design). For many of them, it would be messy and tiring to change in the future rather than get it right in the first place (for example, a new HopInfo schema @alecgrieser). But I had to choose a set things that can be done before 7.0 release, and safe and usable for beta users. I'll later post a document about the road map and the ultimate design. I hope to have more discussions in a meeting or offline. I'm still on the way of getting comfortable with this get-features-in-first-and-improve-later thing, and trying to balance it with my desire of quality and good design. It's not easy. |
fbb116f
to
3f15686
Compare
AWS CodeBuild CI Report for Linux CentOS 7
|
3f15686
to
6c98e35
Compare
AWS CodeBuild CI Report for macOS Catalina 10.15
|
AWS CodeBuild CI Report for Linux CentOS 7
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SS logic LGTM. Please make sure you make the API clear before merging to 7.0 (documentation, making contract clear, etc.).
GetRangeLimits limits, | ||
Snapshot snapshot, | ||
Reverse reverse) { | ||
if (getDatabase()->apiVersionAtLeast(630)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is 630 true? Not 700?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the 630 copied from getRange
. It is 630 because it handles special keys differently before that. Since the new API is added in 700, getDatabase()->apiVersionAtLeast(630)
should aways be true. We could simplify the code by removing the else, but I would prefer keep it consistent with getRange
so we can deduplicate in the future.
@@ -310,6 +318,43 @@ struct GetKeyValuesRequest : TimedRequest { | |||
} | |||
}; | |||
|
|||
struct GetKeyValuesAndFlatMapReply : public LoadBalancedReply { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it worth noting that if one changes GetKeyValuesAndFlatMapRequest/Reply, it has to change GetKeyValuesRequest/Reply as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Will add a comment in a different PR. And I really wish I can find out a way to make templates work better with Flow.
return Void(); | ||
} | ||
|
||
ACTOR Future<GetKeyValuesAndFlatMapReply> flatMap(StorageServer* data, GetKeyValuesReply input, StringRef mapper) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sure you add documentations to these functions at some point.
It introduces
GetRangeAndHop
semantic to FDB. With this, SS is able to generate the keys in the queries based on another query. In another word, we can push some computations at the upper layer down to FDB. For example, querying an index and fetch records can be done in a single query! This is expected to improve latency and bandwidth when reading FDB.See the tests to have a taste how this feature could be used.
This change doesn't affect existing code pass as long as you don't use the new APIs.
Code-Reviewer Section
The general guidelines can be found here.
Please check each of the following things and check all boxes before accepting a PR.
For Release-Branches
If this PR is made against a release-branch, please also check the following:
release-branch
ormaster
if this is the youngest branch)