-
-
Notifications
You must be signed in to change notification settings - Fork 183
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New SortAndBind operator #878
Conversation
target.RemoveAt(currentIndex); | ||
} | ||
break; | ||
case ChangeReason.Refresh: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TODO: I need to add a test case for refresh
|
||
[MemoryDiagnoser] | ||
[MarkdownExporterAttribute.GitHub] | ||
public class BindAndSortInitial: IDisposable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BindAndSort_Initial
to match BindAndSort_Change
?
|
||
[MemoryDiagnoser] | ||
[MarkdownExporterAttribute.GitHub] | ||
public class BindAndSortChange: IDisposable |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Class name doesn't match file name.
|
||
// check initial data set is sorted | ||
_boundList.Count.Should().Be(100); | ||
people.OrderBy(p => p, _comparer).SequenceEqual(_boundList).Should().BeTrue(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The proper assertion here, in my mind, is...
_bountList.Should().BeEquivalentTo(peopoe.OrderBy(p => p, _comparer), options => options.WithStrictOrdering());
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right.
I cloned there test cases from the original sorted fixture and adapted accordingly. This also meant a very very old hang up from the days that I used NUnit.Assert which did the expectation and actual backwards to the usual conventions. I'll fix this throughout the class.
_boundList.Count.Should().Be(101); | ||
var firstItem = _boundList[0]; | ||
|
||
firstItem.Should().Be(insert); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The proper assertion here, in my mind, is the same as above: just assert the bound list is the source items, sorted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is fine to check that a known item has been inserted exactly where we expect it.
With regard to asserting the bound list to be the same as the source.Items sorted, that too would have some merit but I don't think it completely replaces the need for an exact assertion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Technically, if you're asserting that the whole list exactly matches the desired sorting, that should mean that asserting a single item's position is superfluous, but I'm also a big fan of clarified intent, and I'm not gonna knock a tiny inefficiency in testing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I like to combine both. The tests should also not force a cognitive load for the person running or debugging them. Where possible a declarative assertion tells fresh eyes / a newbie exactly what is happening and not just to assert correctness.
} | ||
|
||
[Fact] | ||
public void AppendInMiddle() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nitpick: InsertInMiddle
?
return index; | ||
} | ||
|
||
private int GetInsertPosition(TObject item) => options.UseBinarySearch ? GetInsertPositionBinary(item) : GetInsertPositionLinear(item); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably, this is a meaningless micro-optimization, but what I would do is either separate this logic into different subclasses, or make this a Func<TObject, int>
field that gets assigned to one method or the other, upon subscription. That way, the if check here isn't getting repeated for every inserted item, it only gets done once, since the options don't ever change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's probably a micro-optimization too far as a simple bool check is inexpensive.
/// <param name="ResetOnFirstTimeLoad"> Should a reset be fired for a first time load.This option is due to historic reasons where a reset would be fired for the first time load regardless of the number of changes.</param> | ||
/// <param name="UseBinarySearch"> Use binary search when the result of the comparer is a pure function.</param> | ||
/// <param name="InitialCapacity"> Set the initial capacity of the readonly observable collection.</param> | ||
public record struct BindAndSortOptions( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This definitely should not be implemented with a constructor, but just plain get/init properties.
A) The goal of any of these Options
types should be to contain all the "optional" options for an operator, ergo, a caller should be able to construct a set of options with only the options they care about. A constructor like this forces the caller to specify every single one. Plain object-initializer syntax would not.
B) If we want to add new options in the future, changing the constructor would be a breaking change, just adding a property to the struct is not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right. I just have not got into the habit of using init properties on records.
It will be in my next push.
/// <summary> | ||
/// Default bind and sort options. | ||
/// </summary> | ||
public static readonly BindAndSortOptions Default = new |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would very much like us to move away from using non-obvious "special values". E.G. from context in the operator, it looks like you're using 0
as the value for ResetThreshold
to mean "don't ever convert updates to resets". I think it's way clearer to just use null
for that scenario. Similarly, null
for InitialCapacity
instead of -1
, when the intention is to not set an initial capacity.
With those changes, I would say that a Default
field here becomes superfluous, as all of the default values for these options then become the default value for the type itself. Unless you want to make the Default
field non-readonly, so consumers can set their own defaults, as they see fit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed Default, but included a system wide one which consumers can configure
public static IObservable<IChangeSet<TObject, TKey>> BindAndSort<TObject, TKey>( | ||
this IObservable<IChangeSet<TObject, TKey>> source, | ||
IList<TObject> targetList, | ||
IComparer<TObject> comparer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would have comparer
on all of these overloads be rolled into BindAndSortOptions
, with the default being null
and resolved to Comparer<T>.Default
within the operator.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Surely that would throw if the target object does not implement IComparable. I think this would confuse most people and end up with a load of support issues being raised.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would indeed throw, yes. But it follows the pattern established by Microsoft in System
namespaces. List<T>.Sort()
, Array.Sort()
, and IEnumerable<T>.OrderBy()
all support being called without an IComparer<T>
being given. If Microsoft thinks it's a reasonable compromise, that's good enough for me.
On a separate note, there's probably also room for an overload that takes a Comparison<T>
, instead of an IComparer<T>
which is also something all of the above system APIs have.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's do that for Dynamic Data 2. I do not mind making various changes for this version, but I think a change of concepts and limitations is part of the re-design
/// <summary> | ||
/// ObservableCache extensions for BindAndSort. | ||
/// </summary> | ||
public static partial class ObservableCacheEx |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we cool with having some operators be organized like this, with a partial class, and some not? Would it be better to just reorganize ALL of them in one big PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have been meaning to split these up for years but I have always been way too daunted to tackle in one hit. So let's leave this here as a placeholder. The we can either move gradually, or if anyone is brave enough to do it in one hit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm down either way, just want to be clear. I'll go ahead and plan to do a big PR to reorganize this, once my current plate is clear. It should be a really simple PR, just.... very large... and tedious.
They look solid to me. I questioned, at first, the idea of reusing the same subscriptions/bindings over and over, in the |
/// <param name="targetList">The list to bind to.</param> | ||
/// <param name="comparer">The comparer to order the resulting dataset.</param> | ||
/// <returns>An observable which will emit change sets.</returns> | ||
public static IObservable<IChangeSet<TObject, TKey>> BindAndSort<TObject, TKey>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it makes more sense to consider this a "terminating" operator, I.E. have it perform a subscription and return IDisposable
.
The big code smell in my mind, with leaving subscription management up to the downstream consumer, is that the operator will break if the consumer makes multiple downstream subscription. The operator will end up running for each subscription and trying to apply every changeset to the target list multiple times.
If a consumer NEEDS to be able to listen to downstream changes, I think that really necessitates that they split .BindAndSort()
into separate .Sort()
and .Bind()
operators, so that they can share and publish the .Sort()
downstream, for both .Bind()
and whatever else they need.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In the very earliest version of Dynamic Data I implemented terminating operators. But they are problematic because:
- I often end up adding stuff after Bind() in real world apps.
- If we were to force users to use Publish(), then it's more code for everyone without any real benefit.
- It's better to let the consumer decide what is terminating.
Besides once SortAndBind has settled I will make Sort obsolete (for cache). Sort is a terrible operator and I have always been somewhat ashamed of how bad it's performance is and dearly want to get rid of if. My proposal is to complete SortAndBind asap, then in the next major release mark Sort as obsolete then remove it entirely in the following major release.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Standard practice is 2 major releases before removing obsolete APIs, from what I've read.
- I'll defer to you, on that point, since I haven't built too many real-world apps with DD specifically. It still seems really weird to me.
- I think extra clarity in API surface is a "real benefit", but how much is definitely subjective.
- I see it as the consumer having to do extra work, either way, it's just that one way is potentially dangerous, and the other is just slightly more tedious. Like, the consumer doesn't get to "decide" whether
.Bind()
is terminating it ALWAYS is, if you don't ensure only one subscription, it breaks. Would it be overkill to have it be terminating by returning a custom subscription object, instead of justIDisposable
? Something like...
public class Binding<TObject, TKey>
: IObservable<IChangeSet<TObject, TKey>>,
IDisposable
{ }
This solves the problem of preventing multiple subscriptions, but still allows additional downstream listeners.
Also, I would kinda like a chance to see if we (or just I) can rework .Sort()
for caches into something reasonable, before comitting to dropping it. It seems far too limiting to me to say "you can only sort as the very last step", surely there's a way to make it not terrible. Maybe we can workshop it in DD2 and then pull what works into DD.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding terminating operators, surely it is something to be considered for Dynamic Data 2 and not the current version.
For context I have historically done stuff like this:
mySource
.Sort(...)
.Bind(....)
.Subscribe(_> 'Do some ui related action which requires the bound list to be updated first' )
My preference is not to have them but I am willing to be consider it - you make a good point about subscribing twice, but in practice I don't think anyone has ever complained about that being an issue.
Honestly, do not bother with Sort. I spent more time on it, and had more trouble than on any other operator. Also the sorted changeset is fatally flawed as it need to maintain a copy of sorted state after ever change, which is transmitted on the sorted change set. This leads to huge memory allocations and the CPU goes crazy, and for what? The only operators which use sort are Bind, Virtualise and Page. Without sort it would mean we could also get rid of this pesky sorted change set. Also without sort in the cache we do not need index on the changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
... for list however, sort is perfectly legit.
/// <param name="targetList">The list to bind to.</param> | ||
/// <param name="comparer">The comparer to order the resulting dataset.</param> | ||
/// <returns>An observable which will emit change sets.</returns> | ||
public static IObservable<IChangeSet<TObject, TKey>> BindAndSort<TObject, TKey>( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Slight nitpick: I think SortAndBind()
makes more sense, since that's logically what the operator is doing, sorting first, then binding. Put differently, what this operator replaces is .Sort().Bind()
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed
@JakenVeina I have responded to most of the feedback and the majority of what you say I agree with, and have applied the changes to the code. Also I added the binary sort option as an extra measure for single change benchmarks. The results clearly show that it should be used whenever possible as it makes a significant difference. However for a small changeset, it is actually slower, so perhaps I should work out the sweet spot and optimize for this case. I propose doing that in another PR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some comments and some questions (which might result in more comments).
/// <summary> | ||
/// Use binary search when the result of the comparer is a pure function. | ||
/// </summary> | ||
public bool UseBinarySearch { get; init; } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason to NOT use binary search? Maybe it should be the default.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the comparer is an impure function, exceptions are throw , or worse the item ends up in the wrong position.
I've fallen for that issue, and if I do I know a huge number of users would so so too.
_cache.Clone(changes); | ||
|
||
// apply sorted changes to the target collection | ||
if (target.Count == 0 || (options.ResetThreshold != 0 && options.ResetThreshold < changes.Count)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So ResetThreshold == 0
means to never trigger the logic that just redoes the whole list? Which means it will always just apply the individual changes?
It seems like whether or not to update the current or just throw it out and redo it is going to depend on the size of the current list and the number of changes being applied. If the current list has 5 items and you get a changeset of 50 items, then it will make sense to do the Reset
. However, if the current list has 1000 items and you get a changeset of 50 items, it might make more sense to do the individual updates.
Maybe I am wrong about this... Maybe if there is 50 items being added to 1000, then it always makes sense to do the Reset. My point is that maybe size of the changeset isn't enough to make a good decision and we should test that to make sure. If it isn't, then maybe this can be a ratio or something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Its been that way since day one, and very few people have ever even noticed that behaviour exists so I suggest why change a proven formula.
However, that does not mean that we can't look into these extra options as a later improvement. Personally I do not want that do extra at this stage because
- I need to apply the same treatment to visualise and to page as they are the only other operators that rely on sort
- The improvements are already missive and solved a serious problem which DD currently has.
I suggest one this is merged, you can raise an issue to make these suggestions
|
||
public IObservable<IChangeSet<TObject, TKey>> Run() => | ||
source | ||
.Synchronize(_locker) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think you need this because the Rx rules should ensure only one event at a time. However, if you want to keep it anyway, since it is only locked in one place, you can just use:
.Synchronize()
and skip the extra explicit allocation (there will still be an implicit one).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I theory no, but I am sure I have seen issues before with the order that changes arrived on sort after an observe on. That however is a little hazy is my memory and I can't remember the exact cause.
The easy test would be to plug it into dynamic trader. That soon highlights threading issues especially if its speed is removed up
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm with Darrin, if not synchronizing here causes a problem, that indicates a fault in the upstream operator. It's not something we should have to deal with here, even with ObserveOn
and SubscribeOn
in the mix. Testing this with the dynamic trader app sounds like a good idea.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removed. Let's hammer it in Dynamic Trader
break; | ||
case ChangeReason.Refresh: | ||
{ | ||
var currentIndex = target.IndexOf(item); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this use GetCurrentPosition
so that it can benefit from the BinarySearch when enabled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No because refresh implicitly implies we have an immutable object
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"mutable object" I think you mean.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"mutable object" I think you mean.
Yep a typo. I was typing that on my phone at night in the country side, at a climbing crag, in between climbs (I am a boulderer) using my head torch, and without my glasses on 😃
break; | ||
case ChangeReason.Update: | ||
{ | ||
var currentIndex = GetCurrentPosition(change.Previous.Value); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This might benefit a lot of cases to avoid copying all the items left to remove and then copying them right back to insert. It is at the the expense of an extra check, but I think it is a good trade-off.
var currentIndex = GetCurrentPosition(change.Previous.Value);
var index = GetInsertPosition(item);
if (currentIndex != index)
{
target.RemoveAt(currentIndex);
target.Insert(index, item);
}
else
{
target[currentIndex] = item;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great point, I meant to account for this case but forgot. However with anything to do with binding and sorting it is seriously complicated and there's a heap of hidden gotchas!
Some control suites and platforms do not support replace, whiles others do. Remove and insert was the original behaviour in Dynamic Data and I later added the opt in option to use replace instead. I'll include this change in this PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
/// Should a reset be fired for a first time load | ||
/// This option is due to historic reasons where a reset would be fired for the first time load regardless of the number of changes. | ||
/// </summary> | ||
public bool ResetOnFirstTimeLoad { get; init; } = BindingOptions.DefaultResetOnFirstTimeLoad; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need a historic setting on a new operator? Or are you trying to make it easy for people who relied on the old behavior to migrate?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In most cases you want a reset. Some user controls out there do not support reset, or maybe there's a requirement to notify individual item changes after bind. Only then is there a need to opt in.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I thought about this one, and have gone with you suggestion. Done and checked in.
…for reset thresholds
…for reset thresholds
This pull request has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs. |
Introduce a new bind and sort operator which combine the current Bind and Sort operators.
In the old way of doing it, the sort operator had to produce a sorted set every time there was a change, which in simple terms meant the sort operator had to:
Then the Bind operator had to replay the differences, or if there was a large change set reload the list from the cloned / sorted set.
This created serious performance issues especially for larges collections. So I have created a BindAndSort operator which combines these operations and I can entirely get rid of cloning the sorted set. Also I have made it such that you can now bind to any collection that implements
IList<T>
The one slight drawback is that the bind and sort would have to be done on the UI thread, but I don't think this is such a big deal with huge perf gains.
Now for the benchmarks. @JakenVeina can you check over the benchmarks just to be sure I have set them up well.
Initial population
For a single change
The optimized version includes using BinarySearch