Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Overhaul IsPlistMatrixRep and IsPlistVectorRep #2973

Open
wants to merge 5 commits into
base: master
Choose a base branch
from

Conversation

fingolfin
Copy link
Member

@fingolfin fingolfin commented Nov 7, 2018

Progress towards issue #2148 : This PR is an incomplete proof of concept (see #2776 for its genesis), I am putting it out here to make sure people are aware of this possible direction, and in the vague hope that somebody else might help out with it.

The idea is to tune IsPlistMatrixRep as much as possible so that it becomes a viable replacement for lists-of-lists (= lol) "matrices". Indeed, with this hackish POC, I already benchmarked some cases where the IsPlistMatrixRep code is faster than lists-of-lists, probably in parts because of avoided method selection overhead; another part is that the optimized code benefits from kernel optimizations for multiplication of lol matrices

To give an idea about the performance of IsPlistMatrixRep before and after this PR, I run some mini benchmarks, using the benchmarking code in this gist, plus the following code:

M1:=IdentityMatrix(GF(4),10);
M2:=IdentityMat(10)*Z(2); # "decompress" the matrix
M3:=IdentityMatrix(IsPlistMatrixRep, GF(4),10);
I1:=IdentityMat(10);
I2:=IdentityMatrix(IsPlistMatrixRep,Integers,10);

inputs := [
   [ M1, GF(4), Is8BitMatrixRep ],
   [ M2, GF(4), IsPlistRep ],
   [ M3, GF(4), IsPlistMatrixRep ],
   [ I1, Integers, IsPlistRep ],
   [ I2, Integers, IsPlistMatrixRep ],
];

f := function(mat) local i, x; for i in [1..10000] do x:=mat*mat; od; end;;
g := function(mat) local i, x; for i in [1..10000] do x:=mat^5+mat; od; end;;
opts := rec(mintime:=1000);

for test in [ f, g ] do
    Print("-----\n");
    Print("Testing function ", NameFunction(test), "\n");
    Print("-----\n");

    for input in inputs do
        Print("... on ", NameFunction(input[3]), " matrix over ", input[2], "...\n");
        Assert(0, input[3](input[1])); # verify filter matches
        Benchmark(function() test(input[1]); end, opts);;
        Print("\n");
    od;

    Print("\n");
od;

The results of this are as follows with master (using commit 5b7263c from 2019-07-11), on a Ubuntu server in Gießen:

-----
Testing function f
-----
... on Is8BitMatrixRep matrix over GF(2^2)...
Performed 128 iterations, taking 1008. milliseconds.
average: 7.87 +/- 0.02 (+/- 0.%)
median: 7.870849609375

... on IsPlistRep matrix over GF(2^2)...
Performed 5 iterations, taking 1517. milliseconds.
average: 303.45 +/- 0.12 (+/- 0.%)
median: 303.48095703125

... on IsPlistMatrixRep matrix over GF(2^2)...
Performed 5 iterations, taking 5801. milliseconds.
average: 1160.23 +/- 0.89 (+/- 0.%)
median: 1159.994140625

... on IsPlistRep matrix over Integers...
Performed 18 iterations, taking 1001. milliseconds.
average: 55.62 +/- 0.07000000000000001 (+/- 0.%)
median: 55.6290283203125

... on IsPlistMatrixRep matrix over Integers...
Performed 5 iterations, taking 1900. milliseconds.
average: 380.09 +/- 0.33 (+/- 0.%)
median: 380.011962890625


-----
Testing function g
-----
... on Is8BitMatrixRep matrix over GF(2^2)...
Performed 29 iterations, taking 1010. milliseconds.
average: 34.84 +/- 0.02 (+/- 0.%)
median: 34.8369140625

... on IsPlistRep matrix over GF(2^2)...
Performed 5 iterations, taking 4797. milliseconds.
average: 959.4400000000001 +/- 0.21 (+/- 0.%)
median: 959.5048828125

... on IsPlistMatrixRep matrix over GF(2^2)...
Performed 5 iterations, taking 15796. milliseconds.
average: 3159.28 +/- 1.58 (+/- 0.%)
median: 3159.17724609375

... on IsPlistRep matrix over Integers...
Performed 5 iterations, taking 1014. milliseconds.
average: 202.89 +/- 0.09 (+/- 0.%)
median: 202.889892578125

... on IsPlistMatrixRep matrix over Integers...
Performed 5 iterations, taking 6131. milliseconds.
average: 1226.19 +/- 0.49 (+/- 0.%)
median: 1226.10791015625

With this PR (commit cb79cdf), the numbers for Is8BitMatrixRep and IsPlistRep stay essentially the same, but for IsPlistMatrixRep improve to more or less match those for IsPlistRep (they are a bit slower, but not that much):

-----
Testing function f
-----
... on Is8BitMatrixRep matrix over GF(2^2)...
Performed 127 iterations, taking 1001. milliseconds.
average: 7.88 +/- 0.06 (+/- 1.%)
median: 7.870849609375

... on IsPlistRep matrix over GF(2^2)...
Performed 5 iterations, taking 1513. milliseconds.
average: 302.69 +/- 0.09 (+/- 0.%)
median: 302.649169921875

... on IsPlistMatrixRep matrix over GF(2^2)...
Performed 5 iterations, taking 1582. milliseconds.
average: 316.49 +/- 0.06 (+/- 0.%)
median: 316.454833984375

... on IsPlistRep matrix over Integers...
Performed 18 iterations, taking 1004. milliseconds.
average: 55.77 +/- 0.06 (+/- 0.%)
median: 55.7733154296875

... on IsPlistMatrixRep matrix over Integers...
Performed 15 iterations, taking 1008. milliseconds.
average: 67.19 +/- 0.16 (+/- 0.%)
median: 67.14599609375


-----
Testing function g
-----
... on Is8BitMatrixRep matrix over GF(2^2)...
Performed 29 iterations, taking 1003. milliseconds.
average: 34.6 +/- 0.03 (+/- 0.%)
median: 34.587158203125

... on IsPlistRep matrix over GF(2^2)...
Performed 5 iterations, taking 4775. milliseconds.
average: 955. +/- 0.5600000000000001 (+/- 0.%)
median: 955.198974609375

... on IsPlistMatrixRep matrix over GF(2^2)...
Performed 5 iterations, taking 4858. milliseconds.
average: 971.5700000000001 +/- 0.7000000000000001 (+/- 0.%)
median: 971.367919921875

... on IsPlistRep matrix over Integers...
Performed 5 iterations, taking 1015. milliseconds.
average: 202.91 +/- 0.36 (+/- 0.%)
median: 202.7880859375

... on IsPlistMatrixRep matrix over Integers...
Performed 5 iterations, taking 1140. milliseconds.
average: 228.01 +/- 0.09 (+/- 0.%)
median: 227.99169921875

Of course proper benchmarks would include a range of matrix sizes and ground fields; more interesting / realistic operations; and graph all of that nicely...

@fingolfin fingolfin added kind: enhancement Label for issues suggesting enhancements; and for pull requests implementing enhancements do not merge PRs which are not yet ready to be merged (e.g. submitted for discussion, or test results) labels Nov 7, 2018
lib/matobjplist.gi Outdated Show resolved Hide resolved
lib/matobjplist.gi Outdated Show resolved Hide resolved
lib/matobjplist.gi Outdated Show resolved Hide resolved
lib/matobjplist.gi Outdated Show resolved Hide resolved
lib/matobjplist.gi Outdated Show resolved Hide resolved
# [ IsPlistMatrixRep, IsFunction ],
# function( m, f )
# return List(m![ROWSPOS],f);
# end );
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MatrixObj are not lists, so remove these. If one wants to extract the rows or columns as VectorObj, then we should add GetColumnVectors / GetRowVectors operations (better names pending...)

lib/matobjplist.gi Outdated Show resolved Hide resolved
fi;
od;
l := a![ROWSPOS]*b![ROWSPOS];
#l := PROD_LIST_SCL_DEFAULT(a![ROWSPOS],b![ROWSPOS]);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we use the optimized matrix multiplication functions from the kernel, which is part of the reason for the performance improvements.

lib/matobjplist.gi Outdated Show resolved Hide resolved
lib/matrix.gi Outdated Show resolved Hide resolved
@fingolfin fingolfin force-pushed the mh/IsPlistMatrixRep branch from 68c1172 to e1a1fb6 Compare January 6, 2019 22:28
@codecov
Copy link

codecov bot commented Jan 6, 2019

Codecov Report

Merging #2973 into master will decrease coverage by 12.35%.
The diff coverage is 48.76%.

@@             Coverage Diff             @@
##           master    #2973       +/-   ##
===========================================
- Coverage   84.65%   72.29%   -12.36%     
===========================================
  Files         698      635       -63     
  Lines      345626   308665    -36961     
===========================================
- Hits       292582   223150    -69432     
- Misses      53044    85515    +32471
Impacted Files Coverage Δ
lib/grpmat.gi 66.56% <ø> (-7.44%) ⬇️
lib/matobj2.gd 100% <ø> (ø) ⬆️
lib/matobjplist.gd 100% <100%> (ø) ⬆️
lib/matrix.gi 74.76% <100%> (-12.98%) ⬇️
lib/vecmat.gi 83.2% <100%> (-3.56%) ⬇️
lib/vecmat.gd 100% <100%> (ø) ⬆️
lib/grpffmat.gi 91.88% <100%> (-4.04%) ⬇️
lib/matobjplist.gi 56.17% <35.63%> (+5.69%) ⬆️
lib/matobj.gi 65.66% <53.84%> (-0.54%) ⬇️
lib/teachm2.g 15.38% <0%> (-84.62%) ⬇️
... and 395 more

@coveralls
Copy link

coveralls commented Jan 6, 2019

Coverage Status

Coverage increased (+0.03%) to 84.699% when pulling 0e7599b on fingolfin:mh/IsPlistMatrixRep into d2db1bf on gap-system:master.

@fingolfin fingolfin force-pushed the mh/IsPlistMatrixRep branch from e1a1fb6 to b04f4b6 Compare January 25, 2019 16:11
@fingolfin fingolfin force-pushed the mh/IsPlistMatrixRep branch 3 times, most recently from dcf9ac2 to cb79cdf Compare July 12, 2019 09:58
@fingolfin fingolfin force-pushed the mh/IsPlistMatrixRep branch from a323d50 to 05693a1 Compare August 22, 2019 06:41
@fingolfin fingolfin force-pushed the mh/IsPlistMatrixRep branch from 05693a1 to 3772086 Compare August 30, 2019 17:29
@fingolfin fingolfin force-pushed the mh/IsPlistMatrixRep branch from 0e7599b to f8f3ea1 Compare March 2, 2020 01:08
@fingolfin fingolfin force-pushed the mh/IsPlistMatrixRep branch from f8f3ea1 to f34dc7b Compare March 31, 2021 22:27
@fingolfin fingolfin force-pushed the mh/IsPlistMatrixRep branch from f34dc7b to b52529a Compare April 14, 2021 12:13
@fingolfin fingolfin force-pushed the mh/IsPlistMatrixRep branch 2 times, most recently from 738c096 to 8a085dd Compare April 21, 2021 23:40
@fingolfin fingolfin mentioned this pull request May 19, 2021
@fingolfin fingolfin force-pushed the mh/IsPlistMatrixRep branch 3 times, most recently from 7fa02fd to fe334ea Compare May 27, 2021 11:52
@fingolfin fingolfin changed the title WIP: Improve IsPlistMatrixRep Overhaul IsPlistMatrixRep and IsPlistVectorRep May 27, 2021
@fingolfin fingolfin requested review from ThomasBreuer and ssiccha May 27, 2021 11:53
@fingolfin fingolfin added release notes: to be added PRs introducing changes that should be (but have not yet been) mentioned in the release notes and removed do not merge PRs which are not yet ready to be merged (e.g. submitted for discussion, or test results) labels May 27, 2021
@fingolfin
Copy link
Member Author

I think this is getting close to being mergeable. I am sure there are bugs in my changes, but to flush those out, we should write more and better systematic tests for the interfaces.

Copy link
Contributor

@ssiccha ssiccha left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've had a look up until line 711 and left a few comments.

m := [];
elif IsVectorObj(l[1]) then
# list of vectors
# TODO: convert each IsVectorObj to a plist
Copy link
Contributor

@ssiccha ssiccha May 27, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That sounds to me like you want to turn each entry of l into IsPlistRep. That can't be right, or am I misunderstanding something? The documentation states, that a plist of IsPlistVectorRep objects should be stored.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A primary point of this PR is to change that, i.e.: do not store a list of vector objects, but rather store an old style matrix. This avoids a ton of overhead in both memory usage and performance. Before this PR, IsPlistMatrixRep was hopelessly slow compared to old-style matrices.

# FIXME/TODO: should the following test be always performed
# or only at a higher assertion level?
Assert(0, ForAll(m, row -> Length(row) = ncols));
Assert(0, ForAll(m, row -> ForAll(row, x -> x in basedomain)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentations of IsPlist*Rep do not state that the entries must be in basedomain. So IMO either we remove this Assert or add corresponding statements into the docs.

elif Is8BitVectorRep(l) then
PLAIN_VEC8BIT(v);
fi;
v := PlainListCopy(l);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I take it PlainListCopy respects the GF2 and 8Bit vector reps? (This isn't tested accordingn to codecov)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you mean with "respects" them? It returns a new plain list with the content of the given list, and it works for arbitrary lists, including GF2 and 8bit vector reps.

l := List([1..rows],i->ZeroVector(rows,t));
function( rows, m )
local i,l,o,res;
l := List([1..rows],i->ListWithIdenticalEntries(rows, Zero(m![BDPOS])));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then the entries of l would be plain lists although I thought they should be in IsPlistVectorRep.

Copy link
Contributor

@ThomasBreuer ThomasBreuer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added some specific comments in the code.
Two more general remarks:
Several v![BDPOS] have been changed to BaseDomain(v). If this is a general suggestion then should it be applied everywhere?
The documentation of IsPlistMatrixRep states that a plain list of IsPlistVectorRep objects is stored, should this be changed?

@@ -45,7 +45,7 @@
## <#/GAPDoc>
##
DeclareRepresentation( "IsPlistVectorRep",
IsVectorObj and IsPositionalObjectRep, [] );
IsVectorObj and IsPositionalObjectRep and IsNoImmediateMethodsObject, [] );
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IsNoImmediateMethodsObject is currently undocumented. I expect that it will be useful also for other new kinds of vector and matrix objects. Thus it should become documented, and the documentation of vector and matrix objects should mention it, in a section on "performance hints".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, why not. Open an issue for that idea?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See #4529.

gap> NewMatrix(IsPlistMatrixRep, Integers, 3, [ ] ); Display(last);
<0x3-matrix over Integers>
<0x3-matrix over Integers:
]>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The opening square bracket seems to be missing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, fixed now by revised printing.

# FIXME/TODO: should the following test be always performed
# or only at a higher assertion level?
Assert(0, ForAll(m, row -> Length(row) = ncols));
Assert(0, ForAll(m, row -> ForAll(row, x -> x in basedomain)));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this a more general question about the operation? Do we perhaps need two operations, a NC variant and one that performs the checks?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is ultimately a general question. As are many of the other TODOs. Note that similar questions arise e.g. for \+ for matrices (should we check and enforce that the dimensions agree? or do we want to allow one matrix to be "smaller" than the other? Or do we not want to check at all and rely on the user?

For other binary operations, should we check that the base domains are identical? Or possibly equal (slower, but more permissive)? Or perhaps we should allow adding matrices with different base domains as long as there is a common parent of the base domains (so adding a GF(4) and a GF(8) matrix might produce a GF(64) matrix). If so, what mechanism should be used to determine that "parent" domain?

And if we do perform any such checks, do we need an NC version of "+"? And how would one name that?

There's more. I am also pretty sure we already listed some of these somewhere.

od;
# TODO: should we verify that Length(v) = NrRows(m)?
res := ZeroVector(NrCols(m),v);
res![ELSPOS] := v![ELSPOS] * m![ROWSPOS];
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would say it depends on the definition of arithmetic operations for vector and matrix objects:
If we say that the result is undefined if the dimensions do not match then no check is necessary.
If we promise an error message if the dimensions do not match then we have to check.
(Without check, the result object is at least "internally consistent", that is, the claimed number of columns equals the length of the stored list.)

Append(m![ROWSPOS],n![ROWSPOS]);
end );

# FIXME: does ShallowCopy even make sense for a matrixobj???
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In matobj2.gd, we define ShallowCopy for objects in IsRowListMatrix.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spot on. Thing is, in my mind, IsPlistMatrixRep should not have been in IsRowListMatrix, but of course it still was when you wrote that review remark. I simply forgot to make that change... now done, and IsRowPlistMatrixRep added. And then I changed this ShallowCopy method to require IsRowPlistMatrixRep.

trans := TransposedMatMutable(m![ROWSPOS]);


# FIXME: using TypeObj(m) below is probably wrong, as it
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are many places in this file where a TypeObj value of the argument is used in Objectify, and where this type may be wrong.

# FIXME: if the base domain is not a field, e.g. Integers,
# then the inverse we just computed may not be defined over
# that base domain. Should we detect this here? Or how else
# will we deal with this in general?
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is first of all a question about what the operation NewMatrix promises. Perhaps we need NewMatrix and NewMatrixNC? (And then the question is whether we call NewMatrix or NewMatrixNC here.)

@fingolfin
Copy link
Member Author

@ssiccha @ThomasBreuer thank you for your feedback

@ThomasBreuer you wrote:

The documentation of IsPlistMatrixRep states that a plain list of IsPlistVectorRep objects is stored, should this be changed?

Yes, thank you. In fact, there is more in it that needs to be changed: it implies that IsPlistMatrixRep is a row list matrix, which was the case before, but now isn't / shouldn't be... I'll try to explain my train of thought here (and this definitely should be explained in the commit message(s), too): for optimal performance (and I run a lot of benchmarks back then), we really don't want to store a IsPlistMatrixRep as a list of IsPlistVectorRep; this just adds a ton of overhead. It's much better to simply wrap an old-style IsMatrix and IsPlistRep matrix, which this PR does. Also, for testing purposes, it is good to have an IsMatrixObj implementation which is not an IsRowListMatrix (@danielrademacher asked for this, too). So that's a useful application for IsPlistMatrixRep.

HOWEVER another useful application for IsPlistMatrixRep would be to use it as drop-in replacement for any old-style IsMatrix and IsPlistRep; i.e., one may want to change code producing these to instead produce IsPlistMatrixRep (not all at once, but one at a time, if/when it seems useful and safe). But for this application, one wants those matrices to work as closely as possible to the old-style matrices; so in particular, here we want an IsRowListMatrix.

So here we have a clash of goal. So my thought here now is to introduce IsRowPlistMatrixRep (or something like that) which specializes IsPlistMatrixRep and is also in IsRowListMatrix. I think all that is needed to make that happen is to install suitable [] and []:= methods for IsRowPlistMatrixRep (and perhaps a few more, dunno). Then mat[i] could simply return an IsVectorObj which wraps the list mat![ROWSPOS][i] (and not a shallow copy; it really must point to exactly that list). This way, mat[i][j] := 1 will still work. Of course this incurs a performance penalty; I think that at least the [] method should contain a Info(InfoPerformance, ...) message. If we want to optimize it a bit, we could add a WeakPointerObj as cache for these row wrapper objects (some care needs to be done: replacing a row via []:= should invalidate the respective cache entry).

@fingolfin
Copy link
Member Author

Err, I've started to add IsRowPlistMatrixRep, but completing it will be tedious, as I need to once again implement NewZeroMatrix, NewIdentityMatrix, ... -- we ought to resolve #4366

- change `IsPlistMatrixRep` to *not* be a row list type; so instead
  of storing a list of `IsPlistVectorRep` instances, they now store an
  old-style matrix in `IsMatrix and IsPlistRep`. This in turn allows
  to simplify and optimize a lot of code, e.g. matrix arithmetic can
  now just dispatch to that for plain lists.
- use `BindConstant` on `BDPOS` etc. for best performance
- set `IsNoImmediateMethodsObject` filter for both for better performance
- rename `EMPOS` and `RLPOS` to `NUM_ROWS_POS` and `NUM_COLS_POS`
- allow semirings as basedomain
- fix various vector/matrix constructors to copy their inputs lists
  instead of reusing them inside (that can lead to bad bugs)
- removed a bunch of (now) redundant methods that didn't seem to
  provide any benefits (such as improved performance)
- ...
This fixes printing of 0 x n matrices
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind: enhancement Label for issues suggesting enhancements; and for pull requests implementing enhancements release notes: to be added PRs introducing changes that should be (but have not yet been) mentioned in the release notes
Projects
No open projects
Status: No status
Development

Successfully merging this pull request may close these issues.

4 participants