-
Notifications
You must be signed in to change notification settings - Fork 3.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature: branch pruning #844
Feature: branch pruning #844
Conversation
(pruning). We tag parsed tokens to associate a branch identifier to them
…atibility reasons
This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. |
aa38599
to
c7e3e8b
Compare
Did a merge with master to avoid merge conflicts |
c7e3e8b
to
da19445
Compare
da19445
to
4d8ca2a
Compare
@MarkBaker this PR still look quite promising should we merge it now ? or would it conflict with your work on the new calculation engine ? |
Side note: we have been using in production since a few weeks now on some pretty advanced sheet without observing any bugs |
Thank you for your work and patience. I finally squashed and merged it as 0b387e7 |
1.9.0 ### Added - When <br> appears in a table cell, set the cell to wrap [#1071](#1071) and [#1070](#1070) - Add MAXIFS, MINIFS, COUNTIFS and Remove MINIF, MAXIF [#1056](#1056) - HLookup needs an ordered list even if range_lookup is set to false [#1055](#1055) and [#1076](#1076) - Improve performance of IF function calls via ranch pruning to avoid resolution of every branches [#844](#844) - MATCH function supports `*?~` Excel functionality, when match_type=0 [#1116](#1116) - Allow HTML Reader to accept HTML as a string [#1136](#1136) ### Fixed - Fix to AVERAGEIF() function when called with a third argument - Eliminate duplicate fill none style entries [#1066](#1066) - Fix number format masks containing literal (non-decimal point) dots [#1079](#1079) - Fix number format masks containing named colours that were being misinterpreted as date formats; and add support for masks that fully replace the value with a full text string [#1009](#1009) - Stricter-typed comparison testing in COUNTIF() and COUNTIFS() evaluation [#1046](#1046) - COUPNUM should not return zero when settlement is in the last period [#1020](#1020) and [#1021](#1021) - Fix handling of named ranges referencing sheets with spaces or "!" in their title - Cover `getSheetByName()` with tests for name with quote and spaces [#739](#739) - Best effort to support invalid colspan values in HTML reader - [#878](#878) - Fixes incorrect rows deletion [#868](#868) - MATCH function fix (value search by type, stop search when match_type=-1 and unordered element encountered) [#1116](#1116) - Fix `getCalculatedValue()` error with more than two INDIRECT [#1115](#1115) - Writer\Html did not hide columns [#985](#985)
Repost of #818 after a rebase on
master
.This is:
Checklist:
Why this change is needed?
Calculation engine was resolving every function by first resolving its arguments including IFs, this was causing significant over evaluation when IFs were used as it meant for every case to be evaluated.
I have tested against 5 files made by 4 different people to ensure that this was not introducing regression, I have observed none. It generates speed improvement from 0% to 80% on those files.
EDIT Completing from the discussion I had with @PowerKiKi
Yes, this is touching the core of the calculation engine, so, it would not be to surprising to introduce regression. I did my best to thoroughly test it, if you see extra tests to perform let me know.
As per the extra public methods:
Stack::getStackItem()
: enables code factorization asCalculation::_parseFormula()
doesn't manipulate tokens only through a stack and was also manually creating the arrays representing tokens.Stack::__toString()
: was really convenient to debug I would recommend to leave it but I could truncate it.Calculation
:setBranchPruningEnabled()
,enableBranchPruning()
,disableBranchPruning()
andclearBranchStore()
.processTokenStack()
, I usually prefer to use the reflection API in my unit tests.A side point about
CalculationTest::testBranchPruningFormulaParsing
, this test was pretty thick. I could not use a data provider as the expected result testing is way non trivial. So I did split the test in many different functions.As per the code splitting in the calculation engine itself, the
_parseFormula()
andprocessTokenStack()
functions are not using context objects that could be pass around to subrountines. I could spend some time splitting those huge methods as I think I have gained some understanding of the calculation engine inner workings but I think this should be for another pull request. I don't think it would deeply impact execution speed.