Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: branch pruning #844

Closed

Conversation

frantzmiccoli
Copy link
Contributor

Repost of #818 after a rebase on master.

This is:

- [ ] a bugfix
- [x] a new feature

Checklist:

Why this change is needed?

Calculation engine was resolving every function by first resolving its arguments including IFs, this was causing significant over evaluation when IFs were used as it meant for every case to be evaluated.

I have tested against 5 files made by 4 different people to ensure that this was not introducing regression, I have observed none. It generates speed improvement from 0% to 80% on those files.


EDIT Completing from the discussion I had with @PowerKiKi

Yes, this is touching the core of the calculation engine, so, it would not be to surprising to introduce regression. I did my best to thoroughly test it, if you see extra tests to perform let me know.

As per the extra public methods:

  • Stack::getStackItem(): enables code factorization as Calculation::_parseFormula() doesn't manipulate tokens only through a stack and was also manually creating the arrays representing tokens.
  • Stack::__toString(): was really convenient to debug I would recommend to leave it but I could truncate it.
  • I followed the result caching logic which introduced a few extra methods in Calculation: setBranchPruningEnabled(), enableBranchPruning(), disableBranchPruning() and clearBranchStore().
  • Nothing to do with this pull request but I am wondering if some functions are not public just to enable testing like processTokenStack(), I usually prefer to use the reflection API in my unit tests.

A side point about CalculationTest::testBranchPruningFormulaParsing, this test was pretty thick. I could not use a data provider as the expected result testing is way non trivial. So I did split the test in many different functions.

As per the code splitting in the calculation engine itself, the _parseFormula() and processTokenStack() functions are not using context objects that could be pass around to subrountines. I could spend some time splitting those huge methods as I think I have gained some understanding of the calculation engine inner workings but I think this should be for another pull request. I don't think it would deeply impact execution speed.

@stale
Copy link

stale bot commented Mar 8, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.
If this is still an issue for you, please try to help by debugging it further and sharing your results.
Thank you for your contributions.

@stale stale bot added the stale label Mar 8, 2019
@MarkBaker MarkBaker removed the stale label Mar 8, 2019
@frantzmiccoli frantzmiccoli force-pushed the feature/branch-pruning branch 2 times, most recently from aa38599 to c7e3e8b Compare March 11, 2019 15:21
@frantzmiccoli
Copy link
Contributor Author

frantzmiccoli commented Mar 11, 2019

Did a merge with master to avoid merge conflicts

@frantzmiccoli frantzmiccoli force-pushed the feature/branch-pruning branch from c7e3e8b to da19445 Compare April 3, 2019 09:26
@frantzmiccoli frantzmiccoli force-pushed the feature/branch-pruning branch from da19445 to 4d8ca2a Compare April 3, 2019 09:27
@PowerKiKi PowerKiKi added the pinned pinned issue to avoid them becoming stale label May 26, 2019
@PowerKiKi
Copy link
Member

@MarkBaker this PR still look quite promising should we merge it now ? or would it conflict with your work on the new calculation engine ?

@frantzmiccoli
Copy link
Contributor Author

Side note: we have been using in production since a few weeks now on some pretty advanced sheet without observing any bugs

@PowerKiKi PowerKiKi closed this in 0b387e7 Aug 12, 2019
@PowerKiKi
Copy link
Member

Thank you for your work and patience. I finally squashed and merged it as 0b387e7

PowerKiKi added a commit that referenced this pull request Aug 17, 2019
1.9.0

### Added

- When <br> appears in a table cell, set the cell to wrap [#1071](#1071) and [#1070](#1070)
- Add MAXIFS, MINIFS, COUNTIFS and Remove MINIF, MAXIF [#1056](#1056)
- HLookup needs an ordered list even if range_lookup is set to false [#1055](#1055) and [#1076](#1076)
- Improve performance of IF function calls via ranch pruning to avoid resolution of every branches [#844](#844)
- MATCH function supports `*?~` Excel functionality, when match_type=0 [#1116](#1116)
- Allow HTML Reader to accept HTML as a string [#1136](#1136)

### Fixed

- Fix to AVERAGEIF() function when called with a third argument
- Eliminate duplicate fill none style entries [#1066](#1066)
- Fix number format masks containing literal (non-decimal point) dots [#1079](#1079)
- Fix number format masks containing named colours that were being misinterpreted as date formats; and add support for masks that fully replace the value with a full text string [#1009](#1009)
- Stricter-typed comparison testing in COUNTIF() and COUNTIFS() evaluation [#1046](#1046)
- COUPNUM should not return zero when settlement is in the last period [#1020](#1020) and [#1021](#1021)
- Fix handling of named ranges referencing sheets with spaces or "!" in their title
- Cover `getSheetByName()` with tests for name with quote and spaces [#739](#739)
- Best effort to support invalid colspan values in HTML reader - [#878](#878)
- Fixes incorrect rows deletion [#868](#868)
- MATCH function fix (value search by type, stop search when match_type=-1 and unordered element encountered) [#1116](#1116)
- Fix `getCalculatedValue()` error with more than two INDIRECT [#1115](#1115)
- Writer\Html did not hide columns [#985](#985)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pinned pinned issue to avoid them becoming stale
Development

Successfully merging this pull request may close these issues.

3 participants