Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve efficiency of the dependency graph (56% speed-up) #1293

Merged
merged 33 commits into from
Sep 7, 2023
Merged
Show file tree
Hide file tree
Changes from 13 commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
69878c7
Refactor src/DateTimeDefault
sequba Jul 26, 2023
d38f047
Tmp changes to package.json
sequba Jul 26, 2023
cfbc55a
Reduce the number of calls to adjacentNodes()
sequba Jul 26, 2023
4952638
Move topological sorting algorithm to separate class
sequba Jul 27, 2023
f0f591c
Partition topsort function into smaller parts
sequba Jul 27, 2023
7e9737b
Rewrite topological sorting to use ids and simple arrays instead of M…
sequba Aug 1, 2023
ca5cb70
Add a date/time parsing quick check to escape early if the string doe…
sequba Aug 14, 2023
a1b8ffa
Memoize parsing result of date and time formats for better performance
sequba Aug 14, 2023
48a1d42
Improve documentation of the date/time parsing functions in DateTimeD…
sequba Aug 14, 2023
6f7625f
Fix regexp escape character
sequba Aug 14, 2023
62d876a
Remove unused method in Graph class
sequba Aug 16, 2023
e6cc8ca
Make Graph.nodes private
sequba Aug 16, 2023
0bc660d
Fix removeEdge issue
sequba Aug 25, 2023
ed2aad8
Merge branch 'develop' of github.com:handsontable/hyperformula into f…
sequba Aug 29, 2023
39e770a
Add changelog entry
sequba Aug 29, 2023
fd9f3c0
Remove unnecessary comments
sequba Aug 29, 2023
ea1f448
Remove graph.nodesCount() function (YAGNI)
sequba Aug 29, 2023
294aaf4
Remove graph.edgesCount() function (YAGNI)
sequba Aug 29, 2023
f42c368
Remove graph.getDependencies() function (YAGNI)
sequba Aug 29, 2023
30bdcb2
Add typedocs to all methods of Graph.ts
sequba Aug 29, 2023
0db23ea
Refactor tests for Graph class
sequba Aug 29, 2023
0fd2c24
Improve code coverage for the Graph class
sequba Aug 29, 2023
1a99297
Improve code formatting in Graph.ts and TopSort.ts
sequba Aug 29, 2023
e76fa43
Convert recently changed nodes set to map in Graph class
sequba Aug 30, 2023
3019ac6
Convert volatile nodes set to array of numbers in Graph class
sequba Aug 30, 2023
c234ef0
Convert all sets of nodes to arrays of numbers in Graph class
sequba Aug 30, 2023
c7e3bdb
Improve documentation for Graph class
sequba Aug 30, 2023
004ac3d
Introduce ProcessableValue class for better performance
sequba Sep 5, 2023
ce5e10f
Try a few more optimizations
sequba Sep 6, 2023
c135a48
Change graph.infiniteRangeIds to be a Set of numbers (ids)
sequba Sep 6, 2023
65d4505
Adjust test descriptions
sequba Sep 6, 2023
e3bc190
Fix browser tests
sequba Sep 6, 2023
69095fc
Merge branch 'develop' into feature/issue-876
sequba Sep 7, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion package.json
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
"url": "/~https://github.com/handsontable/hyperformula/issues"
},
"author": "Handsoncode <hello@handsontable.com>",
"version": "2.5.0",
"version": "3.0.0-perf",
"keywords": [
"formula",
"spreadsheet",
Expand Down
2 changes: 1 addition & 1 deletion src/BuildEngineFactory.ts
Original file line number Diff line number Diff line change
Expand Up @@ -86,7 +86,7 @@ export class BuildEngineFactory {
throw new SheetSizeLimitExceededError()
}
const sheetId = sheetMapping.addSheet(sheetName)
addressMapping.autoAddSheet(sheetId, sheet, boundaries)
addressMapping.autoAddSheet(sheetId, boundaries)
}
}

Expand Down
189 changes: 141 additions & 48 deletions src/DateTimeDefault.ts
Original file line number Diff line number Diff line change
Expand Up @@ -6,27 +6,57 @@
import {DateTime, SimpleDate, SimpleTime} from './DateTimeHelper'
import {Maybe} from './Maybe'

export function defaultParseToDateTime(dateTimeString: string, dateFormat?: string, timeFormat?: string): Maybe<DateTime> {
dateTimeString = dateTimeString.replace(/\s\s+/g, ' ').trim().toLowerCase()
let ampmtoken: Maybe<string> = dateTimeString.substring(dateTimeString.length - 2)
if (ampmtoken === 'am' || ampmtoken === 'pm') {
export const TIME_FORMAT_SECONDS_ITEM_REGEXP = new RegExp('^ss(\\.(s+|0+))?$')

const QUICK_CHECK_REGEXP = new RegExp('^[0-9/.\\-: ]+[ap]?m?$')
const WHITESPACE_REGEXP = new RegExp('\\s+')
const DATE_SEPARATOR_REGEXP = new RegExp('[ /.-]')
const TIME_SEPARATOR = ':'
const SECONDS_PRECISION = 1000
const memoizedParseTimeFormat = memoize(parseTimeFormat)
const memoizedParseDateFormat = memoize(parseDateFormat)

/**
* Parses a DateTime value from a string if the string matches the given date format and time format.
*
* Idea for more readable implementation:
* - divide string into parts by a regexp [date_regexp]? [time_regexp]? [ampm_regexp]?
* - start by finding the time part, because it is unambiguous '([0-9]+:[0-9:.]+ ?[ap]?m?)$', before it is the date part
* - OR split by spaces - last segment is ampm token, second to last is time (with or without ampm), rest is date
* If applied:
* - date parsing might work differently after these changes but still according to the docs
* - make sure to test edge cases like timeFormats: ['hh', 'ss.ss'] etc, string: '01-01-2019 AM', 'PM'
*/
export function defaultParseToDateTime(text: string, dateFormat: Maybe<string>, timeFormat: Maybe<string>): Maybe<DateTime> {
if (dateFormat === undefined && timeFormat === undefined) {
return undefined
}

let dateTimeString = text.replace(WHITESPACE_REGEXP, ' ').trim().toLowerCase()

if (!doesItLookLikeADateTimeQuickCheck(dateTimeString)) {
return undefined
}

let ampmToken: Maybe<string> = dateTimeString.substring(dateTimeString.length - 2)
if (ampmToken === 'am' || ampmToken === 'pm') {
dateTimeString = dateTimeString.substring(0, dateTimeString.length - 2).trim()
} else {
ampmtoken = dateTimeString.substring(dateTimeString.length - 1)
if (ampmtoken === 'a' || ampmtoken === 'p') {
ampmToken = dateTimeString.substring(dateTimeString.length - 1)
if (ampmToken === 'a' || ampmToken === 'p') {
dateTimeString = dateTimeString.substring(0, dateTimeString.length - 1).trim()
} else {
ampmtoken = undefined
ampmToken = undefined
}
}
const dateItems = dateTimeString.split(/[ /.-]/g)
if (dateItems.length >= 2 && dateItems[dateItems.length - 2].includes(':')) {
const dateItems = dateTimeString.split(DATE_SEPARATOR_REGEXP)
if (dateItems.length >= 2 && dateItems[dateItems.length - 2].includes(TIME_SEPARATOR)) {
dateItems[dateItems.length - 2] = dateItems[dateItems.length - 2] + '.' + dateItems[dateItems.length - 1]
dateItems.pop()
}
const timeItems = dateItems[dateItems.length - 1].split(':')
if (ampmtoken !== undefined) {
timeItems.push(ampmtoken)
const timeItems = dateItems[dateItems.length - 1].split(TIME_SEPARATOR)
if (ampmToken !== undefined) {
timeItems.push(ampmToken)
}

if (dateItems.length === 1) {
Expand All @@ -46,21 +76,21 @@ export function defaultParseToDateTime(dateTimeString: string, dateFormat?: stri
}
}

export const secondsExtendedRegexp = /^ss(\.(s+|0+))?$/

/**
* Parses a time value from a string if the string matches the given time format.
*/
function defaultParseToTime(timeItems: string[], timeFormat: Maybe<string>): Maybe<SimpleTime> {
const precision = 1000

if (timeFormat === undefined) {
return undefined
}
timeFormat = timeFormat.toLowerCase()
if (timeFormat.endsWith('am/pm')) {
timeFormat = timeFormat.substring(0, timeFormat.length - 5).trim()
} else if (timeFormat.endsWith('a/p')) {
timeFormat = timeFormat.substring(0, timeFormat.length - 3).trim()
}
const formatItems = timeFormat.split(':')

const {
itemsCount,
hourItem,
minuteItem,
secondItem,
} = memoizedParseTimeFormat(timeFormat)

let ampm = undefined
if (timeItems[timeItems.length - 1] === 'am' || timeItems[timeItems.length - 1] === 'a') {
ampm = false
Expand All @@ -70,15 +100,11 @@ function defaultParseToTime(timeItems: string[], timeFormat: Maybe<string>): May
timeItems.pop()
}

if (timeItems.length !== formatItems.length) {
if (timeItems.length !== itemsCount) {
return undefined
}

const hourIndex = formatItems.indexOf('hh')
const minuteIndex = formatItems.indexOf('mm')
const secondIndex = formatItems.findIndex(item => secondsExtendedRegexp.test(item))

const hourString = hourIndex !== -1 ? timeItems[hourIndex] : '0'
const hourString = hourItem !== -1 ? timeItems[hourItem] : '0'
if (!/^\d+$/.test(hourString)) {
return undefined
}
Expand All @@ -93,44 +119,49 @@ function defaultParseToTime(timeItems: string[], timeFormat: Maybe<string>): May
}
}

const minuteString = minuteIndex !== -1 ? timeItems[minuteIndex] : '0'
const minuteString = minuteItem !== -1 ? timeItems[minuteItem] : '0'
if (!/^\d+$/.test(minuteString)) {
return undefined
}
const minutes = Number(minuteString)

const secondString = secondIndex !== -1 ? timeItems[secondIndex] : '0'
const secondString = secondItem !== -1 ? timeItems[secondItem] : '0'
if (!/^\d+(\.\d+)?$/.test(secondString)) {
return undefined
}
const seconds = Math.round(Number(secondString) * SECONDS_PRECISION) / SECONDS_PRECISION

const seconds = Math.round(Number(secondString) * precision) / precision

return {hours, minutes, seconds}
return { hours, minutes, seconds }
}

/**
* Parses a date value from a string if the string matches the given date format.
*/
function defaultParseToDate(dateItems: string[], dateFormat: Maybe<string>): Maybe<SimpleDate> {
if (dateFormat === undefined) {
return undefined
}
const formatItems = dateFormat.toLowerCase().split(/[ /.-]/g)
if (dateItems.length !== formatItems.length) {
const {
itemsCount,
dayItem,
monthItem,
shortYearItem,
longYearItem,
} = memoizedParseDateFormat(dateFormat)

if (dateItems.length !== itemsCount) {
return undefined
}
const monthIndex = formatItems.indexOf('mm')
const dayIndex = formatItems.indexOf('dd')
const yearIndexLong = formatItems.indexOf('yyyy')
const yearIndexShort = formatItems.indexOf('yy')
if (!(monthIndex in dateItems) || !(dayIndex in dateItems) ||
(!(yearIndexLong in dateItems) && !(yearIndexShort in dateItems))) {
if (!(monthItem in dateItems) || !(dayItem in dateItems) ||
(!(longYearItem in dateItems) && !(shortYearItem in dateItems))) {
return undefined
}
if (yearIndexLong in dateItems && yearIndexShort in dateItems) {
if (longYearItem in dateItems && shortYearItem in dateItems) {
return undefined
}
let year
if (yearIndexLong in dateItems) {
const yearString = dateItems[yearIndexLong]
if (longYearItem in dateItems) {
const yearString = dateItems[longYearItem]
if (/^\d+$/.test(yearString)) {
year = Number(yearString)
if (year < 1000 || year > 9999) {
Expand All @@ -140,7 +171,7 @@ function defaultParseToDate(dateItems: string[], dateFormat: Maybe<string>): May
return undefined
}
} else {
const yearString = dateItems[yearIndexShort]
const yearString = dateItems[shortYearItem]
if (/^\d+$/.test(yearString)) {
year = Number(yearString)
if (year < 0 || year > 99) {
Expand All @@ -150,15 +181,77 @@ function defaultParseToDate(dateItems: string[], dateFormat: Maybe<string>): May
return undefined
}
}
const monthString = dateItems[monthIndex]
const monthString = dateItems[monthItem]
if (!/^\d+$/.test(monthString)) {
return undefined
}
const month = Number(monthString)
const dayString = dateItems[dayIndex]
const dayString = dateItems[dayItem]
if (!/^\d+$/.test(dayString)) {
return undefined
}
const day = Number(dayString)
return {year, month, day}
}

/**
* Parses a time format string into a format object.
*/
function parseTimeFormat(timeFormat: string): { itemsCount: number, hourItem: number, minuteItem: number, secondItem: number } {
const formatLowercase = timeFormat.toLowerCase().trim()
const formatWithoutAmPmItem = formatLowercase.endsWith('am/pm')
? formatLowercase.substring(0, formatLowercase.length - 5)
: (formatLowercase.endsWith('a/p')
? formatLowercase.substring(0, timeFormat.length - 3)
: formatLowercase)

const items = formatWithoutAmPmItem.trim().split(TIME_SEPARATOR)
return {
itemsCount: items.length,
hourItem: items.indexOf('hh'),
minuteItem: items.indexOf('mm'),
secondItem: items.findIndex(item => TIME_FORMAT_SECONDS_ITEM_REGEXP.test(item)),
}
}

/**
* Parses a date format string into a format object.
*/
function parseDateFormat(dateFormat: string): { itemsCount: number, dayItem: number, monthItem: number, shortYearItem: number, longYearItem: number } {
const items = dateFormat.toLowerCase().trim().split(DATE_SEPARATOR_REGEXP)

return {
itemsCount: items.length,
dayItem: items.indexOf('dd'),
monthItem: items.indexOf('mm'),
shortYearItem: items.indexOf('yy'),
longYearItem: items.indexOf('yyyy'),
}
}

/**
* If this function returns false, the string is not parsable as a date time. Otherwise, it might be.
* This is a quick check that is used to avoid running the more expensive parsing operations.
*/
function doesItLookLikeADateTimeQuickCheck(text: string): boolean {
return QUICK_CHECK_REGEXP.test(text)
}


/**
* Function memoization for improved performance.
*/
function memoize<T>(fn: (arg: string) => T) {
const memoizedResults: {[key: string]: T} = {}

return (arg: string) => {
const memoizedResult = memoizedResults[arg]
if (memoizedResult !== undefined) {
return memoizedResult
}

const result = fn(arg)
memoizedResults[arg] = result
return result
}
}
9 changes: 4 additions & 5 deletions src/DateTimeHelper.ts
Original file line number Diff line number Diff line change
Expand Up @@ -229,10 +229,10 @@ export class DateTimeHelper {
}

private parseDateTimeFromFormats(dateTimeString: string, dateFormats: string[], timeFormats: string[]): Partial<{ dateTime: DateTime, dateFormat: string, timeFormat: string }> {
const dateFormatsIterate = dateFormats.length === 0 ? [undefined] : dateFormats
const timeFormatsIterate = timeFormats.length === 0 ? [undefined] : timeFormats
for (const dateFormat of dateFormatsIterate) {
for (const timeFormat of timeFormatsIterate) {
const dateFormatsArray = dateFormats.length === 0 ? [undefined] : dateFormats
const timeFormatsArray = timeFormats.length === 0 ? [undefined] : timeFormats
for (const dateFormat of dateFormatsArray) {
for (const timeFormat of timeFormatsArray) {
const dateTime = this.parseSingleFormat(dateTimeString, dateFormat, timeFormat)
if (dateTime !== undefined) {
return {dateTime, timeFormat, dateFormat}
Expand Down Expand Up @@ -319,4 +319,3 @@ export function timeToNumber(time: SimpleTime): number {
export function toBasisEU(date: SimpleDate): SimpleDate {
return {year: date.year, month: date.month, day: Math.min(30, date.day)}
}

2 changes: 1 addition & 1 deletion src/DependencyGraph/AddressMapping/AddressMapping.ts
Original file line number Diff line number Diff line change
Expand Up @@ -61,7 +61,7 @@ export class AddressMapping {
this.mapping.set(sheetId, strategy)
}

public autoAddSheet(sheetId: number, sheet: Sheet, sheetBoundaries: SheetBoundaries) {
public autoAddSheet(sheetId: number, sheetBoundaries: SheetBoundaries) {
const {height, width, fill} = sheetBoundaries
const strategyConstructor = this.policy.call(fill)
this.addSheet(sheetId, new strategyConstructor(width, height))
Expand Down
10 changes: 6 additions & 4 deletions src/DependencyGraph/DependencyGraph.ts
Original file line number Diff line number Diff line change
Expand Up @@ -41,10 +41,11 @@ import {AddressMapping} from './AddressMapping/AddressMapping'
import {ArrayMapping} from './ArrayMapping'
import {collectAddressesDependentToRange} from './collectAddressesDependentToRange'
import {FormulaVertex} from './FormulaCellVertex'
import {DependencyQuery, Graph, TopSortResult} from './Graph'
import {DependencyQuery, Graph} from './Graph'
import {RangeMapping} from './RangeMapping'
import {SheetMapping} from './SheetMapping'
import {RawAndParsedValue} from './ValueCellVertex'
import {TopSortResult} from './TopSort'

export class DependencyGraph {
public readonly graph: Graph<Vertex>
Expand Down Expand Up @@ -111,6 +112,7 @@ export class DependencyGraph {
if (vertex instanceof ArrayVertex) {
this.arrayMapping.removeArray(vertex.getRange())
}

if (vertex instanceof ValueCellVertex) {
const oldValues = vertex.getValues()
if (oldValues.rawValue !== value.rawValue) {
Expand Down Expand Up @@ -527,7 +529,7 @@ export class DependencyGraph {
}

public* arrayFormulaNodes(): IterableIterator<ArrayVertex> {
for (const vertex of this.graph.nodes) {
for (const vertex of this.graph.getNodes()) {
if (vertex instanceof ArrayVertex) {
yield vertex
}
Expand Down Expand Up @@ -611,7 +613,7 @@ export class DependencyGraph {
}

public forceApplyPostponedTransformations() {
for (const vertex of this.graph.nodes.values()) {
for (const vertex of this.graph.getNodes()) {
if (vertex instanceof FormulaCellVertex) {
vertex.ensureRecentData(this.lazilyTransformingAstService)
}
Expand Down Expand Up @@ -1090,7 +1092,7 @@ export class DependencyGraph {
const adjNodesStored = this.graph.adjacentNodes(newVertex)

this.removeVertexAndCleanupDependencies(newVertex)
this.graph.softRemoveEdge(existingVertex, newVertex)
this.graph.removeEdgeIfExists(existingVertex, newVertex)
adjNodesStored.forEach((adjacentNode) => {
if (this.graph.hasNode(adjacentNode)) {
this.graph.addEdge(existingVertex, adjacentNode)
Expand Down
Loading