Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

(perf)node-read: filter before map to candidate #2168

Merged

Conversation

russmatney
Copy link
Contributor

@russmatney russmatney commented Apr 22, 2022

Moves the optional node filter in org-roam-node-read--completions before
the mapcar org-roam-node-read--to-candidate.

This saves time for completion on org-roam-node-find and
org-roam-node-insert (but only when a filter-fn is passed).


Motivation for this change

I have experienced slowness (several seconds) on every node-find and node-insert
command for a while, but finally made the time to dig into why today.

I have about 1k files and 5k nodes in my roam db.

I realized while digging that my 'archived' org nodes are being included, which
could perhaps be filtered to speed things up. Alternatively, i mostly
only care to link to files anyway. Fortunately, filter-fn exists as a way to
filter some nodes.

I tried:

;;;###autoload
(defun russ/org-roam-insert-node-level-0 ()
  "`org-roam-node-insert' but filtering for level 0 (files)"
  (interactive)
  (org-roam-node-insert (lambda (node)
                          (= 0 (org-roam-node-level node)))))

This 'works', but for whatever reason takes just as long (still several seconds
with a mouse-spinner before ivy pops up).

Digging into why, i traced org-roam-node-insert to org-roam-node-list. The
SQL returns more or less immediately in a sqlite shell, and via elisp:

(comment
 (cl-first
  (org-roam-node-list))

 (cl-first
  (reverse
   (org-roam-node-list))))

Then I noticed we're mapping everything via org-roam-node-read--to-candidate
before filtering. Sure enough, moving the filtering ahead of this immediately
speeds up commands that include filters to more or less match the sql-speed:

(comment
 (cl-first
  (org-roam-node-read--completions
   (lambda (node)
     (= 0 (org-roam-node-level node))))))

I don't think this will break any existing usage - I believe the same node is
being passed to the filter function, but a quick read of
org-roam-node-read--to-candidate to be sure it's actually passing the node
through unaltered is probably worth it:

(defun org-roam-node-read--to-candidate (node template)
  "Return a minibuffer completion candidate given NODE.
TEMPLATE is the processed template used to format the entry."
  (let ((candidate-main (org-roam-node--format-entry
                         template
                         node
                         (1- (if (bufferp (current-buffer))
                                 (window-width) (frame-width))))))
    (cons (propertize candidate-main 'node node) node)))

Seems fine to me, and with this change i can overwrite the insert/find bindings
with filters, leading to much better performance. Huzzah!

Other thoughts

Because sqlite is likely to scale well performance-wise, we might consider
including the work of --to-candidate in there, if it's possible. Although, this
seems unlikely, given that it seems to depend on the current buffer, width, and
frame.

Otherwise, org-roam-node-read--to-candidate could probably be refactored to
re-use the (if (bufferp (current-buffer)) (window-width) (frame-width)) bit.
In fact, is that being called for every candidate?? I'm not sure what 1- does.
I'll explore and report back.

@russmatney russmatney force-pushed the perf/filter-before-map-to-candidate branch from cf6d39f to 5ec671b Compare April 22, 2022 13:54
@russmatney
Copy link
Contributor Author

Updating to calc the width once did not seem to make much difference:

(defun org-roam-node-read--completions (&optional filter-fn sort-fn)
  "Return an alist for node completion.
The car is the displayed title or alias for the node, and the cdr
is the `org-roam-node'.
FILTER-FN is a function to filter out nodes: it takes an `org-roam-node',
and when nil is returned the node will be filtered out.
SORT-FN is a function to sort nodes. See `org-roam-node-read-sort-by-file-mtime'
for an example sort function.
The displayed title is formatted according to `org-roam-node-display-template'."
  (let* ((template (org-roam-node--process-display-format org-roam-node-display-template))
         (nodes (org-roam-node-list))
         (nodes (if filter-fn
                    (cl-remove-if-not
                     (lambda (n) (funcall filter-fn n))
                     nodes)
                  nodes))
         ;; map to candidates after filtering
         (nodes (mapcar (lambda (node)
                          (org-roam-node-read--to-candidate
                           node template
                           (1- (if (bufferp (current-buffer))
                                   (window-width) (frame-width))))) nodes))
         (sort-fn (or sort-fn
                      (when org-roam-node-default-sort
                        (intern (concat "org-roam-node-read-sort-by-"
                                        (symbol-name org-roam-node-default-sort))))))
         (nodes (if sort-fn (seq-sort sort-fn nodes)
                  nodes)))
    nodes))

(defun org-roam-node-read--to-candidate (node template &optional width)
  "Return a minibuffer completion candidate given NODE.
TEMPLATE is the processed template used to format the entry."
  (let ((candidate-main (org-roam-node--format-entry
                         template
                         node
                         (if width width
                           (1- (if (bufferp (current-buffer))
                                   (window-width) (frame-width)))))))
    (cons (propertize candidate-main 'node node) node)))

Likely the slowness is the work done in --format-entry. Maybe that could be pre-calced and stored in the db, or cached per template, but I'm sure that's more than a few lines 😛

@russmatney
Copy link
Contributor Author

russmatney commented Apr 22, 2022

For anyone looking, I ended up with helpers like these:

(defun russ/org-roam-old-p (node)
  (or
   (string-match-p
    "/old-nov-2020/\\|/old/\\|/drafts-journal/\\|/journal/\\|/archive/"
    (org-roam-node-file node))
   (member "reviewed" (org-roam-node-tags node))))

;;;###autoload
(defun russ/org-roam-insert-node-relevant ()
  "`org-roam-node-insert' but filtering out misc old notes"
  (interactive)
  (org-roam-node-insert
   (lambda (node)
     (not (russ/org-roam-old-p node)))))

;;;###autoload
(defun russ/org-roam-insert-file ()
  "`org-roam-node-insert' but filtering for level 0 (files)"
  (interactive)
  (org-roam-node-insert
   (lambda (node)
     (= 0 (org-roam-node-level node)))))

;;;###autoload
(defun russ/org-roam-find-node-relevant ()
  "`org-roam-node-insert' but filtering out misc old notes"
  (interactive)
  (org-roam-node-find
   nil nil
   (lambda (node)
     (not (russ/org-roam-old-p node)))))

;;;###autoload
(defun russ/org-roam-find-file ()
  "`org-roam-node-insert' but filtering out misc old notes"
  (interactive)
  (org-roam-node-find
   nil nil (lambda (node)
             (= 0 (org-roam-node-level node)))))

This filters my nodes from ~5k to ~2k and ~1k, which feels like it takes about half and a quarter as long. I'd expect things to continue to slow down as more nodes are added.

One (distracting) thought was that it'd be nice pass filters into the sqlite query itself - i.e. filtering on tags or on filename at the db level - but i think the real slowness comes from is the org-roam-node--format-entry function.

russmatney and others added 2 commits April 24, 2022 17:09
Moves the optional node filter in org-roam-node-read--completions before
the mapcar `org-roam-node-read--to-candidate`.

This saves time for completion on org-roam-node-find and
org-roam-node-insert (but only when a filter-fn is passed).
@jethrokuan jethrokuan force-pushed the perf/filter-before-map-to-candidate branch from 5ec671b to d95b1b4 Compare April 25, 2022 00:10
@jethrokuan
Copy link
Member

Nice catch @russmatney ! Missed this PR in the release, but gonna merge it anyway.

@jethrokuan jethrokuan merged commit e8b4822 into org-roam:main Apr 25, 2022
@@ -14,6 +22,8 @@
- [#2152](/~https://github.com/org-roam/org-roam/pull/2152) org-roam-preview-default-function: doesn't copy copy content of next heading node when current node's content is empty
- [#2159](/~https://github.com/org-roam/org-roam/pull/2159) db: fix db syncs on narrowed buffers
- [#2156](/~https://github.com/org-roam/org-roam/pull/2157) capture: templates with functions are handled correctly to avoid signaling `char-or-string-p`
- [#2168](/~https://github.com/org-roam/org-roam/pull/2168) (perf)node-read: filter nodes before mapping --to-candidate
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@jethrokuan sorry, i'd included a changelog line already - you may want to remove this if the change isn't in this version

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

urgh I thought I had

@russmatney
Copy link
Contributor Author

Nice catch @russmatney ! Missed this PR in the release, but gonna merge it anyway.

excellent, sounds good - thanks for the merge!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants