LuaSnip conditions memoization

first draft of memoization and condition objects as suggested in #592.

Just a draft PR so we have a place for discussion. If you've another idea of implementing, we cen delete/overwrite this one, no problem ; )

Collection of TODOs:

[ ] advise the users to put the heavier conditions at the end of the chain
[ ] more logic operators (currently: and, or, not) -> xnor
[ ] Prebuilt conditions (end of line, start of line, ...)
[ ] Prebuilt invalidation predicates
[ ] pass cursor position (etc.. ?) to the condition as well

Sep 15 '22 19:09 atticus-sullivan

This is pretty much what I had in mind as well, very nice :+1:

The overhead of using memoization would be interesting, should be measurable by adding a bunch of snippets with the same trigger, but different conditions.

Sep 15 '22 19:09 L3MON4D3

Ah, we should consider passing the cursor-position through from expand to check it in invalidate, querying that each time sounds horrible.

Sep 15 '22 19:09 L3MON4D3

Ah, we should consider passing the cursor-position through from expand to check it in invalidate, querying that each time sounds horrible.

Oh yes, sounds reasonable. Then we'll need some changes in he code after all (and think about if we want to make these args accessible for normal conditions as well (but I've got no real idea how to hide them tbh))

Sep 16 '22 14:09 atticus-sullivan

Then we'll need some changes in he code after all

True, but as long as they don't explicitly accomodate the memoization, I'm happy :D

It should be alright to just pass the cursor behind the other stuff, and it could actually come in useful for regular conditions, so there's no reason to hide it

Sep 16 '22 14:09 L3MON4D3

Also, remember that lua boolean operators don't do short-circuit, so this is basically doing all the calls all the time, I think it will be a good idea to do the short-circuit ourselves and advise the users to put the heavier conditions at the end of the chain.

Forget my comment, it does shortcut, I'm just speaking nonsense.

Sep 16 '22 14:09 leiserfg

Are you sure they don't short-circuit? Did a quick test and :lua =((function() print(1) return true end)() or (function() print(2) return true end)()) and 3 doesn't evaluate the second function. (Or I'm misunderstanding something, does anything in our construction prevent short-circuiting?)

Sep 16 '22 15:09 L3MON4D3

Are you sure they don't short-circuit?

Forget my comment, I'm definitely mistaking languages features.

Sep 16 '22 17:09 leiserfg

Just collected some TODOs in the initial PR message (let me know if I'm missing something).

What other logic operators do you think are usefull? (xnor since it is simple equality of course, but what else?) Remember that we need an operator to be overloaded with it (see https://www.lua.org/manual/5.1/manual.html#2.8 for what is possible)

What parameters for invalidation do you think are useful? (I still have to read about things like change ticks as my vim-lib background is pretty weak)

Should we make the changes to the luaSnip core (add additional parameters to condition function) in another PR?

EDIT: And we can already think of some useful conditions besides end_of_line and start_of_line (I personally don't use other ones tbh)

Sep 19 '22 19:09 atticus-sullivan

I'm using == for xnor. What do you think of ^ as the xor operator (inspired by C)?

Sep 19 '22 19:09 atticus-sullivan

All the _ are a bit annoying in the invalidate function, maybe we should discuss what of the possible arguments really are useful for that (on the other side it's always the question if we want to constrain the arguments any make it less flexible)

Sep 19 '22 19:09 atticus-sullivan

And I just though about invalidation conditions like same line. Not quite sure if it's done by tracking the number of the line. In most cases one would have some condition like line_start. Just checking the line number wouldn't be enough in that case, as the condition relays on the line's content didn't change (checking if the line didn't change would be no problem (string interning -> simple pointer equality check) but storing all the strings would eat quite some memory in larger files). But maybe it's possible to "ask" vim if the lines content has changed? :shrug: (we just have to keep in mind that the invalidation check has to be really fast as this is the part which will be executed every time without memoization)

Sep 19 '22 20:09 atticus-sullivan

What other logic operators do you think are usefull? (xnor since it is simple equality of course, but what else?) Remember that we need an operator to be overloaded with it (see https://www.lua.org/manual/5.1/manual.html#2.8 for what is possible)

Can't think of any others :/ But I I think and,or,xor,xnor,not should be pretty expressive already (also +1 for ^ as xor)

What parameters for invalidation do you think are useful? (I still have to read about things like change ticks as my vim-lib background is pretty weak)

Should we make the changes to the luaSnip core (add additional parameters to condition function) in another PR?

Mhmmmm would be a tad cleaner, but I won't complain if you tack it on here, it originates here after all

EDIT: And we can already think of some useful conditions besides end_of_line and start_of_line (I personally don't use other ones tbh)

I started looking into treesitter-conditions in treesitter-ideas, basically I want to add

Am I currently inside some node? (immediately or all parents)
Does my cursor-position match some query?

Those would have to be invalidated on pretty much any change, unfortunately.

Sep 19 '22 22:09 L3MON4D3

And I just though about invalidation conditions like same line. Not quite sure if it's done by tracking the number of the line. In most cases one would have some condition like line_start. Just checking the line number wouldn't be enough in that case, as the condition relays on the line's content didn't change (checking if the line didn't change would be no problem (string interning -> simple pointer equality check) but storing all the strings would eat quite some memory in larger files). But maybe it's possible to "ask" vim if the lines content has changed? shrug (we just have to keep in mind that the invalidation check has to be really fast as this is the part which will be executed every time without memoization)

line_start would also depend on the cursor-position inside the line :/ I can't really think of uses for same_line, but having the invalidation as a callback sounds very flexible (though we can only optimize predefined invalidations, so there might be a reason for accepting both)

In general, having nvim push the invalidations to us (via nvim_buf_attach) seems very wasteful, there will be many more changes than snippet-expansions, so we should definitely query changedtick (basically pull- instead of push-model).

We should really start measuring the impact of condition-evaluations before focusing this much on the memoization-options, it will most likely only be worth it if multiple snippets have the same condition and trigger (because only then will a condition be evaluated twice for a single expand-call)

For example, via

diff --git a/lua/luasnip/init.lua b/lua/luasnip/init.lua
index a79ddc5..390ba47 100644
--- a/lua/luasnip/init.lua
+++ b/lua/luasnip/init.lua
@@ -27,11 +27,15 @@ end
 -- parameters(trigger and captures). params are returned here because there's
 -- no need to recalculate them.
 local function match_snippet(line, type)
-	return snippet_collection.match_snippet(
+	require("jit.p").start("10,i1,s,m0,G", "match")
+	local match, expand_params = snippet_collection.match_snippet(
 		line,
 		util.get_snippet_filetypes(),
 		type
 	)
+	require("jit.p").stop()
+
+	return match, expand_params
 end
 
 -- ft:

(I get no samples at all, but I also don't have snippets with conditions) (It would be very useful to provide an easy-to-use way to profile the match_snippet-sections exclusively, then we could try collecting some data from users. Also important for them to decide which functions to memoize, and measure performance-impact of those decisions)

Sep 19 '22 22:09 L3MON4D3

All the _ are a bit annoying in the invalidate function, maybe we should discuss what of the possible arguments really are useful for that (on the other side it's always the question if we want to constrain the arguments any make it less flexible)

I'd say pass them in a table, that would at least be comfortable. (maybe create one table per condition and reuse it (don't create a table each call), but I'd say we measure first)

Sep 19 '22 22:09 L3MON4D3

I started looking into treesitter-conditions in treesitter-ideas, basically I want to add

Can't find treesitter-ideas, is this a github repo?

Sep 19 '22 23:09 atticus-sullivan

so we should definitely query changedtick (basically pull- instead of push-model).

:+1:

having the invalidation as a callback sounds very flexible (though we can only optimize predefined invalidations, so there might be a reason for accepting both)

I agree, why close this interface for the user (providing custom invalidate functions). Not quite sure what you want to optimize tough.

We should really start measuring the impact of condition-evaluations

Do you refer to switching trigger checking and condition checking?

we could try collecting some data from users

I guess we should make a testing branch which every user can checkout if they want to provide us with data. Any ideas on how to get to the users? (making an announcement) But I'm not sure how much the condition option is even used (I for instance only use it for checking for beginning of line to restrict some automatically triggered snippets to these occurrences). Without users using conditions we won't get a good result of this profiling.

Sep 19 '22 23:09 atticus-sullivan

Having all this discussion on memoization/invalidation, maybe we should outsource the implementation of condition objects and the logic operators to another (new) PR and handle memoization separately (still in this PR). What do you think?

Sep 19 '22 23:09 atticus-sullivan

I'd say pass them in a table, that would at least be comfortable. (maybe create one table per condition and reuse it (don't create a table each call), but I'd say we measure first)

No idea how you want to share a table with is simply used to pass arguments :shrug:, but that's for later

Sep 20 '22 00:09 atticus-sullivan

Just not sure about the vim stuff. But do you think there is any way a condition could change without setting another changedtick? (thinking about using this as the smallest unit of measure for some sort of invalidation caching)

But indeed we should run some benchmarks to check if memoization even has the potential to give some performance benefit (tbh I doubt it based on our current discussion, memoization sounds great but I don't think it's worth it. Maybe it's just a think to keep in the back of our mind to come back if someday this becomes an issue).

Sep 20 '22 00:09 atticus-sullivan

I started looking into treesitter-conditions in treesitter-ideas, basically I want to add

Can't find treesitter-ideas, is this a github repo?

Ah, my bad, it's a branch that I thought I'd pushed already :sweat_smile:

Sep 20 '22 05:09 L3MON4D3

I agree, why close this interface for the user (providing custom invalidate functions). Not quite sure what you want to optimize tough.

We could invalidate on "buffer" by actually clearing the value of conditions instead of checking the condition-function (otoh, this might also clean to often, maybe there won't be any expansions in the new buffere, and then we'd cleaned it for naught). But this is the only one where I can actually think of an optimization

What do you think?

Yeah, that makes sense, the logic-operators are pretty much complete, so no reason to wait with merging them. Good call :+1:

No idea how you want to share a table with is simply used to pass arguments :shrug:

:D I meant share across calls, so regularly we'd have

...
invalidate_args = {
  line = ...,
  cursor = ...
  ...
}
if invalidate(invalidate_args) then ... end

so we'd create a new table for each of those calls, but we could instead store this table in the condition-object, and just set the new values


...
self.invalidate_args.line = ...
self.invalidate_args.cursor = ...
...

if invalidate(self.invalidate_args) then ... end

Not sure if this has any performance-impact at all, I just thought of it and wanted to share :D

Just not sure about the vim stuff. But do you think there is any way a condition could change without setting another changedtick?

Well, maybe, depends on the condition. For example if filteype changes (oh, or the cursor moves ofc), I don't think that's covered by changedtick. We could handle any uncertainty here by calling that memoization just "changedtick" (otoh, "buffer" is clearer)

But indeed we should run some benchmarks to check if memoization even has the potential to give some performance benefit (tbh I doubt it based on our current discussion, memoization sounds great but I don't think it's worth it. Maybe it's just a think to keep in the back of our mind to come back if someday this becomes an issue).

Yup :/

Sep 20 '22 06:09 L3MON4D3

Not sure if this has any performance-impact at all

Me neither (basically the question is if allocate + init is faster than assigning elements)

oh, or the cursor moves ofc

Oh yes of course.

Sep 20 '22 10:09 atticus-sullivan

I'd suggest we keep this in the back of our mind, if someone reports bad performance on conditions (I see no point in doing memoization if there is no real benefit (ofc would be cool to simply have it, but for that it's just too much work I think)). We'd need user support for benchmarking anyways.

So I'd close this PR, but maybe we could add a label (maybe something like poss_performance_gain) to not forget this.

Oct 01 '22 17:10 atticus-sullivan

I'd suggest we keep this in the back of our mind, if someone reports bad performance on conditions (I see no point in doing memoization if there is no real benefit (ofc would be cool to simply have it, but for that it's just too much work I think)). We'd need user support for benchmarking anyways.

Agreed :+1:

I'd close it just in spirit, leaving it as draft seems the perfect thing to do, since we'll need to do more investigation to do this properly (or, to find out if this needs doing properly :D)

Oct 01 '22 20:10 L3MON4D3

Fine with that. Just thought properly close this might avoid cluttering the PR section. But you're also right, closed PRs might get forgetten

Oct 01 '22 21:10 atticus-sullivan