smartparens More general pairs in LaTeX

Hi,

This is a suggestion/question rather than a bug.

In LaTeX, \left( ... \right. is syntactically correct, and a common occurrence in order to insert an automatically sized open parenthesis, and no closing parenthesis. A \left without a \right is a syntax error.

And, on the other hand, standard parentheses can also have an unpaired size specification, like \big( ... \big).

So essentially there are three kinds of pairs in that mode:

(...) which can be prefixed by \big \bigg etc. (only the paren is paired, the size is the same)
\bigl(...\bigr) (both the size and the paren are paired)
\left ... \right which should be suffixed by one of (,[,<,\{,| for the opening, and one of ),],>,\},| for the closing (only the "size" is paired, the paren is free-ish)

Case 2. is currently handled in smartparens, but the specification has quadratic size, going over all the pairs of sizes and all the pairs of parentheses.

Case 1. is not currently handled by smartparens, adding it would require another quadratic size specification. (Note: apparently they are not good-practice LaTeX so they were removed: https://github.com/Fuco1/smartparens/issues/184. But a lot of people use them regardless.)

Case 3. is only handled in the case of paired parens.

Is there currently a good way, in smartparens, to specify this kind of pairs (in a more compact way for 2.), or should we play with :pre-handler and :post-handler to achieve it, or is it hopeless?

Besides a more complete support of the language, my hope (in relation to https://github.com/Fuco1/smartparens/issues/1013 ) is that a more compact description would also allow smartparens to process those pairs faster.

Mar 13 '20 15:03 ThibautVerron

Internally a large part of the search is handled by regexps, so 10 pairs of the form blaX, blaY, blaZ, ... will create a pretty efficient regexp, something like bla[XYZ]. The current "redundant" definition is there only to make the open-close pairing explicit.

I've played with the idea of making the pair definitions themselves regexps but this introduces a ton of problems with escaping and combining them in efficient ways with other pairs (where we generate the regexps).

I think the easiest way would be to define them with a macro / loop and let smartparens internally use whatever amount of pairs is necessary.

Mar 21 '20 18:03 Fuco1

I had a look at the code when trying to investigate #1013 , and it really looks like having more pairs harms efficiency. The generated regexps are pretty efficient for inspecting the text, but it seems that a lot of the machinery after that involves looping over the list of pairs, e.g. to check the actions. Of course I can't claim to have as much understanding of the package as you do.

The problem with the current macros is that the same string would be opener or closer for different pairs: it looks like \left[ ... \right. and \left(...\right. (same closer) can coexist without a problem, but \left( ... \right. and \left( ... \right) (same opener) can't.

I've played with the idea of making the pair definitions themselves regexps but this introduces a ton of problems with escaping and combining them in efficient ways with other pairs (where we generate the regexps).

Yes I agree. But maybe something ad-hoc, in the spirit of how xml tags are handled could work here?

Mar 24 '20 14:03 ThibautVerron

Personally I would consider the looping impact negligible. When I profiled SP a couple times it never came up as a bottleneck.

So unless there is some other reason I would go ahead and assume it would have no impact. In dash.el we did some benchmarks with for example looping vs hash tables and looping was faster until about 200 items (of course depends on CPU etc). But the total time spent there is something like 2 microseconds -> 1 microsecond if you use more efficient lookup. So I wouldn't bother with performance at first.

The point about the openers is valid. And frankly, it's a very arbitrary reason SP requires the openers to be unique (to make the pair management simple). But it could probably be updated, not with a ton of effort, to support non-unique openers as well (but still unique open-close pairs would be required).

Mar 24 '20 20:03 Fuco1

I see, thanks for the explanation. But it should certainly depends on what the loop does. For example, calls to texmathp are not particularly cheap.

Anyway, regardless of which structure is faster (and I certainly believe that 200 elements list are short enough that a direct list implementation is faster than fancy structures), a short list should always be faster than a long one.

In any case, maybe you prefer that we continue this discussion in #1013 or in another issue about performance?

unique open-close pairs would be required

That's a natural expectation.

But it could probably be updated, not with a ton of effort, to support non-unique openers as well

That sounds reasonable, maybe even something I could try to do myself if you don't have time.

But it would also require some thought about the UI, regarding auto-insertion and wrapping.

Mar 25 '20 06:03 ThibautVerron