explain-pause-mode icon indicating copy to clipboard operation
explain-pause-mode copied to clipboard

Why does `xterm--report-background-handler` and `version-handler` take 2 seconds to return on at least iterm2 and alacritty? Investigate emacs-bug. (was Doesn't detect pause when starting "emacsclient -nw")

Open nolanl opened this issue 5 years ago • 17 comments

I'm trying to track down a 2 second pause where running "emacsclient -nw SOMEFILE" shows emacs immediately (either on the scratch buffer or whatever buffer I was last interacting with in another window/tty) before it shows SOMEFILE.

I start explain-pause-mode in init.el at "emacs --daemon" start time. I tried to open a file 6 times in a row, but explain-pause-top just showed some normal ~70ms GC pauses, nothing close to 2000ms.

Dunno if this use-case is intended or even possible...

nolanl avatar Jul 04 '20 22:07 nolanl

Yes, that is the intended use case. But not all the file IO code or the buffer init code OR the xterm init code is captured yet. Also, there’s a lot of native frames that are currently hidden because I don’t have a good way to display them, and probably your 2 secs might be found there IF it’s covered.

Undocumented feature exists where you can grab the event steam live and that might help.

Do

(explain-pause-log-to-socket "./some-socket-file-you-know-where-it-is")
(explain-pause-mode t) ;; turn the mode on

Before doing that, somewhere else in bash, run

socat -u UNIX-RECV:./that-file-you-named-earlier STDOUT

This will print a live stream of every command that your Emacs is running, even before executing the command.

If you can’t find the time there, it must be somewhere where I haven’t captured yet. I too get slow file opens sometimes and I really want this use case to work so I’m happy to investigate w you if you’re game!

lastquestion avatar Jul 04 '20 22:07 lastquestion

Ran "socat -u UNIX-RECV:/tmp/sock STDOUT | ts | tee ~/slow.log" in one terminal, and "emacsclient -e '(explain-pause-log-to-socket "/tmp/sock")'; e customize.el ; killall socat", then did a C-x C-c as fast as I could once the file showed up.

Jul 04 16:06:42 ("enter" "server-sentinel" "server-process-filter" "process-sentinel" nil 0 nil nil nil 3)
Jul 04 16:06:42 ("exit" "server-sentinel" "process-sentinel" 0 nil)
Jul 04 16:06:42 ("exit" "server-process-filter" "process-filter" 2 nil)
Jul 04 16:06:42 ("enter" "server-sentinel" "root-emacs" "process-sentinel" nil 0 nil nil nil 3)
Jul 04 16:06:42 ("exit" "server-sentinel" "process-sentinel" 0 nil)
Jul 04 16:06:42 ("enter" "server-process-filter" "root-emacs" "process-filter" nil 0 nil nil nil 3)
Jul 04 16:06:42 ("enter" "#<subr read-event>" "server-process-filter" "server-process-filter" t 0 nil nil nil 4)
Jul 04 16:06:43 ("enter" "auto-revert-buffers" "#<subr read-event>" "timer" nil 0 nil nil nil 2)
Jul 04 16:06:43 ("exit" "auto-revert-buffers" "timer" 0 nil)
Jul 04 16:06:43 ("enter" "undo-auto--boundary-timer" "#<subr read-event>" "timer" nil 0 nil nil nil 2)
Jul 04 16:06:43 ("exit" "undo-auto--boundary-timer" "timer" 0 nil)
Jul 04 16:06:44 ("exit" "#<subr read-event>" "server-process-filter" 1999 nil)
Jul 04 16:06:44 ("exit" "server-process-filter" "process-filter" 88 nil)
Jul 04 16:06:44 ("enter" "<bytecode> (references: (eldoc-mode global-eldoc-mode eldoc--supported-p eldoc-print-current-symbol-info))" "root-emacs" "idle-timer" nil 0 nil nil nil 5)
Jul 04 16:06:44 ("exit" "<bytecode> (references: (eldoc-mode global-eldoc-mode eldoc--supported-p eldoc-print-current-symbol-info))" "idle-timer" 0 nil)
Jul 04 16:06:44 ("enter" "show-paren-function" "root-emacs" "idle-timer" nil 0 nil nil nil 2)
Jul 04 16:06:44 ("exit" "show-paren-function" "idle-timer" 0 nil)
Jul 04 16:06:44 ("enter" "save-buffers-kill-terminal" "root-emacs" "root-emacs" nil 0 nil nil nil 1)
Jul 04 16:06:44 ("enter" "server-sentinel" "save-buffers-kill-terminal" "process-sentinel" nil 0 nil nil nil 3)
Jul 04 16:06:44 ("exit" "server-sentinel" "process-sentinel" 0 nil)
Jul 04 16:06:44 ("exit" "save-buffers-kill-terminal" "root-emacs" 0 nil)

Nothing jumps out at me.

nolanl avatar Jul 04 '20 23:07 nolanl

Check it out, you spent two seconds in read-event:

Jul 04 16:06:44 ("exit" "#<subr read-event>" "server-process-filter" 1999 nil)

1999ms exactly, that's the timeout firing.

read-event documentation:

(read-event &optional PROMPT INHERIT-INPUT-METHOD SECONDS) Read an event object from the input stream. If the optional argument PROMPT is non-nil, display that as a prompt. If the optional argument INHERIT-INPUT-METHOD is non-nil and some input method is turned on in the current buffer, that input method is used for reading a character. If the optional argument SECONDS is non-nil, it should be a number specifying the maximum number of seconds to wait for input. If no input arrives in that time, return nil. SECONDS may be a floating-point value.

Conclusion: somewhere in someone's code, it's literally waiting for some kind of input for 2 seconds. It might be trying to read keyboard input, or it might be trying to read mouse input, or maybe - and this is my suspicion - it is the read-event that is in xterm.el.

That happens here-ish, at least in emacs HEAD: https://github.com/emacs-mirror/emacs/blob/master/lisp/term/xterm.el#L764

Check out that the default value of xterm-query-timeout is 2 seconds exactly, so I suspect it's here. Before we dig further, you need to tell me which emacs version you're running 😀

But certainly, for this package, there needs to be some way to see "native frames" somehow, and especially "native frames that timeout". I'm not sure yet how to show that :( I suspect I need to implement #48 or something similar so I can generate a tree view in top before I can show native frames.

But I should also productionalize this secret feature (which I intend to do) by providing a way for a separate emacs process to read it and highlight things and stuff. You can see I intended to do that since it spits out elisp that is parseable.

lastquestion avatar Jul 04 '20 23:07 lastquestion

GNU Emacs 26.1 from debian buster: 1:26.1+1-3.2+deb10u1

nolanl avatar Jul 04 '20 23:07 nolanl

This commit in 27+ decreases the xterm wait https://github.com/emacs-mirror/emacs/commit/c67befd662571a822f3183c114ce3b1ad02e0ed9

Before 27, there was a hardcoded wait of 2 seconds, and it was not controllable.

Suggestion:

  1. Try 27+, and at least you have configurable values for the xterm init timeout
  2. What terminal driver are you running? From initial browsing really fast, I can see it's doing fancy stuff trying to ask how many colors it supports and term caps. Also check what your TERM is and what it's advertising. Run infocmp

lastquestion avatar Jul 04 '20 23:07 lastquestion

  1. Thanks, I'll try that and see if changing it helps.
  2. Alacritty
$ echo $TERM
xterm-256color
$ infocmp
#	Reconstructed via infocmp from file: /lib/terminfo/x/xterm-256color
xterm-256color|xterm with 256 colors,
	am, bce, ccc, km, mc5i, mir, msgr, npc, xenl,
	colors#0x100, cols#80, it#8, lines#24, pairs#0x10000,
	acsc=``aaffggiijjkkllmmnnooppqqrrssttuuvvwwxxyyzz{{||}}~~,
	bel=^G, blink=\E[5m, bold=\E[1m, cbt=\E[Z, civis=\E[?25l,
	clear=\E[H\E[2J, cnorm=\E[?12l\E[?25h, cr=\r,
	csr=\E[%i%p1%d;%p2%dr, cub=\E[%p1%dD, cub1=^H,
	cud=\E[%p1%dB, cud1=\n, cuf=\E[%p1%dC, cuf1=\E[C,
	cup=\E[%i%p1%d;%p2%dH, cuu=\E[%p1%dA, cuu1=\E[A,
	cvvis=\E[?12;25h, dch=\E[%p1%dP, dch1=\E[P, dim=\E[2m,
	dl=\E[%p1%dM, dl1=\E[M, ech=\E[%p1%dX, ed=\E[J, el=\E[K,
	el1=\E[1K, flash=\E[?5h$<100/>\E[?5l, home=\E[H,
	hpa=\E[%i%p1%dG, ht=^I, hts=\EH, ich=\E[%p1%d@,
	il=\E[%p1%dL, il1=\E[L, ind=\n, indn=\E[%p1%dS,
	initc=\E]4;%p1%d;rgb\:%p2%{255}%*%{1000}%/%2.2X/%p3%{255}%*%{1000}%/%2.2X/%p4%{255}%*%{1000}%/%2.2X\E\\,
	invis=\E[8m, is2=\E[!p\E[?3;4l\E[4l\E>, kDC=\E[3;2~,
	kEND=\E[1;2F, kHOM=\E[1;2H, kIC=\E[2;2~, kLFT=\E[1;2D,
	kNXT=\E[6;2~, kPRV=\E[5;2~, kRIT=\E[1;2C, kb2=\EOE, kbs=^?,
	kcbt=\E[Z, kcub1=\EOD, kcud1=\EOB, kcuf1=\EOC, kcuu1=\EOA,
	kdch1=\E[3~, kend=\EOF, kent=\EOM, kf1=\EOP, kf10=\E[21~,
	kf11=\E[23~, kf12=\E[24~, kf13=\E[1;2P, kf14=\E[1;2Q,
	kf15=\E[1;2R, kf16=\E[1;2S, kf17=\E[15;2~, kf18=\E[17;2~,
	kf19=\E[18;2~, kf2=\EOQ, kf20=\E[19;2~, kf21=\E[20;2~,
	kf22=\E[21;2~, kf23=\E[23;2~, kf24=\E[24;2~,
	kf25=\E[1;5P, kf26=\E[1;5Q, kf27=\E[1;5R, kf28=\E[1;5S,
	kf29=\E[15;5~, kf3=\EOR, kf30=\E[17;5~, kf31=\E[18;5~,
	kf32=\E[19;5~, kf33=\E[20;5~, kf34=\E[21;5~,
	kf35=\E[23;5~, kf36=\E[24;5~, kf37=\E[1;6P, kf38=\E[1;6Q,
	kf39=\E[1;6R, kf4=\EOS, kf40=\E[1;6S, kf41=\E[15;6~,
	kf42=\E[17;6~, kf43=\E[18;6~, kf44=\E[19;6~,
	kf45=\E[20;6~, kf46=\E[21;6~, kf47=\E[23;6~,
	kf48=\E[24;6~, kf49=\E[1;3P, kf5=\E[15~, kf50=\E[1;3Q,
	kf51=\E[1;3R, kf52=\E[1;3S, kf53=\E[15;3~, kf54=\E[17;3~,
	kf55=\E[18;3~, kf56=\E[19;3~, kf57=\E[20;3~,
	kf58=\E[21;3~, kf59=\E[23;3~, kf6=\E[17~, kf60=\E[24;3~,
	kf61=\E[1;4P, kf62=\E[1;4Q, kf63=\E[1;4R, kf7=\E[18~,
	kf8=\E[19~, kf9=\E[20~, khome=\EOH, kich1=\E[2~,
	kind=\E[1;2B, kmous=\E[M, knp=\E[6~, kpp=\E[5~,
	kri=\E[1;2A, mc0=\E[i, mc4=\E[4i, mc5=\E[5i, meml=\El,
	memu=\Em, oc=\E]104\007, op=\E[39;49m, rc=\E8, rev=\E[7m,
	ri=\EM, rin=\E[%p1%dT, ritm=\E[23m, rmacs=\E(B,
	rmam=\E[?7l, rmcup=\E[?1049l\E[23;0;0t, rmir=\E[4l,
	rmkx=\E[?1l\E>, rmm=\E[?1034l, rmso=\E[27m, rmul=\E[24m,
	rs1=\Ec\E]104\007, rs2=\E[!p\E[?3;4l\E[4l\E>, sc=\E7,
	setab=\E[%?%p1%{8}%<%t4%p1%d%e%p1%{16}%<%t10%p1%{8}%-%d%e48;5;%p1%d%;m,
	setaf=\E[%?%p1%{8}%<%t3%p1%d%e%p1%{16}%<%t9%p1%{8}%-%d%e38;5;%p1%d%;m,
	sgr=%?%p9%t\E(0%e\E(B%;\E[0%?%p6%t;1%;%?%p5%t;2%;%?%p2%t;4%;%?%p1%p3%|%t;7%;%?%p4%t;5%;%?%p7%t;8%;m,
	sgr0=\E(B\E[m, sitm=\E[3m, smacs=\E(0, smam=\E[?7h,
	smcup=\E[?1049h\E[22;0;0t, smir=\E[4h, smkx=\E[?1h\E=,
	smm=\E[?1034h, smso=\E[7m, smul=\E[4m, tbc=\E[3g,
	u6=\E[%i%d;%dR, u7=\E[6n, u8=\E[?%[;0123456789]c,
	u9=\E[c, vpa=\E[%i%p1%dd,

nolanl avatar Jul 04 '20 23:07 nolanl

It's either xterm--report-background-handler or xterm--version-handler. Directly run it in *Scratch* and see which one it is. On my machine on 26.4, (xterm--report-background-handler) takes 2 seconds and xterm--version-handler is faster but still slow as molasses. This feels like possibly a bug in Emacs which that commit "hides" - or both iTerm and Alacritty have implemented the same behavior.

It's also possible emacs is querying something that no modern term implements as that code has been there for a long time.

lastquestion avatar Jul 05 '20 00:07 lastquestion

OK, I lied, both are timing out:

(require  'benchmark)
benchmark

(benchmark-elapse (xterm--report-background-handler))
2.000109
(benchmark-elapse (xterm--version-handler))
2.000795

lastquestion avatar Jul 05 '20 00:07 lastquestion

on 26.1:

(benchmark-elapse (xterm--report-background-handler))
2.001629141
(benchmark-elapse (xterm--version-handler))
2.00161142

on emacs mainline:

(benchmark-elapse (xterm--report-background-handler))
2.001123153
(benchmark-elapse (xterm--version-handler))
2.001618543

nolanl avatar Jul 05 '20 00:07 nolanl

xterm is very quick.

nolanl avatar Jul 05 '20 00:07 nolanl

That's crazy. I get that too. Does this PR even work?! https://github.com/emacs-mirror/emacs/commit/c67befd662571a822f3183c114ce3b1ad02e0ed9#diff-89047cdd70633a563eda4e02a5fb2e13L674

Very strange:

(require 'benchmark)
benchmark
(benchmark-elapse (xterm--report-background-handler))
2.005282

xterm-query-redisplay-timeout
0.2
xterm-query-timeout
2
(setq xterm-query-timeout 0.1)
0.1
(benchmark-elapse (xterm--report-background-handler))
0.206694

I conclude there is probably an emacs-lisp bug in that PR above. This is on 28 HEAD.

However, I think you can set both values very low and see what happens. I believe it will fix it assuming that PR even half works.

Also, I was thinking how annoying it is to recompile emacs for 28+, if you want to stay on 26 I'd honestly just define xterm--read-event-for-query to just return nil without trying to read. It's what it does anyway when it timesout.

I am very curious what exactly it is querying for, though, and I will take a look after finishing #56 ...

lastquestion avatar Jul 05 '20 00:07 lastquestion

Wait, I understand that code now. The total time it waits is the combination of both values. So the PR does work. Just set both values to 0.01. But still doesn't explain what the hell it's querying for...and why all these modern term emulators don't reply to it.

lastquestion avatar Jul 05 '20 00:07 lastquestion

(defun xterm--read-event-for-query () nil) works for me, thanks for taking the time to help me debug this, it has been annoying me for months. Weird terminal interactions were definitely not where I thought we'd end up here.

nolanl avatar Jul 05 '20 00:07 nolanl

This comment is highly suggestive: https://github.com/emacs-mirror/emacs/blob/c67befd662571a822f3183c114ce3b1ad02e0ed9/lisp/term/xterm.el#L677

nolanl avatar Jul 05 '20 00:07 nolanl

I saw that too. And this has been hitting me too, but I've been thinking that I've been doing some GC during startup and I was going to get "to it" once I did #27 and track GC pauses. LOL. Well thank you because it has been also frustrating me.

I am probably gonna see if I can figure out what the root cause is, and try to report a bug to emacs core... I suspect this is likely hitting lots and lots of people and it wouldn't count as part of init-time either. I see lots of posts on reddit saying "oh my init code was 0.1 seconds but I spent many seconds before emacs started up".

Maybe make a post in reddit about this and help the community? There are at least dozens of us still using -nw :p HAHA

lastquestion avatar Jul 05 '20 00:07 lastquestion

I'm going to open new enhancement issues for the two things I outlined so it makes it easier to find and debug these issues. They were things I wanted to do already, but it's good to get affirmation they are useful features.

Thank you for trying explain-pause out and let me know if you find other bugs! I'm going to leave this open until I get time to discover with upstream what's going on....

lastquestion avatar Jul 05 '20 01:07 lastquestion

Awesome, thanks again for the debugging help and for making explain-pause-mode!

nolanl avatar Jul 05 '20 01:07 nolanl