Wishlist-for-R icon indicating copy to clipboard operation
Wishlist-for-R copied to clipboard

WISH: Multi-Line Comments

Open mikemahoney218 opened this issue 6 years ago • 17 comments

Many languages have syntax specific for entering multi-line comments (for instance, C's /* comment */ syntax). A similar syntax in R would allow cleaner code documentation, with longer comments not needing to have each line prefixed with #.

mikemahoney218 avatar Feb 16 '20 03:02 mikemahoney218

Many people think that multi-line comments are not a great idea because it is easy to make mistakes with them.

Every modern editor supports commenting and uncommenting a block of code using single-line comments.

gaborcsardi avatar Feb 16 '20 08:02 gaborcsardi

I would prefer multi-line strings to multi-line comments.

nfultz avatar Feb 16 '20 19:02 nfultz

@nfultz multiline strings, depending on what you mean, are already in and have been since I'm aware:

> x = "hi how
+ are you doing
+ today"
> x
[1] "hi how\nare you doing\ntoday"

The same works in script code.

As for multi-line comments, AFAIK its as @gaborcsardi suggests. From what I know second+ hand this was considered long ago and may even have been implemented in some form briefly in devel but was then taken back out. My understanding is that, in as much as R-core has options as a group (it doesnt really) it's opinion is, has been for a while, and is likely to stay that multi-line comments aren't desireable, making this issue likely a non-starter. That said no one in this thread including me can speak for R-core so its not completely impossible I guess.

gmbecker avatar Feb 16 '20 21:02 gmbecker

I was thinking more of the triple-quote strings in python, which serve as its multi-line comments - you can't (generally) wrap a chunk of code in single (character) quotes to not run it, because it may have quotes on the the inside that would have to be escaped.

nfultz avatar Feb 16 '20 21:02 nfultz

I feel that every modern editor supports commenting and uncommenting a block of code using single-line comments is a decent indicator that a more efficient syntax would be useful :smile: . I can understand the concern about them causing issues in code, but I personally feel like their utility outweighs the risk. I won't hold out too much hope if R-core has already decided against this idea, though.

mikemahoney218 avatar Feb 16 '20 21:02 mikemahoney218

and @nfultz , I wonder if the new r"( raw string syntax could be used for similar -- though I don't know how python treats these strings; my guess (I haven't played with the newest R devel) is that doing so in R would print out each of your comments in sequence as you run your code.

mikemahoney218 avatar Feb 16 '20 22:02 mikemahoney218

their utility outweighs the risk

So if you can easily add multi-line comments in any editor (correctly, unlike adding a */, which might be incorrect!), then what is their utility?

The new raw string syntax is exactly what the triple quote syntax is in Python, AFAICT.

gaborcsardi avatar Feb 16 '20 22:02 gaborcsardi

A similar syntax in R would allow cleaner code documentation, with longer comments not needing to have each line prefixed with #

Do not forget that some R users don't use colors to highlight syntax. For us having a "#" at the start of each line is a cleaner way to comment out code, even in languages that allow multi-line comments.

karoliskoncevicius avatar Feb 16 '20 22:02 karoliskoncevicius

I don't know that I buy a correct/incorrect distinction here -- feels to me like */ would be "correct" if added to the syntax; and I've certainly made mistakes from including # in the wrong location (or outside of quotes) if that's what you mean by "might be incorrect". The utility is clearer documentation (and faster/less finnicky when dealing with line lengths while editing); I hadn't even considered commenting out blocks of code with this syntax when originally creating the post, that's why you delete things and commit often :smile:.

mikemahoney218 avatar Feb 16 '20 22:02 mikemahoney218

"Incorrect" means that if comment out (say) three lines of code in the middle of a file, you need to check if those three lines are already included in a longer comment or not. It is very easy to make mistakes this way.

Also, if you look at code in the middle of a file, it is impossible to tell if you are looking at code that was commented out, or not.

gaborcsardi avatar Feb 16 '20 22:02 gaborcsardi

@nfultz In that case I don't agree that thats the right way put comments in code in the first place. Comments shouldn't need to be executed, in my opinion, it should be a parser distinction, which formal comments are but just putting a huge string in the middle of your code and not assigning it to anything is not.

There is also the issue I thought @gaborcsardi was talking about, ,which is that if you already have /* blabla */ in your code and then you try to wrap a larger chunk that includes that in a new /* */ pair it will actually not work, the new start of the comment will be terminated by the old end, like so

/* new start
x= some_code()
/* y = 5 */ XXX this is the old comment but its where the new comment closes now!
more_code()  XXX this is not commented
*/ XXX this is an error now

It is impossible for single line comments applied in batch to have this problem

gmbecker avatar Feb 17 '20 01:02 gmbecker

@nfultz In that case I don't agree that thats the right way put comments in code in the first place. Comments shouldn't need to be executed, in my opinion, it should be a parser distinction, which formal comments are but just putting a huge string in the middle of your code and not assigning it to anything is not.

I guess that's more Pure, but I'd rather see scripts where the author left a comment the Wrong Way instead of not writing it all just because they didn't know / couldn't remember a emacs shortcut.

Also, wouldn't the byte compiler optimize away unused constants anyway?

nfultz avatar Feb 17 '20 02:02 nfultz

I mean sure, probably. Except that you don't byte compile script code in general. Also I have to say, the concept of people in any real numbers just giving up on commenting their code for the lack of multi-line comments doesn't really ring true to me at all.

Also, you didn't address the other, much more important part of my (and gabor's) comments at all, ie that multi-line comments can actually introduce errors quite easily that are impossible to commit with batch single-line comments.

gmbecker avatar Feb 17 '20 04:02 gmbecker

left a comment the Wrong Way instead of not writing it all just because they didn't know / couldn't remember a emacs shortcut.

Well, when you write comments, you don't really need the editor's shortcut, because you can just type in the comment character, together with the rest of the text, no?

I think multi-line comments are only potentially useful when temporarily commenting out a chunk of code. And even then they are a pain to use, because you need to basically read the whole file for embedding or embedded comments.

Also, wouldn't the byte compiler optimize away unused constants anyway?

No, it can't. R is a very dynamic language, and the byte compiler does not know if some other code will inspect the byte-compiled code to compute on it. To make sure that computation on the language works the same way on non-byte-compiled and byte-compiled code, you need to keep the dummy strings. Some packages use dummy strings to annotate code, e.g. https://github.com/r-lib/debugme

gaborcsardi avatar Feb 17 '20 08:02 gaborcsardi

I mean sure, probably. Except that you don't byte compile script code in general. Also I have to say, the concept of people in any real numbers just giving up on commenting their code for the lack of multi-line comments doesn't really ring true to me at all.

True, it probably only would affect 1-2% of the not-commenters. But over a few thousand students over a few years, it could add up and save some grading time, for example. If I were making a trade, I'd prefer a 100 shallow bugs that can be trivially fixed vs one deep one where the author didn't document what they wrote.

Well, when you write comments, you don't really need the editor's shortcut, because you can just type in the comment character, together with the rest of the text, no?

A lot of comments I've seen aren't original, they're copy pasted into the file from other sources. Email, stackoverflow, github, jira being the most common, but also articles, software licenses, etc.

No, it can't. R is a very dynamic language, and the byte compiler does not know if some other code will inspect the byte-compiled code to compute on it. To make sure that computation on the language works the same way on non-byte-compiled and byte-compiled code, you need to keep the dummy strings. Some packages use dummy strings to annotate code, e.g. https://github.com/r-lib/debugme

Not to get too off topic, but doesn't that require the original expression tree and not the actual byte code (eg vector of ints)? Or Is the expression -> compile -> bytecode -> disassemble -> expression conversions actually lossless?

In DeclareDesign, we had tried to annotate functions using actual comments, but ran in to issues with the compiler and the keep.source option.

nfultz avatar Feb 17 '20 16:02 nfultz

A lot of comments I've seen aren't original, they're copy pasted into the file from other sources. Email, stackoverflow, github, jira being the most common, but also articles, software licenses, etc.

Right. So basically the use case is copying code/text from external source, by people that cannot mind the "Comment code" item in the menu. :)

In DeclareDesign, we had tried to annotate functions using actual comments, but ran in to issues with the compiler and the keep.source option.

AFAICT comments are dropped, string constants are not.

gaborcsardi avatar Feb 17 '20 17:02 gaborcsardi

In DeclareDesign, we had tried to annotate functions using actual comments, but ran in to issues with the compiler and the keep.source option.

AFAICT comments are dropped, string constants are not.

I mean, this really seems to be an undefined behavior implementation detail to me. I wouldn't rely on constants not assigned to anything remaining available in compiled versions of functions, Unless @ltierney and/or @kalibera has indicated that retaining things like that is part of the contract the compiler implements (possible but would surprise me).

gmbecker avatar Feb 18 '20 04:02 gmbecker