Fix #3921 & #2342: inline function without parentheses used in condition, inline 'else' #4838

zdenko · 2017-12-31T15:42:09Z

Fixes #3921 and #2342.
Compile fails when inline function is used in condition (control flow): if, unless, while, until, switch.

Example cases:

# [1]
if f (a) -> a is 1
  'ok'

# [2]
res = if f (a) -> a is 1 then 'ok' else 'ko'

# [3]
while do (s = i) -> a -= s
  foo(a)

# [4]
i = 1
a = switch do (m = 2) -> i * m
  when 2 then "two"
  when 1 then "one"
  else "none"

[1], [2], [3], [4]

The main cause of the issue is incorrect positions of INDENT and OUTDENT tag pair inserted in Rewriter::normalizeLines.
Example:

if f -> a
  'ok'

###
IF/if IDENTIFIER/f ->/-> INDENT/2 IDENTIFIER/a INDENT/2 STRING/'ok' OUTDENT/2 OUTDENT/2
                         ^                                                    ^
                         *                                                    *
###

The second OUTDENT should be placed between a and INDENT.
Besides that, @tagPostfixConditionals also converts IF into POST_IF tag in such cases.

At first, I’ve tried to adjust the conditions in @normalizeLines to correct the INDENT/OUTDENT position. I managed to get it work for cases where newline follows inline function, but I couldn’t solve cases with inline condition (if -> a then) which are too ambiguous.
So, at the end, I decided to add new method @addParensToConditions which wraps the function in parentheses.

The better solution would be controlling the output in grammar and nodes, but currently I don’t see it as an option, since Expression used in grammar is too broad, which means that rules for If, While and Switch nodes would have to be split into more detailed rules, e.g. for Value, Code, Assign, …

…tion

GeoffreyBooth · 2017-12-31T19:49:32Z

Why wouldn’t extending the grammar be an option? When I look at Expression, I see only this:

  Expression: [
    o 'Value'
    o 'Code'
    o 'Operation'
    o 'Assign'
    o 'If'
    o 'Try'
    o 'While'
    o 'For'
    o 'Switch'
    o 'Class'
    o 'Throw'
    o 'Yield'
  ]

Several of these don’t make much sense to allow inside an if conditional (throw, yield, maybe if, class). But even if we allowed all of them, couldn’t the grammar distinguish between items that require being wrapped in parentheses versus items that don’t? So something like:

IfBlock: [
  o 'IF SimpleConditional Block',                    -> new If $2, $3, type: $1
  o 'IfBlock ELSE IF SimpleConditional Block',       -> $1.addElse LOC(3,5) new If $4, $5, type: $3
  o 'IF ( ComplexConditional ) Block',               -> new If $3, $4, type: $1
  o 'IfBlock ELSE IF ( ComplexConditional ) Block',  -> $1.addElse LOC(3,7) new If $5, $7, type: $3
]

And then we’d copy Expression into two new blocks, SimpleConditional and ComplexConditional, based on whether the conditional needs to be wrapped in parentheses or not. Code, for example, would need to be wrapped in parentheses. I don’t think it’s a requirement that ambiguous things like inline functions need to be allowed in conditionals without being wrapped in parentheses.

GeoffreyBooth · 2017-12-31T19:52:38Z

Also I wonder if we could fix any of the other if-related bugs at the same time: #3933, #3921, #3441, #2342.

vendethiel · 2017-12-31T20:34:08Z

Fixing this in the grammar would be much better yeah.

WRT other if bugs, see satyr/coco#127 as it has quite a bit of discussion

zdenko · 2018-01-04T12:17:27Z

OK, I bite the bullet, discarded the @addParensToConditions method from the Rewriter and implement new rules in the grammar instead.
There are quite a few changes, so I'll try to explain the problem and solution.

As said before, the problem with the inline functions being used in conditional and control flow (If, While, For, Switch, ...), was an improper implicit placement of the INDENT and OUTDENT tags in the Rewriter::normalizeLines.

For example: f -> a would be rewritten to

f ->
  a
### Tokens
IDENTIFIER -> INDENT IDENTIFIER OUTDENT 
###

And the final output from the rewriter is

f (-> 
   a
)
### Tokens
IDENTIFIER CALL_START -> INDENT IDENTIFIER OUTDENT CALL_END
###

Grammar breaks this expression into

Value
 Invocation 
   Value 
     Identifier "f"
   Arguments
     ArgList
       Code
         FuncGlyph "->"
         Block "INDENT a OUTDENT"

And, in this example if f -> a then b, the code will be broken into

if f ( ->
      a
      b
   )

If
  Value
   Invocation 
     Value 
       Identifier "f"
     Arguments
       ArgList
         Code
           FuncGlyph "->"
           Block "INDENT a INDENT b OUTDENT OUTDENT"

Essentially, the problem is misinterpreted inline function (-> a) when used as an argument to function call.

The initial idea for solving this, was adjusting @normalizeLine and place INDENT and OUTDENT in proper places, or adding the new method which will wrap f -> a in braces.

The former would mean adding more condition rules when scanning tokens, the latter added just another unnecessary loop over the tokens.

By looking closer at grammar it's clear that the main reason @normalizeLine inserts INDENT after the ->, is because the grammar rule for the Code expects Block which is basically indented Body.
So, to solve this we need a grammar rule which will also accept -> Expression.
This will enable writing inline functions without braces.

if do -> a then b

if do -> 'wrong' then do -> 'right'

c if a = f -> b

for a in do -> b
  a
  
while do (a = 1) -> f a
  c   
 
while b = f -> a when a isnt 3
  foo b
 
c = switch b = f -> a
  when 2 then "two"
  when 1 then "one"
  else "none"

c = switch
  when do -> a 
      'ok'
  when do -> b
      'ko'

Besides control flow, the same issue also appears when used in ranges, slices and array definition

a = [do -> b...5]
a = arr[do -> b...5]
arr = [1, 2, 3, do -> a, 5, 6]

Note that examples above works in current master when inline function is the last element:

a = [1...do -> b]
a = arr[1...do -> b]
arr = [1, 2, 3, do -> a]

Besides new grammar rules, I also changed existing @normalizeLines in order to avoid inserting INDENT / OUTDENT when the inline function is detected with control flow tags or inside the array, range, and slice.

This PR is still WIP as there are still some cases (especially Operation) which need refinements and testing.

zdenko · 2018-01-04T12:25:22Z

@GeoffreyBooth
#3933 and #3441 are not part of this PR. I'll have a PR for the #3441 ready soon.
However, #2342 might be fixed once I finished all changes in grammar.

vendethiel · 2018-01-04T12:53:51Z

Would I be right saying this could potentially fix a -> b, c to mean a (-> b), c then? As per the previously linked ticket.

zdenko · 2018-01-04T13:23:25Z

@vendethiel which ticket exactly? In your example a -> b, c is changed in Rewriter::addImplicitBracesAndParens, which I also intend to refactor later.

I just did a quick test locally (not committed yet) and a -> b, c compiles to

a(function() {
  return b;
}, c);

and this case f a -> b, c to

f(a(function() {
  return b;
}, c));

Is this expected?

vendethiel · 2018-01-04T13:27:35Z

This ticket: satyr/coco#127

I'd say the compilation is correct. It's "unexpected", because we currently parse it as a (-> b, c) which is an error.

The question I have then is: did this change the meaning of these cases:

if true then if false
   true

Does this code now compile?

switch
when a then if b then c else d

vendethiel · 2018-01-04T13:31:56Z

I know, it's a bit broader than what the ticket attempts to do, but it's one big can of worms..

#2342

zdenko · 2018-01-04T16:10:11Z

Yeah, I was looking at those also and I think this PR could also cover such cases.

GeoffreyBooth · 2018-01-07T06:54:35Z

test/control_flow.coffee

@@ -485,6 +485,293 @@ test "#4267: lots of for-loops in the same scope", ->
  """
  ok CoffeeScript.eval(code)

+# Test for issue #2342: Lexer: Inline `else` binds to wrong `if`/`switch`
+test "issue 2343: if / then / if / then / else", ->


This is very minor, but elsewhere in the tests when we reference an issue we do it as test "#2343: ....

zdenko · 2018-01-07T07:37:38Z

The last commit includes the fix for #2342.

 # if a then if b then c else d

if (a) {
  if (b) {
    c;
  } else {
    d;
  }
}

# switch e
#   when f then if g then h else i

switch (e) {
  case f:
    if (g) {
      h;
    } else {
      i;
    }
}

I've tried to fix this in grammar and nodes by having a separate set of rules for inline if: if a then b else c. The main idea was (is) to keep then, i.e. not replacing it with INDENT/OUTDENT.
It worked to some extent, but at the end, I was walking in circles of ambiguous cases.
From my understanding, the grammar solution would require having rules for IF Block and IF Line, which ultimately means a major reworking of the Expression.
So, at least for now, I solved this by adjusting Rewriter::normalizeLines where INDENT/OUTDENT are inserted.

GeoffreyBooth · 2018-01-07T07:49:34Z

The more I learn about this codebase, the more I feel like rewriter.coffee really never should’ve existed in the first place; that it’s just a long series of hacks to get around not achieving what we want in the grammar. Alas, it’s more than a little late now. I think that’s what CoffeeScriptRedux was aiming to fix.

Re if a then if b then c else d, why is this not ambiguous? Like, how would I know that the else goes with if b and not if a? I surmise that you’re introducing a rule that else always binds with the closest prior then, but will this potentially break old code that expected otherwise?

Inve1951 · 2018-01-07T08:31:43Z

binding to closest if is how other languages handle dangling else.
i'd go as far as calling it the expected and most logical behavior.

in cases where you want if a then (if b then c) else d you can use parens or use multiple lines and indent appropriately. imo that's sufficient.

another possibility could be using a semicolon: if a then if b then c; else d
i don't think that's a good idea tho as it's easy to miss and harder to grasp the meaning.

GeoffreyBooth · 2018-01-07T08:41:14Z

I think the behavior proposed here (binding to closest then) is the best option; second-best would be throwing an error that it’s ambiguous. I’m just asking adopting this rule now is a breaking change.

zdenko · 2018-01-07T11:43:47Z

Binding else to closest then is the way to go IMHO.
Note that this PR fixes cases like if a then if b then d else c, but we should also take care of this case if a then if b then c else d else e.
I would expect this output

if (a) {
  if (b) {
    return c
  } else {
    return d
  }
} else {
  return e
}

The more I learn about this codebase, the more I feel like rewriter.coffee really never should’ve existed in the first place

Count me in. I'm working on improved grammar (not part of this PR), and I think it will be possible to get rid of most of the current logic in the rewriter.

Inve1951 · 2018-01-07T20:48:48Z

i assume you flipped c and d by accident.
besides that i agree with the output.

GeoffreyBooth · 2018-01-07T21:27:57Z

Okay, but is this a breaking change?

zdenko · 2018-01-07T22:05:17Z

@Inve1951 yes, c and d were flipped. I corrected the comment.
@GeoffreyBooth I would say this is a breaking change.
Any existing code with if false then if false then 1 else 2 that expects the result to be 2 will now get undefined.

vendethiel · 2018-01-07T22:29:48Z

The more I learn about this codebase, the more I feel like rewriter.coffee really never should’ve existed in the first place; that it’s just a long series of hacks to get around not achieving what we want in the grammar. Alas, it’s more than a little late now. I think that’s what CoffeeScriptRedux was aiming to fix.

Yes, pretty much everyone agrees on this. Do note, tho, that CSR has some minor rewriting (notably regexes).

Re if a then if b then c else d, why is this not ambiguous? Like, how would I know that the else goes with if b and not if a? I surmise that you’re introducing a rule that else always binds with the closest prior then, but will this potentially break old code that expected otherwise?

It's not ambiguous, but i'm afraid this PR changing its meaning (tho for the best) is a huge breaking change.

GeoffreyBooth · 2018-01-07T22:54:48Z

I’m okay with a breaking change that fixes a bug, like when we changed chaining in 1.12.7 to go from the start of a line rather than the last function in the line. But when we make such a change, we need to be sure that the prior behavior is a bug, and clearly seems to be such. I think this qualifies; what do you all think?

I would disagree that this is a “huge” breaking change; to me, relying on multiple thens in the same line smacks of ambiguity, so I would hope it wouldn’t be common (or that people would add parentheses to make their intentions clear). It’s only one data point, but we didn’t need to change anything across the CoffeeScript codebase as a result of this change, did we?

GeoffreyBooth · 2018-01-16T05:25:21Z

Survey of the crowd (including @jashkenas, @lydell, @connec, others not already on this thread):

A bunch of bugfix PRs have been merged in recently, some of them substantial changes, so I’m thinking the next release should be 2.2.0. That should give people enough of a hint that it’s not just a small patch. In addition, this PR is technically a breaking change: else previously bound to the earliest then but now binds to the latest then, which makes much more sense. So code like these are changing:

if no then if yes then alert 1 else alert 2

Old	New
if (false) { if (true) { alert(1); } } else { alert(2); }	if (false) { if (true) { alert(1); } else { alert(2); } }
Alerted 2	Nothing alerted

switch 1
  when 1 then if yes then alert 1 else alert 2

Old	New
switch (1) { case 1: if (false) { alert(1); } break; default: alert(2); }	switch (1) { case 1: if (false) { alert(1); } else { alert(2); } }
Nothing alerted	Alerts 2

As you can see in the output, in the first example the else binds to the first if instead of the second. The latter makes more sense. I don’t think the docs discuss what to expect in a case like this; if anything, I would’ve expected the compiler to throw an error complaining about ambiguity.

In the second example, the else previously bound to the when (creating a default: block) whereas now it binds to the if. Again, I think the new behavior is more what users would expect.

So the question is, what do we think of this change?

It’s tiny. Slip it into 2.2.0.
It’s big enough to be worth splitting into its own release; make 2.2.1 just this PR.
It’s big enough to be worth splitting into its own release, and it’s not a patch; make 2.3.0 just this PR.
It’s drastic! Wait for 3.0.0!

I’m inclined for option 2, releasing 2.2.1 as just this PR. This change feels to me a lot like when we fixed chaining between 1.12.6 and 1.12.7; technically it was a breaking change, but it corrected a bug. Though at least in that case the intended behavior was in the docs; this is changing undocumented behavior. On the other hand, 2.x is a lot less settled than 1.12.6 was, so changing undocumented behavior at this point should bother a lot fewer people.

zdenko · 2018-01-19T18:21:38Z

The last commit adds support for nested else in a single line.

// if a then if b then c else d  else e

if (a) {
  if (b) {
    c;
  } else {
    d;
  }
} else {
  e;
}

// if a then if b then if c then d else e else f

if (a) {
  if (b) {
    if (c) {
      d;
    } else {
      e;
    }
  } else {
    f;
  }
}

// if a then if b then if c then d else if e then f else if g then h else i else j else k

if (a) {
  if (b) {
    if (c) {
      d;
    } else if (e) {
      f;
    } else if (g) {
      h;
    } else {
      i;
    }
  } else {
    j;
  }
} else {
  k;
}

GeoffreyBooth · 2018-01-28T10:05:11Z

I’m thinking of just merging this in and releasing a big 2.2.0. Any objections? The minor (as opposed to batch) version bump should tell people that there might be meaningful changes in the release, as there very well could be for some people.

@zdenko This looks good to me. Anything else you have planned for this PR, or is it ready? And any other PRs you’re working on that should be slipped into 2.2.0?

zdenko · 2018-01-28T18:32:59Z

@GeoffreyBooth I think it's ready and can be merged. I'm still working on improved grammar which should take care of then and else (and some other) tags without Rewriter, but it's not yet ready for PR.
I have also other PRs with enhancements and new features proposals on the way, but those should go in the next release.

GeoffreyBooth · 2018-01-29T01:52:07Z

Great, thanks for all your hard work!

fix jashkenas#3921: inline function without parentheses used in condi…

e55f4a3

…tion

add grammar rules

bcf9538

fix issue jashkenas#2343

9db587d

zdenko changed the title ~~Fix #3921: inline function without parentheses used in condition~~ Fix #3921 & #2342: inline function without parentheses used in condition, inline 'else' Jan 7, 2018

GeoffreyBooth reviewed Jan 7, 2018

View reviewed changes

typos

ea6f46a

multiple 'else' in line

e9c4cb6

zdenko force-pushed the issue3921 branch from a3e57b8 to e9c4cb6 Compare January 19, 2018 14:41

close 'else if'

13d02e9

merge with master

8b80fb6

GeoffreyBooth merged commit c804087 into jashkenas:master Jan 29, 2018

GeoffreyBooth mentioned this pull request Feb 1, 2018

2.2.0 #4873

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix #3921 & #2342: inline function without parentheses used in condition, inline 'else' #4838

Fix #3921 & #2342: inline function without parentheses used in condition, inline 'else' #4838

zdenko commented Dec 31, 2017 •

edited by GeoffreyBooth

Loading

GeoffreyBooth commented Dec 31, 2017

GeoffreyBooth commented Dec 31, 2017

vendethiel commented Dec 31, 2017

zdenko commented Jan 4, 2018

zdenko commented Jan 4, 2018

vendethiel commented Jan 4, 2018

zdenko commented Jan 4, 2018

vendethiel commented Jan 4, 2018

vendethiel commented Jan 4, 2018

zdenko commented Jan 4, 2018

GeoffreyBooth Jan 7, 2018

zdenko commented Jan 7, 2018

GeoffreyBooth commented Jan 7, 2018

Inve1951 commented Jan 7, 2018

GeoffreyBooth commented Jan 7, 2018

zdenko commented Jan 7, 2018 •

edited

Loading

Inve1951 commented Jan 7, 2018

GeoffreyBooth commented Jan 7, 2018

zdenko commented Jan 7, 2018

vendethiel commented Jan 7, 2018

GeoffreyBooth commented Jan 7, 2018

GeoffreyBooth commented Jan 16, 2018

zdenko commented Jan 19, 2018 •

edited

Loading

GeoffreyBooth commented Jan 28, 2018

zdenko commented Jan 28, 2018

GeoffreyBooth commented Jan 29, 2018

Fix #3921 & #2342: inline function without parentheses used in condition, inline 'else' #4838

Fix #3921 & #2342: inline function without parentheses used in condition, inline 'else' #4838

Conversation

zdenko commented Dec 31, 2017 • edited by GeoffreyBooth Loading

GeoffreyBooth commented Dec 31, 2017

GeoffreyBooth commented Dec 31, 2017

vendethiel commented Dec 31, 2017

zdenko commented Jan 4, 2018

zdenko commented Jan 4, 2018

vendethiel commented Jan 4, 2018

zdenko commented Jan 4, 2018

vendethiel commented Jan 4, 2018

vendethiel commented Jan 4, 2018

zdenko commented Jan 4, 2018

GeoffreyBooth Jan 7, 2018

Choose a reason for hiding this comment

zdenko commented Jan 7, 2018

GeoffreyBooth commented Jan 7, 2018

Inve1951 commented Jan 7, 2018

GeoffreyBooth commented Jan 7, 2018

zdenko commented Jan 7, 2018 • edited Loading

Inve1951 commented Jan 7, 2018

GeoffreyBooth commented Jan 7, 2018

zdenko commented Jan 7, 2018

vendethiel commented Jan 7, 2018

GeoffreyBooth commented Jan 7, 2018

GeoffreyBooth commented Jan 16, 2018

zdenko commented Jan 19, 2018 • edited Loading

GeoffreyBooth commented Jan 28, 2018

zdenko commented Jan 28, 2018

GeoffreyBooth commented Jan 29, 2018

zdenko commented Dec 31, 2017 •

edited by GeoffreyBooth

Loading

zdenko commented Jan 7, 2018 •

edited

Loading

zdenko commented Jan 19, 2018 •

edited

Loading