[Parse] Statement typo corrections #28510

kitaisreal · 2019-11-30T17:27:35Z

Current implementation steps:

If consecutive statement error occurred we are getting previous identifier, then trying to typo correct it to one of the statements keywords.
Updated tests

Current implementation is limited to only identifiers that start from the beginning of the line. Like

gaurd x>5 else { return }

The limitation exists because it is ambigous how to handle case where

x=1+i i=i+1

Should we correct it to if statement, or provide insert of ';'.

https://bugs.swift.org/browse/SR-8510

kitaisreal · 2019-11-30T18:09:23Z

lib/Parse/ParseStmt.cpp

+///// \returns the score, if it is good enough to even consider this a match.
+///// Otherwise, an empty optional.
+/////
+static Optional<unsigned> scoreIdentifierAndStmtNameTypo(StringRef identifier,


There are identical function used for matching stringRefs in CSSimplify.cpp. Can it be moved to StringExtras?

I think you can get away with a table of keywords and a loop checking edit_distance.

Agree. Fixed.

CodaFi · 2019-11-30T20:42:34Z

include/swift/AST/DiagnosticsParse.def

@@ -925,6 +925,8 @@ ERROR(statement_begins_with_closure,none,
      "top-level statement cannot begin with a closure expression", ())
 ERROR(statement_same_line_without_semi,none,
      "consecutive statements on a line must be separated by ';'", ())
+ERROR(statement_misspell,none,
+    "did you misspell '%0'", (StringRef))


Some nits:

The phrasing here is of a note, but the diagnostic is an error. Take a look at how we do typo correction diagnostics in libSema.

Rhetorical questions need punctuation

Agree. Fixed.

CodaFi · 2019-11-30T20:45:56Z

test/Parse/statement_typo.swift

+  let test = 5
+  gaurd test > 5 else { return } // expected-error {{did you misspell 'guard'}} {{3-8=guard}} expected-error {{use of unresolved identifier 'gaurd'}}
+  i test > 5 { return } // expected-error {{did you misspell 'if'}} {{3-4=if}} expected-error {{use of unresolved identifier 'i'}}
+  whle test > 5 { return } // expected-error {{did you misspell 'while'}} {{3-7=while}} expected-error {{use of unresolved identifier 'whle'}}


Can we recover better here? (Would it make sense to use the nearest-neighbor match to start parsing the corresponding statement kind?)

Yes. Fixed. Please check my comment below.

CodaFi · 2019-11-30T20:52:12Z

lib/Parse/ParseStmt.cpp

+      }
+      if (stmtTokMatch) {
+        ParserPosition parserPosition = ParserPosition(L->getStateForBeginningOfToken(previousToken), PreviousLoc);
+        backtrackToPosition(parserPosition);


I don’t think you have to backtrack. You only need the location of the typo’d keyword.

Currently I used backtrack to update previous token with our corrected token kind and continue parsing as there were no error.
Maybe there are better way to recover there for example just skip tokens until we met end of statement.
Do we have any other options @rintaro ?

Unfortunately,setKind() to a keyword kind is not not an option because we have an invariant in libSyntax where keyword tokens have always have the fixed text.

An option (which @CodaFi is suggesting) is to make parseStmt*** functions receive keyword location like

ParserResult<Stmt> Parser::parseStmtReturn(SourceLoc tryLoc, SourceLoc ReturnLoc) { if (ReturnLoc.isInvalid()) ReturnLoc = consumeToken(tok::kw_return); ... }

Then directly call them from this recovery logic with the location. But that's probably not simple changes considering TopLevelCodeDecl handling.

So, I recommend to skip to the next statement, decl, or '}' using skipUntilDeclStmtRBrace(). Downside of this is that we loose the diagnostics and semantic syntax highlighting until the next statement. But I think that's better than lousy diagnostics.

Hi, totally agree. I also thought that approach with backtracking and setting kind is too unnatural. Related to skipUntilDeclStmtRBrace(). If we start using such diagnostics there are case like this.

func bar() { let i = 5 gaurd i > 5 else { return } gaurd i > 5 else { return } }

After first missed guard I want to skip until rBrace or next stmt. But currently in skipSingle I find out that we are skipping to rBrace when we find out lBrace without actually going through checks in skipUntilDeclStmtRBrace().

void Parser::skipSingle() { switch (Tok.getKind()) { ... case tok::l_brace: consumeToken(); skipUntil(tok::r_brace); consumeIf(tok::r_brace); break;

void Parser::skipUntilDeclStmtRBrace(tok T1) { /// here Tok could not be '}' because we are skipping it in skipSingle() while (Tok.isNot(T1, tok::eof, tok::r_brace, tok::pound_endif, tok::pound_else, tok::pound_elseif, tok::code_complete) && !isStartOfStmt() && !isStartOfDecl()) { skipSingle(); } }

And in case like I posted from above gaurd is not a startOfStmt and we are just skipping to the end of the function.
We can make 2 variants:

Introduce function possibleStartOfStmt() and update skipUntilDeclStmtRBrace.

Just create other skip function with correct handling rBrace.
What approach is more suitable @rintaro ?

I think the implementation of skipSingle() is not wrong. This is for example:

func foo() { if /*HERE*/ foo { return 1 } return 2 }

When skipping from /*HERE*/ to the next statement, we don't care about the contents of the if statement. We just want to skip to return 2 at line 5.

Introduce function possibleStartOfStmt() and update skipUntilDeclStmtRBrace.

Could you elaborate on this? How will possibleStartOfStmt() be different from isStartOfStmt()?

Ok, agree. Lets keep it simple.

Hi, @rintaro after investigation I find out there simple usage of skipUntilDeclStmtRBrace() is not sufficient for cases like

// SR-8510 func foo() -> Int? { return 0 } func bar() { gaurd /*HERE*/ let x = foo() else { return } // expected-error {{consecutive statements}} // expected-note {{did you misspell 'guard'}} {{3-8=guard}} print(x) }

We will stop here and then emit a bunch of unrelated errors. Are there any workarounds around this, if we are trying to use skipUntilDeclStmtRBrace() ?
Should we consider introducing another skipping function ?

void Parser::skipUntilStmtRBrace() { while (Tok.isNot(tok::eof, tok::r_brace, tok::pound_endif, tok::pound_else, tok::pound_elseif, tok::code_complete) && !isStartOfStmt()) skipSingle(); }

Ahh, that's unfortunate. I think that is a general skipUntilDeclStmtRBrace() problem where it cannot skip statement conditions with optional binding syntax. I noticed that actually causes spurious diagnostics:

var foo: Int? if a == ], let b = foo { }

test.swift:4:8: error: expected expression after operator a == ], ^ test.swift:5:11: error: cannot call value of non-function type 'Int?' let b = foo { ^

Hmm...

For now, how about just skipping the first Tok.isAny(tok::kw_let, tok::kw_var) in this recovery logic? like:

if (Tok.isAny(tok::kw_let, tok::kw_var)) consumeToken(); skipUntilDeclStmtRBrace()

It's not great for something like:

gurad let a = someOpt1, let b = someOpt2 else { }

But I think supporting single let/var is good enough because this is just a recovery.

Hi, @rintaro updated pull request. Currently recovery logic will work only if next token is let/var and wrong typed statement is on the beginning of the line. Also updated tests.

CodaFi · 2019-11-30T20:53:49Z

I left some high-level comments about the approach here. I think @rintaro is more qualified than I am to get into the finer details.

Thanks for tackling this. It’s a tough nut to crack.

CodaFi · 2019-11-30T21:01:11Z

test/Parse/statement_typo.swift

@@ -0,0 +1,13 @@
+// RUN: %target-typecheck-verify-swift
+
+func statementMisspelledAtTheBegginingOfTheLine() {


Can you add the functions from the SR too?

Sure. Added.

CodaFi · 2019-11-30T21:01:27Z

test/Parse/statement_typo.swift

+func statementMisspelledNotAtTheBegginingOfTheLine() {
+  let i = 5
+  var x = i x=x+1 // expected-error {{consecutive statements}}
+}


Nit: Trailing newline please.

kitaisreal · 2019-12-01T12:14:16Z

lib/Parse/ParseStmt.cpp

+        }
+      }
+      if (stmtTokMatch) {
+        Entries.pop_back();


We are removing latest entry, because it is unresolved identifier expression from parse expression. If we will not remove it undefined identifier error will be emitted by Sema.

rintaro · 2019-12-03T07:23:12Z

lib/Parse/ParseStmt.cpp

+      }
+      if (stmtTokMatch) {
+        ParserPosition parserPosition = ParserPosition(L->getStateForBeginningOfToken(previousToken), PreviousLoc);
+        backtrackToPosition(parserPosition);


Unfortunately,setKind() to a keyword kind is not not an option because we have an invariant in libSyntax where keyword tokens have always have the fixed text.

An option (which @CodaFi is suggesting) is to make parseStmt*** functions receive keyword location like

ParserResult<Stmt> Parser::parseStmtReturn(SourceLoc tryLoc, SourceLoc ReturnLoc) { if (ReturnLoc.isInvalid()) ReturnLoc = consumeToken(tok::kw_return); ... }

Then directly call them from this recovery logic with the location. But that's probably not simple changes considering TopLevelCodeDecl handling.

So, I recommend to skip to the next statement, decl, or '}' using skipUntilDeclStmtRBrace(). Downside of this is that we loose the diagnostics and semantic syntax highlighting until the next statement. But I think that's better than lousy diagnostics.

rintaro · 2019-12-03T07:24:11Z

lib/Parse/ParseStmt.cpp

@@ -2529,3 +2548,28 @@ ParserResult<Stmt> Parser::parseStmtPoundAssert() {
  return makeParserResult<Stmt>(new (Context) PoundAssertStmt(
      SourceRange(startLoc, endLoc), conditionExprResult.get(), message));
 }
+
+Optional<tok> Parser::closestMatchingStmtTokenForIdentifier(StringRef identifier) {


Do we need to make it a Parser method? If not, please make this a static function.

Sure, fixed

lib/Parse/ParseStmt.cpp

rintaro · 2020-01-14T18:36:13Z

lib/Parse/ParseStmt.cpp

+      if (stmtTokMatch && Tok.isAny(tok::kw_let, tok::kw_var)) {
+        StringRef stmtTokenText = getTokenText(*stmtTokMatch);
+        diagnose(PreviousLoc, diag::statement_misspell, stmtTokenText)
+          .fixItReplace(PreviousLoc, stmtTokenText);
+        consumeToken();
+        skipUntilDeclStmtRBrace(tok::r_brace);


Ah, I meant something like:

if (stmtTokMatch) { ... diagnose ... if (Tok.isAny(tok::kw_let, tok::kw_var)) consumeToken(); skipUntilDeclStmtRBrace(tok::r_brance); }

We still can recover non let/var binding cases.

@rintaro Sure. Fixed.

rintaro · 2020-01-17T05:12:19Z

Sorry for the delay. Could you add back test cases you deleted in previous commit?

func statementMisspelledAtTheBegginingOfTheLine() {
   let test = 5
   gaurd test > 5 else { return } // expected-error {{consecutive statements}} // expected-note {{did you misspell 'guard'}} {{3-8=guard}}
   i test > 5 { return } // expected-error {{consecutive statements}} // expected-note {{did you misspell 'if'}} {{3-4=if}}
   whle test > 5 { return } // expected-error {{consecutive statements}} // expected-note {{did you misspell 'while'}} {{3-7=while}}
 }

 func statementMisspelledNotAtTheBegginingOfTheLine() {
   let i = 5
   var x = i x=x+1 // expected-error {{consecutive statements}}
 }

These should work now.

rintaro

Implementation looks good!
For code-formatting, I recommend clang-format with git-clang-format.

rintaro · 2020-01-17T05:12:28Z

include/swift/Parse/Parser.h

@@ -870,7 +870,7 @@ class Parser {
                           IsFollowingGuard);
  }
  ParserResult<BraceStmt> parseBraceItemList(Diag<> ID);
-  
+


Unrelated change?

It seems like there was trailing white space and xCode automatically trimmed it.

rintaro · 2020-01-17T05:21:50Z

test/Parse/statement_typo.swift

+  gaurd let firstWithLet = foo() else { return } // expected-error {{consecutive statements}}
+  // expected-note@-1 {{did you misspell 'guard'}} {{3-8=guard}}
+  // expected-error@-2 {{use of unresolved identifier 'gaurd'}}
+  i let secondWithLet = foo() { return } // expected-error {{consecutive statements}}


I suspect the previous recovery skips i here. Because, use of unresolved identifier 'I' is not emitted.
Could you split test cases to their own functions? like:

func stmtMisspelledAtTheBegginingOfTheLineWithLet1() { gaurd let firstWithLet = foo() else { return } } func stmtMisspelledAtTheBegginingOfTheLineWithLet2() { i let secondWithLet = foo() { return } }

Sure. Fixed.

rintaro · 2020-01-17T05:22:18Z

lib/Parse/ParseStmt.cpp

+    std::pair<StringRef, tok> pair = stmtKeywords[i];
+    StringRef potentialStatement = pair.first;
+
+    unsigned dist = potentialStatement.edit_distance(identifier, /*AllowReplacements=*/true, /*MaxEditDistance=*/bestScore);


rintaro · 2020-01-17T05:23:18Z

lib/Parse/ParseStmt.cpp

+        SourceLoc startOfLineLocation = L->getLocForStartOfLine(SourceMgr, previousToken.getLoc());
+        SourceLoc locationOfTokenAtStartOfLine = Lexer::getTokenAtLocation(SourceMgr, startOfLineLocation).getLoc();
+        if (previousToken.getLoc() == locationOfTokenAtStartOfLine) {
+          stmtTokMatch = closestMatchingStmtTokenForIdentifier(previousToken.getText());


rintaro · 2020-01-17T05:23:52Z

lib/Parse/ParseStmt.cpp

@@ -194,6 +194,32 @@ static bool isAtStartOfSwitchCase(Parser &parser,
  return parser.Tok.isAny(tok::kw_case, tok::kw_default);
 }

+
+static Optional<tok> closestMatchingStmtTokenForIdentifier(StringRef identifier) {


Updated formatting. Split tests for different statements.

kitaisreal · 2020-03-12T16:49:04Z

Hi, @rintaro, what is status of this ticket ? Should be consider some different implementation ?

shahmishal · 2020-10-01T06:57:22Z

Please update the base branch to main by Oct 5th otherwise the pull request will be closed automatically.

How to change the base branch: (Link)
More detail about the branch update: (Link)

kitaisreal changed the title ~~SR-8610: Parser statement typo corrections~~ [PARSER] Statement typo corrections Nov 30, 2019

kitaisreal changed the title ~~[PARSER] Statement typo corrections~~ [Parse] Statement typo corrections Nov 30, 2019

kitaisreal commented Nov 30, 2019

View reviewed changes

CodaFi reviewed Nov 30, 2019

View reviewed changes

CodaFi requested a review from rintaro November 30, 2019 20:54

CodaFi reviewed Nov 30, 2019

View reviewed changes

kitaisreal force-pushed the parser-statement-typo-corrections branch from 84a18c6 to 9870352 Compare December 1, 2019 11:53

kitaisreal commented Dec 1, 2019

View reviewed changes

rintaro reviewed Dec 3, 2019

View reviewed changes

kitaisreal requested a review from rintaro December 3, 2019 08:35

kitaisreal commented Dec 5, 2019

View reviewed changes

lib/Parse/ParseStmt.cpp Outdated Show resolved Hide resolved

kitaisreal added 3 commits January 11, 2020 17:52

SR-8610: Parser statement typo corrections

5df7322

SR-8610: Fixed code review issues

b5d73ea

SR-8510: Fixed code review issues

346bf2a

kitaisreal force-pushed the parser-statement-typo-corrections branch from 916866b to 346bf2a Compare January 11, 2020 15:42

SR-8510: Fixed code review issues

0a9f97b

rintaro reviewed Jan 14, 2020

View reviewed changes

SR-8510: Fixed code review issues

c1767b6

kitaisreal requested a review from rintaro January 15, 2020 08:05

rintaro reviewed Jan 17, 2020

View reviewed changes

SR-8510: Fixed code review issues

1f3b9ec

Updated formatting. Split tests for different statements.

kitaisreal requested a review from rintaro January 19, 2020 09:52

kitaisreal requested a review from CodaFi March 23, 2020 11:58

shahmishal closed this Oct 5, 2020

		@@ -0,0 +1,13 @@
		// RUN: %target-typecheck-verify-swift

		func statementMisspelledAtTheBegginingOfTheLine() {

[Parse] Statement typo corrections #28510

[Parse] Statement typo corrections #28510

Conversation

kitaisreal commented Nov 30, 2019 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rintaro Dec 3, 2019 • edited

Choose a reason for hiding this comment

kitaisreal Dec 3, 2019 • edited

Choose a reason for hiding this comment

rintaro Dec 3, 2019 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

CodaFi commented Nov 30, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rintaro Dec 3, 2019 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

rintaro commented Jan 17, 2020

rintaro left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

kitaisreal commented Mar 12, 2020

shahmishal commented Oct 1, 2020

kitaisreal commented Nov 30, 2019 •

edited

rintaro Dec 3, 2019 •

edited

kitaisreal Dec 3, 2019 •

edited

rintaro Dec 3, 2019 •

edited

rintaro Dec 3, 2019 •

edited