New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Parse] Statement typo corrections #28510
[Parse] Statement typo corrections #28510
Conversation
lib/Parse/ParseStmt.cpp
Outdated
///// \returns the score, if it is good enough to even consider this a match. | ||
///// Otherwise, an empty optional. | ||
///// | ||
static Optional<unsigned> scoreIdentifierAndStmtNameTypo(StringRef identifier, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are identical function used for matching stringRefs in CSSimplify.cpp. Can it be moved to StringExtras?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can get away with a table of keywords and a loop checking edit_distance
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree. Fixed.
@@ -925,6 +925,8 @@ ERROR(statement_begins_with_closure,none, | |||
"top-level statement cannot begin with a closure expression", ()) | |||
ERROR(statement_same_line_without_semi,none, | |||
"consecutive statements on a line must be separated by ';'", ()) | |||
ERROR(statement_misspell,none, | |||
"did you misspell '%0'", (StringRef)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some nits:
- The phrasing here is of a note, but the diagnostic is an error. Take a look at how we do typo correction diagnostics in libSema.
- Rhetorical questions need punctuation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree. Fixed.
test/Parse/statement_typo.swift
Outdated
let test = 5 | ||
gaurd test > 5 else { return } // expected-error {{did you misspell 'guard'}} {{3-8=guard}} expected-error {{use of unresolved identifier 'gaurd'}} | ||
i test > 5 { return } // expected-error {{did you misspell 'if'}} {{3-4=if}} expected-error {{use of unresolved identifier 'i'}} | ||
whle test > 5 { return } // expected-error {{did you misspell 'while'}} {{3-7=while}} expected-error {{use of unresolved identifier 'whle'}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we recover better here? (Would it make sense to use the nearest-neighbor match to start parsing the corresponding statement kind?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. Fixed. Please check my comment below.
lib/Parse/ParseStmt.cpp
Outdated
} | ||
if (stmtTokMatch) { | ||
ParserPosition parserPosition = ParserPosition(L->getStateForBeginningOfToken(previousToken), PreviousLoc); | ||
backtrackToPosition(parserPosition); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don’t think you have to backtrack. You only need the location of the typo’d keyword.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently I used backtrack to update previous token with our corrected token kind and continue parsing as there were no error.
Maybe there are better way to recover there for example just skip tokens until we met end of statement.
Do we have any other options @rintaro ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately,setKind()
to a keyword kind is not not an option because we have an invariant in libSyntax where keyword tokens have always have the fixed text.
An option (which @CodaFi is suggesting) is to make parseStmt***
functions receive keyword location like
ParserResult<Stmt> Parser::parseStmtReturn(SourceLoc tryLoc, SourceLoc ReturnLoc) {
if (ReturnLoc.isInvalid())
ReturnLoc = consumeToken(tok::kw_return);
...
}
Then directly call them from this recovery logic with the location. But that's probably not simple changes considering TopLevelCodeDecl
handling.
So, I recommend to skip to the next statement, decl, or '}' using skipUntilDeclStmtRBrace()
. Downside of this is that we loose the diagnostics and semantic syntax highlighting until the next statement. But I think that's better than lousy diagnostics.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, totally agree. I also thought that approach with backtracking and setting kind is too unnatural. Related to skipUntilDeclStmtRBrace(). If we start using such diagnostics there are case like this.
func bar() {
let i = 5
gaurd i > 5 else { return }
gaurd i > 5 else { return }
}
After first missed guard I want to skip until rBrace or next stmt. But currently in skipSingle I find out that we are skipping to rBrace when we find out lBrace without actually going through checks in skipUntilDeclStmtRBrace().
void Parser::skipSingle() {
switch (Tok.getKind()) {
...
case tok::l_brace:
consumeToken();
skipUntil(tok::r_brace);
consumeIf(tok::r_brace);
break;
void Parser::skipUntilDeclStmtRBrace(tok T1) {
/// here Tok could not be '}' because we are skipping it in skipSingle()
while (Tok.isNot(T1, tok::eof, tok::r_brace, tok::pound_endif,
tok::pound_else, tok::pound_elseif,
tok::code_complete) &&
!isStartOfStmt() && !isStartOfDecl()) {
skipSingle();
}
}
And in case like I posted from above gaurd is not a startOfStmt and we are just skipping to the end of the function.
We can make 2 variants:
- Introduce function possibleStartOfStmt() and update skipUntilDeclStmtRBrace.
- Just create other skip function with correct handling rBrace.
What approach is more suitable @rintaro ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the implementation of skipSingle()
is not wrong. This is for example:
func foo() {
if /*HERE*/ foo {
return 1
}
return 2
}
When skipping from /*HERE*/
to the next statement, we don't care about the contents of the if
statement. We just want to skip to return 2
at line 5.
Introduce function possibleStartOfStmt() and update skipUntilDeclStmtRBrace.
Could you elaborate on this? How will possibleStartOfStmt()
be different from isStartOfStmt()
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, agree. Lets keep it simple.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, @rintaro after investigation I find out there simple usage of skipUntilDeclStmtRBrace() is not sufficient for cases like
// SR-8510
func foo() -> Int? { return 0 }
func bar() {
gaurd /*HERE*/ let x = foo() else { return } // expected-error {{consecutive statements}} // expected-note {{did you misspell 'guard'}} {{3-8=guard}}
print(x)
}
We will stop here and then emit a bunch of unrelated errors. Are there any workarounds around this, if we are trying to use skipUntilDeclStmtRBrace() ?
Should we consider introducing another skipping function ?
void Parser::skipUntilStmtRBrace() {
while (Tok.isNot(tok::eof, tok::r_brace, tok::pound_endif,
tok::pound_else, tok::pound_elseif,
tok::code_complete) &&
!isStartOfStmt())
skipSingle();
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ahh, that's unfortunate. I think that is a general skipUntilDeclStmtRBrace()
problem where it cannot skip statement conditions with optional binding syntax. I noticed that actually causes spurious diagnostics:
var foo: Int?
if
a == ],
let b = foo {
}
test.swift:4:8: error: expected expression after operator
a == ],
^
test.swift:5:11: error: cannot call value of non-function type 'Int?'
let b = foo {
^
Hmm...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For now, how about just skipping the first Tok.isAny(tok::kw_let, tok::kw_var)
in this recovery logic? like:
if (Tok.isAny(tok::kw_let, tok::kw_var))
consumeToken();
skipUntilDeclStmtRBrace()
It's not great for something like:
gurad
let a = someOpt1,
let b = someOpt2
else {
}
But I think supporting single let
/var
is good enough because this is just a recovery.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi, @rintaro updated pull request. Currently recovery logic will work only if next token is let/var and wrong typed statement is on the beginning of the line. Also updated tests.
I left some high-level comments about the approach here. I think @rintaro is more qualified than I am to get into the finer details. Thanks for tackling this. It’s a tough nut to crack. |
test/Parse/statement_typo.swift
Outdated
@@ -0,0 +1,13 @@ | |||
// RUN: %target-typecheck-verify-swift | |||
|
|||
func statementMisspelledAtTheBegginingOfTheLine() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you add the functions from the SR too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Added.
test/Parse/statement_typo.swift
Outdated
func statementMisspelledNotAtTheBegginingOfTheLine() { | ||
let i = 5 | ||
var x = i x=x+1 // expected-error {{consecutive statements}} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: Trailing newline please.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
84a18c6
to
9870352
Compare
lib/Parse/ParseStmt.cpp
Outdated
} | ||
} | ||
if (stmtTokMatch) { | ||
Entries.pop_back(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We are removing latest entry, because it is unresolved identifier
expression from parse expression. If we will not remove it undefined identifier
error will be emitted by Sema.
lib/Parse/ParseStmt.cpp
Outdated
} | ||
if (stmtTokMatch) { | ||
ParserPosition parserPosition = ParserPosition(L->getStateForBeginningOfToken(previousToken), PreviousLoc); | ||
backtrackToPosition(parserPosition); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately,setKind()
to a keyword kind is not not an option because we have an invariant in libSyntax where keyword tokens have always have the fixed text.
An option (which @CodaFi is suggesting) is to make parseStmt***
functions receive keyword location like
ParserResult<Stmt> Parser::parseStmtReturn(SourceLoc tryLoc, SourceLoc ReturnLoc) {
if (ReturnLoc.isInvalid())
ReturnLoc = consumeToken(tok::kw_return);
...
}
Then directly call them from this recovery logic with the location. But that's probably not simple changes considering TopLevelCodeDecl
handling.
So, I recommend to skip to the next statement, decl, or '}' using skipUntilDeclStmtRBrace()
. Downside of this is that we loose the diagnostics and semantic syntax highlighting until the next statement. But I think that's better than lousy diagnostics.
lib/Parse/ParseStmt.cpp
Outdated
@@ -2529,3 +2548,28 @@ ParserResult<Stmt> Parser::parseStmtPoundAssert() { | |||
return makeParserResult<Stmt>(new (Context) PoundAssertStmt( | |||
SourceRange(startLoc, endLoc), conditionExprResult.get(), message)); | |||
} | |||
|
|||
Optional<tok> Parser::closestMatchingStmtTokenForIdentifier(StringRef identifier) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we need to make it a Parser method? If not, please make this a static
function.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, fixed
916866b
to
346bf2a
Compare
lib/Parse/ParseStmt.cpp
Outdated
if (stmtTokMatch && Tok.isAny(tok::kw_let, tok::kw_var)) { | ||
StringRef stmtTokenText = getTokenText(*stmtTokMatch); | ||
diagnose(PreviousLoc, diag::statement_misspell, stmtTokenText) | ||
.fixItReplace(PreviousLoc, stmtTokenText); | ||
consumeToken(); | ||
skipUntilDeclStmtRBrace(tok::r_brace); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, I meant something like:
if (stmtTokMatch) {
... diagnose ...
if (Tok.isAny(tok::kw_let, tok::kw_var))
consumeToken();
skipUntilDeclStmtRBrace(tok::r_brance);
}
We still can recover non let
/var
binding cases.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@rintaro Sure. Fixed.
Sorry for the delay. Could you add back test cases you deleted in previous commit? func statementMisspelledAtTheBegginingOfTheLine() {
let test = 5
gaurd test > 5 else { return } // expected-error {{consecutive statements}} // expected-note {{did you misspell 'guard'}} {{3-8=guard}}
i test > 5 { return } // expected-error {{consecutive statements}} // expected-note {{did you misspell 'if'}} {{3-4=if}}
whle test > 5 { return } // expected-error {{consecutive statements}} // expected-note {{did you misspell 'while'}} {{3-7=while}}
}
func statementMisspelledNotAtTheBegginingOfTheLine() {
let i = 5
var x = i x=x+1 // expected-error {{consecutive statements}}
} These should work now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implementation looks good!
For code-formatting, I recommend clang-format
with git-clang-format
.
@@ -870,7 +870,7 @@ class Parser { | |||
IsFollowingGuard); | |||
} | |||
ParserResult<BraceStmt> parseBraceItemList(Diag<> ID); | |||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unrelated change?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like there was trailing white space and xCode automatically trimmed it.
gaurd let firstWithLet = foo() else { return } // expected-error {{consecutive statements}} | ||
// expected-note@-1 {{did you misspell 'guard'}} {{3-8=guard}} | ||
// expected-error@-2 {{use of unresolved identifier 'gaurd'}} | ||
i let secondWithLet = foo() { return } // expected-error {{consecutive statements}} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suspect the previous recovery skips i
here. Because, use of unresolved identifier 'I'
is not emitted.
Could you split test cases to their own functions? like:
func stmtMisspelledAtTheBegginingOfTheLineWithLet1() {
gaurd let firstWithLet = foo() else { return }
}
func stmtMisspelledAtTheBegginingOfTheLineWithLet2() {
i let secondWithLet = foo() { return }
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure. Fixed.
lib/Parse/ParseStmt.cpp
Outdated
std::pair<StringRef, tok> pair = stmtKeywords[i]; | ||
StringRef potentialStatement = pair.first; | ||
|
||
unsigned dist = potentialStatement.edit_distance(identifier, /*AllowReplacements=*/true, /*MaxEditDistance=*/bestScore); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
80 columns
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
lib/Parse/ParseStmt.cpp
Outdated
SourceLoc startOfLineLocation = L->getLocForStartOfLine(SourceMgr, previousToken.getLoc()); | ||
SourceLoc locationOfTokenAtStartOfLine = Lexer::getTokenAtLocation(SourceMgr, startOfLineLocation).getLoc(); | ||
if (previousToken.getLoc() == locationOfTokenAtStartOfLine) { | ||
stmtTokMatch = closestMatchingStmtTokenForIdentifier(previousToken.getText()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
80 columns
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
lib/Parse/ParseStmt.cpp
Outdated
@@ -194,6 +194,32 @@ static bool isAtStartOfSwitchCase(Parser &parser, | |||
return parser.Tok.isAny(tok::kw_case, tok::kw_default); | |||
} | |||
|
|||
|
|||
static Optional<tok> closestMatchingStmtTokenForIdentifier(StringRef identifier) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
80 columns
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed.
Updated formatting. Split tests for different statements.
Hi, @rintaro, what is status of this ticket ? Should be consider some different implementation ? |
Current implementation steps:
Current implementation is limited to only identifiers that start from the beginning of the line. Like
The limitation exists because it is ambigous how to handle case where
Should we correct it to
if
statement, or provide insert of';'
.https://bugs.swift.org/browse/SR-8510