Skip to content

Commit 9fc4db6

Browse files
committedMar 8, 2011
Merge branch 'master' into recursive-elseif
Conflicts: src/Makefile src/comp/front/ast.rs src/comp/front/parser.rs src/comp/middle/fold.rs src/comp/middle/trans.rs
·
release-0.70.1
2 parents 3fedb18 + 6ed226c commit 9fc4db6

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

66 files changed

+8203
-2368
lines changed
 

‎AUTHORS.txt

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -15,6 +15,7 @@ Jason Orendorff <jorendorff@mozilla.com>
1515
Jeff Balogh <jbalogh@mozilla.com>
1616
Jeff Mulzelaar <jmuizelaar@mozilla.com>
1717
Jeffrey Yasskin <jyasskin@gmail.com>
18+
Marijn Haverbeke <marijnh@gmail.com>
1819
Matt Brubeck <mbrubeck@limpet.net>
1920
Michael Bebenita <mbebenita@mozilla.com>
2021
Or Brostovski <tohava@gmail.com>

‎doc/rust.texi

Lines changed: 27 additions & 21 deletions
Original file line numberDiff line numberDiff line change
@@ -592,10 +592,12 @@ or interrupted by ignored characters.
592592

593593
Most tokens in Rust follow rules similar to the C family.
594594

595-
Most tokens (including identifiers, whitespace, keywords, operators and
596-
structural symbols) are drawn from the ASCII-compatible range of
597-
Unicode. String and character literals, however, may include the full range of
598-
Unicode characters.
595+
Most tokens (including whitespace, keywords, operators and structural symbols)
596+
are drawn from the ASCII-compatible range of Unicode. Identifiers are drawn
597+
from Unicode characters specified by the @code{XID_start} and
598+
@code{XID_continue} rules given by UAX #31@footnote{Unicode Standard Annex
599+
#31: Unicode Identifier and Pattern Syntax}. String and character literals may
600+
include the full range of Unicode characters.
599601

600602
@emph{TODO: formalize this section much more}.
601603

@@ -638,18 +640,22 @@ token or a syntactic extension token. Multi-line comments may be nested.
638640
@c * Ref.Lex.Ident:: Identifier tokens.
639641
@cindex Identifier token
640642

641-
Identifiers follow the pattern of C identifiers: they begin with a
642-
@emph{letter} or @emph{underscore}, and continue with any combination of
643-
@emph{letters}, @emph{decimal digits} and underscores, and must not be equal
644-
to any keyword or reserved token. @xref{Ref.Lex.Key}. @xref{Ref.Lex.Res}.
643+
Identifiers follow the rules given by Unicode Standard Annex #31, in the form
644+
closed under NFKC normalization, @emph{excluding} those tokens that are
645+
otherwise defined as keywords or reserved
646+
tokens. @xref{Ref.Lex.Key}. @xref{Ref.Lex.Res}.
645647

646-
A @emph{letter} is a Unicode character in the ranges U+0061-U+007A and
647-
U+0041-U+005A (@code{'a'}-@code{'z'} and @code{'A'}-@code{'Z'}).
648+
That is: an identifier starts with any character having derived property
649+
@code{XID_Start} and continues with zero or more characters having derived
650+
property @code{XID_Continue}; and such an identifier is NFKC-normalized during
651+
lexing, such that all subsequent comparison of identifiers is performed on the
652+
NFKC-normalized forms.
648653

649-
An @dfn{underscore} is the character U+005F ('_').
654+
@emph{TODO: define relationship between Unicode and Rust versions}.
650655

651-
A @dfn{decimal digit} is a character in the range U+0030-U+0039
652-
(@code{'0'}-@code{'9'}).
656+
@footnote{This identifier syntax is a superset of the identifier syntaxes of C
657+
and Java, and is modeled on Python PEP #3131, which formed the definition of
658+
identifiers in Python 3.0 and later.}
653659

654660
@node Ref.Lex.Key
655661
@subsection Ref.Lex.Key
@@ -1984,22 +1990,22 @@ module system).
19841990
An example of a @code{tag} item and its use:
19851991
@example
19861992
tag animal @{
1987-
dog();
1988-
cat();
1993+
dog;
1994+
cat;
19891995
@}
19901996
1991-
let animal a = dog();
1992-
a = cat();
1997+
let animal a = dog;
1998+
a = cat;
19931999
@end example
19942000

19952001
An example of a @emph{recursive} @code{tag} item and its use:
19962002
@example
19972003
tag list[T] @{
1998-
nil();
2004+
nil;
19992005
cons(T, @@list[T]);
20002006
@}
20012007
2002-
let list[int] a = cons(7, cons(13, nil()));
2008+
let list[int] a = cons(7, cons(13, nil));
20032009
@end example
20042010

20052011

@@ -3395,9 +3401,9 @@ control enters the block.
33953401
An example of a pattern @code{alt} statement:
33963402

33973403
@example
3398-
type list[X] = tag(nil(), cons(X, @@list[X]));
3404+
type list[X] = tag(nil, cons(X, @@list[X]));
33993405
3400-
let list[int] x = cons(10, cons(11, nil()));
3406+
let list[int] x = cons(10, cons(11, nil));
34013407
34023408
alt (x) @{
34033409
case (cons(a, cons(b, _))) @{

0 commit comments

Comments
 (0)