Tracking issue for RFC 446 - ES6-style unicode escapes #19739

pnkfelix · 2014-12-11T16:51:23Z

Remove \u203D and \U0001F4A9 unicode string escapes, and add ECMAScript 6-style \u{1F4A9} escapes instead.

Text: /~https://github.com/rust-lang/rfcs/blob/master/text/0446-es6-unicode-escapes.md
RFC PR: rust-lang/rfcs#446

Migration strategy: /~https://github.com/rust-lang/rfcs/blob/master/text/0446-es6-unicode-escapes.md#migration-strategy

The text was updated successfully, but these errors were encountered:

pnkfelix · 2014-12-11T17:02:23Z

Note that stage 1 and the foundation for stage 2 in the migration strategy have been added by PR #19480

emk · 2014-12-13T13:54:48Z

I know this is probably on your to-do list already, but I found an interesting corner-case in syntax::print, and traced it back to std::char::Char::escape_unicode: #19811

aochagavia · 2014-12-13T19:48:48Z

Though not mentioned in the RFC, we still have \x.. to escape hexadecimal characters. Is this explicitly intended or should it also be removed?

steveklabnik · 2014-12-13T19:51:51Z

Hexidecimal doesn't suffer the same issues, because the full range is expressible, right?

aochagavia · 2014-12-13T20:06:02Z

It seems weird to me that you can do this: assert_eq!('\u{7f}', '\x7f'). But maybe it is just me.

I thought that \x should be deprecated as it is just a limited version of \u{...}.

aochagavia · 2014-12-13T20:11:24Z

cc @SimonSapin what do you think?

SimonSapin · 2014-12-13T20:28:49Z

I suggest leaving \xHH unchanged.

The reasoning behind the {} delimiters in \u{1F4A9} is to allow a variable number of digits, without initial zeros. Unicode goes up to U+10FFFF with 6 digits, but the majority of assigned code points still only needs 4 significant digits or less.

\xHH is fixed to two digits, but the \x00 to \x0F range is all control characters, most of which are very rarely used. The less rare ones, \x00 (nul), \x09 (tab), \x0A (newline), and \x0D (carriage return) all have shorter syntax already: \0, \t, \n, and \r, respectively. So there is no point in having \x{HH} and \x{H}.

SimonSapin · 2014-12-13T20:35:30Z

Oh, looks like I completely misread you, sorry. Yes, removing \xHH entirely when the replacement was \u00HH was annoying, but now it might not be as bad. I don’t mind either way, but it’d probably have to be a separate RFC. See rust-lang/rfcs#326 for prior discussion, including a decision not to remove it. (But that was before \u{H+})

genbattle · 2015-02-01T04:02:42Z

I have opened a PR here to update the documentation in relation to this change.

alexcrichton · 2015-02-16T23:35:20Z

This has been completed.

In Rust, it's recommended to use short (non-zero-padded) code-points inside ES6-style escaping sequences (`\u{...}`), as it reduces the length of the literal, and works better on the eyes for average use cases, while mechanical parsing remains still fairly easy. See examples in the Rust RFC and related discussion: * /~https://github.com/rust-lang/rfcs/blob/master/text/0446-es6-unicode-escapes.md * rust-lang/rfcs#446 * rust-lang/rust#19739

emk mentioned this issue Dec 13, 2014

std::chars::Char::escape_unicode generates old-style \u and \U escapes, breaking syntax::print #19811

Closed

kmcallister added the A-parser Area: The parsing of Rust source code to an AST label Jan 17, 2015

alexcrichton closed this as completed Feb 16, 2015

behnam mentioned this issue Apr 25, 2017

[conversion/rust] Remove zero-padding from ES6-style escaping r12a/r12a.github.io#17

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tracking issue for RFC 446 - ES6-style unicode escapes #19739

Tracking issue for RFC 446 - ES6-style unicode escapes #19739

pnkfelix commented Dec 11, 2014

pnkfelix commented Dec 11, 2014

emk commented Dec 13, 2014

aochagavia commented Dec 13, 2014

steveklabnik commented Dec 13, 2014

aochagavia commented Dec 13, 2014

aochagavia commented Dec 13, 2014

SimonSapin commented Dec 13, 2014

SimonSapin commented Dec 13, 2014

genbattle commented Feb 1, 2015

alexcrichton commented Feb 16, 2015

Tracking issue for RFC 446 - ES6-style unicode escapes #19739

Tracking issue for RFC 446 - ES6-style unicode escapes #19739

Comments

pnkfelix commented Dec 11, 2014

pnkfelix commented Dec 11, 2014

emk commented Dec 13, 2014

aochagavia commented Dec 13, 2014

steveklabnik commented Dec 13, 2014

aochagavia commented Dec 13, 2014

aochagavia commented Dec 13, 2014

SimonSapin commented Dec 13, 2014

SimonSapin commented Dec 13, 2014

genbattle commented Feb 1, 2015

alexcrichton commented Feb 16, 2015