-
Notifications
You must be signed in to change notification settings - Fork 57
More precise parseInt and toString #86
Comments
That's a very good point. There is a related source of ambiguity in https://tc39.github.io/ecma262/#sec-runtime-semantics-mv-s
I think it'd be a good idea to make a uniform decision and fix all of these, for both Number and BigInt. I don't think I'll be able to do so by the September meeting, though. For BigInt in particular, there should never be such a 20-digit limitation. The exact answer should be given regardless of the number of digits and the radix. I guess that's a separate spec text fix which will be easier than the Number one. |
Not sure if this is directly related, but parseInt also has some funny behavior with respect to radix prefixes: it allows "0x" prefixes if radix is either not passed or 16, but disallows it (or rather, returns 0) otherwise. It's not clear to me that this legacy behavior is something we want to bring forward, unless the intention of this particular method is to match the behavior of the legacy parseInt. |
@ajklein Thanks for raising that (rather separate) question. I don't see a big harm coming from that stripPrefix behavior, but maybe it'd make sense to clean it up. @cxielarko , you implemented this functionality and tests for it, do you have an opinion here? |
The previous specification for BigInt.parseInt was based on referencing and making small changes to BigInt.parseInt. This patch inlines the definition for more clarity, and to remove some unnecessary implementation-defined parts. Addresses part of #86
The previous specification for BigInt.parseInt was based on referencing and making small changes to BigInt.parseInt. This patch inlines the definition for more clarity, and to remove some unnecessary implementation-defined parts. Addresses part of #86
I pushed a patch to fix @thejoshwolfe 's issue, but the issue from @ajklein might have different points of view, so I'll do that as a pull request to get more feedback. As long as we're cleaning up parseInt, we could also do a few more things beyond what @ajklein is suggesting:
I'll put together all three of these changes into a PR and make a matching test262 PR. |
- Throw a RangeError rather than a SyntaxError when an out-of-bounds radix is passed in. (Thanks, @cxielarko) - Throw a syntax error with "0x" prefixes. (Thanks, @ajklein) - Throw an exception if there is more non-whitespace text after the recognized numeric part, rather than ignoring it. Addresses #86
For proposed changes from tc39/proposal-bigint#86 in the PR at tc39/proposal-bigint#97
We settled on removing the parseInt function at the November 2017 TC39 meeting. |
What?! How is one supposed to parse BigInts with custom radixes? Or is that simply not a supported use case any more? |
const alphabet = '0123456789abcdefghijklmnopqrstuvwxyz'.split('');
function parseBigInt(str, radix = 10) {
if (radix < 2 || radix > alphabet.length || Math.floor(radix) !== radix) {
throw new RangeError('radix out of range');
}
let val = 0n;
for (const c of ('' + str).split('')) {
const index = alphabet.indexOf(c);
if (index < 0 || index >= radix) {
throw new RangeError('character out of range');
}
val = val * radix + BigInt(index);
}
return val;
} (Modulo casing, Something like that can be added to the language later, of course, but I don't think it's that big of a deal to leave it out initially, given that it's doable in userland. |
The idea would be to leave it for user libraries. This proposal already leaves several other things to user libraries, such as bitcasting Numbers to 64-bit ints, writing and reading arbitrarily large BigInts from ArrayBuffers, and bit instructions such as popcount, leftmost/rightmost set/clear bit, etc. All of those could be implemented by user code or be faster as built-ins, but are left out in the interest of minimalism; this proposal could join the club. |
Of course it is; with TypedArrays and In a quick benchmark based on our unit test's inputs, @bakkot 's polyfill (with an added IMHO it's weird to have |
You can get the native implementation to do the parsing for the 4 common radixes with the function parseBigInt(str, radix = 10) {
switch (radix) {
case 2: return BigInt("0b" + str);
case 8: return BigInt("0o" + str);
case 10: return BigInt(str);
case 16: return BigInt("0x" + str);
}
// fallback to a userland implementation...
} This neglects lots of corner cases like empty input and leading whitespace. Python has a similar asymmetry between parsing and stringification, but amusingly it's the reverse situation. |
IIRC @ajklein was pushing more for removal for now; maybe you could clarify. |
I discussed this offline with @jakobkummerow, filling him in on the discussion around a hypothetical |
Yes, I think there should either be both (I find none of "it can be polyfilled in userland", "in the interest of minimalism", or "the use-case is unclear" particularly convincing arguments, because they could be used to shoot down the entire proposal.) |
OK, the committee didn't seem necessarily opposed to adding fromString, except that it might be weird to have fromString on BigInt and not Number. Would it be OK to add a matching Number.fromString in this proposal? Cc @ljharb |
I think it would be excellent to add both; and would address any consistency argument from only adding one. |
After discussing further with the @jakobkummerow and @ajklein offline, we concluded to stick with the November 2017 committee decision and pursue BigInt.fromString together with Number.fromString as a follow-on proposal, and remove Decimal.parseInt from this proposal. |
Decimal, or BigInt? |
BigInt (edited the comment above)--sorry, silly typo. If we have a follow-on Decimal proposal, it will follow the pattern established by this fromString proposal. |
After some discussion about various edge cases in BigInt.parseInt, TC39 decided in the November 2017 meeting to remove this feature in favor of pursuing a follow-on proposal to add Number.fromString and BigInt.fromString, as new, cleaner functions. Closes #86
After some discussion about various edge cases in BigInt.parseInt, TC39 decided in the November 2017 meeting to remove this feature in favor of pursuing a follow-on proposal to add Number.fromString and BigInt.fromString, as new, cleaner functions. Closes #86
BigInt seems to be mimicking the parseInt and toString semantics of Number, but I don't understand why so much implementation-defined behavior is allowed for BigInt. For Number there are practical issues with loss of precision and exponential notation, but these are not concerns for BigInt.
This particular excerpt is concerning (from here relevant to BigInt through here):
It seems to defeat the purpose of arbitrary-precision BigInt if implementations can discard precision during parseInt.
And here's a concerning exceprt regarding toString (from here when radix is not 10):
Why are we not precisely defining an abstract algorithm?
The text was updated successfully, but these errors were encountered: