-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add forEach extensions methods to iterate over codepoints #38
Conversation
By using For example: charSequence.codePointSequence().forEach { codePoint -> /* do something */ }
charSequence.codePointSequence().forEachIndexed { index, codePoint -> /* do something */ }
charSequence.codePointIterator().forEach { codePoint -> /* do something */ }
charSequence.codePointIterator().withIndex().forEach { (index, codePoint) -> /* do something */ } |
Yes, you are absolutely right. However, in order to use them you need to create an instance of iterator that adds additional overhead. I believe, for the same reason |
These are the results of a local benchmarking. The difference is about ~30% for small strings and ~70% for long strings (10561 codepoints) for JVM platform.
For native targets (I don't have a Mac to measure IOS targets) the difference is much bigger (x5-10)
Also, the |
Thank you for the pull request 👍 Sorry for not properly reading your description before posting the first comment. I was sitting on this pull request for a while because I was contemplating the question in #39. For now I think the functions to be added should not have support for providing a start or end index. But I think the
The Without having to support a start and end index, the implementation could also be simplified to: inline fun CharSequence.forEachCodePointIndexed(action: (index: Int, codePoint: Int) -> Unit) {
var index = 0
while (index < length) {
val codePoint = codePointAt(index)
action(index, codePoint)
index += CodePoints.charCount(codePoint)
}
} Do you have time to make the proposed changes? Otherwise I'll do it myself. |
Hi, @cketti. Thank you for taking a look at the PR. Initially, I was thinking about adding those functions to both But if you think this won't be a problem I am happy to make proposed changes. Will do it tomorrow, I think |
Having two |
Hi, @cketti. I made the proposed changes. Could you please take a look? However, I noticed that using this approach for interesting over codepoints inline fun CharSequence.forEachCodePointIndexed(action: (index: Int, codePoint: Int) -> Unit) {
var index = 0
while (index < length) {
val codePoint = codePointAt(index)
action(index, codePoint)
index += CodePoints.charCount(codePoint)
}
} introduces significant overhead for native targets (~30-70% depending on string length). Please, find the benchmark results below (for JVM target both methods performs more or less the same way).
Because of that, I think we should keep the current version of the method without using |
4b09bc3
to
f6b0a7c
Compare
Thanks 👍 |
The
CharSequence
hasforEach
andforEachIndexed
extension methods in Kotlin that allow to iterate over everychar
in theCharSequence
.I think it would be handy to have such methods to iterate over codepoints in
CharSequence
without overhead from usingIterator
orSequence
for that task.Please, let me know what you think about that.
If you have any suggestions on what should be changed in the current implementation I will be glad to hear them.
Thank you!