-
Notifications
You must be signed in to change notification settings - Fork 337
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use rb_enc_interned_str
if available
#451
Conversation
ruby/ruby#3786 was merged which means @hsbt are you interested in this PR? |
@hsbt could I interest you in this PR, or @marcandre maybe? |
Looks like a great PR, but I don't have write access. I wish we move this to |
Yeah I know :/ Who else could I try to pick their interest on this? |
I'm not sure what advantage with this change. @nurse Do you interest in this? |
It should be more efficient. Just to be sure: interned strings may be garbage collected like symbols? |
It will be garbage collected when there's not references to the fstring: /~https://github.com/ruby/ruby/blob/55e52c19e74d5df90560ea1cc4f2a2b9f5d7a5c4/string.c#L1410-L1419
How large this improve the performance? If it improves enough performance, see review comments. |
As discussed in https://bugs.ruby-lang.org/issues/16029, we have large configuration files we parse on boot and keep in memory forever. They contains a lot of repeated keys, that end up allocated and deduplicated later. This change would reduce our allocations on boot by over 20%, and since we spend 25-30% of our boot time in the GC, I'd expect a relatively nice speedup for us here. Concretely, the following JSON: [
{"foo": 1},
{"foo": 2},
] Will currently allocate
Yes. It's similar to
If the numbers I posted above aren't enough I can try to craft a benchmark. |
Unless you meant the |
Yes, and I know. But as far as I understand these days just creating Ruby object is not so heavy while they are collected by Generational GC. I want to ensure the refactoring is worth to add more code complexity. Micro bench mark is ok, but also want one which tests alloca and xmalloc (shorter strings and longer strings). |
Ok. I'll try to produce theses soon. |
Ok, so first micro benchmark, to see if allocating on the stack is worth it: # frozen_string_literals: true
require 'benchmark/ips'
require 'json'
puts "Ruby: #{RUBY_VERSION}"
json = JSON.dump("\n\r\t\\")
Benchmark.ips do |x|
x.report('parse escaped') { JSON.parse(json) }
end This PR:
With same but with the condition commented out (so always using xmalloc):
So the difference doesn't seem that huge, but seem consistent. If you think it is a useless optimization, I'm fine with removing it. |
Oh, I forgot you asked for larger strings as well: # frozen_string_literals: true
require 'benchmark/ips'
require 'json'
puts "Ruby: #{RUBY_VERSION}"
json = JSON.dump("\n\r\t\\")
json2 = JSON.dump("\n\r\t\\" * 16)
Benchmark.ips do |x|
x.report('parse escaped(10B)') { JSON.parse(json) }
x.report('parse escaped(130B)') { JSON.parse(json2) }
end This PR:
With alloca removed:
|
Another benchmark, so see the need for this feature overall: # frozen_string_literal: true
require 'benchmark/ips'
require 'json'
puts "Ruby: #{RUBY_VERSION}"
json = JSON.dump(
'foo' => 'bar',
'plopplopplopplopplopplopplopplopplopplopplopplopplopplopplop' => 12,
'bar' => 'foo',
'egg' => 'spam',
'spam' => 'egg',
)
Benchmark.ips do |x|
x.report('parse(freeze: true)') { JSON.parse(json, freeze: true) }
end This time on ruby 3.0.0 (to have
This PR:
|
@nurse if these benchmark are good enough for you I can work on rebasing and solving your comments. |
a137e56
to
bed9dee
Compare
@casperisfine Thank you for the benchmark. It looks this PR improves so much and alloca is also worth to introduce additional complexity. I'll merge this when you fix this. |
492bb70
to
33bcc78
Compare
@nurse I moved the |
Oops sorry indeed. Please ignore that. |
I just noticed travis isn't testing against Ruby 3.0. I can push that if you'd like. |
33bcc78
to
471cf62
Compare
@nurse is there anything else I could do? |
Some CIs are failed and how about 38e8477#r553682398? |
Indeed, but I don't see any output. I don't know why they failed. I'll see if I can retry the jobs.
Not sure what this links to. It point to and old commit but I see no comment. |
Seems like I don't have the permission to retry the jobs, and they are all very specific versions of OSX, test fully pass on ubuntu. I suspect some kind of flakiness, but no way to tell without the output. |
44f2b55
to
1982070
Compare
Ok, based on the comment id I think you are referring to I edited my last commit message to try to restart CI, we'll see how it goes. |
So all the |
@casperisfine thanks! I merged this! Merged commit's CI works fine. |
Thank you! |
…cation. Fixes ohler55#275 - Background info: https://en.wikipedia.org/wiki/String_interning - Available to C extensions since MRI Ruby 3.0: - https://bugs.ruby-lang.org/issues/13381 - https://bugs.ruby-lang.org/issues/16029 This change makes `Ox` use "interned" (frozen and deduplicated) `String`s anywhere possible, the same effect as calling `String#-@` except without the heap churn of `RVALUE` allocation: https://ruby-doc.org/core/String.html#method-i-2B-40 - Uses the `HAVE_<funcname>` preprocessor macros to detect this functionality and avoid breaking older Rubies (confirmed with at least MRI 2.7.2). - I explicitly use interned `String`s anywhere an allocated `String` seemed entirely internal to `Ox`, e.g. in `sax_value_as_time` when allocating the `String` argument to `ox_time_class`. - Adds a new user-facing option `intern_strings` to control the behavior of `String` return values from user-facing functions, e.g. from `sax_as_s`. - Results in a ~25% reduction in startup GC pressure in my library! :D Inspired by similar changes to `json` and `msgpack`: - ruby/json#451 - msgpack/msgpack-ruby#196 My goal is `Object` allocation reduction in the `Ox::Sax` handler of my MIME Type file-identification library caused by the large number of duplicate `String`s in the `shared-mime-info`-format XML it uses as a data source. I was already calling `#-@` (or using the `-some_str_variable` syntax) everywhere possible in my handler, but that only freezes and deduplicates an already-allocated `String`: irb(main):068:0> oid = proc { puts "#{_1} is object_id #{_1.object_id}" } => #<Proc:0x0000564d53eb6850 (irb):66> irb(main):069:0> lol = 'lol'.tap(&oid).-@.tap(&oid) lol is object_id 19800 lol is object_id 19740 => "lol" irb(main):070:0> -lol.tap(&oid).-@.tap(&oid) lol is object_id 19740 lol is object_id 19740 => "lol" That avoids duplicate retained `String`s but still causes my library to take a big GC hit right at startup when it loads its data. All the stats below were collected using SamSaffron's `memory_profiler`: /~https://github.com/SamSaffron/memory_profiler First, a sanity-check to make sure I didn't break MRI < 3.0, using MRI 2.7 + latest official `Ox` from RubyGems (2.14.5): [okeeblow@emi#CHECKING-YOU-OUT] ruby -v ruby 2.7.2p137 (2020-10-01 revision 5445e04352) [x86_64-linux] [okeeblow@emi#CHECKING-YOU-OUT] ./bin/are-we-unallocated-yet|grep Total Total allocated: 20561351 bytes (401372 objects) Total retained: 1680971 bytes (22102 objects) versus same MRI 2.7 but with my patched `Ox`: [okeeblow@emi#CHECKING-YOU-OUT] gem install ../../ox/ox-2.14.5.gem [okeeblow@emi#CHECKING-YOU-OUT] ./bin/are-we-unallocated-yet|grep Total Total allocated: 20561111 bytes (401366 objects) Total retained: 1680971 bytes (22102 objects) No difference between the two, showing that this patch is a no-op for pre-3.0. Sorry I did not test older versions of MRI or other Rubies like J/Truffle/etc since I don't have them on my system. Now the real gainz can be seen when comparing MRI 3.0 with unpatched `Ox`: [okeeblow@emi#CHECKING-YOU-OUT] bundle install Fetching gem metadata from https://rubygems.org/...... … Installing ox 2.14.5 with native extensions Using checking-you-out 0.7.0 from source at `.` Bundle complete! 11 Gemfile dependencies, 19 gems now installed. [okeeblow@emi#CHECKING-YOU-OUT] ./bin/are-we-unallocated-yet|grep Total Total allocated: 20081080 bytes (390133 objects) Total retained: 1713209 bytes (22095 objects) against same MRI 3.0 with my patched `Ox`: [okeeblow@emi#CHECKING-YOU-OUT] gem install ../../ox/ox-2.14.5.gem Building native extensions. This could take a while... Successfully installed ox-2.14.5 Parsing documentation for ox-2.14.5 unknown encoding name ""UTF-8"" for README.md, skipping Installing ri documentation for ox-2.14.5 Done installing documentation for ox after 0 seconds 1 gem installed [okeeblow@emi#CHECKING-YOU-OUT] ./bin/are-we-unallocated-yet|grep Total Total allocated: 17298804 bytes (322860 objects) Total retained: 1713202 bytes (22073 objects) against same MRI 3.0, my patched `Ox`, and `Ox.parse_sax(intern_strings: true)`: [okeeblow@emi#CHECKING-YOU-OUT] ./bin/are-we-unallocated-yet|grep Total Total allocated: 16414938 bytes (301382 objects) Total retained: 1713226 bytes (22073 objects) This shows that just having Ruby 3.0 available results in a huge win, and opting into immutable `String`s from `Value#as_s` helps me even more! Unit tests pass: [okeeblow@emi#test] ruby tests.rb Loaded suite tests Started ................. 16 tests, 40 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications 100% passed ..... 18 tests, 42 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications 100% passed ............................................................................................................................... 151 tests, 249 assertions, 0 failures, 0 errors, 0 pendings, 0 omissions, 0 notifications 100% passed Apologies for possible indentation issues in this patch. There seem to be lots of existing lines with a mix of tabs and spaces — nbd, it happens — so I tried to match the surrounding lines everywhere I made an addition to make sure things line up and look okay in `git diff`. I only use `Ox` for Sax parsing, not for marshalling, so please scrutinize my changes in `gen_load.c` and friends extra hard in case I omitted any `intern_strings` option checks that could result in an unwary user getting an immutable `String` where there wasn't one before. The one change I had to make to a `String#force_encoding`-using test case is an example of what I want to avoid surprising anyone with.
Ref: https://bugs.ruby-lang.org/issues/13381
So if all goes well, Ruby 3.0 should expose
rb_enc_interned_str(char* ptr, long len, rb_encoding* enc)
. This function returns an interned string likeString#-@
, except that if the string already exist, it can return it without any allocation.So for documents with a lot of repeated object keys, or using the
freeze => true
option, this has the potential to very heavily cut down on the number of allocation.NB: the function doesn't work well on current ruby master, but hopefully ruby/ruby#3786 will make it usable. But I'd like to start the discussion now if possible.