-
Notifications
You must be signed in to change notification settings - Fork 19
Define most of Pathname in Ruby #53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
ef3c4ab
to
9e3c777
Compare
@eregon out of curiosity, do you have benchmarks for this change? Or does it stay mostly the same performance? |
It's significantly faster (first line is this branch, second line is
|
A bit of a nitpick, but I think Even better, if you require |
Nevermind, I missed the |
* This is just before methods started to be moved from Ruby code to the C extension. * BTW, in the ruby/pathname repository there was no pathname.rb before that commit.
* This means it's only additions in lib/pathname.rb and zero removals.
9e3c777
to
b868d69
Compare
* The <=> implementation in the extension is much faster, so is kept. * The other methods are actually faster in Ruby than in C, because rb_funcall() and rb_ivar_get() in C code have no inline cache, but method calls and `@path` have inline caches in Ruby code. https://railsatscale.com/2023-08-29-ruby-outperforms-c/ is an explanation of that (though it was known well before that).
* Avoids a MatchData allocation.
b868d69
to
c96b559
Compare
I am merging this, I don't want this work to go to waste, the code is clearly more maintainable and readable as Ruby instead of C and it's even faster. |
In the last meeting it has been decided to make Pathname a builtin, hence this repo/gem will become a noop. IMO this change is good, but I think it has to happen in core? |
Once upon a time, Pathname was pure-Ruby: https://github.com/ruby/ruby/blob/95bc02237635d3fe42532bfe53038257575cee75/lib/pathname.rb
This PR goes back to that, but keeps the C extension implementation of
<=>
as that one is significantly faster.The other Pathname methods are actually faster in Ruby than in C, because all these methods just do
rb_funcall()
andrb_ivar_get()
and those in C code have no inline cache, but the corresponding method calls and@path
have inline caches in Ruby code.https://railsatscale.com/2023-08-29-ruby-outperforms-c/ is an explanation of that (though it was known well before that).
I have discussed this with @akr several times (notably in https://bugs.ruby-lang.org/issues/17473) and the last time he said it was OK to do this change.
The main goals are:
From my discussions with @akr, IIRC, the original motivation to rewrite pathname.rb to C, besides the optimization for
<=>
, was apparently to use*at
functions likeopenat
(seeman openat
,Rationale for openat() and other directory file descriptor APIs
) but these are not portable, it did not happen, and is only useful in very rare edge cases.The Ruby
Dir
class could potentially support some of that, but it seems it has never been important enough for someone to implement it.The API of Pathname would anyway also need to change to take advantage of a working directory different than the process CWD, e.g. Pathname methods would need to take an extra "Pathname to use as working directory" argument.
(because if one just uses
Pathname("relative/path").open(...)
there is no point to use*at()
functions).I worked to make the diff really clean, it only adds lines in
lib/pathname.rb
and only removes lines inext/pathname/pathname.c
. That way it should be easy to review it.I restored the Ruby implementation of the methods from ed9270a, the commit just before methods started being migrated to the C extension.
I then fixed things to make the test suite pass and implemented the few missing methods based on their C definition.
The individual commits and their messages make it clear what exactly happened, so I would recommend to review commit-by-commit.