-
Notifications
You must be signed in to change notification settings - Fork 173
Use posix-style paths internally, only convert for IO #922
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you so much for exploring this pathway. To me this feels much better than #881.
It leaves open the question of tracking which path is a url or a file, but this could be done as a separate PR, and maybe is just an issue of how we name variables rather than typescripting it (??).
The part that feels redundant is the special handling of \r\n. In my opinion (and it's just a weak opinion at this point), we should not change anything here in the code and the data assets; and instead, make tests smarter (for instance, we don't care if a python data loader returns "python\r\n" or "python\n" — instead let's test that the trimmed output is "python").
# Ask git to automatically ensure that text files have posix-style line endings. | ||
* text eol=lf | ||
|
||
# Git can usually detect binary files, but lets mark the binary files we have |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# Git can usually detect binary files, but lets mark the binary files we have | |
# Git can usually detect binary files, but let’s mark the binary files we have |
*.tgz binary | ||
*.webp binary | ||
*.xlsx binary | ||
*.zip binary |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't plan to run the repo under Windows, except for unit tests—and in that case I'd rather see a failure if something changes. Can we just remove the gitattributes file and live with whatever this results in?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't add this to support local development; it is required for Windows CI. I could remove the extra binary attributes and just trust Git to handle binary detection.
Note that the "Waiting for status to be reported" state of CI is because I changed the names of the expected CI jobs. We'll need to have a sort of sea-change to merge this, where we update the CI to expect the new jobs. One thing we might consider is to introduce a "Capstone" job which itself depends on all the needed jobs. That gives us an option to change CI without having to change Github settings in the future. |
Co-authored-by: Philippe Rivière <[email protected]>
3e21af1
to
0c33d22
Compare
Closing in favor of #944. |
This is an alternate approach to Windows support (#90), in contrast to #881. The two PRs share a lot of the common cross platform fixes, but differe in how paths are handled.
In #881, every path was "branded" as either a
FilePath
or aUrlPath
, and there were helpers that helped convert to, from, and between these types. In this PR, every path is kept as a "posix" style path (with forward slashes), until we have to interact with an external system like writing to a file or calling rollup. As late as possible the path is converted to use the OS path format, which is a no-op on macOS and Linux.Comparing the two, #881 changed 87 files, with a line change of +2020/-1264. This PR touched 68 files, with a line change of +1186/-646. All other things being equal, I like smaller changes. The line count of #881 is inflated, due to it often making lines longer and causing extra wrapping.
From a subjective point of view, the two approaches feel quite different to work with. When working with the branding path types, the issue of OS vs URL paths is felt everywhere, and needs to be constantly thought about. However, it is almost always an easy thing to figure out. With this PR, the internal consistency meant that (as long as you use the right imports) the thought about different paths is almost completely invisible. However, when it is felt, it is a lot harder to figure out, since there aren't any guard rails from the type system.
In this PR I also had to figure out to deal with Windows' driver letters, which #881 got to pretty much ignore. The approach I went with is to turn a Windows path like
C:\Users\Amaya\
to/C:/Users/Amaya/
. The file system handlers then have to know how to recognize that prefix and handle it appropriately. Getting that working correctly was about 1/4 of the time I spent on this project. I'm only confident I got it working due to good test coverage.Assuming that CI passes here, my vote would be for this PR. I like the pattern of concentrating "the hard parts" into well defined interfaces instead of spreading it lighly over the whole code base. This is a leaky abstraction due to drive letters, but in the spirit of worse is better I think that it is the right trade off.