Skip to content

FilePath syntactic operations #14

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Feb 5, 2021
Merged

Conversation

milseman
Copy link
Contributor

@milseman milseman commented Nov 5, 2020

Syntactic operations for FilePath, inspired by Rust and C++17. Includes Windows support.

Review thread

@abertelrud
Copy link

Overall this looks great! Really excited to see this API!

@apple apple deleted a comment Jan 31, 2021
@apple apple deleted a comment Jan 31, 2021
Add CInterop namespace and PlatformChar/PlatformEncoding typealiases for
narrow/wide characters.

Change Windows syscalls to use wide-character variants. Add full Windows root
parser to error-correct malformed roots and correctly normalize roots and
post-root portions of the path.

Add platformString APIs as replacements for cString APIs.

Add internal SystemString as backing storage type for FilePath.

Add Windows path mocking support.

This change includes API/ABI additions and deprecations.
Add the proposed non-file-system-interacting FilePath operations, Component,
Root, and ComponentView.

Proposal: https://forums.swift.org/t/api-review-filepath-syntactic-apis-version-2/44197

This change includes API/ABI additions and deprecations.
This API is fully redundant with .kind == .regular and could be a source of
confusion and namespace pollution. Drop it for now, and we can add something
equivalent in the future when we have more component-analysis APIs.
@milseman milseman marked this pull request as ready for review February 2, 2021 17:29
@milseman milseman changed the title [Draft]: FilePath syntactic operations FilePath syntactic operations Feb 2, 2021

public subscript(position: Index) -> FilePath.Component {
let end = _path._parseComponent(startingAt: position._storage).componentEnd
return FilePath.Component(_path, position._storage ..< end)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth making ComponentView.Index store a Range so subscript is O(1)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is amortized constant, as in a subscript over every index will be a total of O(n). We could store the range, but that would complicate startIndex and/or the initializer for ComponentView. If we have an unknown representation for the length, that further complicates other logic. subscript is not the only operation to optimize for, as many operations (especially RRC operations) only advance indices or slice the collection.

This "parse" is just scanning bytes looking for the / byte (or \ on Windows). I don't think it's worth spending the complexity budget on, since these types are resilient, these operations are not inlinable, and the scanned byte range is loaded at the end (with a retain, at least until we do small-form for small components).

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's good we have the flexibility to add this in the future.

Computing the Range in the ComponentView initializer seems like the right direction to me. It's the direction we've taken with the ChunkedByCount and SlidingWindows collections from the Algorithms package.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By "resilient" I wasn't referring to the ability to add it later (though that's true), I was referring to the performance characteristics of the code. Inside the call, subscript does something computers are really good at (linear scan within a cache line for data that will be loaded anyways), after having been called indirectly and returning its result indirectly.

From a performance perspective, I don't think ChunkedByCount and SlidingWindows are analogous, as their collection status is primarily for the purpose of yielding elements. They're not RRC, for example.

Copy link

@kylemacomber kylemacomber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! I only had minor additional comments (and didn't have time to carefully read through the tests).

return self[..<idx]
}

internal mutating func _eatThrough(_ idx: Index) -> Slice {

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if these should be eat(upTo:) and eat(through:), taking their cue from prefix(upTo:) and prefix(through:)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They could be here. It depends on what the total overload set is, as argument labels don't play as well with trailing closures. Either way, these are temporary helpers that I can hopefully drop in the future.

self.formIndex(after: &readIdx)
}
}
self.removeLast(self.distance(from: writeIdx, to: readIdx))

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting so this is basically:

uniqueAdjacentDuplicates { isSeparator($0) }

Maybe we need to add that algorithm after all 🤔

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now that we have a prepass to convert / to \ on Windows, it is. It used to also do the separator change here, but that makes root parsing significantly more complex.


extension SystemString {
internal init() {
self.nullTerminatedStorage = [.null]

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A few times in the code you mention "Intern[ing] the empty SystemString," is this as simple as declaring a global or static variable to store this [.null]?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can be, but IIRC how you declare a global matters and we don't want to unnecessarily dirty memory. I never understood all the differences (especially as they interact with @usableFromInline, which wouldn't be relevant here yet).

Fix bug in _modify and add better testing. Change lexicallyNormal to
lexicallyNormalized() to match "ed" rule.

Fix up some comments and remove some dead code

Co-authored-by: Kyle Macomber <[email protected]>
@milseman milseman merged commit 5bea53a into apple:main Feb 5, 2021
@milseman milseman deleted the route_morphology branch February 5, 2021 00:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants