-
Notifications
You must be signed in to change notification settings - Fork 146
WIP: correction cashe refactor init #715
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: dev
Are you sure you want to change the base?
Conversation
// NOTE: this is a hack for fuzzy_search only. | ||
// The algorithm iterates over all unique_paths. | ||
// I'm sure we can find better way to implement it. | ||
unique_paths: HashSet<Vec<usize>>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This unique paths, for fuzzy search, I think are meant to be like the previous cache_shortened, to do fuzzy search over them, we'll need them to be AT LEAST the length from workspace folder, for example, if I have opened my project in /home/user/work/
and I have a file /home/user/work/dir1/file.ext
, it should never be shortened to just file.ext
but to dir1/file.ext
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And to implement it, I guess we could just mark some nodes, with a boolean flag, that will be true if this is the end of one of those shortened paths, like dir1/file.ext
, we can mark them after the build, by finding them in the trie until the count is 1, and we're deep enough such that we crop equal or less than the workspace folder.
Then iterating through those ones could be iterating through the trie and retrieving the marked ones
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
but there may be a simpler implementation for this, but this one will work and is not super complex I guess
.map(|comp| comp.as_os_str().to_string_lossy().to_string()) | ||
.collect(); | ||
|
||
for i in (0..components.len()).rev() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Here same thing, we shouldn't crop more than the workspace folder, not sure if this handles that well
// it's dangerous to use cache_correction_arc without a mutex, but should be fine as long as it's read-only | ||
// (another thread never writes to the map itself, it can only replace the arc with a different map) | ||
|
||
if let Some(fixed) = (*cache_correction_arc).get(&correction_candidate.clone()) { | ||
return fixed.into_iter().cloned().collect::<Vec<String>>(); | ||
// NOTE: do we need top_n here? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
top_n is mostly for fuzzy seach limit I guess, we could assume not that many files will match for the non fuzzy case, but maybe they will, so not sure, maybe we should top_n both? but I think we somehow handle ... and n files more
somewhere, not sure if it applies here
files, dirs caches based on trie data structure
todo: