-
Notifications
You must be signed in to change notification settings - Fork 13.3k
rustdoc: Fix generics generation in search index #88268
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,7 +1,7 @@ | ||
use std::collections::hash_map::Entry; | ||
use std::collections::BTreeMap; | ||
|
||
use rustc_data_structures::fx::{FxHashMap, FxHashSet}; | ||
use rustc_data_structures::fx::FxHashMap; | ||
use rustc_middle::ty::TyCtxt; | ||
use rustc_span::symbol::Symbol; | ||
use serde::ser::{Serialize, SerializeStruct, Serializer}; | ||
|
@@ -192,32 +192,24 @@ crate fn get_index_search_type<'tcx>( | |
item: &clean::Item, | ||
tcx: TyCtxt<'tcx>, | ||
) -> Option<IndexItemFunctionType> { | ||
let (all_types, ret_types) = match *item.kind { | ||
let (mut inputs, mut output) = match *item.kind { | ||
clean::FunctionItem(ref f) => get_all_types(&f.generics, &f.decl, tcx), | ||
clean::MethodItem(ref m, _) => get_all_types(&m.generics, &m.decl, tcx), | ||
clean::TyMethodItem(ref m) => get_all_types(&m.generics, &m.decl, tcx), | ||
_ => return None, | ||
}; | ||
|
||
let inputs = all_types | ||
.iter() | ||
.map(|(ty, kind)| TypeWithKind::from((get_index_type(ty), *kind))) | ||
.filter(|a| a.ty.name.is_some()) | ||
.collect(); | ||
let output = ret_types | ||
.iter() | ||
.map(|(ty, kind)| TypeWithKind::from((get_index_type(ty), *kind))) | ||
.filter(|a| a.ty.name.is_some()) | ||
.collect::<Vec<_>>(); | ||
inputs.retain(|a| a.ty.name.is_some()); | ||
output.retain(|a| a.ty.name.is_some()); | ||
let output = if output.is_empty() { None } else { Some(output) }; | ||
|
||
Some(IndexItemFunctionType { inputs, output }) | ||
} | ||
|
||
fn get_index_type(clean_type: &clean::Type) -> RenderType { | ||
fn get_index_type(clean_type: &clean::Type, generics: Vec<TypeWithKind>) -> RenderType { | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why did you remove There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Because otherwise you need to set it into the returned type. With this, it cannot be forgotten. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure what you mean. I'd like this change to be as small and simple as possible since I haven't interacted with this part of rustdoc before. I'd really appreciate it if you could revert all changes in this PR other than the part that inlines bounds better, because it's really hard for me to understand what the changes are doing when they are all mixed together. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. All the changes are about that, including this one... |
||
RenderType { | ||
name: get_index_type_name(clean_type, true).map(|s| s.as_str().to_ascii_lowercase()), | ||
generics: get_generics(clean_type), | ||
generics: if generics.is_empty() { None } else { Some(generics) }, | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why store an option for this? The caller can check if the Vec is empty, this means it has to check for both None and empty. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Because if it's not an option, it'll generate There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'll add a comment to make it clear. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm confused though - can't you just do this same check wherever it would generate an empty Vec? Why does it have to be specifically here? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'm not sure about the other places... In this case, it's contained inside a vec, so if the vec length is 1 or 2, it's not really a problem. For the other places, it might be different. |
||
} | ||
} | ||
|
||
|
@@ -246,23 +238,6 @@ fn get_index_type_name(clean_type: &clean::Type, accept_generic: bool) -> Option | |
} | ||
} | ||
|
||
/// Return a list of generic parameters for use in the search index. | ||
/// | ||
/// This function replaces bounds with types, so that `T where T: Debug` just becomes `Debug`. | ||
/// It does return duplicates, and that's intentional, since search queries like `Result<usize, usize>` | ||
/// are supposed to match only results where both parameters are `usize`. | ||
fn get_generics(clean_type: &clean::Type) -> Option<Vec<String>> { | ||
clean_type.generics().and_then(|types| { | ||
let r = types | ||
.iter() | ||
.filter_map(|t| { | ||
get_index_type_name(t, false).map(|name| name.as_str().to_ascii_lowercase()) | ||
}) | ||
.collect::<Vec<_>>(); | ||
if r.is_empty() { None } else { Some(r) } | ||
}) | ||
} | ||
|
||
/// The point of this function is to replace bounds with types. | ||
/// | ||
/// i.e. `[T, U]` when you have the following bounds: `T: Display, U: Option<T>` will return | ||
|
@@ -272,27 +247,77 @@ crate fn get_real_types<'tcx>( | |
generics: &Generics, | ||
arg: &Type, | ||
tcx: TyCtxt<'tcx>, | ||
recurse: i32, | ||
res: &mut FxHashSet<(Type, ItemType)>, | ||
) -> usize { | ||
fn insert(res: &mut FxHashSet<(Type, ItemType)>, tcx: TyCtxt<'_>, ty: Type) -> usize { | ||
if let Some(kind) = ty.def_id_no_primitives().map(|did| tcx.def_kind(did).into()) { | ||
res.insert((ty, kind)); | ||
1 | ||
recurse: usize, | ||
res: &mut Vec<TypeWithKind>, | ||
) { | ||
fn insert_ty( | ||
res: &mut Vec<TypeWithKind>, | ||
tcx: TyCtxt<'_>, | ||
ty: Type, | ||
mut generics: Vec<TypeWithKind>, | ||
) { | ||
let is_full_generic = ty.is_full_generic(); | ||
|
||
if is_full_generic && generics.len() == 1 { | ||
// In this case, no need to go through an intermediate state if the generics | ||
// contains only one element. | ||
// | ||
// For example: | ||
// | ||
// fn foo<T: Display>(r: Option<T>) {} | ||
// | ||
// In this case, it would contain: | ||
// | ||
// ``` | ||
// [{ | ||
// name: "option", | ||
// generics: [{ | ||
// name: "", | ||
// generics: [ | ||
// name: "Display", | ||
// generics: [] | ||
// }] | ||
// }] | ||
// }] | ||
// ``` | ||
// | ||
// After removing the intermediate (unnecessary) full generic, it'll become: | ||
// | ||
// ``` | ||
// [{ | ||
// name: "option", | ||
// generics: [{ | ||
// name: "Display", | ||
// generics: [] | ||
// }] | ||
// }] | ||
// ``` | ||
// | ||
// To be noted that it can work if there is ONLY ONE generic, otherwise we still | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why does it only work if there's a single generic argument? It seems weird to have an outer object that just has There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's what I tried to explain but I failed I guess haha. So basically, arguments can only have a type and generics. What happens in case the argument is a full generic which is There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I just discussed this with Guillaume, and I think I can explain it now. For purposes of search-indexing, However, There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Exactly. :) |
||
// need to keep it as is! | ||
res.push(generics.pop().unwrap()); | ||
return; | ||
} | ||
let mut index_ty = get_index_type(&ty, generics); | ||
if index_ty.name.as_ref().map(|s| s.is_empty()).unwrap_or(true) { | ||
return; | ||
} | ||
if is_full_generic { | ||
// We remove the name of the full generic because we have no use for it. | ||
index_ty.name = Some(String::new()); | ||
res.push(TypeWithKind::from((index_ty, ItemType::Generic))); | ||
} else if let Some(kind) = ty.def_id_no_primitives().map(|did| tcx.def_kind(did).into()) { | ||
res.push(TypeWithKind::from((index_ty, kind))); | ||
} else if ty.is_primitive() { | ||
// This is a primitive, let's store it as such. | ||
res.insert((ty, ItemType::Primitive)); | ||
1 | ||
} else { | ||
0 | ||
res.push(TypeWithKind::from((index_ty, ItemType::Primitive))); | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why did you change this bit? How do you know it's a primitive? Can you add an assertion? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There is a There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Oh, I misread the code. Why did you remove the else branch? What did 0 represent? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Before, it returned the number of items inserted to know if something needed to be added into the HashSet. Now we add all arguments (as long as they have a name). So basically, the |
||
} | ||
} | ||
|
||
if recurse >= 10 { | ||
// FIXME: remove this whole recurse thing when the recursion bug is fixed | ||
return 0; | ||
return; | ||
} | ||
let mut nb_added = 0; | ||
|
||
if let Type::Generic(arg_s) = *arg { | ||
if let Some(where_pred) = generics.where_predicates.iter().find(|g| match g { | ||
|
@@ -301,6 +326,7 @@ crate fn get_real_types<'tcx>( | |
} | ||
_ => false, | ||
}) { | ||
let mut ty_generics = Vec::new(); | ||
let bounds = where_pred.get_bounds().unwrap_or_else(|| &[]); | ||
for bound in bounds.iter() { | ||
if let GenericBound::TraitBound(poly_trait, _) = bound { | ||
|
@@ -309,41 +335,32 @@ crate fn get_real_types<'tcx>( | |
continue; | ||
} | ||
if let Some(ty) = x.get_type() { | ||
let adds = get_real_types(generics, &ty, tcx, recurse + 1, res); | ||
nb_added += adds; | ||
if adds == 0 && !ty.is_full_generic() { | ||
nb_added += insert(res, tcx, ty); | ||
} | ||
get_real_types(generics, &ty, tcx, recurse + 1, &mut ty_generics); | ||
} | ||
} | ||
} | ||
} | ||
insert_ty(res, tcx, arg.clone(), ty_generics); | ||
} | ||
if let Some(bound) = generics.params.iter().find(|g| g.is_type() && g.name == arg_s) { | ||
let mut ty_generics = Vec::new(); | ||
for bound in bound.get_bounds().unwrap_or(&[]) { | ||
if let Some(path) = bound.get_trait_path() { | ||
let ty = Type::ResolvedPath { did: path.def_id(), path }; | ||
let adds = get_real_types(generics, &ty, tcx, recurse + 1, res); | ||
nb_added += adds; | ||
if adds == 0 && !ty.is_full_generic() { | ||
nb_added += insert(res, tcx, ty); | ||
} | ||
get_real_types(generics, &ty, tcx, recurse + 1, &mut ty_generics); | ||
} | ||
} | ||
insert_ty(res, tcx, arg.clone(), ty_generics); | ||
} | ||
} else { | ||
nb_added += insert(res, tcx, arg.clone()); | ||
if let Some(gens) = arg.generics() { | ||
for gen in gens.iter() { | ||
if gen.is_full_generic() { | ||
nb_added += get_real_types(generics, gen, tcx, recurse + 1, res); | ||
} else { | ||
nb_added += insert(res, tcx, (*gen).clone()); | ||
} | ||
let mut ty_generics = Vec::new(); | ||
if let Some(arg_generics) = arg.generics() { | ||
for gen in arg_generics.iter() { | ||
get_real_types(generics, gen, tcx, recurse + 1, &mut ty_generics); | ||
} | ||
} | ||
insert_ty(res, tcx, arg.clone(), ty_generics); | ||
} | ||
nb_added | ||
} | ||
|
||
/// Return the full list of types when bounds have been resolved. | ||
|
@@ -354,38 +371,41 @@ crate fn get_all_types<'tcx>( | |
generics: &Generics, | ||
decl: &FnDecl, | ||
tcx: TyCtxt<'tcx>, | ||
) -> (Vec<(Type, ItemType)>, Vec<(Type, ItemType)>) { | ||
let mut all_types = FxHashSet::default(); | ||
) -> (Vec<TypeWithKind>, Vec<TypeWithKind>) { | ||
let mut all_types = Vec::new(); | ||
for arg in decl.inputs.values.iter() { | ||
if arg.type_.is_self_type() { | ||
continue; | ||
} | ||
let mut args = FxHashSet::default(); | ||
// FIXME: performance wise, it'd be much better to move `args` declaration outside of the | ||
// loop and replace this line with `args.clear()`. | ||
let mut args = Vec::new(); | ||
get_real_types(generics, &arg.type_, tcx, 0, &mut args); | ||
if !args.is_empty() { | ||
// FIXME: once back to performance improvements, replace this line with: | ||
// `all_types.extend(args.drain(..));`. | ||
all_types.extend(args); | ||
} else { | ||
if let Some(kind) = arg.type_.def_id_no_primitives().map(|did| tcx.def_kind(did).into()) | ||
{ | ||
all_types.insert((arg.type_.clone(), kind)); | ||
all_types.push(TypeWithKind::from((get_index_type(&arg.type_, vec![]), kind))); | ||
} | ||
} | ||
} | ||
|
||
let ret_types = match decl.output { | ||
let mut ret_types = Vec::new(); | ||
match decl.output { | ||
FnRetTy::Return(ref return_type) => { | ||
let mut ret = FxHashSet::default(); | ||
get_real_types(generics, return_type, tcx, 0, &mut ret); | ||
if ret.is_empty() { | ||
get_real_types(generics, return_type, tcx, 0, &mut ret_types); | ||
if ret_types.is_empty() { | ||
if let Some(kind) = | ||
return_type.def_id_no_primitives().map(|did| tcx.def_kind(did).into()) | ||
{ | ||
ret.insert((return_type.clone(), kind)); | ||
ret_types.push(TypeWithKind::from((get_index_type(return_type, vec![]), kind))); | ||
} | ||
} | ||
ret.into_iter().collect() | ||
} | ||
_ => Vec::new(), | ||
_ => {} | ||
}; | ||
(all_types.into_iter().collect(), ret_types) | ||
(all_types, ret_types) | ||
Comment on lines
-390
to
+410
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is this the reason you changed the There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. If by "this" you mean the fact that a same type can be present multiple times into generics, then yes. ;) |
||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -299,10 +299,10 @@ window.initSearch = function(rawSearchIndex) { | |
var elems = Object.create(null); | ||
var elength = obj[GENERICS_DATA].length; | ||
for (var x = 0; x < elength; ++x) { | ||
if (!elems[obj[GENERICS_DATA][x]]) { | ||
elems[obj[GENERICS_DATA][x]] = 0; | ||
if (!elems[obj[GENERICS_DATA][x][NAME]]) { | ||
elems[obj[GENERICS_DATA][x][NAME]] = 0; | ||
} | ||
elems[obj[GENERICS_DATA][x]] += 1; | ||
elems[obj[GENERICS_DATA][x][NAME]] += 1; | ||
} | ||
var total = 0; | ||
var done = 0; | ||
|
@@ -345,6 +345,7 @@ window.initSearch = function(rawSearchIndex) { | |
// Check for type name and type generics (if any). | ||
function checkType(obj, val, literalSearch) { | ||
var lev_distance = MAX_LEV_DISTANCE + 1; | ||
var tmp_lev = MAX_LEV_DISTANCE + 1; | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why this There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. True, I'll revert this one. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don't really understand why the other changes were made either ;) Like with #88268 (comment), can you please have someone who's more knowledgeable about rustdoc's JS review this part? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I'll ask @Nemo157. |
||
var len, x, firstGeneric; | ||
if (obj[NAME] === val.name) { | ||
if (literalSearch) { | ||
|
@@ -354,10 +355,10 @@ window.initSearch = function(rawSearchIndex) { | |
var elems = Object.create(null); | ||
len = obj[GENERICS_DATA].length; | ||
for (x = 0; x < len; ++x) { | ||
if (!elems[obj[GENERICS_DATA][x]]) { | ||
elems[obj[GENERICS_DATA][x]] = 0; | ||
if (!elems[obj[GENERICS_DATA][x][NAME]]) { | ||
elems[obj[GENERICS_DATA][x][NAME]] = 0; | ||
} | ||
elems[obj[GENERICS_DATA][x]] += 1; | ||
elems[obj[GENERICS_DATA][x][NAME]] += 1; | ||
} | ||
|
||
var allFound = true; | ||
|
@@ -382,7 +383,7 @@ window.initSearch = function(rawSearchIndex) { | |
// If the type has generics but don't match, then it won't return at this point. | ||
// Otherwise, `checkGenerics` will return 0 and it'll return. | ||
if (obj.length > GENERICS_DATA && obj[GENERICS_DATA].length !== 0) { | ||
var tmp_lev = checkGenerics(obj, val); | ||
tmp_lev = checkGenerics(obj, val); | ||
if (tmp_lev <= MAX_LEV_DISTANCE) { | ||
return tmp_lev; | ||
} | ||
|
@@ -392,8 +393,8 @@ window.initSearch = function(rawSearchIndex) { | |
if ((!val.generics || val.generics.length === 0) && | ||
obj.length > GENERICS_DATA && obj[GENERICS_DATA].length > 0) { | ||
return obj[GENERICS_DATA].some( | ||
function(name) { | ||
return name === val.name; | ||
function(gen) { | ||
return gen[NAME] === val.name; | ||
}); | ||
} | ||
return false; | ||
|
@@ -404,17 +405,27 @@ window.initSearch = function(rawSearchIndex) { | |
// a levenshtein distance value that isn't *this* good so it goes | ||
// into the search results but not too high. | ||
lev_distance = Math.ceil((checkGenerics(obj, val) + lev_distance) / 2); | ||
} else if (obj.length > GENERICS_DATA && obj[GENERICS_DATA].length > 0) { | ||
} | ||
if (obj.length > GENERICS_DATA && obj[GENERICS_DATA].length > 0) { | ||
Comment on lines
-407
to
+409
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Same here. Why is this necessary? I'm not comfortable reviewing changes to the JS search code other than adding There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Because before, if we found that the item had a "close enough" name to what we're looking for, it then ran checks on generics as well, otherwise it was going to look into the generics. Now, in both cases we go look into generics and then we pick the best match out of the two. And yes, sorry for the JS changes... If that can make you feel better, you can always confirm it's working as expected by running the JS search tests. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. The way I interpret your description is that it's not necessary to bundle the JS change with the
Rustdoc doesn't currently have very exhaustive test coverage, so I'd rather have someone more knowledgeable look over the JS change. |
||
// We can check if the type we're looking for is inside the generics! | ||
var olength = obj[GENERICS_DATA].length; | ||
for (x = 0; x < olength; ++x) { | ||
lev_distance = Math.min(levenshtein(obj[GENERICS_DATA][x], val.name), | ||
lev_distance); | ||
tmp_lev = Math.min(levenshtein(obj[GENERICS_DATA][x][NAME], val.name), tmp_lev); | ||
GuillaumeGomez marked this conversation as resolved.
Show resolved
Hide resolved
|
||
} | ||
if (tmp_lev !== 0) { | ||
// If we didn't find a good enough result, we go check inside the generics of | ||
// the generics. | ||
for (x = 0; x < olength && tmp_lev !== 0; ++x) { | ||
tmp_lev = Math.min( | ||
checkType(obj[GENERICS_DATA][x], val, literalSearch), | ||
tmp_lev | ||
); | ||
} | ||
} | ||
} | ||
// Now whatever happens, the returned distance is "less good" so we should mark it | ||
// as such, and so we add 1 to the distance to make it "less good". | ||
return lev_distance + 1; | ||
return Math.min(lev_distance, tmp_lev) + 1; | ||
} | ||
|
||
function findArg(obj, val, literalSearch, typeFilter) { | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why are generic parameters treated as items?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The pure generic ones, like
fn foo<T: Display + Debug>(a: T)
, in here,a
cannot be both "Display" and "Debug", it needs to go through a level:So in here, the argument with name
""
is a "pure" generic, hence creating a newItemType
for it.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But value parameters like
a: T
don't have item types (at least from looking at theItemType
enum definition), so how are they represented?There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just with an empty name. If they are still present, might be interesting to remove them since they're useless for the search index.