Skip to content

Change in module structure in the upstream kernel breaks drgn #73

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
sdimitro opened this issue Oct 9, 2020 · 2 comments
Closed

Change in module structure in the upstream kernel breaks drgn #73

sdimitro opened this issue Oct 9, 2020 · 2 comments

Comments

@sdimitro
Copy link
Contributor

sdimitro commented Oct 9, 2020

Root Cause

Before version 5.8 the module structure looked like this:

struct module_sect_attr {
	struct module_attribute mattr;
	char *name;
	unsigned long address;
};

From that version, onwards it looks like this:

struct module_sect_attr {
	struct bin_attribute battr;
	unsigned long address;
};

Unfortunately drgn makes the assumption that the name field is part of the module_sect_attr structure within kernel_module_section_iterator_next_offline() when caching the kernel module sections:

static struct drgn_error *
kernel_module_section_iterator_next_offline(struct kernel_module_section_iterator *it,
					    const char **name_ret,
					    uint64_t *address_ret)
{
...
	err = drgn_object_member(&kmod_it->tmp3, &kmod_it->tmp2, "address");
	if (err)
		return err;
...
	err = drgn_object_member(&kmod_it->tmp3, &kmod_it->tmp2, "name"); // <---- Assumption here
	if (err)
		return err;
...
	return NULL;
}

An alternative way of getting the name field in these newer kernels starting from the same struct could be the following:

struct module_sect_attr -> battr -> attr -> name

There should be some conditional logic somewhere to deal with this, ideally based on the fields within the struct and not based on the version of the kernel - this way older kernels that had this patch backported can still avoid this bug.

Symptoms

In crash dumps that ran a 5.8 and later kernel (and sometimes earlier for distributions like Ubuntu - we hit this on their 5.4 kernel, so it seems like the have backported that patch) and we experience an infinite loop because of the above logic.

The specifics of the infinite loop are the following:

1] report_loaded_kernel_module() calls cache_kernel_module_sections() which in turn calls kernel_module_section_iterator_next{,_offline}.
2] The latter returns DRGN_ERROR_LOOKUP here:

static struct drgn_error *
kernel_module_section_iterator_next_offline(struct kernel_module_section_iterator *it,
					    const char **name_ret,
					    uint64_t *address_ret)
{
...
	err = drgn_object_member(&kmod_it->tmp3, &kmod_it->tmp2, "name"); // <---- DRGN_ERROR_LOOKUP
	if (err)
		return err;
...
}

3] This gets us back all the to report_loaded_kernel_module() where try to report that error:

report_loaded_kernel_module(struct drgn_debug_info_load_state *load,
			    struct kernel_module_iterator *kmod_it,
			    struct kernel_module_table *kmod_table)
{
...
	do {
		uint64_t start, end;
		err = cache_kernel_module_sections(kmod_it, kmod->elf, &start, // <---- DRGN_ERROR_LOOKUP returned
						   &end);
		if (err) {
			err = drgn_debug_info_report_error(load, kmod->path, // <---- err variable set to 0 here
							   "could not get section addresses",
							   err);
			if (err)
				return err;
			continue;          // <---- since the err variable we repeat steps 1 to 3 over and over on the *same* module - there is no progress
		}
...

This infinite loop may hint at a separate bug but nevertheless I wanted to report them together

@osandov
Copy link
Owner

osandov commented Oct 13, 2020

Thanks for the detailed report! I was able to reproduce both issues (and another related one). I'll fix those this week, it should be simple.

@osandov
Copy link
Owner

osandov commented Oct 13, 2020

I just pushed the fixes and created an issue to make sure this is tested automatically in the future. Let me know if you get the chance to test it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants