Skip to content

Commit f2a07f4

Browse files
Hugh Dickinstorvalds
Hugh Dickins
authored andcommitted
tmpfs mempolicy: fix /proc/mounts corrupting memory
Recently I suggested using "mount -o remount,mpol=local /tmp" in NUMA mempolicy testing. Very nasty. Reading /proc/mounts, /proc/pid/mounts or /proc/pid/mountinfo may then corrupt one bit of kernel memory, often in a page table (causing "Bad swap" or "Bad page map" warning or "Bad pagetable" oops), sometimes in a vm_area_struct or rbnode or somewhere worse. "mpol=prefer" and "mpol=prefer:Node" are equally toxic. Recent NUMA enhancements are not to blame: this dates back to 2.6.35, when commit e17f74a "mempolicy: don't call mpol_set_nodemask() when no_context" skipped mpol_parse_str()'s call to mpol_set_nodemask(), which used to initialize v.preferred_node, or set MPOL_F_LOCAL in flags. With slab poisoning, you can then rely on mpol_to_str() to set the bit for node 0x6b6b, probably in the next page above the caller's stack. mpol_parse_str() is only called from shmem_parse_options(): no_context is always true, so call it unused for now, and remove !no_context code. Set v.nodes or v.preferred_node or MPOL_F_LOCAL as mpol_to_str() might expect. Then mpol_to_str() can ignore its no_context argument also, the mpol being appropriately initialized whether contextualized or not. Rename its no_context unused too, and let subsequent patch remove them (that's not needed for stable backporting, which would involve rejects). I don't understand why MPOL_LOCAL is described as a pseudo-policy: it's a reasonable policy which suffers from a confusing implementation in terms of MPOL_PREFERRED with MPOL_F_LOCAL. I believe this would be much more robust if MPOL_LOCAL were recognized in switch statements throughout, MPOL_F_LOCAL deleted, and MPOL_PREFERRED use the (possibly empty) nodes mask like everyone else, instead of its preferred_node variant (I presume an optimization from the days before MPOL_LOCAL). But that would take me too long to get right and fully tested. Signed-off-by: Hugh Dickins <[email protected]> Cc: [email protected] Signed-off-by: Linus Torvalds <[email protected]>
1 parent 128dd17 commit f2a07f4

File tree

1 file changed

+26
-38
lines changed

1 file changed

+26
-38
lines changed

mm/mempolicy.c

Lines changed: 26 additions & 38 deletions
Original file line numberDiff line numberDiff line change
@@ -2595,8 +2595,7 @@ void numa_default_policy(void)
25952595
*/
25962596

25972597
/*
2598-
* "local" is pseudo-policy: MPOL_PREFERRED with MPOL_F_LOCAL flag
2599-
* Used only for mpol_parse_str() and mpol_to_str()
2598+
* "local" is implemented internally by MPOL_PREFERRED with MPOL_F_LOCAL flag.
26002599
*/
26012600
static const char * const policy_modes[] =
26022601
{
@@ -2610,28 +2609,21 @@ static const char * const policy_modes[] =
26102609

26112610
#ifdef CONFIG_TMPFS
26122611
/**
2613-
* mpol_parse_str - parse string to mempolicy
2612+
* mpol_parse_str - parse string to mempolicy, for tmpfs mpol mount option.
26142613
* @str: string containing mempolicy to parse
26152614
* @mpol: pointer to struct mempolicy pointer, returned on success.
2616-
* @no_context: flag whether to "contextualize" the mempolicy
2615+
* @unused: redundant argument, to be removed later.
26172616
*
26182617
* Format of input:
26192618
* <mode>[=<flags>][:<nodelist>]
26202619
*
2621-
* if @no_context is true, save the input nodemask in w.user_nodemask in
2622-
* the returned mempolicy. This will be used to "clone" the mempolicy in
2623-
* a specific context [cpuset] at a later time. Used to parse tmpfs mpol
2624-
* mount option. Note that if 'static' or 'relative' mode flags were
2625-
* specified, the input nodemask will already have been saved. Saving
2626-
* it again is redundant, but safe.
2627-
*
26282620
* On success, returns 0, else 1
26292621
*/
2630-
int mpol_parse_str(char *str, struct mempolicy **mpol, int no_context)
2622+
int mpol_parse_str(char *str, struct mempolicy **mpol, int unused)
26312623
{
26322624
struct mempolicy *new = NULL;
26332625
unsigned short mode;
2634-
unsigned short uninitialized_var(mode_flags);
2626+
unsigned short mode_flags;
26352627
nodemask_t nodes;
26362628
char *nodelist = strchr(str, ':');
26372629
char *flags = strchr(str, '=');
@@ -2719,24 +2711,23 @@ int mpol_parse_str(char *str, struct mempolicy **mpol, int no_context)
27192711
if (IS_ERR(new))
27202712
goto out;
27212713

2722-
if (no_context) {
2723-
/* save for contextualization */
2724-
new->w.user_nodemask = nodes;
2725-
} else {
2726-
int ret;
2727-
NODEMASK_SCRATCH(scratch);
2728-
if (scratch) {
2729-
task_lock(current);
2730-
ret = mpol_set_nodemask(new, &nodes, scratch);
2731-
task_unlock(current);
2732-
} else
2733-
ret = -ENOMEM;
2734-
NODEMASK_SCRATCH_FREE(scratch);
2735-
if (ret) {
2736-
mpol_put(new);
2737-
goto out;
2738-
}
2739-
}
2714+
/*
2715+
* Save nodes for mpol_to_str() to show the tmpfs mount options
2716+
* for /proc/mounts, /proc/pid/mounts and /proc/pid/mountinfo.
2717+
*/
2718+
if (mode != MPOL_PREFERRED)
2719+
new->v.nodes = nodes;
2720+
else if (nodelist)
2721+
new->v.preferred_node = first_node(nodes);
2722+
else
2723+
new->flags |= MPOL_F_LOCAL;
2724+
2725+
/*
2726+
* Save nodes for contextualization: this will be used to "clone"
2727+
* the mempolicy in a specific context [cpuset] at a later time.
2728+
*/
2729+
new->w.user_nodemask = nodes;
2730+
27402731
err = 0;
27412732

27422733
out:
@@ -2756,13 +2747,13 @@ int mpol_parse_str(char *str, struct mempolicy **mpol, int no_context)
27562747
* @buffer: to contain formatted mempolicy string
27572748
* @maxlen: length of @buffer
27582749
* @pol: pointer to mempolicy to be formatted
2759-
* @no_context: "context free" mempolicy - use nodemask in w.user_nodemask
2750+
* @unused: redundant argument, to be removed later.
27602751
*
27612752
* Convert a mempolicy into a string.
27622753
* Returns the number of characters in buffer (if positive)
27632754
* or an error (negative)
27642755
*/
2765-
int mpol_to_str(char *buffer, int maxlen, struct mempolicy *pol, int no_context)
2756+
int mpol_to_str(char *buffer, int maxlen, struct mempolicy *pol, int unused)
27662757
{
27672758
char *p = buffer;
27682759
int l;
@@ -2788,18 +2779,15 @@ int mpol_to_str(char *buffer, int maxlen, struct mempolicy *pol, int no_context)
27882779
case MPOL_PREFERRED:
27892780
nodes_clear(nodes);
27902781
if (flags & MPOL_F_LOCAL)
2791-
mode = MPOL_LOCAL; /* pseudo-policy */
2782+
mode = MPOL_LOCAL;
27922783
else
27932784
node_set(pol->v.preferred_node, nodes);
27942785
break;
27952786

27962787
case MPOL_BIND:
27972788
/* Fall through */
27982789
case MPOL_INTERLEAVE:
2799-
if (no_context)
2800-
nodes = pol->w.user_nodemask;
2801-
else
2802-
nodes = pol->v.nodes;
2790+
nodes = pol->v.nodes;
28032791
break;
28042792

28052793
default:

0 commit comments

Comments
 (0)