Re: [PATCH -v2] rmap: make anon_vma_prepare link in all the anon_vmas of a mergeable VMA

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: Linus Torvalds
Date: Monday, April 12, 2010 - 9:26 am

On Mon, 12 Apr 2010, Linus Torvalds wrote:

I have a new theory. And this new theory is completely different from all 
the other things we've been looking at.

The new theory is really simple: 'page->mapping' has been re-set to the 
wrong mapping. 

Now, there is one case where we reset page->mapping _intentionally_, 
namely in the COW-breaking case of having the last user 
("page_move_anon_rmap"). And that looks fine, and happens under normal 
loads all the time. We _want_ to do it there.

But there is a _much_ more subtle case that involved swapping.

So guys, here's my fairly simple theory on what happens:

 - page gets allocated/mapped by process A. Let's call the anon_vma we 
   associate the page with 'A' to keep it easy to track.

 - Process A forks, creating process B. The anon_vma in B is 'B', and has 
   a chain that looks like 'B' -> 'A'. Everything is fine.

 - Swapping happens. The page (with mapping pointing to 'A') gets swapped 
   out (perhaps not to disk - it's enough to assume that it's just not 
   mapped any more, and lives entirely in the swap-cache)

 - Process B pages it in, which goes like this:

	do_swap_page ->
	  page = lookup_swap_cache(entry);
	 ...
	  set_pte_at(mm, address, page_table, pte);
	  page_add_anon_rmap(page, vma, address);

   And think about what happens here!

In particular, what happens is that this will now be the "first" mapping 
of that page, so page_add_anon_rmap() will do

	if (first)
		__page_set_anon_rmap(page, vma, address);

and notice what anon_vma it will use? It will use the anon_vma for process 
B!

So now page->mapping actually points to anon_vma 'B', not 'A' like it used 
to.

What happens then? Trivial: process 'A' also pages it in (nothing happens, 
it's not the first mapping), and then process 'B' execve's or exits or 
unmaps, making anon_vma B go away.

End result: process A has a page that points to anon_vma B, but anon_vma B 
does not exist any more. This can go on forever. Forget about RCU grace 
periods, forget about locking, forget anything like that. The bug is 
simply that page->mapping points to an anon_vma that was correct at one 
point, but was _not_ the one that was shared by all users of that possible 
mapping.

The patch below is my largely mindless try at fixing this. It's untested. 
I'm not entirely sure that it actually works. But it makes some amount of 
conceptual sense. No?

			Linus

---
 mm/rmap.c |   15 +++++++++++++--
 1 files changed, 13 insertions(+), 2 deletions(-)

diff --git a/mm/rmap.c b/mm/rmap.c
index ee97d38..4bad326 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -734,9 +734,20 @@ void page_move_anon_rmap(struct page *page,
 static void __page_set_anon_rmap(struct page *page,
 	struct vm_area_struct *vma, unsigned long address)
 {
-	struct anon_vma *anon_vma = vma->anon_vma;
+	struct anon_vma_chain *avc;
+	struct anon_vma *anon_vma;
+
+	BUG_ON(!vma->anon_vma);
+
+	/*
+	 * We must use the _oldest_ possible anon_vma for the page mapping!
+	 *
+	 * So take the last AVC chain entry in the vma, which is the deepest
+	 * ancestor, and use the anon_vma from that.
+	 */
+	avc = list_entry(vma->anon_vma_chain.prev, struct anon_vma_chain, same_vma);
+	anon_vma = avc->anon_vma;
 
-	BUG_ON(!anon_vma);
 	anon_vma = (void *) anon_vma + PAGE_MAPPING_ANON;
 	page->mapping = (struct address_space *) anon_vma;
 	page->index = linear_page_index(vma, address);
--
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
Linux 2.6.34-rc3, Linus Torvalds, (Tue Mar 30, 10:50 am)
[Regression, post-rc2] Commit a5ee4eb7541 breaks OpenGL on ..., Rafael J. Wysocki, (Tue Mar 30, 2:16 pm)
Re: [Regression, post-rc2] Commit a5ee4eb7541 breaks OpenG ..., Rafael J. Wysocki, (Wed Mar 31, 6:13 pm)
Re: [Regression, post-rc2] Commit a5ee4eb7541 breaks OpenG ..., Rafael J. Wysocki, (Thu Apr 1, 12:46 pm)
Re: [Regression, post-rc2] Commit a5ee4eb7541 breaks OpenG ..., Rafael J. Wysocki, (Sat Apr 3, 12:33 pm)
[PATCH] rmap: fix anon_vma_fork() memory leak, Rik van Riel, (Sun Apr 4, 4:09 pm)
Re: [PATCH] rmap: fix anon_vma_fork() memory leak, Minchan Kim, (Sun Apr 4, 4:56 pm)
Re: [PATCH] rmap: fix anon_vma_fork() memory leak, Linus Torvalds, (Mon Apr 5, 8:37 am)
Re: [PATCH] rmap: fix anon_vma_fork() memory leak, Minchan Kim, (Mon Apr 5, 8:48 am)
Re: [PATCH] rmap: fix anon_vma_fork() memory leak, Rik van Riel, (Mon Apr 5, 9:04 am)
[PATCH -v2] rmap: fix anon_vma_fork() memory leak, Rik van Riel, (Mon Apr 5, 9:13 am)
[No subject], Rik van Riel, (Tue Apr 6, 7:34 am)
[No subject], Rik van Riel, (Tue Apr 6, 7:38 am)
[No subject], Minchan Kim, (Tue Apr 6, 8:34 am)
[No subject], Rik van Riel, (Tue Apr 6, 8:40 am)
[No subject], Linus Torvalds, (Tue Apr 6, 8:55 am)
[No subject], Minchan Kim, (Tue Apr 6, 8:58 am)
[No subject], Minchan Kim, (Tue Apr 6, 9:23 am)
[No subject], Linus Torvalds, (Tue Apr 6, 9:28 am)
[No subject], Linus Torvalds, (Tue Apr 6, 9:32 am)
[No subject], Minchan Kim, (Tue Apr 6, 9:45 am)
[No subject], Linus Torvalds, (Tue Apr 6, 9:53 am)
[No subject], Minchan Kim, (Tue Apr 6, 9:54 am)
[No subject], Rik van Riel, (Tue Apr 6, 10:04 am)
[No subject], Borislav Petkov, (Tue Apr 6, 10:05 am)
Re: Ugly rmap NULL ptr deref oopsie on hibernate (was Linu ..., Steinar H. Gunderson, (Tue Apr 6, 12:10 pm)
Re: Ugly rmap NULL ptr deref oopsie on hibernate (was Linu ..., Steinar H. Gunderson, (Tue Apr 6, 1:46 pm)
Re: Ugly rmap NULL ptr deref oopsie on hibernate (was Linu ..., Steinar H. Gunderson, (Tue Apr 6, 2:05 pm)
[PATCH 1/3] mm: make page freeing path RCU-safe, Borislav Petkov, (Sun Apr 11, 6:19 am)
[PATCH 2/3] mm: cleanup find_mergeable_anon_vma complexity, Borislav Petkov, (Sun Apr 11, 6:19 am)
[PATCH 3/3] mm: fixup vma_adjust, Borislav Petkov, (Sun Apr 11, 6:19 am)
[PATCH 2/3] mm: cleanup find_mergeable_anon_vma complexity, Borislav Petkov, (Sun Apr 11, 6:25 am)
Re: [PATCH -v2] rmap: make anon_vma_prepare link in all th ..., Linus Torvalds, (Mon Apr 12, 9:26 am)
[PATCH 2/4] vma_adjust: fix the copying of anon_vma chains, Linus Torvalds, (Mon Apr 12, 1:23 pm)