Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: Hans-Peter Jansen
Subject:
Date: Tuesday, April 6, 2010 - 7:52 am

Hi Dave,

On Tuesday 06 April 2010, 01:06:00 Dave Chinner wrote:

With all due respect, I disagree. See below.


Sure, but for compatibility reasons with a customer setup, that I'm fully 
responsible for and we strongly depend on, it is i586 still. (and it's a 
system, that I've full access on only for a few hours on sundays, which 
punishes my family..).

Dave, I really don't want to disappoint you, but a lengthy bisection session 
points to:

57817c68229984818fea9e614d6f95249c3fb098 is the first bad commit
commit 57817c68229984818fea9e614d6f95249c3fb098
Author: Dave Chinner <david@fromorbit.com>
Date:   Sun Jan 10 23:51:47 2010 +0000

    xfs: reclaim all inodes by background tree walks
    
    We cannot do direct inode reclaim without taking the flush lock to
    ensure that we do not reclaim an inode under IO. We check the inode
    is clean before doing direct reclaim, but this is not good enough
    because the inode flush code marks the inode clean once it has
    copied the in-core dirty state to the backing buffer.
    
    It is the flush lock that determines whether the inode is still
    under IO, even though it is marked clean, and the inode is still
    required at IO completion so we can't reclaim it even though it is
    clean in core. Hence the requirement that we need to take the flush
    lock even on clean inodes because this guarantees that the inode
    writeback IO has completed and it is safe to reclaim the inode.
    
    With delayed write inode flushing, we coul dend up waiting a long
    time on the flush lock even for a clean inode. The background
    reclaim already handles this efficiently, so avoid all the problems
    by killing the direct reclaim path altogether.
    
    Signed-off-by: Dave Chinner <david@fromorbit.com>
    Reviewed-by: Christoph Hellwig <hch@lst.de>
    Signed-off-by: Alex Elder <aelder@sgi.com>

:040000 040000 9cada5739037ecd59afb358cf5ed6186b82d5236 
8e6b6febccba69bc4cdbfd1886d545c369d64c41 M      fs

I will try to prove this by reverting this commit on a 2.6.33.2 build, but
that's going to take another day, or so.


Hmm, thanks for the warning. Will resort to 2.6.33.2 for now on my servers
and keep an eye on the xfs commit logs...

Cheers && greetings to the orbit ;-),
Pete

For the sake of completeness, here's the revert:

---
commit dfe0d292280ad21c9cf3f240bb415913715d8980
Author: Hans-Peter Jansen <hpj@urpla.net>
Date:   Tue Apr 6 16:05:47 2010 +0200

    Revert "xfs: reclaim all inodes by background tree walks"
    
    This reverts commit 57817c68229984818fea9e614d6f95249c3fb098.
    
    Avoid triggering the oom killer with a simple du on a big xfs tree on i586.
    
    Signed-off-by: Hans-Peter Jansen <hpj@urpla.net>

:100644 100644 52e06b4... a76fc01... M	fs/xfs/linux-2.6/xfs_super.c

diff --git a/fs/xfs/linux-2.6/xfs_super.c b/fs/xfs/linux-2.6/xfs_super.c
index 52e06b4..a76fc01 100644
--- a/fs/xfs/linux-2.6/xfs_super.c
+++ b/fs/xfs/linux-2.6/xfs_super.c
@@ -954,14 +954,16 @@ xfs_fs_destroy_inode(
 	ASSERT_ALWAYS(!xfs_iflags_test(ip, XFS_IRECLAIM));
 
 	/*
-	 * We always use background reclaim here because even if the
-	 * inode is clean, it still may be under IO and hence we have
-	 * to take the flush lock. The background reclaim path handles
-	 * this more efficiently than we can here, so simply let background
-	 * reclaim tear down all inodes.
+	 * If we have nothing to flush with this inode then complete the
+	 * teardown now, otherwise delay the flush operation.
 	 */
+	if (!xfs_inode_clean(ip)) {
+		xfs_inode_set_reclaim_tag(ip);
+		return;
+	}
+
 out_reclaim:
-	xfs_inode_set_reclaim_tag(ip);
+	xfs_ireclaim(ip);
 }
 
 /*

--
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
2.6.34-rc3: simple du (on a big xfs tree) triggers oom killer, Hans-Peter Jansen, (Sun Apr 4, 3:49 pm)
[No subject], Hans-Peter Jansen, (Tue Apr 6, 7:52 am)
Re: 2.6.34-rc3: simple du (on a big xfs tree) triggers oom ..., Hans-Peter Jansen, (Tue Apr 13, 1:50 am)
Re: 2.6.34-rc3: simple du (on a big xfs tree) triggers oom ..., Hans-Peter Jansen, (Tue Apr 13, 2:42 am)
Re: 2.6.34-rc3: simple du (on a big xfs tree) triggers oom ..., Hans-Peter Jansen, (Sat Apr 24, 9:44 am)
Re: 2.6.34-rc3: simple du (on a big xfs tree) triggers oom ..., Hans-Peter Jansen, (Sat Apr 24, 3:30 pm)
Re: [opensuse-kernel] Re: 2.6.34-rc3: simple du (on a big ..., Justin P. Mattock, (Sat Apr 24, 3:40 pm)
Re: [opensuse-kernel] Re: 2.6.34-rc3: simple du (on a big ..., Justin P. Mattock, (Sat Apr 24, 3:41 pm)
Re: 2.6.34-rc3: simple du (on a big xfs tree) triggers oom ..., Christoph Hellwig, (Sun Apr 25, 9:57 am)