login
Header Space

 
 

Linux: Data corrupting ext3 bug in 2.4.20

December 2, 2002 - 12:11am
Submitted by Anonymous on December 2, 2002 - 12:11am.
Linux news

Andrew Morton [interview] posted on the lkml, "In 2.4.20-pre5 an optimisation was made to the ext3 fsync function which can very easily cause file data corruption at unmount time". This bug only affects people using ext3 in the uncommon "data=journal" mode, or files operating under "chattr -j", and does not affect the 2.5 series of kernels.

Andrew went on to say that "The symptoms are that any file data which was written within the thirty seconds prior to the unmount may not make it to disk. A workaround is to run `sync' before unmounting". He also posted a patch to fix the problem. However, soon thereafter, he posted saying that "that 'fix' didn't fix it. Sorry about that". Until a proper fix can be developed, he recommends that people "please avoid ext3/data=journal". Since "data=journal" is not the default ext3 mode, it is unlikely most people running ext3 will be affected by this. However, it is a data corruption bug so you should double-check that you use either "data=ordered" or "data=writeback" as your ext3 mode of operation.


From: Andrew Morton
To: linux-kernel Mailing List
Subject: data corrupting bug in 2.4.20 ext3, data=journal
Date: Sun Dec 01 2002 - 03:11:41 EST

In 2.4.20-pre5 an optimisation was made to the ext3 fsync function
which can very easily cause file data corruption at unmount time. This
was first reported by Nick Piggin on November 29th (one day after 2.4.20 was
released, and three months after the bug was merged. Unfortunate timing)

This only affects filesystems which were mounted with the `data=journal'
option. Or files which are operating under `chattr -j'. So most people
are unaffected. The problem is not present in 2.5 kernels.
The symptoms are that any file data which was written within the thirty
seconds prior to the unmount may not make it to disk. A workaround
is to run `sync' before nmounting.

The optimisation was intended to avoid writing out and waiting on the
inode's buffers when the subsequent commit would do that anyway. This
optimisation was applied to both data=journal and data=ordered modes.
But it is only valid for data=ordered mode.

In data=journal mode the data is left dirty in memory and the unmount
will silently discard it.

The fix is to only apply the optimisation to inodes
which are operating under data=ordered.

--- linux-akpm/fs/ext3/fsync.c~ext3-fsync-fix Sat Nov 30 23:37:33 2002
+++ linux-akpm-akpm/fs/ext3/fsync.c Sat Nov 30 23:39:30 2002
@@ -63,10 +63,12 @@ int ext3_sync_file(struct file * file, s
          */
         ret = fsync_inode_buffers(inode);
 
- /* In writeback mode, we need to force out data buffers   too. In
- * the other modes, ext3_force_commit takes care of forcing out
- * just the right data blocks. */
- if (test_opt(inode->i_sb, DATA_FLAGS) == EXT3_MOUNT_WRITEBACK_DATA)
+ /*
+ * If the inode is under ordered-data writeback it is not necessary to
+ * sync its data buffers here - commit will do that, with potentially
+ * better IO merging
+ */
+ if (!ext3_should_order_data(inode))
                 ret |= fsync_inode_data_buffers(inode);
 
         ext3_force_commit(inode->i_sb);

_

From: Andrew Morton
To: linux-kernel Mailing List
Subject: Re: data corrupting bug in 2.4.20 ext3, data=journal
Date: Sun Dec 01 2002 - 03:52:23 EST

Andrew Morton wrote:>
> ...
> The fix is to only apply the optimisation to inodes which are operating
> under data=ordered.
>

That "fix" didn't fix it. Sorry about that.

Please avoid ext3/data=journal until it is sorted out.

Most people are unaffected?

December 2, 2002 - 12:48am

One of the biggest selling points of ext3 was that is journaled both data and metadata. That's why I use it. To downplay this like no one uses that mode is a big mistake. Surely the release should be pulled or the patch rolled back or SOMETHING.

My honest opinion

December 2, 2002 - 1:00am

Marcelo screwed up with the last 2 releases, I was joyious when he got the job, fresh blood and all, now I've learned to fear his decision making skills.

2.4.19 introduced the horrible lag bug, 2.4.20 didn't fix it for what I know.. and now this horrible horrible corruption bug..

-edit-
It gets worse
http://www.uwsg.indiana.edu/hypermail/linux/kernel/0212.0/0028.html

According to the original bug reporter the bug predates 2.4.19-final...

Look on the bright side

December 2, 2002 - 1:34am
Anonymous

This corruption bug is about the least horrible a corruption bug can possibly be.

Give Marcelo a Break

December 2, 2002 - 11:17am
Anonymous

The last two release diffs were huge even when bzipped. Saying that he "screwed up" one out of maybe a thousand patch decisions is really nit picking.

Yeah well

December 2, 2002 - 12:35pm
Anonymous

Considering he is using a release model (release candidates) lightyears ahead of those before him (for stable releases), he really is making a hash running it. Look, how many rcs come out each release? 5? What you really have to do is keep stability foremost in mind the ENTIRE way through the release and plan for 1 (ONE) release candidate, not use it so you can be lazy and have it (the release candiate model) cover your bum. It just breaks down and is no better than releasing un (widely) tested patches like previous management. A lot more people will test it if there is a good chance it will be the same as the release version. Look at most any other (respected) software project using this model.

RE: Yeah well

December 2, 2002 - 12:46pm
Anonymous

And that is not to say he shouldn't, say, merge a big IDE update from Alan at the start of a release, especially when it has been in Alan's tree for x months, just that he should _plan_ for 1 release candidate.

WHAT?

December 2, 2002 - 3:02pm

I personally believe that the IDE merge should have stayed in Alan' kernel, but that's besides the point.

With 2.4.19 two serious bugs were introduced, and they were not fixed in 2.4.20. The lag bug alone has forced me to use the 2.4.18 based WOLK kernel untill there was a fix available. Since I never used ext3 much, since I pref. Reiserfs (ever since it got stable) it's faster and I've experienced less corruption problems with ReiserFS than I ever did with ext3.

I'm not saying that Marcelo has to go, but I don't see the reason for big merges like the IDE one in a stable kernel - I didn't see it fixing any problems nor adding some critically needed feature - why risk that in a stable kernel?

There has to be a defining line for stable kernels, leave the dangerous merges for Alan and all the other patchset creators, nearly nobody uses a vanilla kernel afterall, thus the stable kernel should be a death stable base for vendors.

And seriously Marcelos RCs ain't new, Linus did the same thing - the later pres would be bugfixes only.

Unfortunately

December 3, 2002 - 3:39am
Anonymous

The old IDE stuff did have its share of problems and it wasn't like a total rewrite or anything. You still need to add capability for new devices as they come out in the stable kernel. I have never had problems with ext3 personally except that bug (which I reported - I started this whole mess!). Anyway, the RCs are new, not so much for putting less stuff into later patches, but having no chances between the last rc and release.

That said, the IDE merge should have waited a couple of revisions, and some vm stuff should probably have gone in instead for 2.4.20.

lag bug???

December 2, 2002 - 11:58pm

Any url or reference on this? This is the first I've heard of it.

Ok then

December 3, 2002 - 12:54am

It's a bug that causes pausing under load. I'm not good at explaining it, but when using X and this happens it's like really bad lagging in an online game (hence I call it the lag bug - I heard it be called pauses from hell bug, etc.)

I don't know what's causing this bug, but fact is that it's in 2.4.19 and not in 2.4.18. Might be IDE, might be some VM shit, might be something else... end result, laggy behavior on some machines.

I know Con Kolivas has had reports on pausing on his ck patchset featuring the compressed cache patch. It seems to be the same bug but in a different context, I don't think it's CC that causes this, because I haven't seen this bug with CC on 2.4.18 (WOLK)

Here's a link that seems to be about a possible fix for it, by the grand master himself... Marc-Christian Peterson.

http://www.uwsg.indiana.edu/hypermail/linux/kernel/0211.2/0066.html

IO scheduler

December 3, 2002 - 3:42am
Anonymous

The problem is reads being starved I think. The problem is being solved in 2.5 with the deadline IO scheduler. The scheduler in 2.4 starves reads really badly. There hasn't been a (quick) fix for it which is agreeable to all parties. read-latency2 works well and is in ac, but Andrea doesn't like it, neither does Jens!

maybe

December 3, 2002 - 4:54pm

Then why have I yet to see this "bug" in 2.4.18 if it's an IO scheduler bug - even when I use the backported scheduler in 2.4.19 (JP and CK amongst others) the bug is still present. (Maybe I'm thinking of a different scheduler, I'm really not much of a kernel developer)

I agree than read starvation seems to be a factor in this problem, but I doubt that it's the whole and entire cause, but I'm glad that there's a possible fix out, be it an ugly fix.

yes

December 4, 2002 - 4:58am
Anonymous

The IO scheduler, not the process scheduler. The fix (both read-latency2 and andrea's fix) both change the IO scheduler so reads don't get starved for as long.

lag bug

December 3, 2002 - 1:02am
Anonymous

I am not sure what the bug actually is but I seem to have it. Whenever I run 2.4.x kernels and do large cvs updates, or compiles, my system starts lagging so badly that mouse movement is even jerky/nonexistant. I started running 2.5.x to get away from the issue.

Chris Cheney
ccheney@cheney.cx

Rollback

December 2, 2002 - 5:41am
Anonymous

You could rollback to an earlier kernel version, you know.

This sort of thing is why I treat recently released kernels as 'unstable'. There have been a number of cases where the so-called stable branch has had gotchas in the code.

Not recent

June 14, 2003 - 5:44pm
Anonymous

Define "Recently released kernels". Take a look at how long it was since 2.4.20 was released, for an example.

Incorrect patch

December 2, 2002 - 1:37am
Anonymous

Oh and by the way, the patch does NOT fix the problem.

Uh huh

December 2, 2002 - 2:42am

Oh, so that's what "That "fix" didn't fix it. Sorry about that." means. ;-)

ext3?

December 2, 2002 - 5:00am
Anonymous

I didn't think anyone still used ext* except perhaps those upgrading legacy servers. ext3 is really slow in my experience compared to the more modern file systems like ReiserFS (which had its fair share of issues, but a long time ago). ReiserFS seems to be the default now in most distros anyway. When I have to pay through the nose for my storage hardware, I like to know that my file system provides stability and performance to match it. Can't wait for Reiser4!

Reiser Default!?

December 2, 2002 - 5:16am
Anonymous

I really don't think so! Maybe in Mandrake, but not Debian, RedHat or Gentoo. ReiserFS has it's own history and baggage. If you like it - I won't argue the point. ext3 is demostrably faster on any number of real-world tasks, and is getting 300% perf boost in the newest 2.5 series.

I use the SGI XFS, knowing it to e Beta-quality. This means backups via Duplicity/rsync and EVMS/OpenAFS snapshots. I'm an old Irix-er, and I have my homedirs in /usr/people, too...

Backup

December 2, 2002 - 8:47am
Anonymous

xfsdump couldn't do the job? Won't anything like Backup Exec, ARCServe do the trick? Sorry for being off topic, but I'm looking into backing up all permissions and ACLs for Samba share on XFS.

I use it

December 2, 2002 - 8:53am
Anonymous

I use it because it is backwards compatible with ext2, offers data journaling, journal on a seperate device, and is very stable and robust IMO regardless of this bug.

ext3 - journaling upgrade on the cheap :)

December 5, 2002 - 7:02pm
Anonymous

I wanted to go with Reiser, but I've got a ton of data I don't want to lose and I can neither afford a back up drive or another hard drive to dump the data before converting. Ah well, I guess Christmas is coming up... ;)

Well

July 22, 2003 - 3:12pm
Anonymous

I don't know what sort of system you're running, but here, ext3+htree performs very nearly as well as reiser, and it's a lot easier to support :)

sync before unmount

December 3, 2002 - 1:05am
Anonymous

im 'sync'inf before 'unmount'ing on linux every time, maybe im a bit overparanoid... but semms it is a good idea in general...

i did this scince i first booted into freebsd and say this behaviour... so the sync is in place in my bsd init script on linux, too..

besides its usual that even stable kernels (whatever os) still contains lot of riscs arinsing soon after the release... so ppl should always update fast and whats more important: do backups of every important data u have. (and no: no .tar.gz on the same partition ;-)

Eugene

Congratulations

December 3, 2002 - 8:58am
Anonymous

Posted to Slashdot. Twice. Maybe they forgot to sync the first one to disk.

Bug Stomping

December 3, 2002 - 9:59am
Anonymous

I'm curious to hear if anybody has any specific ideas on how this bug got out in a 'stable' release. Is it a case of too many patches merged, not enough time spent testing, etc. I think its important that we consider any procedural changes to ensure that point releases are as stable as possible. .20 is pretty late in the game to have this type of error IMO and I'd like know what could be done to prevent further flubs.

Nah

December 3, 2002 - 4:11pm
Anonymous

Its more a matter of nobody much gets affected by it. It had been reported on lkml before 2.4.19 but nobody really noticed. Mostly you'd only be unmounting an ext3 disk before a reboot and I think most init scripts sync beforehand.

Correct patch

December 5, 2002 - 7:09pm
Anonymous

Has the bug already been fixed? If not, what's taking them so long? I want to upgrade to 2.4.20!

Fix

December 6, 2002 - 8:57pm

Andrew Morton posted a new fix.

How good is that patch? Does

December 7, 2002 - 8:02pm
Anonymous

How good is that patch? Does it solve the problem completely? I'm not sure after reading his email.

Help: I've suffered this Data corrupting ext3 bug in 2.4.20

March 24, 2003 - 10:53am
Anonymous

Everything was going smooth... until I rebooted. nooooo.......

This is the error message I got:
fsck.ext3: Invalid argument : couldn't load ext3 journal for /dev/hda3

So how am I meant to repair this problem? How can I boot Linux?

Comment viewing options

Select your preferred way to display the comments and click "Save settings" to activate your changes.
speck-geostationary