Matthew Dillon sent out a series of updates [1] about his developing HAMMER filesystem, noting that he is currently focusing on the reblocking and pruning code, tracking down a number of bugs resulting in B-Tree corruption. He also noted that previously HAMMER was comprised of three components: B-Tree nodes, records, and data. In his latest cleanups, he has entirely removed the record structure, "this will seriously improve the performance of directory and inode access." This change did require an on-media format change [2], "I know I have said this before, but there's a very good chance that no more on-media changes will be made after this point. The official freeze of the on-media format will not occur until the 2.0 release, however."
Matt added [3], "HAMMER is stable enough now that I am able to run it on my LAN backup box. I'm using it to test that the snapshots work as expected as well as to test the long term effects of reblocking and pruning." He then cautioned:
"Please note that HAMMER is not ready for production use yet, there is still the filesystem-full handling to implement and much more serious testing of the reblocking and pruning code is required, not to mention the crash recovery code. I expect to find a few more bugs, but I'm really happy with the results so far."
From: Matthew Dillon <dillon@...>
Subject: HAMMER update 12-May-2008
[3]Date: May 12, 1:01 pm 2008
I'm holding off the filesystem-full handling work for another week
and instead I am going to focus on the reblocking and pruning code.
There are still numerous bugs in the reblocking and pruning code
that are resulting in a small amount of corruption of the B-Tree.
I am also going to do one more major change to the on-media format.
As I test the reblocking and pruning code more and more, and also
test HAMMER's performance, it has become apparent that the record
abstraction is creating a bigger problem then it is solving.
HAMMER is broken down into three major components: B-Tree nodes, records,
and data. B-Tree nodes reference both records and data and also
duplicate a big chunk of the information found in records. In fact,
the ONLY information in a record that is not found in a B-Tree node
exists for inode records and directory entries, and only a few fields.
What I am going to do is move the remaining information found in the
record structure into the data, and get rid of the record structure
entirely so HAMMER only has B-Tree nodes and data. This will seriously
improve the performance of directory and inode access.
These changes are actually fairly minor in the larger scheme of things.
The records are barely accessed as it stands now, so removing them will
only take a day.
-MattFrom: Matthew Dillon <dillon@...>
Subject: HEADS UP - HAMMER on-media format changed 12-May-2008
[3]Date: May 12, 5:33 pm 2008
For those people testing HAMMER, the HAMMER on-media format has
changed so you will have to newfs any HAMMER filesystems.
I know I have said this before, but there's a very good chance that no
more on-media changes will be made after this point. The official
freeze of the on-media format will not occur until the 2.0 release,
however.
The testing of the reblocking and pruning code continues. There
are still a handful of bugs related to parallel operations while
reblocking and pruning which I expect to be worked out this week.
-MattFrom: Matthew Dillon <dillon@...>
Subject: Backup statistics - using HAMMER on my LAN backup box
[3]Date: May 11, 2:13 pm 2008
HAMMER is stable enough now that I am able to run it on my
LAN backup box. I'm using it to test that the snapshots work
as expected as well as to test the long term effects of reblocking
and pruning. The LAN backup box NFS mounts all the other boxes primary
partitions and uses that to create a daily snapshots from 5 machines
(apollo, crater, leaf, pkgbox, and my office workstation), covering
around 90G of backed-up data. The box mirrors the latest daily
snapshot off-site once a week so I can afford to lose the data if
I hit a bug.
With UFS I had to use the hardlink trick (w/ cpdup) to generate backups.
It took 4-6 hours every day for the backup box to create the snapshots,
and I couldn't use more then half the 700G of backup space because
using more resulted in having too many inodes (> 40 million) for UFS's
fsck to be able to fsck without running out of memory.
STARTING MIRRORS Mon Mar 31 01:15:00 PDT 2008 level 2
DONE MIRRORING Mon Mar 31 05:01:51 PDT 2008 ~4 hrs
STARTING MIRRORS Wed Apr 2 01:15:00 PDT 2008 level 2
DONE MIRRORING Wed Apr 2 07:02:00 PDT 2008 ~6 hrs
STARTING MIRRORS Fri Apr 4 01:15:00 PDT 2008 level 2
DONE MIRRORING Fri Apr 4 07:32:33 PDT 2008 ~6 hrs
STARTING MIRRORS Sat Apr 5 01:15:00 PDT 2008 level 2
DONE MIRRORING Sat Apr 5 05:09:13 PDT 2008 ~4 hrs
With HAMMER I don't have to use the hardlink trick. I can just
cpdup straight out and then create a @@ softlink to the snapshot.
It takes less then an hour to do a daily backup that way.
STARTING MIRRORS Tue May 6 01:15:01 PDT 2008 level 2
DONE MIRRORING Tue May 6 02:11:17 PDT 2008 ~56 min
STARTING MIRRORS Sat May 10 01:15:00 PDT 2008 level 2
DONE MIRRORING Sat May 10 02:17:36 PDT 2008 ~62 min
STARTING MIRRORS Sun May 11 01:15:01 PDT 2008 level 2
DONE MIRRORING Sun May 11 02:09:02 PDT 2008 ~54 min
So far the integrity of the snapshots is good. I am doing a
tar cf - <softlink>/. | md5 on each fixed snapshot and will check
to see if the value changes over time. I already see that I might
want to create a mount option to update mtime as a record update
instead of as an in-place update, to guarantee that it does not
change from the point of view of a snapshot. Being able to
integrity-check a snapshot will likely become an important aspect of
the filesystem.
I am seeing a certain degree of fragmentation, particularly when
listing directories. It will be interesting to see what kind of
effect reblocking has on that.
Please note that HAMMER is not ready for production use yet, there
is still the filesystem-full handling to implement and much more serious
testing of the reblocking and pruning code is required, not to mention
the crash recovery code. I expect to find a few more bugs, but I'm
really happy with the results so far.
-MattFrom: Matthew Dillon <dillon@...>
Subject: Blogbench results for HAMMER
[3]Date: May 10, 5:21 pm 2008
I ran blockbench on a HAMMER partition and on a UFS partition and
got some rather interesting results.
I fully expected HAMMER's write performance to be bad compared to UFS,
because HAMMER is still double-buffering its data. Indeed, as the
test began UFS seemed to be outdoing HAMMER. But as the number of files
grew and the kernel started to have to recycle vnodes and buffers, UFS's
performance went completely to hell while HAMMER was able to maintain good
throughput. Ths basic blog benchmark creates, reads, and writes around
20,000 files and goes for a lot of parallelism.
I don't know why UFS's write performance went to hell.. it pretty much
died completely after a very promising start. But even ignoring that
as some sort of implementation fluke the read performance numbers speak
for themselves.
I haven't run bonnie++ yet. I think UFS still does very well vs HAMMER
on saturated single-file I/O.
-Matt
test29# blogbench -d /usr/obj/bench (HAMMER MOUNT)
Frequency = 10 secs
Scratch dir = [/usr/obj/bench]
Direct I/O: disabled
Spawning 3 writers...
Spawning 1 rewriters...
Spawning 5 commenters...
Spawning 100 readers...
Benchmarking for 30 iterations.
The test will run during 5 minutes.
Nb blogs R articles W articles R pictures W pictures R comments W comments
17 90598 894 64890 945 44719 2268
22 82772 362 63112 348 52860 1002
32 75915 537 53145 484 49000 1482
34 86616 188 58819 213 54302 542
38 85506 179 60253 195 51557 474
43 73030 441 51141 390 43208 1582
45 72860 156 51320 226 40755 634
48 63925 262 47448 87 37990 578
53 65461 338 48538 370 37215 1199
55 60703 189 44439 97 37096 487
55 61601 111 44742 110 34605 401
60 60006 497 45219 232 34962 1413
61 62211 66 43301 104 36553 394
62 61530 47 43645 123 34151 381
70 59738 380 43176 286 34783 1567
70 60988 70 42115 132 36931 407
71 61319 76 42675 90 35336 323
75 62402 398 44539 224 37923 1132
75 60812 66 43790 116 34839 373
77 62885 82 45267 72 35848 310
80 60077 154 44181 393 32197 1513
81 60118 35 46024 59 39169 190
83 61791 115 46716 44 39592 295
87 57090 181 43096 244 35117 1229
87 62665 84 45634 44 41626 296
89 59524 91 44228 52 37435 264
92 57822 121 43098 81 37357 622
94 62745 194 46117 248 43361 1280
96 61023 58 46515 45 39916 202
96 64832 49 47852 29 44019 166
Final score for writes: 96
Final score for reads : 40279
test29# blogbench -d /usr/obj/bench (UFS + softupdates)
Frequency = 10 secs
Scratch dir = [/usr/obj/bench]
Direct I/O: disabled
Spawning 3 writers...
Spawning 1 rewriters...
Spawning 5 commenters...
Spawning 100 readers...
Benchmarking for 30 iterations.
The test will run during 5 minutes.
Nb blogs R articles W articles R pictures W pictures R comments W comments
21 41840 1138 38517 1168 26698 5056
33 71941 800 52572 668 52682 4824
46 60342 625 40127 512 39293 3301
53 70209 709 44394 479 51812 3297
65 53748 689 35700 491 36768 3025
65 18636 0 12765 0 13882 2
65 19329 0 13227 0 13244 0
67 34110 199 23500 164 23938 1109
67 19850 0 13136 4 12844 10
67 19394 0 12692 0 13231 0
67 19452 0 12909 0 13442 0
67 19523 0 13231 0 13644 0
67 19941 0 13162 0 12295 0
67 20134 0 12781 0 13061 0
67 19832 0 13066 0 13343 0
67 19672 0 12471 0 12996 0
67 19353 0 12842 0 13634 0
67 19516 0 12775 1 13401 1
67 19399 0 12927 0 13596 0
67 20434 0 12915 0 13345 0
67 19534 0 12528 0 14222 0
67 20034 0 12667 0 13535 0
67 19090 0 12707 0 14163 0
67 20591 0 13061 0 12392 0
67 19419 0 12702 0 13495 1
67 18881 0 12697 0 13638 0
67 19308 0 12430 0 13816 0
67 18406 0 12878 0 14477 0
67 18697 0 12448 0 14445 0
67 19444 0 12322 0 13796 0
Final score for writes: 67
Final score for reads : 16022Related links:
- Archive of above thread [3]
- Archive of above thread [3]
- Archive of above thread [3]
- Archive of above thread [3]