"After another round of performance tuning HAMMER all my benchmarks show HAMMER within 10% of UFS's performance, and it beats the shit out of UFS in certain tests such as file creation and random write performance," noted DragonFly BSD creator Matthew Dillon, providing an update on his new clustering filesystem. He continued, "read performance is good but drops more then UFS under heavy write loads (but write performance is much better at the same time)." He then referred to the blogbench benchmark noting, "now when UFS gets past blog #300 and blows out the system caches, UFS's write performance goes completely to hell but it is able to maintain good read performance." Matthew then compared this to HAMMER:
"HAMMER is the opposite. It can maintain fairly good write performance long after the system caches have been blown out, but read performance drops to about the same as its write performance (remember, this is blogbench doing reads from random files). Here HAMMER's read performance drops significantly but it is able to maintain write performance. UFS's write performance basically comes to a dead halt. However, HAMMER's performance numbers become 'unstable' once the system caches are blown out."
From: Matthew Dillon <dillon@...>
Subject: HAMMER update 11-June-2008
Date: Jun 11, 8:27 pm 2008
After another round of performance tuning HAMMER all my benchmarks
show HAMMER within 10% of UFS's performance, and it beats the shit
out of UFS in certain tests such as file creation and random write
performance. Read performance is good but drops more then UFS under
heavy write loads (but write performance is much better at the same
time).
I am making progress with blogbench. It turns out that HAMMER isn't
quite as horrible as it first appeared. What was happening was simply
that the test under UFS never got past blog #200, and thus never wrote
enough data to blow out the system caches. The blogbench test builds
up an ever-increasing sized dataset as it progresses.
HAMMER has superior random write performance and because of that it
easily got into the blog #350-400 range in the default test, or about
double the size of the data-set. Plus it was writing more
during the first half of the test so that of course depressed the
read performance a bit relative to UFS.
When I increase the number of iterations sufficiently for both UFS and
HAMMER to blow out the system caches, the results wind up being very
different.
blogbench --iterations=100 -d /mnt/bench2
Now when UFS gets past blog #300 and blows out the system caches, UFS's
write performance goes completely to hell but it is able to maintain
good read performance:
Nb blogs R articles W articles R pictures W pictures R comments W comments
322 72232 67 55654 88 45740 194
323 83711 81 64882 81 53844 204
325 57380 62 43314 62 36603 196
...
345 17494 40 12866 50 12226 137
347 21895 42 16655 41 13002 128
347 22803 68 17517 12 14247 122
348 16976 52 13397 29 12113 119
348 20068 34 15668 52 13569 135
HAMMER is the opposite. It can maintain fairly good write performance
long after the system caches have been blown out, but read performance
drops to about the same as its write performance (remember, this is
blogbench doing reads from random files). Here HAMMER's read performance
drops significantly but it is able to maintain write performance.
UFS's write performance basically comes to a dead halt. However, HAMMER's
performance numbers become 'unstable' once the system caches are blown
out.
Here is HAMMER:
Nb blogs R articles W articles R pictures W pictures R comments W comments
297 3904 972 3111 720 3228 3966
310 2653 1104 1605 751 1740 3936
325 2703 962 2082 708 1894 2914
346 3637 1123 2537 1138 2204 4761
...
477 1375 597 1005 700 572 2548
496 1507 1307 995 900 825 3735
515 1423 1068 907 1008 569 3877
...
751 1221 1445 817 1086 557 3296
761 1204 508 719 664 719 1398
771 1352 438 824 685 525 1856
Performance TODO
I am going to continue to work on random read and write performance,
particularly inconsistencies in HAMMER's performance numbers.
There are some performance issues when running blogbench on a directory
that it has already been run on, with a normal HAMMER mount which is
retaining a full history of all changes. I believe the problem is
related to fragmentation of the directory entries.
I may make two additional media changes:
* I may give directory entries their own blockmap zone or their own
localization parameter (so they can be reblocked separately). I haven't
decided for sure yet.
* I will probably increase the data block size from 16K to 64K for
files larger then 1MB. This will cut the number of B-Tree elements
needed to index large files by a factor of 4.
-Matt
Matthew Dillon
<dillon@backplane.com>
From: Matthew Dillon <dillon@...>
Subject: HAMMER UPDATE 10-Jun-2008 (HEADS UP, MEDIA CHANGE!)
Date: Jun 10, 8:15 pm 2008
I have made another change to the HAMMER media structures. I
determined that the B-Tree was using too small a radix so I bumped
it up from 16 to 64. A full recompile of the HAMMER filesystem and
its utilities, including newfs_hammer, is required, and any HAMMER
filesystems must be re-new-FS'd (sorry, that's the way it goes, it's
still under development). I pick up my foot :-)
WARNING! Another media change will occur in the next day or two as
well!
As of commit 53H I believe I have fixed all remaining bugs. BUT (always
a but!)... I added an optimization to the B-Tree code that needs
to testing so you may see some follow-up commits if it turns out I
blew the optimization. The optimization is to not do a linear scan
of a B-Tree node's elements. That was fine when there were 16 elements
but now that there are 64 I changed it to do a power-of-2 narrowing
scan.
Performance is coming along nicely. I've made some progress and tests
such as blogbench show tantilizing possibilities. HAMMER currently has
an issue with a backlog of dirty inodes building up and screwing up
performance for long-running tests. I would have posted this message
before but I screwed up the gpt partition on my raid-1 (it wasn't
aligned to the stripe size), so all my tests blew up in my face.
I will post a follow-up tonight once I whack a few more performance
issues.
I will be fixing sequential write performance issues tonight sometime.
That turned out to be two issues. First, the record limit is set
absurdly low and causing unnecessary flushes. Second, when the write
sees that the records have hit their limit it does a complete flush of
the inode before letting more writes through, when it should only need
to wait for the record count to fall below the limit.
-Matt
From: Matthew Dillon <dillon@...>
Subject: HAMMER update 09-June-2008
Date: Jun 9, 9:08 pm 2008
With the 53D commit HAMMER has stabilized again. I will again
recommend that people testing HAMMER update and newfs_hammer your
filesystems. 53D located and corrected a data overwrite bug and
53C corrected a freemap bug that could cause pruning/reblocking panics.
I am continuing to work on various performance issues. In particular,
when I run blogbench I can make the system become extremely inefficient
and writes initiated by the flusher, which are not subject to kernel
restrictions, can 'take over' the system and cause everything else
trying to write to a file to block for very long periods of time.
I am considering making another change to the on-media format to
increase the size of the B-Tree node from 16 elements to 64 elements.
B-Tree operations appear to be HAMMER's only major hangup right now.
Blogbench has shown that B-Tree updates can wind up being extremely
disk-inefficient. I have made a final decision on the matter yet,
I need to play with the flusher's B-Tree updates for a few days first.
-Matt