Sage Weil wrote:For critical metadata which is needed to access a lot of data, it's done: even ext3 replicates superblocks. These days there are content and search indexes, and journals. They aren't replication but are related in some ways since parts of the data are duplicated and voting protocols can feed into that. There's also RAID6 and similar parity/coding. The data is not fully replicated, saving space, but the coordination is similar to N>=3 way replication. Now apply that over a network. Or even local disks, if you were looking to boost RAID write-commit performance. (Generalising to any "quorum" (majority vote) protocol). That's true if you require that all results are guaranteed consistent or blocked, in the event of any kind of network failure. But if you prefer incoherent results in the event of a network split (and those are often mergable later), and only want to protect against media/node failures to the best extent possible at any given time, then quorum protocols can gracefully degrade so you still have access without a majority of working nodes. That is a very useful property. (I think it more closely mimics the way some human organisations work too: we try to coordinate, but when communications are down, we do the best we can and sync up later.) In that model, neighbour sensing is used to find the largest coherency domains fitting a set of parameters (such as "replicate datum X to N nodes with maximum comms latency T"). If the parameters are able to be met, quorum gives you the desired robustness in the event of node/network failures. During any time while the coherency parameters cannot be met, the robustness reduces to the best it can do temporarily, and recovers when possible later. As a bonus, you have some timing guarantees if they are more important. This is pretty much the same as RAID durability. You have robustness against failures, still have access in the event of disk failures, and degraded robustness (and performance) temporarily while awaiting a new disk and resynchronising it. -- Jamie --
| Goswin von Brederlow | Re: [00/41] Large Blocksize Support V7 (adds memmap support) |
| Andrew Morton | 2.6.23-rc4-mm1 |
| Pavel Machek | iwl3945 in 2.6.24-rc1 dies under load |
| Serge E. Hallyn | Re: LSM conversion to static interface |
git: | |
| Johan Herland | [PATCH 0/6] Refactor the tag object |
| Johan De Messemaeker | Re: People unaware of the importance of "git gc"? |
| Mark Levedahl | autoCRLF, git status, git-gui, what is the desired behavior? |
| Dan Farina | backup or mirror a repository |
| Mark Reitblatt | US Export of Cryptography |
| Richard Stallman | Real men don't attack straw men |
| Sam Fourman Jr. | Asus Striker Extreme does not support 4GB memory |
| Sunnz | How do I configure sendmail? |
| Patrick McHardy | pkt_sched: add DRR scheduler |
| jamal | [PATCH 2/3][NET_BATCH] net core use batching |
| Evgeniy Polyakov | [resend take 2 0/4] Distributed storage. |
| Julius Volz | Adding SNAT support to LVS/NAT |
