Re: RFC: Flat directory for notes, or fan-out? Both!

Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
From: Shawn O. Pearce
Date: Tuesday, February 10, 2009 - 12:09 pm

Junio C Hamano <gitster@pobox.com> wrote:

Don't forget a version number.  Waste 4 bytes now and its easier
to change the format in the future if we need to.


Yup.  Sort of my thoughts when I was thinking about that external
index for a "git database".

I was considering a much more complex file layout though; one that
would permit editing without completely recopying the file every
time something changes.

More or less a traditional block oriented on-disk M-tree, with
copy-on-write semantics for the blocks.  This would permit us to
quickly append onto the end of the file with new updates, and then
periodically copy and flatten out the the file as necessary to
reclaim the prior dead space.

E.g.:

  magic number
  version
  [intermediate blocks ...]
  [leaf blocks...]
  root block

Writers would append modified leaf and intermediate blocks as
necessary to the end of the file, then append a new root block.

Readers would read the file tail and verify it is a root, then scan
with a traditional M-tree search algorithm.

If the root block has a "magic block header" and a strong checksum
at the tail of the block, readers can concurrently read while a
writer is appending.  Any invalid root block just means the reader
is seeing the middle of a write, or an aborted write, and should
scan backwards to locate the prior valid root.

If the root block also has a commit SHA-1 indicating which commit
that root become valid under, a reader can decide if that root
might give it answers which aren't correct for the current value of
the notes history it is reading, and scan backwards for some older
root block.  We could accelerate that by including the file offset
of the prior root block in each new root.

GC compacting the file is just a matter of write-locking the file
to block out a new writer, then traversing the current root and
copying all blocks that are reachable.

</end-hand-waving>


Ooh, great idea.  If we could toss rerere data into something that
can be transported around, and efficiently accessed.  I like it.

-- 
Shawn.
--
To unsubscribe from this list: send the line "unsubscribe git" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
Re: RFC: Flat directory for notes, or fan-out? Both!, Junio C Hamano, (Tue Feb 10, 8:58 am)
Re: RFC: Flat directory for notes, or fan-out? Both!, Shawn O. Pearce, (Tue Feb 10, 9:48 am)
Re: RFC: Flat directory for notes, or fan-out? Both!, Johannes Schindelin, (Tue Feb 10, 9:48 am)
Re: RFC: Flat directory for notes, or fan-out? Both!, Shawn O. Pearce, (Tue Feb 10, 9:56 am)
Re: RFC: Flat directory for notes, or fan-out? Both!, Johannes Schindelin, (Tue Feb 10, 10:31 am)
Re: RFC: Flat directory for notes, or fan-out? Both!, Junio C Hamano, (Tue Feb 10, 11:35 am)
Re: RFC: Flat directory for notes, or fan-out? Both!, Shawn O. Pearce, (Tue Feb 10, 12:09 pm)
Re: RFC: Flat directory for notes, or fan-out? Both!, Johannes Schindelin, (Tue Feb 10, 2:10 pm)
Re: RFC: Flat directory for notes, or fan-out? Both!, Thomas Rast, (Tue Feb 10, 3:16 pm)
Re: RFC: Flat directory for notes, or fan-out? Both!, Thomas Rast, (Tue Feb 10, 3:26 pm)
Re: RFC: Flat directory for notes, or fan-out? Both!, Junio C Hamano, (Tue Feb 10, 3:32 pm)
Re: RFC: Flat directory for notes, or fan-out? Both!, Johannes Schindelin, (Wed Feb 11, 1:57 pm)
Re: RFC: Flat directory for notes, or fan-out? Both!, Junio C Hamano, (Wed Feb 11, 2:16 pm)
Re: RFC: Flat directory for notes, or fan-out? Both!, Johannes Schindelin, (Wed Feb 11, 4:05 pm)