login
Header Space

 
 

Re: [2/3] POHMELFS: Documentation.

Score:
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]
To: Evgeniy Polyakov <johnpol@...>
Cc: Jamie Lokier <jamie@...>, <linux-kernel@...>, <netdev@...>, <linux-fsdevel@...>
Date: Sunday, June 15, 2008 - 12:27 am

Hi Evgeniy,

On Sat, 14 Jun 2008, Evgeniy Polyakov wrote:

By synchronous/asynchronous, are you talking about whether writepages() 
blocks until the write is acked by the server?  (Really, any FS that does 
writeback is writing asynchronously...)


Well... Ceph writes synchronously (i.e. waits for ack in write()) only 
when write-sharing on a single file between multiple clients, when it is 
needed to preserve proper write ordering semantics.  The rest of the time, 
it generates nice big writes via writepages().  The main performance issue 
is with small files... the fact that writepages() waits for an ack and is 
usually called from only a handful of threads limits overall throughput.  
If the writeback path was asynchronous as well that would definitely help 
(provided writeback is still appropriately throttled).  Is that what 
you're doing in POHMELFS?


Your meaning of "transaction" confused me as well.  It sounds like you 
just mean that the read/write operation is retried (asynchronously), and 
may be redirected at another server if need be.  And that writes can be 
directed at multiple servers, waiting for an ack from both.  Is that 
right?

I my view the writeback metadata cache is definitely the most exciting 
part about this project.  Is there a document that describes where the 
design ended up?  I seem to remember a string of posts describing your 
experiements with client-side inode number assignment and how that is 
reconciled with the server.  Keeping things consistent between clients is 
definitely the tricky part, although I suspect that even something with 
very coarse granularity (e.g., directory/subtree-based locking/leasing) 
will capture most of the performance benefits for most workloads.

Cheers-
sage
--
Previous message: [thread] [date] [author]
Next message: [thread] [date] [author]

Messages in current thread:
[0/3] POHMELFS high performance network filesystem. First st..., Evgeniy Polyakov, (Fri Jun 13, 12:37 pm)
[3/3] POHMELFS high performance network filesystem., Evgeniy Polyakov, (Fri Jun 13, 12:42 pm)
Re: [3/3] POHMELFS high performance network filesystem., Vegard Nossum, (Sun Jun 15, 3:47 am)
Re: [3/3] POHMELFS high performance network filesystem., Evgeniy Polyakov, (Sun Jun 15, 5:14 am)
[2/3] POHMELFS: Documentation., Evgeniy Polyakov, (Fri Jun 13, 12:41 pm)
Re: [2/3] POHMELFS: Documentation., Jamie Lokier, (Fri Jun 13, 10:15 pm)
Re: [2/3] POHMELFS: Documentation., Evgeniy Polyakov, (Sat Jun 14, 2:56 am)
Re: [2/3] POHMELFS: Documentation., Sage Weil, (Sun Jun 15, 12:27 am)
Re: [2/3] POHMELFS: Documentation., Evgeniy Polyakov, (Sun Jun 15, 1:57 am)
Re: [2/3] POHMELFS: Documentation., Sage Weil, (Sun Jun 15, 12:41 pm)
Re: [2/3] POHMELFS: Documentation., Evgeniy Polyakov, (Sun Jun 15, 1:50 pm)
Re: [2/3] POHMELFS: Documentation., Sage Weil, (Sun Jun 15, 11:17 pm)
Re: [2/3] POHMELFS: Documentation., Evgeniy Polyakov, (Mon Jun 16, 6:20 am)
Re: [2/3] POHMELFS: Documentation., Trond Myklebust, (Sat Jun 14, 2:45 pm)
Re: [2/3] POHMELFS: Documentation., Evgeniy Polyakov, (Sat Jun 14, 3:25 pm)
Re: [2/3] POHMELFS: Documentation., Jeff Garzik, (Sat Jun 14, 5:49 am)
[1/3] POHMELFS: VFS trivial change., Evgeniy Polyakov, (Fri Jun 13, 12:40 pm)
speck-geostationary