On Thu, 2008-01-31 at 18:50 +0300, Vladislav Bolkhovitin wrote:
I think the really interesting numbers are the difference for bulk I/O
between kernel and userspace on both traditional iSCSI and the RDMA
enabled flavours. I have not been able to determine anything earth
shattering from the current run of kernel vs. userspace tests, nor which
method of implementation for iSER, SRP, and generic Storage Engine are
'more effective' for that case. Performance and latency to real storage
would make alot more sense for the kernel vs. user case. Also workloads
against software LVM and Linux MD block devices would be of interest as
these would be some of the more typical deployments that would be in the
field, and is what Linux-iSCSI.org uses for our production cluster
storage today.
Having implemented my own iSCSI and SCSI Target mode Storage Engine
leads me to believe that putting logic in userspace is probably a good
idea in the longterm. If this means putting the entire data IO path
into userspace for Linux/iSCSI, then there needs to be a good reason why
this will not not scale to multi-port 10 Gb/sec engines in traditional
and RDMA mode if we need to take this codepath back into the kernel.
The end goal is to have the most polished and complete storage engine
and iSCSI stacks designs go upstream, which is something I think we can
all agree on.
Also, with STGT being a pretty new design which has not undergone alot
of optimization, perhaps profiling both pieces of code against similar
tests would give us a better idea of where userspace bottlenecks reside.
Also, the overhead involved with traditional iSCSI for bulk IO from
kernel / userspace would also be a key concern for a much larger set of
users, as iSER and SRP on IB is a pretty small userbase and will
probably remain small for the near future.
Yes, people like to claim their stacks are the fastest with RAM disk
benchmarks. But hooking up their fast network silicon to existing
storage hardware and OS storage subsystems and software is where the
real game is..
Being able to have a best case baseline with disktest for kernel vs.
user would be of interest for both transport protocol and SCSI Target
mode Storage Engine profiling. The first run of tests looked pretty
bandwith oriented, so disktest works well to determine maximum bandwith.
Disktest also is nice for getting reads from cache on hardware RAID
controllers because disktest only generates requests with LBAs from 0 ->
disktest BLOCKSIZE.
--nab
--