On Wed, 28 May 2008, Benjamin Herrenschmidt wrote:
Well..
There's really two different issues:
(a) x86 and the fact that we have thousands of drivers
which in turn conflicts with
(b) non-x86 and the fact that other architectures tend to be absolute
pieces of cr*p when it comes to ordering, _especially_ across IO.
and the thing about (b) is that the number of drivers involved is a hell
of a lot smaller. For example, ia64 and the big SGI machines probably
really only care about roughly five drivers (number taken out of my nether
regions).
So practically speaking, I suspect that the right approach is to just say
"ok, x86 will continue to be pretty darn ordered, and the barriers won't
really matter (*)" but at the same time also saying "we wish reality was
different, and well-maintained drivers should probably try to work in the
presense of re-ordering".
In *practice*, that probably means that most architectures will be better
off if they emulate x86 closely, just because that means that they won't
rely on drivers always getting things right, but I think we can leave the
door open for the odd machines. We should just realize that they will
never get a lot of testing, but on the other hand, their usage scenarios
will generally also be very limited (very specific loads, and _very_
specific hardware).
And the patch I sent out actually made "__[raw_]readl()" different from
"readl()" on x86 too, in that the assembly _allows_ a bit more
re-ordering, even though I doubt it will be visible in practice. So if you
use the "__" versions, you'd better have barriers even on x86!
Linus
(*) With the possible but unlikely exception of some big machines with
separate IO networks, but if they happen they will fall into the 'ia64'
case of just having a few relevant drivers.
--