See the links I posted and quote in an earlier message up the thread if you
don't remember what you wrote yourself.
I originally only hold up the fragmentation argument (or rather only
argued against it), until I was corrected by both Ingo and you in the
earlier thread and you both insisted that 50k threads were the real
reason'd'etre for 4k stacks.
You're saying that was wrong and the fragmentation issue was really the
real reason for 4k stacks? If both you and Ingo can agree on that
I would be happy to forget the 50k threads :)
On a 32bit kernel?
My estimate is that you need around 32k for a functional blocked thread
in a network server (8k + 2*4k for poll with large fd table and wait queues +
some pinned dentries and inodes + misc other stuff). With 20k you're 625MB into
your lowmem which leaves about 200MB left on a 3:1 system with 16GB
(and ~128MB mem_map). That might work for some time, but I expect it will fall
over at some point because there is just too much pinned lowmem
and not enough left for other stuff (like networking buffers etc.)
10k sounds more doable. But again do 4k more or less make
a big difference with the other thread overhead? I don't think so.
And trading reliability (and functionality -- you basically have to
cut off XFS)just for 4k/thread doesn't seem like good bargain to
me. Especially with kernel code getting more complicated all the time.
Well if it is that serious a problem surely it will have hit some public
bugzillas or mailing lists? Arguing with something secret is also not
very useful.
Also I find it always important to reevaluate assumptions when new
facts come up. In this case we should reevaluate a decision that made
sense[1] in 2.4 with the new facts of 2.6 (e.g. new VM with much better
reclaim)
[1] refering to the fragmentation argument, not the 50k threads which
were always unrealistic.
-Andi
--