On Fri, Apr 23, 2010 at 11:06:32AM +1000, Dave Chinner wrote:2G RAM, AMD Phenom with 4 cores. Compile-based loads that fill up memory and put it under heavy memory pressure that also dirties memory. While they are running, a kernel module is loaded that starts allocating huge pages one at a time so that accurate timing and the state of the system can be gathered at allocation time. The number of allocation attempts is 90% of the number of huge pages that exist in the system. Yes, but unfortunately they are not in a publishable state. Parts of them depend on an automation harness that I don't hold the copyright to. Unfortunately, I don't know what the effect on the underlying load is as it takes longer than the huge page allocation attempts do. The tests objective is to check how well lumpy reclaim works undedmemory pressure. However, the time it takes to allocate a huge page increases with direct reclaim disabled (i.e. your patch) early in the test up until about 40% of memory was allocated as huge pages. After that, the latencies with disable-directreclaim are lower until the gives up while the latencies with enable-directreclaim increase. In other words, with direct reclaim writing back pages, lumpy reclaim is a lot more determined to get the pages cleaned and wait on them if necessary. A compromise patch might be to have a wait_on_page_dirty to be cleared instead of queueing the IO and wait_on_page_writeback? How long it stalled would depend heavily on what rate pages were getting cleaned in the background. You are using the nr_hugepages interface and writing a large number to it so you are also triggering the hugetlbfs retry-logic and have little control over how many times the allocator gets called on each attempt. How many huge pages it allocates depends on how much progress it is able to make during lumpy reclaim. It's why the tests I run allocate huge pages one at a time and measure the latencies as it goes. The results tend to be quite reproducible. Success figures would be the same between runs and the rate of allocation success would generally be comparable as well. Your test could do something similar by only ever requesting one additional page. It will be good enough to measure allocation latency. The gathering of other system state at the time of failure is not very important here (where as it was important during anti-frag development hence the use of a kernel module). Typically, there is not much variance between tests. Maybe 1-2% in allocation success rates. With writeback, lumpy reclaim takes a range of pages, cleans them, waits for the IO before moving on. This causes a seeky IO pattern and takes time. Also causes a nice amount of trashing. With your patch, lumpy reclaim would just skip over ranges with dirty pages until it found clean pages in a suitable range. When there is plenty of usable memore early in the test, it probably scans more but causes less IO so would appear faster. Later in the test, it scans more but eventually encounters too many dirty pages and gives up. Hence, its success rates will be more random because it depends on where exactly the dirty pages were. If this is accurate, it will always be the case that your patch causes less disruption in the system and will appear faster due to the lack of IO but will be less predictable and give up easier so will have lower success rates when there are dirty pages in the system. The underlying workload is only important in how many pages it is dirtying at any given time. Heck, at one point my test workload was a single process that created a mapping the size of physical memory and in test a) would constantly read it and in test b) would constantly write it. Lumpy reclaim with dirty-page-writeback was always more predictable and had higher success rates. -- Mel Gorman Part-time Phd Student Linux Technology Center University of Limerick IBM Dublin Software Lab --
| Jesse Barnes | Re: [stable] [BUG][PATCH] cpqphp: fix kernel NULL pointer dereference |
| Greg KH | [003/136] p54usb: add Zcomax XG-705A usbid |
| Magnus Damm | [PATCH 03/07] ARM: Use shared GIC entry macros on Realview |
| Oliver Neukum | Re: [Bug #13682] The webcam stopped working when upgrading from 2.6.29 to 2.6.30 |
| Martin Schwidefsky | Re: [PATCH] optimized ktime_get[_ts] for GENERIC_TIME=y |
git: | |
| Junio C Hamano | Re: Some advanced index playing |
| Jeff King | Re: confusion over the new branch and merge config |
| Robin Rosenberg | Re: cvs2svn conversion direc |
