This is the listing of the open bugs that are relatively new, around 2.6.22 and up. They are vaguely classified by specific area. (not a full list, there are more :) The good part is that reporters of the bugs below are still around and haven't dissipated, or disposed of their hardware, so it is a good time to get the bugs. Those bugzillas that have been started as regressions on Rafael's list are not mentioned here so far, since they are being tracked as new regressions already. It would be appreciated if the corresponding maintenance team could take a look, close off any which are fixed and see if they can fix any which aren't. NOTE: when replying to this email, please add the bug number to the Subject in the form [Bug 1234] so that bugzilla will capture the discussion. Thanks. ACPI==================================================================== System does not load without acpi=off ide=nodma noapic http://bugzilla.kernel.org/show_bug.cgi?id=9358 Kernel: 2.6.23.1 ACPI Error attaching device data http://bugzilla.kernel.org/show_bug.cgi?id=9354 Kernel: 2.6.24-rc2 /proc/acpi/battery displays Incorrect voltages http://bugzilla.kernel.org/show_bug.cgi?id=9341 Kernel: 2.6.23.1 PATA scan: ACPI Exception AE_AML_PACKAGE_LIMIT... is beyond end of object http://bugzilla.kernel.org/show_bug.cgi?id=9320 Kernel: 2.6.24-rc2 (Tejun: calling _GTF without calling _STM first. _GTM doesn't have any prerequisite (it can't). Can someone familiar with ACPI tell me why the method is failing? At any rate, libata should work fine regardless of ACPI failures. Maybe it's time to start blacklist to skip ATA-ACPI for some boards to avoid those annoying messages during boot) ACPI Battery Info in /sys but not /proc/acpi http://bugzilla.kernel.org/show_bug.cgi?id=9183 Kernel: 2.6.23-rc8-mm2 When using ACPI on a Compaq Presario V6221EU the laptop goes into deadlock after a random amount of time http://bugzilla.kernel.org/show_bug.cgi?id=9118 Kernel: 2.6.23-rc6 ACPI video ...
No response from developers So I count around seven reports which people are doing something with and twenty seven which have been just ignored. Three of these reports have been identified as regressions. All three of those remain unresponded to. -
Urm, well, if no-one ever tells the SCSI list it's unrealistic to expect anyone to be working on it. As far as I can tell, email was sent to Andrew Vasquez only on 31 October. However, the fault looks to be generic, so he probably just dropped it. This seems to be the significant line from the trace: Oct 7 23:35:07 t-host kernel: ISP2422: PCI-X Mode 1 (133 MHz) @ 0000:01:03.0 hdma-, host#=1, fw=4.00.27 [IP] Oct 7 23:35:07 t-host kernel: ACPI: PCI Interrupt 0000:01:03.1[B] -> GSI 29 (level, low) -> IRQ 22 Oct 7 23:35:07 t-host kernel: qla2xxx 0000:01:03.1: Found an ISP2422, irq 22, iobase 0xf8cf4000 Oct 7 23:35:07 t-host kernel: qla2xxx 0000:01:03.1: Configuring PCI space... Oct 7 23:35:07 t-host kernel: qla2xxx 0000:01:03.1: Configure NVRAM parameters... Oct 7 23:35:07 t-host kernel: qla2xxx 0000:01:03.1: Verifying loaded RISC code... Oct 7 23:35:07 t-host kernel: qla2xxx 0000:01:03.1: Allocated (64 KB) for EFT... Oct 7 23:35:07 t-host kernel: qla2xxx 0000:01:03.1: Allocated (1413 KB) for firmware dump... Oct 7 23:35:07 t-host kernel: scsi2 : qla2xxx Oct 7 23:35:07 t-host kernel: qla2xxx 0000:01:03.1: Oct 7 23:35:07 t-host kernel: QLogic Fibre Channel HBA Driver: 8.01.07-k7 Oct 7 23:35:07 t-host kernel: QLogic QLA2462 - PCI-X 2.0 to 4Gb FC, Dual Channel Oct 7 23:35:07 t-host kernel: ISP2422: PCI-X Mode 1 (133 MHz) @ 0000:01:03.1 hdma-, host#=2, fw=4.00.27 [IP] Oct 7 23:35:07 t-host kernel: qla2xxx 0000:01:03.0: LIP reset occured (f8f7). Oct 7 23:35:07 t-host kernel: qla2xxx 0000:01:03.0: LIP occured (f8f7). Oct 7 23:35:07 t-host kernel: qla2xxx 0000:01:03.0: LOOP UP detected (4 Gbps). Oct 7 23:35:07 t-host kernel: ohci_hcd 0000:03:00.0: auto-stop root hub Oct 7 23:35:07 t-host kernel: ohci_hcd 0000:03:00.1: auto-stop root hub Oct 7 23:35:07 t-host kernel: scsi 1:0:0:0: Direct-Access transtec PV610F16R1C 348B PQ: 0 ANSI: 4 Oct 7 23:35:07 t-host kernel: kobject_add failed for 1:0:0:0 with -EEXIST, don't try to register things with ...
It seems that new SCSI bugs need to be sent to linux-scsi@vger.kernel.org. Martin, can you arrange that to happen automatically instead of Andrew having to do it manually? --- ~Randy -
This is a technical issue with vger.kernel.org mailing lists that I've tried addressing before - maybe davem can help fix it? What I've tried doing is bouncing relevant postings from my procmail filters to the list, but it seems to drop bounces (probably as spam). Is there any way around this? (like can I get an exception to be allowed to bounce stuff or mark it with some magic X-secret-knock: header?) -
From: "Martin Bligh" <mbligh@google.com> I think the problem is that certain mail headers show up multiple times and this makes it look like a looping email so we toss it. I suspect the one you really need to block out is X-MailingList or similar. Why don't you do a few tries and I'll try to remember to keep an eye out for the bounces? Thanks. -
I looked at that one and decided not to forward it to anyone because it was Please let me know asap if/when this starts working so I don't start forwarding duplicates everywhere. -
From: Andrew Morton <akpm@linux-foundation.org> That's funny, then how come there was a proper patch fix posted and it's now in my tree ready to go to Linus? I think you like just saying "No response from developers" over and over again to make some of point about how developers are ignoring lots of bugs. That's fine, but at least be accurate about it :-) -
From: Andrew Morton <akpm@linux-foundation.org> Do you feel that making us feel and look like shit helps? I guess I'm just masterbating here all night long with the 46 bug fixes I've reviewed fully and queued up into my tree. Along with all the 10 or so -stable submissions I did tonight as well. When someone like me is bug fixing full time, I take massive offense to the impression you're trying to give especially when it's directed at the networking. So turn it down a notch Andrew. I bet if you did things like list explicitly by name every single person who adds a bug fix (however trivial) to an -mm release instead of a new feature, you'll better achieve your goal than what you're doing here. -
That doesn't answer my question. See, first we need to work out whether we have a problem. If we do this, then we can then have a think about what to do about it. I tried to convince the 2006 KS attendees that we have a problem and I resoundingly failed. People seemed to think that we're doing OK. But it appears that data such as this contradicts that belief. This is not a minor matter. If the kernel _is_ slowly deteriorating then this won't become readily apparent until it has been happening for a number of years. By that stage there will be so much work to do to get us back to an acceptable level that it will take a huge effort. And it will take a long time after that for the kerel to get its reputation back. So it is important that we catch deterioration *early* if it is happening. -
From: Andrew Morton <akpm@linux-foundation.org> You tell me what I should spend my time working on, and I promise to do it OK? :-) For example, if I have a choice between a TCP crash just about anyone can hit and some obscure issue only reported with some device nearly nobody has, which one should I analyze and work on? That's the problem. All of us prioritize and it means the chaff collects at the bottom. You cannot fix that except by getting more bug fixers so that the chaff pile has a chance to get smaller. Luckily if the report being ignored isn't chaff, it will show up again (and again and again) and this triggers a reprioritization because not only is the bug no longer chaff, it also now got a lot of information tagged to it so it's a double worthwhile investment to work on the problem. I think a lot of bugs that "aren't getting looked at" are simply sitting in some early stage of this process. -
Can't we wait until all regressions[0] are fixed before releasing a new 2.6.x? I'd consider regressions a *literal* show stopper, and with this policy they just have be fixed, nothing would "slide"... my 2 cents, Christian. [0] preferably only reproducible regressions, with responsive reporters. -- BOFH excuse #380: Operators killed when huge stack of backup tapes fell over. -
Problem is that everyone would then sit around pumping shiny new features into their trees waiting until someone else fixes the regressions. There are a number of process things we _could_ do. Like - have bugfix-only kernel releases - Just refuse to merge any non-bugfix patches for a subsystem when it is determined that the subsystem has "too many" regressions. - Create an "if you added a regression, you should fix some other bug too" rule. - probably other stuff. But we can't/shouldn't do any of that until it is generally agreed that we have a problem and that the problem is of sufficient magnitude that process changes are needed to address it. We aren't at that stage yet. Here's an important point: developers have a fixed amount of development time. They spend some of that time fixing bugs and the rest of that time on <otherstuff>. And while one could cook up all sorts of wonderful process changes, they all would be aimed at a single thing: shifting some of the developers' time away from <otherstuff> and onto bugfixing. At this stage the only tool which is being deployed to attempt to bring about that reprioritisation is suasion. aka "lkml flamewar". -
There is another possible solution:
Finding more maintainers.
The problem seems to be that there are many people who want to write
drivers for cool shiny new hardware, but not many people willing to
learn to know and maintain existing code.
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
-
rant on :) ... These aren't directed specifically at Andrew, but everyone who merges patches or is involved in the release process. Not if you said their regression causing patches will get reverted unless it can be fixed for release. For everyone not involved, they'll be doing their I don't know why this would be better than the above. If you are worried about perception of released kernels, then it would actually be worse, because the non-bugfix-only releases will get out there, and the numbering scheme gets yet another level of complexity. Make 2.6.24-rc3 your bugfix only release. If you argue that would make If you don't allow regressions to build up over releases in the first That's pretty annoying. Everybody tries not to introduce regressions. It's just a natural part of development. You just have to make the system work well within that constraint. It's easy: "if you add a regression, you fix it. If it doesn't get fixed, we either don't release or we revert your patch." That takes care of regressions. Now for actually improving quality. I think to do that you have to encourage code review and bug fixing. - Take reviewers seriously, and don't allow patches to be merged if they have outstanding unaddressed comments (unaddressed doesn't have to mean changed completely to reviewers taste, but at least answered questions and provided rationale). Don't merge unreviewed patches. - Prioritise bugfixes and regression fixes. I realise they can be very complex changes, but I am disappointed sometimes at how some of my bugfix attempts are received. Things like silent and un-reproduceable pagecache corruption, memory ordering bugs which will likewise cause silent and un-reproduceable data corruption are totally unacceptable IMO; worse than most simple (eg. fail-stop, performance) regressions. To see them struggle because the patch isn't perfect right off the bat, they're deemed too hard, or they clash with some pending "feature" is a pity to ...
Actually, I'm pretty happy reverting patches that cause regressions even if it *can* be "fixed for release". If there isn't a fix available within a day or two, it should get reverted. The "fix" can then be re-applying the *fixed* patch - and at that point we should strive to require the person who re-submits the patch (with fixes) having to have an Ack from the person who found the problem in the first place, so that it's verified to actually fix things! So I really would encourage people to send me emails like Please revert commit xyz, because it breaks abc, and there is no fix available even though this was reported x days ago. I have verified that revert just that change fixes the issue. and just make the "because it breaks abc" be specific and clear enough that I go "Ahh, ok, I'd better revert it". Also, please notice the latter part of the suggestion above: even if somebody has bisected down their problem to a specific commit, I really *do* want to hear that actually undoing the commit on top of the current tree acually fixes it again, because sometimes that just isn't the case - sometimes you end up having various interactions that means that reverting a commit might simply not even work. I have no trouble at all with reverting commits in general. I think regressions are serious. So the problematic cases are the cases where: - the commit no longer reverts cleanly, or just otherwise introduces other infrastructure that other commits that already got merged depend on. So sometimes you actually need a patch along with the revert (Andrew does that kind of thing anyway, since he works with patches regardless, so the "needs a patch" case is obviously not limited to just the problem cases) - more commonly: it's not entirely clear which commit actually caused the problem. but I *do* want to encourage people to revert (and ask other people to revert) much more aggressively. I personally try to revert ...
Ok. drivers/net/skge has been broken for several weeks. I have manually fixed the driver at each rc* release since then. Please revert skge changes, the commit that broke driver is 7fb7ac241162dc51ec0f7644d4a97b2855213c32 See http://bugzilla.kernel.org/show_bug.cgi?id=9321 for more information. I think Stephen Hemminger is working on the skge fix, but it has been several days since I've heard anything from him. -- Heikki Orsila Barbie's law: heikki.orsila@iki.fi "Math is hard, let's go shopping!" http://www.iki.fi/shd -
Adrian Bunk does (did?) this with 2.6.16.x, although it always seemed to me like an unrewarded one man show. AFAIK not even the big distros are begging for bugfix-only versions, as they too want to have (sell) new features. Mission critical systems might want to require such versions, Naah, I'm not really in favour of blaming someone. The kernel doesn't have Keeping track of the (number of) regressions / bugs each release seems to True. Implementing "only bugfixes from now on" (i.e. a longer freeze-window) would perhaps speed up the shifting a bit: $developer can still do $otherstuff all day long, but it won't get merged anyway, because True. But I just noticed that I have to distinguish between "flamewars" and "fierce discussions": if I'd imagine a room with ~50 developers/bystanders brainstorming on a issue like this (at the same time, without the wonderful delay of writing/sending an email), it'd feel much more uncomfortable. Christian. -- BOFH excuse #433: error: one bad user found in front of screen -
And congratulations to him for that. We almost entirely dropped 2.6.16, but there's a regression some time since then that makes large MMAPed files a major pain (specifically the dcc database clean takes about 5 minutes on 2.6.16 and about 12 hours on 2.6.20 or 2.6.23 series kernels) But we keep putting off writing a small testcase that can repeat the issue so we can bisect it - because it's working fine with 2.6.16 on that machine. Bron. -
Heh. I suspect you don't even need to bisect it. The big difference with large mmap'ed files is that later kernels will actually track dirty ratios for dirty mmap'ed pages. Earlier kernels never did. So in older kernels, you can dirty as much memory as you want, and the kernel will never try to write it back (well - "never" here means one of either (a) you ask it to with msync or (b) you run out of memory, when the kernel then totally falls down and the machine is essentially unusuable). So *if* the symptom seems to be that the later kernels do a lot more IO, then try to change /proc/sys/vm/dirty_[background_]ratio which is just a percentage of memory (defaults to 5% for background and 10% for foreground dirtying). Turn them both up a lot (say to 50 and 80 percent respectively) and see if that makes a difference. If so, you'll be the first one to officially even notice this change, I think. Linus -
From our sysctl.conf:
# This should help reduce flushing on Cache::FastMmap files
vm.dirty_background_ratio = 50
vm.dirty_expire_centisecs = 9000
vm.dirty_ratio = 80
vm.dirty_writeback_centisecs = 3000
So we've already been running those settings for a while. They didn't
help.
We also gave this thing its very own dedicated ServeRAID card and
associated RAID1 set of high speed SCSI drives (mainly because they
were just sitting there already attached to the machine and unused,
we don't love DCC that much) and it didn't help. Helped the rest of
the machine now that the system drive wasn't being pegged 100% for
12 hours a day, but it didn't speed things up any.
It was making some pretty random little scattered changes all through
that file. Hmm.. here's what the developers said about it:
First dbclean creates a new dcc_db file by copying from the old file.
As it copies, it decides whether each record is worth keeping.
That involves looking up the checksums in the old hash table. This
is as almost afast a simple /bin/cp if the old dcc_db and dcc_db.hash
files fit in RAM.
The dbclean creates a new dcc_db.hash file. This starts with
creating an empty new dcc_db.hash file. Then the new dcc_db and
dcc_db.hash files are mapped into memory, and dbclean creates pointers
to each checksum in the dcc_db file in the dcc_db.hash file.
While dbclean is running, dccd unmaps everything and tries to stay out
Yay for us. Thankfully it doesn't affect Cyrus's MMAP usage (read only
with direct seek and write calls to change anything, then remap) or we
would have suffered pretty badly!
Guess we'd better get on to figuring building a simple test app. The
mmap file that DCC uses is about 2Gb if that makes any difference:
-rw-r--r-- 1 dcc dcc 2035138560 Nov 15 00:15 dcc_db
-rw-r--r-- 1 dcc dcc 516612096 Nov 14 06:27 dcc_db.hash
The machine has 6Gb of memory and should be able to fit these
files fine:
[root@out1 hm]$ free
total ...Ok, so something else is up. If the mmap file is 2G, and you have 6G of RAM, you shouldn't be hitting the dirty limits with those setups. Of course, it may still be that some accounting thing is simply off, and Yeah, if you have something that others can see in action, that is sure going to get more people to look at it. That said - I'm sincerely hoping that you're not running on a 32-bit kernel. Because if so, those percentages are percentages of *normal* memory, not highmem (that got changed at one point after people ran out of lowmem). So even at 100% dirty limits, it won't let you dirty more than 1GB on the default 32-bit setup. Linus -
Side note: all of these are obviously still just heuristics. If you really *do* run on a 32-bit kernel, and you want to have the pain, I'm sure you can just disable the dirty limits with a one-liner kernel mod. And if it's useful enough, we can certainly expose flags like that.. Not that I expect that much anybody else will ever care, but it's not like it's wrong to expose the silly heuristics the kernel has to users that have very specific loads. That said, I still do hope you aren't actually using HIGHMEM64G. I was really hoping that the people who had enough moolah to buy >4GB of RAM had long since also upgraded to a 64-bit machine ;) Linus -
I'm afraid we are, which probably explains it. We have a bunch of 64 bit machines, but this particular machine is one of our somewhat more ancient IBM x235 machines. It's got stacks of fast SCSI drives and a couple of hyperthreading Xeons in it. Very nice machine in its day, and very reliable which is why we have kept them, even though at 6RU it chews through disk space. Unfortunately none of the 64 bit machines are world facing, and we're running HIGHMEM64G on a bunch of machines both for consistency value and because we only have one machine left with only 2Gb. I guess we'll be doing the one-liner kernel mod and testing that then. I'd certainly like to build a test case anyway so I'm not spending too much time rebooting that machine, it's also our outbound SMTP gateway. And I'll keep in mind finding a 64 bit capable machine for the role when I can. Thanks for the feedback on this - I'll come back with more details once we've done some testing, but this sounds likely, and I don't think DCC is going to change how it works, so we're stuck supporting it. Bron. -
The thing to look at is "get_dirty_limits()" in mm/page-writeback.c, and
in this particular case it's the
unsigned long available_memory = determine_dirtyable_memory();
that's going to bite you. In particular, note the
x -= highmem_dirtyable_memory(x);
that we do in determine_dirtyable_memory().
So in this case, if you basically remove that line, it will allow all of
memory to be dirtied (including highmem), and then the background_ratio
will work on the whole 6GB.
HOWEVER! It's worth noting that we also have some other old legacy cruft
there that may interfere with your code. In particular, if you look at the
top of "get_dirty_limits()", it *also* does a
unmapped_ratio = 100 - ((global_page_state(NR_FILE_MAPPED) +
global_page_state(NR_ANON_PAGES)) * 100) /
available_memory;
dirty_ratio = vm_dirty_ratio;
if (dirty_ratio > unmapped_ratio / 2)
dirty_ratio = unmapped_ratio / 2;
and that whole "unmapped_ratio" comparison is probably bogus these days,
since we now take the mapped dirty pages into account. That code harks
back to the days before we did that, and dirty ratios only affected
non-mapped pages.
And in particular, now that I look at it, I wonder if it can even go
negative (because "available_memory" may be *smaller* than the
NR_FILE_MAPPED|ANON_PAGES sum!).
We'll fix up a negative value anyway (because of the clamping of
dirty_ratio to no less than 5), but the point is that the whole
"unmapped_ratio" thing probably doesn't make sense any more, and may well
make the dirty_ratio not work for you, because you may have a very small
unmapped_ratio that effectively makes all dirty limits always clamp to a
very small value.
So regardless, I think you may want to try the appended patch *first*.
If this patch makes a difference, please holler. I think it's the correct
thing to do, but I'm not going to actually commit it ...I wondered about that part the other day when I went through the BDI
dirty code due to that iozone thing..
The initial commit states:
commit d90e4590519d196004efbb308d0d47596ee4befe
Author: akpm <akpm>
Date: Sun Oct 13 16:33:20 2002 +0000
[PATCH] reduce the dirty threshold when there's a lot of mapped
Dirty memory thresholds are currently set by /proc/sys/vm/dirty_ratio.
Background writeout levels are controlled by
/proc/sys/vm/dirty_background_ratio.
Problem is that these levels are hard to get right - they are too
static. If there is a lot of mapped memory around then the 40%
clamping level causes too much dirty data. We do lots of scanning in
page reclaim, and the VM generally starts getting into distress. Extra
swapping, extra page unmapping.
It would be much better to simply tell the caller of write(2) to slow
down - to write out their dirty data sooner, to make those written
pages trivially reclaimable. Penalise the offender, not the innocent
page allocators.
This patch changes the writer throttling code so that we clamp down
much harder on writers if there is a lot of mapped memory in the
machine. We only permit memory dirtiers to dirty up to 50% of unmapped
memory before forcing them to clean their own pagecache.
BKrev: 3da9a050Mz7H6VkAR9xo6ongavTMrw
But because dirty mapped pages are no longer special, I'd say the reason
for its existance is gone. So,
Acked-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
As for the highmem part, that was due to buffer cache, and unfortunately
that is still true. Although maybe we can do something smart with the
-
Something like this ought to do I guess. Although my
mapping_is_buffercache() is the ugliest thing. I'm sure that can be done
better.
Uncompiled, untested
Not-Signed-off-by: Peter Zijlstra <a.p.zijlstra@chello.nl>
---
mm/page-writeback.c | 28 ++++++++++++++++++++--------
1 file changed, 20 insertions(+), 8 deletions(-)
Index: linux-2.6/mm/page-writeback.c
===================================================================
--- linux-2.6.orig/mm/page-writeback.c
+++ linux-2.6/mm/page-writeback.c
@@ -280,27 +280,28 @@ static unsigned long highmem_dirtyable_m
#endif
}
-static unsigned long determine_dirtyable_memory(void)
+static unsigned long determine_dirtyable_memory(int highmem)
{
unsigned long x;
x = global_page_state(NR_FREE_PAGES)
+ global_page_state(NR_INACTIVE)
+ global_page_state(NR_ACTIVE);
- x -= highmem_dirtyable_memory(x);
+ if (!highmem)
+ x -= highmem_dirtyable_memory(x);
return x + 1; /* Ensure that we never return 0 */
}
static void
get_dirty_limits(long *pbackground, long *pdirty, long *pbdi_dirty,
- struct backing_dev_info *bdi)
+ struct backing_dev_info *bdi, int highmem)
{
int background_ratio; /* Percentages */
int dirty_ratio;
int unmapped_ratio;
long background;
long dirty;
- unsigned long available_memory = determine_dirtyable_memory();
+ unsigned long available_memory = determine_dirtyable_memory(highmem);
struct task_struct *tsk;
unmapped_ratio = 100 - ((global_page_state(NR_FILE_MAPPED) +
@@ -346,6 +347,16 @@ get_dirty_limits(long *pbackground, long
}
}
+static inline int mapping_is_buffercache(struct address_space *mapping)
+{
+ struct super_block *sb = mapping->host->i_sb;
+
+ if (sb && sb->s_bdev && sb->s_bdev->bd_inode->i_mapping != mapping)
+ return 0;
+
+ return 1;
+}
+
/*
* balance_dirty_pages() must be called by processes which are generating dirty
* data. It looks at the number of dirty pages in the machine and will force
@@ -364,6 +375,7 @@ ...No, this absolutely sucks. Why? It's totally unacceptable to have per-mapping notions of how much memory we have. We used to do *exactly* that, and it's idiocy. The reason it's unacceptable idiocy is that it means that two processes that access different files will then have *TOTALLY*DIFFERENT* notions of what the "dirty limit" is. And as a result, one process will happily write lots and lots of dirty stuff and never throttle, and the other process will have to throttle all the time - and clean up after the process that didn't! See? The fact is, because we count dirty pages as one resource, we must also have *one* limit. So this patch is a huge regression. You might not notice it, because if everybody writes to the same kind of mapping, nobody will be hurt (they all have effectively the same global limit anyway), but you *will* notice if you ever have two different values of "highmem". Unacceptable. We used to do exactly what your patch does, and it got fixed once. We're not introducing that fundamentally broken concept again. Linus -
Agreed, I was just about to send out an email saying that.. -
Say all buffer cache users were against default_backing_dev_info, and we'd give default_backing_dev_info less, that should work out, right? ( I'm not yet clear on if buffer cache already uses default_backing_dev_info or not, bdget() seems to suggest it does ) -
Examples of non-broken solutions:
(a) always use lowmem sizes (what we do now)
(b) always use total mem sizes (sane but potentially dangerous: but the
VM pressure should work! It has serious bounce-buffer issues, though,
which is why I think it's crazy even if it's otherwise consistent)
(c) make all dirty counting be *purely* per-bdi, so that everybody can
disagree on what the limits are, but at least they also then use
different counters
So it's just the "different writers look at the same dirty counts but then
interpret it to mean totally different things" that I think is so
fundamentally bogus. I'm not claiming that what we do now is the only way
to do things, I just don't think your approach is tenable.
Btw, I actually suspect that while (a) is what we do now, for the specific
case that Bron has, we could have a /proc/sys/vm option to just enable
(b). So we don't have to have just one consistent model, we can allow odd
users (and Bron sounds like one - sorry Bron ;) to just force other, odd,
but consistent models.
I'd also like to point out that while the "bounce buffer" issue is not so
much a HIGHMEM issue on its own (it's really about the device DMA limits,
which are _independent_ of HIGHMEM, of course), the reason HIGHMEM is
special is that without HIGHMEM the bounce buffers generally work
perfectly fine.
The problem with HIGHMEM is that it causes various metadata (dentries,
inodes, page struct tables etc) to eat up memory "prime real estate" under
the same kind of conditions that also dirty a lot of memory. So the reason
we disallow HIGHMEM from dirty limits is only *partly* the per-device or
mapping DMA limits, and to a large degree the fact that non-highmem memory
is special in general, and it is usually the non-highmem areas that are
constrained - and need to be protected.
Linus
-
Final note on this (promise): I'd really be very interested to hear if the patch I *do* think makes sense (ie the removal of the old "unmapped_ratio" logic) actually already solves most of Bron's problems. It may well be that that unmapped_ratio logic effectively undid the system configuration changes that Bron has done. It doesn't matter if Bron has >From our sysctl.conf: # This should help reduce flushing on Cache::FastMmap files vm.dirty_background_ratio = 50 vm.dirty_expire_centisecs = 9000 vm.dirty_ratio = 80 vm.dirty_writeback_centisecs = 3000 if it turns out that the "unmapped_ratio" logic turns the 80% back down to 5%. It may well be that 80% of the non-highmem memory is plenty good enough! Sure, older kernels allowed even more of memory to be dirty (since they didn't count dirty mappings at all), but we may have a case where the fact that we discount the HIGHMEM stuff isn't the major problem in itself, and that the dirty_ratio sysctl should be ok - but just gets screwed over by that unmapped_ratio logic. So Bron, if you can test that patch, I'd love to hear if it matters. It may not make any difference (maybe you don't actually trigger the unmapped_ratio logic at all), but I think it has the potential for being totally broken for you. People that don't change the dirty_ratio from the default values would generally never care, because the default dirty-ratio is *already* so low that even if the unmapped_ratio logic triggers, it won't much matter! Linus -
I think that (c) is doable. If its worth the effort, who knows, apparently there still are people using 32bit kernels on boxen with But this problem is already an issue, Anton recently had a case where a 12GB highmem box locked up due to NTFS running out of lowmem - or something like that. And I think that with the targeted slab reclaim (or slab defrag as its apparently still called) we can properly fix this side of the problem. I think Rik was looking into doing so. -
Yeah. I always considered HIGHMEM to just be unusable. It's ok for extending to 2-4GB (ie HIGHMEM4G, not 64G), and it's probably borderline usable for 4-8G if you are careful. But quite frankly, I refuse to even care about anything past that. If you have 12G (or heaven forbid, even more) in your machine, and you can't be bothered to just upgrade to a 64-bit CPU, then quite frankly, *I* personally can't be bothered to care. That's my personal opinion, and I realize that some of the commercial vendors may care about their insane customers' satisfaction, but I'm simply not interested in insane users. If they have that much RAM (and bought it a few years ago when a 64-bit CPU wasn't an option), they can't be poor. So the _only_ explanation today for 12GB on a 32-bit machine is (a) insanity or (b) being so lazy as to not bother to upgrade and in either case, my personal reaction is "I'm *not* crazy, and yes, I'm lazy too, and I can't give a rats *ss about those problems". HIGHMEM was a mistake in the first place. It's one that we can live with, but I refuse to support it more than it needs to be supported. And 12GB is *way* past the end of what is worth supporting. Linus -
How about... c) they bought it at the beginning of a project and are stuck with it because they aren't getting any more money for hardware d) they've shipped it to the field and have to support it We've got some 32-bit 8GB boxes for which both of these would hold true. Chris -
Still not enough of a reason for me to care. Remember - I'm the guy who refused to merge RH's 4G:4G patches because I thought they were an unsupportable nightmare. I care a lot about future supportability, and HIGHMEM is there purely as a temporary wart and blip on the screen. I did acknowledge that others may care more, but the fact is, I suspect that it's going to be cheaper to literally buy and ship a new machine to a customer than to really "suppport" it in any other form. Side note: HIGHMEM64G works perfectly fine with 12GB of RAM under *limited*loads*. If your customer does certain well-defined and simple things that don't put huge and varied loads on the VFS or VM layer, then 12GB+ is probably fine regardless. Linus -
Just around the corner... $ ftp ftp Connected to ftp.gwdg.de. 220-==================================================================== 220-Gesellschaft fuer wissenschaftliche Datenverarbeitung mbH Goettingen 220-==================================================================== 220-This is a Linux PC (Dell PE-2650, 2 CPUs P4/2800, 12 GB RAM) 220-running SuSE-Linux-8.2 with SuSE kernel 2.4.20-64GB-SMP. There is no reason to upgrade the hardware - if it works, hey good then. And I am pretty sure that a few 2 GB sticks are cheaper than a big opteron (if you only go by that). It sure is now - and probably even back then. -
On Thu, Nov 15, 2007 at 01:14:32PM -0800, Linus Torvalds wrote: Sorry about not replying to this earlier. I actually got a weekend away from the computer pretty much last weekend - took the kids swimming, helped a friend clear dead wood from around her house Hey, if Andrew Morton can tell us we find all the interesting bugs, you can call me odd. I've been called worse! We also run ReiserFS (3 of course, I tried 4 and it et my laptop disk) on all our production IMAP servers. Tried ext3 and the performance was so horrible that our users hated us (and I hated being woken in the night by things timing out and paging me). And I'm spending far too long still writing C thanks to Cyrus having enough bugs to keep me busy for the rest of my natural life if I don't break and go write my own I'm going to finish off writing a decent test case so I can reliably reproduce the problem first, and then go compile a small set of kernels with the various patches that have been thrown around here and see if they solve the problems for me. Thankfully I don't have the same problem you do Linus - I don't care if any particular patch isn't consistent - isn't fair in the general sense - even "doesn't work for anyone else". So long as it's stable and it works on this machine I'm happy to support it through the next couple of years until we either get a world facing 64 bit machine with the spare capacity to run DCC or we drop DCC. The only reason to upgrade the kernel there at all is keeping up-to-date with security patches, and the relative tradeoffs of backporting (or expecting Adrian Bunk to keep doing it for us) rather than maintaing a small patch to keep the behaviour of one thing we like. And to all of you in this thread (especially Linus and Peter) - thanks heaps for grabbing on to a throw away line in an unrelated discussion and putting the work in to: a) explain the problem and the cause to me before I put in heaps of work tracking it down; and b) putting together some ...
A 32 bit machine with HIGHMEM64 enabled running DCC has an MMAPed file
of approximately 2Gb size which contains a hash format that is written
"randomly" by the dbclean process. On 2.6.16 this process took a few
minutes. With lowmem only accounting of dirty ratios, this takes about
12 hours of 100% disk IO, all random writes.
This patch includes some code cleanup from Linus and a toggle in
/proc/sys/vm/dirty_highmem which can be set to 1 to add the highmem
back to the total available memory count.
Signed-off-by: Bron Gondwana <brong@fastmail.fm>
Index: linux-2.6.23.8-reiserfix-fai-vmdirty/mm/page-writeback.c
===================================================================
--- linux-2.6.23.8-reiserfix-fai-vmdirty.orig/mm/page-writeback.c 2007-11-22 01:48:20.000000000 +0000
+++ linux-2.6.23.8-reiserfix-fai-vmdirty/mm/page-writeback.c 2007-11-22 02:42:04.000000000 +0000
@@ -70,6 +70,12 @@ static inline long sync_writeback_pages(
int dirty_background_ratio = 5;
/*
+ * free highmem will not be subtracted from the total free memory
+ * for calculating free ratios if vm_dirty_highmem is true
+ */
+int vm_dirty_highmem;
+
+/*
* The generator of dirty data starts writeback at this percentage
*/
int vm_dirty_ratio = 10;
@@ -153,7 +159,8 @@ static unsigned long determine_dirtyable
x = global_page_state(NR_FREE_PAGES)
+ global_page_state(NR_INACTIVE)
+ global_page_state(NR_ACTIVE);
- x -= highmem_dirtyable_memory(x);
+ if (!vm_dirty_highmem)
+ x -= highmem_dirtyable_memory(x);
return x + 1; /* Ensure that we never return 0 */
}
@@ -163,20 +170,12 @@ get_dirty_limits(long *pbackground, long
{
int background_ratio; /* Percentages */
int dirty_ratio;
- int unmapped_ratio;
long background;
long dirty;
unsigned long available_memory = determine_dirtyable_memory();
struct task_struct *tsk;
- unmapped_ratio = 100 - ((global_page_state(NR_FILE_MAPPED) +
- global_page_state(NR_ANON_PAGES)) * 100) /
- available_memory;
-
...Just to verify - can you confirm that this "just fixes it" for you? I think this is the right approach to take, and seems very safe (ie people who know that their loads are ok can just set the flag), but I do want to verify that there was nothing else going on, and that you now see the same performance as you did in 2.6.16? The other alternative, of course, would be to simply allow the dirty percentages to be > 100%, but that's just *odd* ;) Linus -
Yes, toggling dirty_highmem "just fixes it" in all our tests. I hadn't tested it on the production machine yet - but I'm just installing it there now since it's been running fine on a less important machine for a few days now. I did wonder about allowing the dirty percentage to go way up, but that would have cause "this one goes up to 110%" comments in the sysctl limits code and people would have thought I was childish. Can't have that. Much better to have "int one = 1" instead. Bron. -
Actually, I'm confused now. Maybe I chose a bad name to begin with. Does it mean "I am allowed to dirty high memory" or "my high memory will be dirty if this is on"? Hmm... I'm even having trouble articulating what's odd about it. I guess my internal model was: "if this flag is set then you are allowed to make high memory dirty without needing to flush it immediately", which is why I made it that way around. No - you're wrong. My patch _did_ include high memory in the dirty This removes the high memory from the total count. I think I got it right. If dirty_highmem is set to true, then don't subtract highmem from the total memory count before calculating the percentages. That's what I meant, and that's what the toggle did. Removed the subtraction. -- Bron Gondwana brong@fastmail.fm -
But we're always allowed to dirty highmem - there'd be no point in having it otherwise. Hence the term dirty_highmem is confusing. umm, really you want /proc/sys/vm/dont-account-highmem-in-dirty-memory-calculations, only shorter. Do you agree? If so, then it's still not a very pleasing interface - setting something to "true" to disable a particular piece of kernel behaviour implies a single negation which we don't really need. It would be simpler to have /proc/sys/vm/do-account-highmem-in-dirty-memory-calculations, defaulting to "true" - this has no negations. So... how about /proc/sys/vm/, umm. <looks at inbox, brain explodes> OK, I give up. Please see if you can think of something less confusing which involves no negations? Thanks. -
I still read dirty_highmem as:
/proc/sys/vm/do-account-highmem-in-dirty-memory-calculations
Well, the particular piece of kernel behaviour is already a negative:
"decrease the amount of memory allowed to get dirty so we never dirty
more than a percentage of available lowmem"
So what this flag is saying is:
"DON'T decrease the amount of memory allowed to get dirty down to just
the lowmem - dirty a percentage of total available including highmem"
As Linus said - the alternative of allowing more than 100% of lowmem
to be dirty is just plain too wierd, hence this approach of allowing
No, that's not true. The whole point is that between 2.6.16 and
2.6.20 the kernel behaviour changed. It currently doesn't count
highmem in dirty memory calculations, which is why the memory pressure
appears to be so great when actually there's still 4Gb of unused
memory in the box.
/proc/sys/vm/do-account-highmem-in-dirty-memory-calculations would
default to "false" to get the current behaviour post-2.6.16 kernels.
Setting a flag to make it true would stop the kernel subtracting
highmem from the available count, giving the "old" behaviour of
allowing a percentage of all the memory in the system to be dirty
I've spent a while thinking about this, and looking at the code.
I think this might be slightly clearer:
/proc/sys/vm/highmem_is_dirtyable - defaults to false
Here's how it would look in the code:
static unsigned long determine_dirtyable_memory(void)
{
unsigned long x;
x = global_page_state(NR_FREE_PAGES)
+ global_page_state(NR_INACTIVE)
+ global_page_state(NR_ACTIVE);
if (!vm_highmem_is_dirtyable)
x -= highmem_dirtyable_memory(x);
return x + 1; /* Ensure that we never return 0 */
}
I think that's very clear.
"Unless highmem is dirtyable, subtract the otherwise dirtyable
pages from the total dirtyable memory count if they are in
highmem"
Or without ...Well, it was quick enough to just do - here's the patch. I've also updated the documentation a bit to clarify the intention and the reasons why you might want to use it (based in part on the comments to the original change that made highmem uncountable for dirtyness purposes) Tested and applied against 2.6.23.9 (our build script makes Debian packages from a clean unpack of kernel major plus patch minor plus svn checkout of out quilt series and apply regardless, so it was just as easy to bump the version number while I was at it). Builds, boots, passes a quick run of the test program I used last time around. Bron. Add vm.highmem_is_dirtyable toggle A 32 bit machine with HIGHMEM64 enabled running DCC has an MMAPed file of approximately 2Gb size which contains a hash format that is written randomly by the dbclean process. On 2.6.16 this process took a few minutes. With lowmem only accounting of dirty ratios, this takes about 12 hours of 100% disk IO, all random writes. This patch includes some code cleanup from Linus and a toggle in /proc/sys/vm/highmem_is_dirtyable which can be set to 1 to add the highmem back to the total available memory count. Signed-off-by: Bron Gondwana <brong@fastmail.fm> Index: linux-2.6.23.8-reiserfix-fai-vmdirty/mm/page-writeback.c =================================================================== --- linux-2.6.23.8-reiserfix-fai-vmdirty.orig/mm/page-writeback.c 2007-11-21 21:58:20.000000000 -0500 +++ linux-2.6.23.8-reiserfix-fai-vmdirty/mm/page-writeback.c 2007-11-27 07:27:51.000000000 -0500 @@ -70,6 +70,12 @@ int dirty_background_ratio = 5; /* + * free highmem will not be subtracted from the total free memory + * for calculating free ratios if vm_highmem_is_dirtyable is true + */ +int vm_highmem_is_dirtyable; + +/* * The generator of dirty data starts writeback at this percentage */ int vm_dirty_ratio = 10; @@ -153,7 +159,10 @@ x = global_page_state(NR_FREE_PAGES) + global_page_state(NR_INACTIVE) + ...
mmap: mmap call failed: errno: 12 errmsg: Cannot allocate memory Yep, that's "fixed" the problem alright! No way this puppy is dirtying 2Gb of memory any more. http://linux.brong.fastmail.fm/2007-11-22/bmtest.pl That said, pushing the size down to 1700 rather than 2000 in that file makes it run, and the behaviour matches the 2000 Mb case on 2.6.16.55 rather than 2.6.20.20 or 2.6.23.1 (my other test case kernels that happened to be pre-built on that machine) [root@lb1 ~]$ free total used free shared buffers cached Mem: 4149836 2073056 2076780 0 22036 1846096 -/+ buffers/cache: 204924 3944912 Swap: 2096472 0 2096472 That's after running the 1700Mb version. You can see this machine is our one remaining 4Gb machine (it's not running any production services unlike the 6Gb machine, so it's better for testing) Anyway - looks like this may be a "good enough" solution for out1 if it can manage an ~2Gb file with 6Gb of memory available. I'll test that later today - but I should drag myself into the office now... Bron. -
Alternatively perhaps I'm just a moron who used a config file with:
CONFIG_PAGE_OFFSET=0x80000000 set to build the new kernel (I hadn't
committed it because it turned out not to solve the issue it was
there for). That would explain a few things.
[root@lb1 perl]$ free
total used free shared buffers cached
Mem: 4150620 2272284 1878336 0 11212 2066536
-/+ buffers/cache: 194536 3956084
Swap: 2096472 0 2096472
That's more the usage I would expect to see.
Now for the downside. It works again, but it still runs slow. Seems to
hit (and this is totally unscientific, I'm just watching the numbers
scroll by) at about 120000 writes rather than 70000 writes, but that's
still not fitting the while file dirty.
I notice that PF_LESS_THROTTLE gets set by nfsd to get an extra 25%
bonus free space allocated. Potentially dcc could use similar tricks
to claim extra space if that knob is available up in userspace. I'm
happy to patch dcc as well if I have to, I'm already backporting it,
so adding another little quilt directory and applying it is pretty
trivial (must try guilt/stgit one of these days)
Bron.
-
Strongly agree. This is exactly what happened to that ARM NO_HZ bug report. The report in bugzilla was rather lacking (and wrong) in ways that have already been described. HPET on ARM? 8) Then on the morning of 6th November, someone reported on the mailing list that "pxa270 doesn't work with oneshot timer" and that was the trigger to getting the bug resolved - because it was a narrowly defined bug report. Since it was a narrowly defined bug report, it became very easy to investigate and resolve. About half an hour of time for an initial patch. There's another issue I want to raise concerning bugzilla. We have the classic case of "not enough people reading bugzilla bugs" - which is one of the biggest problems with bugzilla. Virtually no one in the ARM community looks for ARM bugs in bugzilla. Let's not forget that it would be a waste of time for people to manually check bugzilla for ARM bugs. There's soo few people reporting ARM bugs into bugzilla that a weekly manual check by every maintainer would just return the same old boring results for months and months at a time. It would be far more productive if the ARM category was deleted from bugzilla and the few people who use bugzilla reported their bugs on the mailing list. We've a couple of thousand people on the ARM kernel mailing list at the moment - that's 3 orders of magnitude more of eyes than look at bugzilla. (I'm not saying that if the ARM NO_HZ bug as reported in bugzilla had been reported on the correct mailing list would've been solved earlier; I doubt there'd be much difference. However, the probability of a question being asked of the reporter would've been much higher, and _that_ might have led to an earlier resolution.) -- Russell King Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: -
I screen all bugzilla reports. 100% of them. - I'll try to establish whether it is a regression - I'll solicit any extra information which I believe the reveloper will need - I'll ensure that an appropriate developer has seen the report Is that linux-arm-kernel@lists.arm.linux.org.uk? If so, MANITAINERS claims that it is subscribers-only. That would cause some bug reporters to give up and go away. -
that just because you do this everyone in a select clique, who you include me in, should be doing this as well. Find some other mailing list; I'm not hosting *nor* am I willing to run a non-subscribers only mailing list. Period. Not negotiable, so don't even try to change my mind. -- Russell King Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: -
No, I don't mean that at all and this was very plainly obviously from my very clearly written email. Let me try again. No, no subsystem developer needs to monitor new bugzilla reports. This is because *I do it for them*. I will actively make them aware of new reports which I believe are legitimate and which contain sufficient information for Making a list subscribers-only will cause some bug reports to be lost. Tradeoffs are involved, against which decisions must be made. You have made yours. -
From: Andrew Morton <akpm@linux-foundation.org> Russell doesn't have to worry any more, he doesn't have to host it, and he doesn't have to be willing to run a non-subscribers-only mailing list. Because I am. I've created linux-arm@vger.kernel.org Enjoy. -
Let me just say - I'm astonished at how little spam gets though the vger lists. Considering how many times those email addresses must have been added to spam databases. It must be a lot of work, and whoever is doing it does it well. I don't even know. Is it Matti? You? <contemplates linux-kernel@lists.sourceforge.net. Shudders.> -
From: Andrew Morton <akpm@linux-foundation.org> Matti gets all the credit for setting up the bayesian et al. Yes, sourceforge is a complete joke. -
Martin's changed the owner for ARM bugs last night to the mailing list so the whole issue is now redundant. -- Russell King Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: -
By doing so you've just said (implicitly) that you can not tolerate someone having a different opinion from your own. While I accept *your* right to run *your* lists how you please, you are unable to accept *my* right to run *my* lists how I see fit. Time will tell which lists will survive. Whatever, I suspect that by doing what you've just done, you're going to create more confusion and problems. Instead of having one focused place for discussions and bug reports, they're going to be spread more thinly, meaning less people looking at such things, meaning more bugs get ignored. Thus making the issue worse. So, when are you creating a replacement alsa-devel mailing list on vger? That's also subscribers-only. -- Russell King Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: -
From: Russell King <rmk+lkml@arm.linux.org.uk> I created a mailing list on a machine where I provide such services. I didn't tell you to take your list down or to run it in some other way. I didn't tell you to unsubscribe everyone and move them over to the new list either. I've provided an alternative, and people can pick and choose how they see fit. I'm letting natural selection run it's course. Are you able to cope with the fact that people might not want to use your list any longer? Perhaps that is what bugs you so much about my The operative term is "alternative" rather than "replacement". Perhaps this misunderstanding is what you're so upset about. And yes, that alsa list bugs the crap out of me too. I'm more than happy to provide an alternative for that one as well. In fact, *poof*, there it is, linux-alsa@vger.kernel.org is there and available for anyone who wants to use it. Have a nice day Russell. -
On 14-11-07 11:07, David Miller wrote: alsa-devel@alsa-project.org is not subscriber-only. Same as that arm list, it's _moderated_ for non-subscribers and given that I and other moderators have been doing our best to moderate quickly (I tend to stay logged in to the moderation interface all day for example) what specifically bugged the crap out of you? It's not something a poster needs to concern himself with. Also for alsa-devel the moderators tend to add any valid non-subcribers to a whitelist after landing in the queue the first time meaning even a delay is just a one-time thing normally. So what's the trouble? Basically, noone Not that I think that moving alsa-devel over to vger wouldn't be a good idea mind you; when the list moved from sourceforge, asking you to host it was my preferred option. I do somewhat suspect that Jaroslav would like to keep the alsa-devel@ name (and I'd like to ask you to then also host alsa-user@) and would then rewrite mail to those lists @alsa-project.org to vger. But what is the problem you speak of with the alsa-devel list? While I would not mind loosing it, moderation hasn't been overly laborious and I'm not aware of any serious problems. Rene. -
From: Rene Herman <rene.herman@keyaccess.nl> The fact that it farts at me every time I post to this thread. That's That sucks for new people taking part in the conversation. There is no reason for moderation at all, it isn't necessary for spam prevention and it does nothing but annoy new posters and make work for the moderator. -
From: David Miller <davem@davemloft.net> See? I got another one and I have received at least 10 of the following over the past 2 days. That's rediculious. And because a human adds the whitelist this is always going to happen to someone when they start posting to the alsa list for the first time. /me gets ready for the 11th copy in response to this one... -------------------- Subject: Your message to Alsa-devel awaits moderator approval From: alsa-devel-bounces@alsa-project.org To: davem@davemloft.net Date: Wed, 14 Nov 2007 12:57:06 +0100 Sender: alsa-devel-bounces@alsa-project.org Your mail to 'Alsa-devel' with the subject Re: [alsa-devel] [BUG] New Kernel Bugs Is being held until the list moderator can review it for approval. The reason it is being held: Too many recipients to the message Either the message will get posted to the list, or you will receive notification of the moderator's decision. If you would like to cancel this posting, please visit the following URL: http://mailman.alsa-project.org/mailman/confirm/alsa-devel/12dd3bd077bbf9cd142f214beae... -
Nah, in this case you are not even getting them to not being a non-subcriber but due to too many CCs. I got one as well. That just needs to be disabled, does not have anything to do with non-subscribers (and you're in the white list) but is just a retarted bit of list configuration... (no, I can't personally change it, needs Jaroslav Kysela) Rene. -
At Wed, 14 Nov 2007 04:01:31 -0800 (PST), ... if you give too many recipients in your post. That is often really annoying thing to me, together with keeping the unrelated subject line ;) I personally don't care whether it's a moderated or open list. We chose it simply due to too bad S/N ratio at that time. So, if the current list annoys your or many others and the list management on vger is so good, it'd be basically a good move, of course. I'll appreciate it. The only confusion would be the change of ML address, but we can do it slowly, too. Takashi -
I'd love the lists at vger. Amazing spam-filtering. I'd like to request the name alsa-devel@vger.kernel.org (and alsa-user@vger.kernel.org if at all possible so we can open that one up as well) though. There wouldn't need to be a forced ML address change if Jaroslov would then just rewrite alsa-{devel,user}@alsa-project.org to vger.kernel.org same as he did for alsa-devel and does for alsa-user to @lists.sf.net. Rene. -
At Wed, 14 Nov 2007 13:21:30 +0100, I think alsa-user can stay as is. It's no place for dragging many other addresses like alsa-devel. If it works, then I'm for it, too. thanks, Takashi -
From: Takashi Iwai <tiwai@suse.de> That's fine with me, I've changed it alsa-devel@vger.kernel.org -
It certainly is. I only experienced that now due to the "too many recipients to message" moderation notice that I got from my own message. Jaroslav -- please disable that junk or if possible, make it a "at most once Yes there is. It's necessary for lists that do not have the human and other resouces behind it that vger does. alsa-devel was drowning in spam and dying as a result back when it was at sourceforge. Upon moving, my preference was to ask the lists to be hosted at vger but given that (it seems) Jaroslav wanted to keep them locally, moderation was very necessary. I moderate out quite a bit of spam every day. vger is doing an amazing job at spam filtering -- if it's an option to move to vger, than sure, no need. But otherwise, the "no need" needs a list admin with enough bandwidth and skill. As to the "new people": it's not optimal, but (upto this thread I'll admit -- I woke up to a huge number of posts in the queue) it's not been a _real_ problem. alsa-devel is not high-volume enough for it to be. Rene. -
Totally unrelated - I sent something to the kolab mailing list a couple of days ago (it's moderated for non subscribers) informing them that I had found the cause of some Cyrus bugs that they had problems with in the past and providing a link to my post to the cyrus list with the patches attached. It sat in the moderation queue and then was rejected with "non subscriber post to subscription only list". Not only was the reponse a day later when I had moved on to other things, but it got me really pissed off that I had put some effort into providing a good quality post that outlined the specific issues and how they applied to their project, and had been summarily dismissed, probably without the effort being put in. There's no way for a non-subscriber to know in advance if the list they are trying to post to will do that to them, completely negating the effort put in to writing something worthwhile to inform that community. It's insular, and it sucks. So yeah, my attitude now is that the Kolab folks can go screw themselves and track down the fix on their own or wait until I've convinced upstream to accept the fixes (likely) and they have moved to the new version (unlikely for a long time, and meanwhile they're missing out on the performance increases that having a more stable skiplist library would give them) I'm sure if I had something that I considered worth informing the ALSA project of, I'd be wary of spending the same effort writing a good post knowing it may be dropped in between the by a list moderator just selecing all and bouncing them. Bron. -
Totally unrelated indeed so why are spouting crap? If the kohab list has a problem take it up with them but keep ALSA out of it. alsa-devel has only ever moderated out spam -- nothing else. ene -
As an outsider to the list, how do I know what your policy will be other than "I've been rejected out of hand by someone else's list, so my experience is that member only lists aren't willing to listen to something I have to say unless I make the effort to sign up and have yet another folder accumulating unread messages". I don't. Well, ok - maybe I do here since I've let myself be dragged in to the debate. Oops. I get the same information from both project websites: "moderated for non-members, public archives" - no way of knowing that ALSA will accept me informing them of something they would be interested without committing to reading or bit-bucketing their list. The alternative is to subscribe just long enough to send something and then unsubscribe again.... or cold-email a member and ask them to pass a message along. Or post and hope it doesn't get rejected, not even knowing for a day or so. Bron. -
Can you please just shelve this crap? You have a way of knowing that "ALSA will accept you" and that is knowing or assuming that the ALSA project doesn't consist of drooling retards. When a project list goes to the difficulty of moderating non-subscribers it has made the explicit choice to _not_ become subscriber only. Then refusing valid non-subscribers after all makes no sense whatsoever. I'm sorry you got your feelings hurt by that other list but it was no doubt an accident; take it up with them. Rene. -
Well, my experience with moderation has been that moderated mails are stuck in some queue for weeks. Two seperate lists, neither of them was alsa. If also is doing a better job, great. But it still has to live Been there, done that. In spite of people not being drooling retards, the amount of time and effort they invest into either moderation or improving the ruleset is quite limited. Problems persist. And even without mails being held hostage for weeks, every single moderation mail is annoying. Like the one I'm sure to receive after sending this out. Jörn -- Joern's library part 5: http://www.faqs.org/faqs/compression-faq/part2/section-9.html -
Certainly. Upto this thread I wasn't actually aware the list was doing that. While it might be informative once, getting it each time quickly gets old. Don't know if mailman can do anything like it but I'd suggest anyone running a non-subscriber-moderation list configure it to send such messages at most once a <time-period> per address or some such. And just disable the message if it cannot do that. Fortunately, alsa-devel is (almost) no longer such a list anyway as it's moving to vger. Hurrah. David -- thanks. Rene. -
That is incorrect. Hopefully it is the case now though, since my experience of the subject was years ago. OG. -
At Thu, 15 Nov 2007 14:17:27 +0100, Yeah, it was really years ago that we once switched to the open list. Funny that people never forget such a thing :) Takashi -
Hi Dave, * David Miller <davem@davemloft.net> [071114 02:09]: Can you please use your *poof* trick one more time to set up linux-omap@vger.kernel.org? We've (as in linux-omap community) would like to move from subscriber only list at linux-omap-open-source@linux.omap.com to vger as we're starting to get more patches and comments on LKML. For related discussion on linux-omap-open-source@linux.omap.com, see [1]. Regards, Tony [1] http://linux.omap.com/pipermail/linux-omap-open-source/2007-November/011980.html -
Thanks! Tony -
If you screen all bugzilla reports then you'll know that bug #9356 arrived at about 1400 GMT yesterday. It's hardly surprising then that your utterly crappy responses to Natalie's message (which, incidentally, wasn't copied On the whole you do an excellent job with feeding the bug reports to people, and while I recognise that you're only human, things do occasionally go wrong. For instance, sending clearly marked Samsung S3C bugs to me rather than Ben Dooks (who's in MAINTAINERS for those So how are they lost when they're held in a moderation queue and are either accepted, a useful response given to the original poster, or are forwarded to someone who can deal with the issue. I don't think "subscribers only" describes my lists - we don't devnull stuff just because the poster is not a subscriber. -- Russell King Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: -
Well whatever, sorry. But this is in the noise floor. Point is: many bug Oh, OK, as long as there really is a human paying attention to those things then that's fine. When one is on the sending end of these things one never knows how long it will take, not whether it will even happen. -
The postmasters at vger is pretty good at running mailing lists. For linux-kbuild my effort so far has been to request it. Thats not a big deal. So if they accept it you could have linux-arm@vger.kernel.org for zero overhead for you. Sam -
From: Sam Ravnborg <sam@ravnborg.org> I already did, get a little deeper in your mailbox before replying :-) -
What about having all ARM bugs in Bugzilla by default assigned to
cu
Adrian
[1] Either directly or through a pseudo address, but that's just a
technical detail.
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
-
That would also work, probably much better than setting up yet another list. My experience of trying to get mbligh to do this when I stopped looking after PCMCIA stuff was *extremely* painful. Wonder if it's become any easier of late? -- Russell King Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: -
cpufreq (at least) does it this way. I don't know how well it is turning out in practice. It's useful if the initial report makes it clear (ie; to me) that the report has already gone to a mailing list so I don't go and forward a duplicate. He's a bad, bad man ;) But he's been turning these things around pretty rapidly lately. -
yes, yes, yes, and i agree with you that there is a problem. I tried to make this point at the 2007 KS: not only is degradation in quality not apparent for years, slow degradation in quality can give kernel developers the exact _opposite_ perception! (Fewer testers means fewer bugreports and that results in apparent "improved" quality and fewer reported regressions - while exactly the opposite is happening and testers are leaving us without giving us any indication that this is happening. We just dont notice.) I'm not moaning about bugs that slip through - those are unavoidable facts of a high flux codebase. I'm moaning about reoccuring, avoidable bugs, i'm moaning about hostility towards testers, i'm moaning about hostility towards automated testing, i'm moaning about unnecessary hoops a willing (but unskilled) tester has to go through to help us out. I tried to make the point that the only good approach is to remove our current subjective bias from quality metrics and to at least realize what a cavalier attitude we still have to QA. The moment we are able to _measure_ how bad we are, kernel developers will adopt in a second and will improve those metrics. Lets use more debug tools, both static and dynamic ones. Lets measure tester base and we need to measure _lost_ early adopters and the reasons why they are lost. Regression metrics are a very important first step too and i'm very happy about the increasing effort that is being spent on this. This is all QA-101 that _cannot be argued against on a rational basis_, it's just that these sorts of things have been largely ignored for years, in favor of the all-too-easy "open source means many eyeballs and that is our QA" answer, which is a _good_ answer but by far not the most intelligent answer! Today "many eyeballs" is simply not good enough and nature (and other OS projects) will route us around if we dont change. We kernel developers have been spoiled by years of abundance in testing ...
but here I disagree. LKML is already too busy and noisy. Major subsystems need their own discussion areas. --- ~Randy -
That's a stupid argument. We lose much more by forced isolation of discussion than what we win by having less traffic! It's _MUCH_ easier to narrow down information (by filter by threads, by topics, by people, etc.) than it is to gobble information together from various fractured sources. We learned it _again and again_ that isolation of kernel discussions causes bad things. In fact this thread is the very example: David points out that on netdev some of those bugs were already discussed and resolved. Had it been all on lkml we'd all be aware of it. this is a single kernel project that is released together as one codebase, so a central place of discussion is obvious and common-sense. so please stop this "too busy and too noisy" nonsense already. It was nonsense 10 years ago and it's nonsense today. In 10 years the kernel grew from a 1 million lines codebase to an 8 million lines codebase, so what? Deal with it and be intelligent about filtering your information influx instead of imposing a hard pre-filtering criteria that restricts intelligent processing of information. Ingo -
From: Ingo Molnar <mingo@elte.hu> That's a rediculious argument. One other reason these bugs are resolved, is that the networking developers only need to subscribe to netdev and not have to listen to all the noise on lkml. People who want to manage bugs know what list to look on and contact about problems. Dumping even more crap on lkml is not the answer. -
I agree totally with David, and this goes for SCSI too. If it's not reported on linux-scsi, there's a significant chance of us missing the bug report. The fact that some people notice bugs go past on LKML and forward them to linux-scsi is a happy accident and not necessarily something to rely on. LKML has 10-20x the traffic of linux-scsi and a much smaller signal to noise ratio. Having a specialist list where all the experts in the field hangs out actually enhances our ability to fix bugs. James -
you are actually proving my point. People have to scan lkml for SCSI regressions _anyway_, because otherwise _you_ would miss them. In the case a user is fortunate enough to realize that a regression is SCSI related, and he is lucky enough to pre-select the SCSI mailing list in the first go, he might get a fix from you. That already reduces the number of useful bugreports by about an order of magnitude. Ingo -
what noise? If someone really wants networking discussions only, use
this procmail rule:
:0 HBc
* .*net: *
sched-patches
to separate it into an extra folder and use "net: " as an agreed upon
Subject line if you really want to narrow things down. (But there would
still be all the other mail just in case the developer has to look at
the wider picture. There would be no "I'm only subscribed to netdev"
excuse. )
but there should still be one central repository for all kernel
discussions - just like there is one central repository for all kernel
i think that's the problem. Developers (and here i dont mean you) who
want to do "development only", without being exposed to the global state
of the kernel and without being exposed to bugs. I think that's the
basic mindset difference. That is one of the factor that is causing
assymetric allocation of developers and the increasing detachment from
that "crap" that i'd like to see dumped upon lkml would be netdev
traffic mainly - most of the other kernel development lists (and i'm
subscribed to many of them) are low-traffic. netdev is the main reason
why we cannot do a "one common discussion forum" approach.
Ingo
-
hmm, how much work would it be to tweak the mail software on vger to have a linux-all@vger.kernel.org that got a copy of any linux-* list hosted by vger. this would solve half the problem (people on linux-kernel not seeing discussions on the other lists) David Lang -
So you have a preferred method of handling email. Please don't force it on the rest of us. I'll plan to use lkml-list-only when you have convinced DaveM to drop all of the other mailing lists at vger.kernel.org. Yeah, sure. --- ~Randy -
I'd be curious for any pointers on tools, actually. I "read" (ok, skim) lkml but still overlook relevant bug reports occasionally. (Fortunately, between Trond and Andrew and others forwarding things it's not actually a problem, but I'm still curious). --b. -
Ingo Molnar wrote: .. QA-101 and "many eyeballs" are not at all in opposition. The latter is how we find out about bugs on uncommon hardware, and the former is what we need to track them and overall quality. A HUGE problem I have with current "efforts", is that once someone reports a bug, the onus seems to be 99% on the *reporter* to find the exact line of code or commit. Ghad what a repressive method. And if the "developer" who broke the damn thing, or who at least "claims" to be supporting that code, cannot "reproduce" the bug, they drop it completely. Contrast that flawed approach with how Linus does things.. he thinks through the symptoms, matches them to the code, and figures out what the few possibilities might be, and feeds back some trial balloon patches for the bug reporter to try. MUCH better. Linus also asks for a git bisect, but doesn't insist upon the reporter learning an entire new (poorly documented) toolset just to to report a bug. Blah! And remember, *I'm* an old-time Linux kernel developer.. just think about the people reporting bugs who haven't been around here since 1992.. -ml -
yes, absolutely so - that's why i used the "good" qualifier. "Good is not good enough" calls for additional efforts to make it more efficient, not for the abolition of the many eyeballs concept (which would be absurd). So what i wanted to say is that _sole_ reliance on the large numbers of eyeballs is a fundamental mistake. It's even sometimes used as an excuse to merge questionable stuff. "we'll find any bugs, many eyeballs will make bugs shallow". In reality the many eyeballs are not infinite, nor should they be taken for granted if they are used for bogus things. We have to make sure the eyeballs stay 'many', and we also have to make sure they are not wasted. It's a physical resource that must be intelligently handled. Its positive effects can be easily wasted and we do that today. for example git-bisect was godsent. I remember that years ago bisection of a bug was a very laborous task so that it was only used as a final, last-ditch approach for really nasty bugs. Today we can autonomouly bisect build bugs via a simple shell command around "git-bisect run", without any human interaction! This freed up testing resources enormously and made bisection one of the _first_ things that are tried when bugs are met. We just need more of this (distros should offer pre-built kernel rpm 'farms' for every important commit point and automated tools for users to easily specify breakage points, without them having to install those kernels individually) , and everyone should be aware of the fact that we still suck (we merge too much crap and still dont have good enough tools to de-crappify what we merge) and that we are losing testers. Ingo -
.. It's only a godsend for the few people who happen to be kernel developers and who happen to already use git. It's a 540MByte download over a slow link for everyone else. -ml -
Oh, common. Leeching CDs is so yesterday. These days some distributions don't even offer CDs anymore in favour of DVDs. I'd be amazed if a lot of the testers would still be on slownet, its impossible to keep up with the latest distros without broadband. -
It's also godsend for users who want a regression they observe fixed. If you can tell which patch broke it you often turned a very hard to debug problem into a relatively easy fixable problem. As an example, [1] was an issue a normal user could discover, and bisecting made the difference between "nearly undebuggable" and As already said in thread, the required instructions for bisecting are relatively short and simple (assuming the user can build his own Not everyone has a slow connection. For me, the speed of cloning a tree from git.kernel.org is completely cpu bound and limited by the speed of the 1.8 Ghz Athlon in my computer... But if there is a real life problem like people with extremely slow and expensive internet connections not being able to bisect bugs these cu Adrian [1] http://lkml.org/lkml/2007/11/12/154 -- "Is there not promise of rain?" Ling Tan asked suddenly out of the darkness. There had been need of rain for many days. "Only a promise," Lao Er said. Pearl S. Buck - Dragon Seed -
.. Oh yes, definitely. When that use happens to be a kernel dev + git user, it saves the *fool who broke it* a hell of a lot of time, because they can slough it off onto the poor bloke who notices it. Mind you, no arguing that this is effective when that poor bloke has a day free to download the git-tree and build/reboot a dozen times. -
From: Mark Lord <liml@rtr.ca> Like the internet, this time spent is beneficial because it's pushing the work out to the end nodes. In fact git bisect is an awesome example of the end node principle in action for software development and QA. For the end-user wanting their bug fixed and the developer it's a win win situation because the reporter is actually able to do something proactive which will help get the bug they want fixed faster. So I don't agree with framing this person as a "poor bloke". Our testers are more empowered than ever to lead the process towards a fix. -
Please stop cross-posting this thread at least to linux-pcmcia until your post is relevant to PCMCIA. Sorry for being a bore. (Not that I don't love reading LKML discussions, but I found that it took too much time, and now they're over at linux-pcmcia too! :) Thank you in advance. //Peter -
"fool who broke it" are hard works. Bugs are part of software
development, so you'd have to name everyone who develops software
a fool.
But the main point is that often you don't know who broke it until you
I did bisecting myself, and I know that it costs time and work.
But the first point is the above one that it makes otherwise nearly
undebuggable problems debuggable and fixable.
Another point is that it shifts the work from the few experienced
developers to the many users. Users (and voluntary testers) we have
many, but developer time for debugging bug reports is a quite scarce
resource.
And why "poor bloke"? Bisecting takes time, but that's not different
from e.g. writing code or cleaning up code or going through bug reports.
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
-
Adrian Bunk wrote: .. Definitely useful, no question. But the problem is now that kernel devs are addicted to it, many won't even consider resolving a problem any other way. That's not "maintaining" (or supporting) one's code. And when a "maintainer" is too busy to find/fix their own bugs, that could be a sign that they've bitten off too big of a chunk of the kernel, and it's time for them to distribute code maintainership. Cheers -
What you replaced with two dots contained the answer to this:
Another point is that it shifts the work from the few experienced
developers to the many users. Users (and voluntary testers) we have
many, but developer time for debugging bug reports is a quite scarce
The problem is: Maintainers don't grow on trees.
You need people who are both technically capable and willing to spend
time on the non-sexy task of debugging problems.
Where do you plan to find them?
If you don't believe me, please find a maintainer for the currently
unmaintained parallel port support.
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
-
There is this silly limit that noone can work more than 168 hours per
week on the Linux kernel, and some kernel developers seem to take the
liberty of spending even less time on kernel development...
Considering our problems to cope with the amount of incoming bug
reports, everything that would require a kernel developer to spend more
time for getting a bug fixed would be a horrible mistake.
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
-
That limit of 168 hours applies all around the world to everyone. Moreover, not all kernel developers are employed to hack on the kernel for 168 hours a week. For me, personally, that figure is in reality about 24 hours a week. Yes, just 24. The rest of the time (like *now*) is time I'm volunteering because I happen to be reading my email... ... and happen to be wasting replying to discussions like this rather than reading that message which has just arrived on the ARM kernel mailing list from someone having problems using copy_from_user() with a kernel pointer. So, please, stop this idea that somehow kernel developers can somehow spend infinite amounts of time solving lots and lots of bugs. -- Russell King Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: -
Sorry, that happens when using irony in a non-native language...
What I wanted to express:
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
-
.. Hey, if somebody has time to break things, then they damn well ought to be able to make time to fix them again. And the best developers here on LKML do just that (fix what they break). You broke it, you fix it. A simple rule. Translation for the particularly daft: If you've been making significant updates to a driver/subsystem, and people are reporting that it is now broken for them, then it's your job to make it right. The reporters can help, and many may even git-bisect or send patches. But you cannot *expect* or *insist* upon them doing your job. -
What are "significant updates"?
Sometimes one person makes one small patch and this patch contains
We have some open drivers/ata/ regressions.
I see some person named "Mark Lord" being responsible for 4 commits.
What pubishment do you plan for him if 2.6.24 ships with any libata
regressions?
Let George W. Bush wrongly accuse him of possessing weapons of
Bullshit.
Bug fixing is not about finding someone to blame, it's about getting the
bug fixed.
The bug reporter is the person who can reproduce the problem, and if
it's a regression then bisecting is the natural way of getting nearer
at getting it fixed.
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
-
On Tue, 13 Nov 2007 19:52:17 -0500 Why does the kernel have very few useful tests? Lack of interest? resources? expertise? Ideally each new feature would just be a small add on to an existing test. Unlike developing new features which seems to grow well with more developers. Bug fixing also seems to be a scarcity process. There often seems to be a very few people that understand the problem well enough or have the necessary hardware to reproduce and fix the problem. Recent changes like tickless and scheduler rework were well thought out and caused very little impact to 90% of the users. The problem is the 10% who do have problems. Worse, the developers often only hear about the a small sample of those. -- Stephen Hemminger <shemminger@linux-foundation.org> -
Tests would of course be nice, but they aren't very useful(!) Looking at this list which Natalie has generated I see around thirty which are dependent on the right hardware and ten which are not. This ratio is typical, I think. In fact I'd say that more than 75% of reported bugs are dependent on hardware. So the best test of all for the kernel is "run it on a different machine". This is why we are sooooo dependent upon our volunteer testers/reporters to Sure. For system-call-visible features it would be good to do that. But this tends not to be where bugs get exposed. Because the original developer can 100% exercise such code. That isn't the case with We're 100% dead if "having the hardware" is a prerequisite to fixing a bug. The terminal state there is that the kernel runs on about 200 machines worldwide. We have to work with reporters via email to fix these sorts of Yes. An unknown number of people just shrug and go back to an old kernel. -
.. Then that person should double check their changes against the problems reported, and re-convince themselves that the .. Yup, but they're more specific than just that entire subsystem, and the maintainers are actively pursuing the problems. .. If the code I'm touching breaks, then I'll fix it ASAP, .. It's not about blame, it's about paying attention to breakages in code that a person claims to be supporting, and then doing their best to resolve the issues. Again, if one has the time to actively write/modify code such that something breaks, .. For the third time, no disagreement here. git-bsect can help in many cases, but not in all cases. And it requires a great time commitment from somebody who's system used to work and now doesn't work. The person who broke it has a fair bit of responsibility there, too. cheers -
Simple?
Everything you have in mind with "should double check their changes" is
simply not realistic with dozens of known unfixed regressions within
more than half a million changed or new lines of code written by more
Maintainers are just humans with limited time.
You were the one who suggested to "distribute code maintainership",
code writer != subsystem maintainer
git-bisect can help only for regressions, and it can help for most
regressions.
And you shouldn't try to make a problem out of something that isn't a
problem:
Bug submitters are either volunteers who test -rc or even -git or -mm
kernels for finding bugs or people who want a problem they experience
fixed.
In both cases the submitters are usually willing to invest some time for
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
-
Where do you get this number from? $ du -sh .git/objects/pack/ 249M .git/objects/pack/ $ du -sh .git/objects/ 253M .git/objects/ ie about half what you claim. -- Intel are signing my paycheques ... these opinions are still mine "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step." -
.. No, it's from earlier in this very thread: .. mkdir t cd t git clone git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git (wait half an hour) /usr/bin/du -s linux-2.6 522732 linux-2.6 -
You're assuming that everything in linux-2.6 was downloaded; that's not true. Everything in linux-2.6/.git was downloaded; but then you do a checkout which happens to approximately double the size of the linux-2.6 directory. If you do git-clone -n, you'll get a closer estimate to the size of the download. I suppose git-clone should grow a -v option that it could pass to rsync to let us find out how many bytes are actually transferred, but i'm happy to go with 250MB as a close estimate to the amount of data to xfer. When you compare it to the 60MB tarballs that are published, it's really not that bad. -- Intel are signing my paycheques ... these opinions are still mine "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step." -
.. Ah, I wondered why it took only half an hour to download. .. The tarballs I download are only 45MB. Cheers -
You clone the git repo once. Afterwards, you only update it and that usually doesn't take that much time and a little effort. Greetings, Rafael -
and you can get even lower than the 260MB by downloading a shallow clone of v2.6.23 and then populating the git tree from tht point on. (see the --depth parameter of git-clone) [because most of the time you want to bisect back to the last stable release, not back to 2 years of git history.] Ingo -
When creating additional git trees (Linville's wireless-2.6 tree, for example) for driver development, you can save a lot of download bandwidth by using the --reference parameter of git-clone. Larry -
Actually, the best command is git gc which does a repack (into a single pack file rather than an incremenal), and then removes all the objects now in the pack. If, like me, you work on temporary branches which you keep rebasing, you can add a --prune to gc which will erase all unreferenced objects as it packs (use this one with care. I usually never use it but run a git prune -n just to see what would be removed, and then run git prune separately if it looks OK). James -
Thanks for the comment. That managed to indeed shave a few extra bytes off my already "repack -a -d" packed repo still. Rene. -
"git-repack -a -d" gives me ~220 MB: $ du -s .git 222064 .git anyone who can download a 43 MB tar.bz2 tarball for a kernel release should be able to afford a _one time_ download size of 250 MB (the size of the current kernel.org git repository). If not, burning a CD or DVD and carrying it home ought to do the trick. Git is very bandwidth-efficient after that point - lots of people behind narrow pipes are using it - it's just the initial clone that takes time. And given all the history and metadata that the git repository carries (full changelogs, annotations, etc.) it's a no-brainer that kernel developers should be using it. (and you can shrink the 250 MB further down by using shallow clones, etc.) yes, some people complained when distros stopped doing floppy installs. Some people complained when distros stopped doing CD installs. Yes, i've myself done a 250+ MB download over a 56 kbit modem in the past, and while it indeed took overnight to finish, it's very much doable. It's not really qualitatively different from the 1.5 hours a kernel tar.bz2 took to download. Ingo -
Probably that once in a while, we should set up a complete tree in a tar.bz2 format on kernel.org. It would help a lot of people behind small pipes. I have been encountering problems with git-clone when the link is unstable. After the smallest error, it erases everything and you have to retry from start, which is quite frustrating and expensive. At least, downloading a tar.bz2 with FTP would be easier and a lot more reliable. Also, people could download it from their workplace and bring it home. Willy -
This is the only method that scales. Developer has only 24 hours in each day, and sometimes he needs to eat, sleep, and maybe even pay attention to e.g. his kids. But bug reporters are much more numerous and they have more hours in one day combined. BUT - it means that developers should try to increase user base, Developer should let reporter know that reporter needs to help a bit here. Sometimes a bit of hand holding is needed, but it Yes. Developers should not grow more and more unhelpful and arrogant towards their users just because inexperienced users send incomplete/poorly written bug reports. They need to provide help, not humiliate/ignore. I think we agree here. -- vda -
99% on the reporter? Is that why I always try to understand the reporters problem (*provided* it's in an area I know about) and come up with a patch to test a theory or fix the issue? I'm _less_ inclined to provide such a "service" for lazy maintainers who've moved off into new and wonderfully exciting technologies, to churn out more patches for me to merge (and eventually provide a free to them bug fixing service for.) That's "less" inclined, not "won't". -- Russell King Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: -
.. Same here. I just find it weird that something can be known broken for several -rc* kernels before I happen to install it, discover it's broken on my own machine, and then I track it down, fix it, and submit the patch, generally all within a couple of hours. Where the heck was the dude(ess) that broke it ?? AWOL. And when I receive hostility from the "maintainers" of said code for fixing -
Given a decent bug report, I agree that having the bug not looked at is shameful. But what can a developer do if a bug report effectively reads "there is some bug somewhere in recent kernels"? How can I know that in this particular case it is my bug that I introduced? It could just as easily be 50 other people and none of them are eager to debug it unless they suspect it to be their bug. This is a common problem and fairly unrelated to linux in general or the kernel in particular. Who is going to be the sucker that figures out which developer the bug belongs to? And I have yet to find a project, commercial or opensource, where volunteers flock to become such a sucker. One option is to push this role to the bug reporter. Another is to strong-arm some developers into this role, by whatever means. A third would be for $LARGE_COMPANY to hire some people. If you have a better idea or would volunteer your time, I'd be grateful. Simply blaming one side, whether bug reporter or a random developer, for not being the sucker doesn't help anyone. Jörn -- Joern's library part 2: http://www.art.net/~hopkins/Don/unix-haters/tirix/embarrassing-memo.html -
It's relatively common that a regression in subsystem A will manifest as a failure in subsystem B, and the report initially lands on the desk of the subsystem B developers. But that's OK. The subsystem B people are the ones with the expertise to be able to work out where the bug resides and to help the subsystem A people understand what went wrong. Alas, sometimes the B people will just roll eyes and do nothing because they know the problem wasn't in their code. Sometimes. -
And sometimes the A people will ignore the B people after the root cause has been worked out. Do you have a good idea how to shame A into action? Should I put you on Cc:? Right now I'm in the eye-rolling phase. Jörn -- The cost of changing business rules is much more expensive for software than for a secretaty. -- unknown -
Well, that's the problem, isn't it? The best I can come up with is to suggest that all the info be captured in a bugzilla report so that at least it doesn't get forgotten about. I suppose that other options are a) try to fix it yourself. I'll take the patch and as long as we make a big enough mess of it, someone who knows what they're doing might fix it for real. b) If it was a regression, identify the offending commit and we'll just revert it. -
.. Most of the regressions we have are easily identifiable and not of the type where there could "50 other people" touching the relevant code. As a developer (and former subsystem maintainer) I look hard at my own code when there's a bug reported that could have come from recent updates there. Usually there are not that many updates to consider, and tracking it down is just a matter of being willing to do so. Of late, I've given up on other developers fixing the stuff they break on my own machines, and I generally just dive into totally unfamiliar code, and find and fix it myself. Quite quickly, usually. And the bugs are often very apparent just from looking at the source code diffs (patches) from recent history in the code that's not working. This is not rocket science, and it doesn't require a log2 download/rebuild/reboot process. But yes, there are more difficult ones, like when my machine crashed yesterday with some form of corruption showing up during JBD filesystem I/O. That's one where the problem isn't going to be obvious to anyone, and I don't actually expect anyone to go looking for it right away. If more such events happen, then it will get more attention. But things like broken drivers, in almost every case those are trivial .. Nobody's blaming anyone here. I'm just asking that developers here do more like our Top Penguin does, and actually look at problems and try to understand them and suggest fixes to try. And not rely solely on the git-bisect crutch. It's a good crutch, provided the reporter is a kernel developer, or has a lot of time on their hands. But we debugged Linux here for a long time without it. And I already volunteer my time here, thanks, BIG TIME, since 1992 or so. Cheers -
Same thing can be said for compile breakages as well. Looking at the latest kautobuild output: ARM ep93xx defconfig has been broken since 2.6.23-git1 due to: drivers/net/arm/ep93xx_eth.c:420: error: implicit declaration of function '__netif_rx_schedule_prep' caused by: [NET]: Make NAPI polling independent of struct net_device objects. ARM netx defconfig has been broken since 2.6.23-git1 due to: drivers/net/netx-eth.c: In function 'netx_eth_hard_start_xmit': drivers/net/netx-eth.c:131: error: 'dev' undeclared (first use in this function) drivers/net/netx-eth.c:131: error: (Each undeclared identifier is reported only once drivers/net/netx-eth.c:131: error: for each function it appears in.) drivers/net/netx-eth.c: In function 'netx_eth_receive': drivers/net/netx-eth.c:158: error: 'dev' undeclared (first use in this function) caused by: [NET] drivers/net: statistics cleanup #1 -- save memory and shrink code Haven't got a report for either of those, but Kautobuild lets people know if folk can be bothered to subscribe to its mailing list and/or look at the site occasionally. I suspect the maintainers of the above drivers aren't aware that their drivers are broken. -- Russell King Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: -
Btw, I used to test every -mm kernel. But since I've switched distros (gentoo->ubuntu) and I have less time, I feel it's harder to test -rc or -mm kernels (I know this isn't a lkml problem but more a distro problem, but I would love having an ubuntu blessed repo with current dev kernel for the latest stable ubuntu release). For debugging, maybe it's time someone does an amazon ec2+s3 service to automate the bisecting and create .deb/.rpm from git, I don't know how much it would cost though. regards, Benoit -
a few months ago i estimated the costs of this and it's just a few terabytes so within arm's reach. As long as the .deb/.rpm's are built by tracking -git in a rolling fashion CPU time should not be a big issue. The only limit is download bandwidth - but even that problem might be solvable via a huge git repository of ready-to-boot .o's that are linked together on the tester's machine. Ingo -
There are two parts to this. One is a Ubuntu development kernel which
we can give to large numbers of people to expand our testing pool.
But if we don't do a better job of responding to bug reports that
would be generated by expanded testing this won't necessarily help us.
The other an automated set of standard pre-built bisection points so
that testers can more easily localize a bug down to a few hundred
commits without needing to learn how to use "git bisect" (think Ubuntu
users).
So for the first, I've actually been playing with some plans to put
together an unofficial kernel that basically "what Ted is using on his
laptop". It generally has emergency bug fixes that haven't made it
into mainline, plus some other trees where I've been more aggressive
since I want to latest in wireless and powersaving technology, etc.
It has the property that "if it breaks, you get to keep both pieces
--- and I've helpfully included the git ID in the package name so you
can do the bisection yourself". If you want to try it, the first such
kernel is here:
http://www.kernel.org/~tytso/tbek
I wasn't planning on talking about it until it was more fully baked,
but if people want something vaguely stable based on 2.6.24-rc2, this
might be interesting.
As for the second, I was just talking to Arjan over pizza and beer
last night, and we reached the same conclusion as Ingo, which is this
really isn't that hard. It wouldn't be that hard to set up
infrastructure to do this, and it's just a matter of getting the disk
space and the network bandwidth togehter in the right place, plus a
relatively small amount of prgramming at least for the simplest
iteration of the idea. (As is quite common when doing designs over
beer, we talked about some more gradious web-based schemes to do
custom built kernels that was tied to the kernel bugzilla, but first
things first. :-)
- Ted
-
Before that you want a flowchart or instruction list of boot options to try. A lot of errors can be localised simply by asking the reported to boot with things like "iommu=off", "pci=routeirq", "apci=off" etc That takes a lot less time to run through and can be very informative. Alan -
The main problem aren't missing testers [1] - we already have relatively
experienced people testing kernels and/or reporting bugs, and we slowly
scare them away due to the many bug reports without any reaction.
The main problem is finding experienced developers who spend time on
looking into bug reports.
Getting many relatively unexperienced users (who need more guidance for
debugging issues) as additional testers is therefore IMHO not
cu
Adrian
[1] and e.g. when Greg says he has a few hundred people who want to
write drivers it would most likely be possible to find a few
dozen additional -rc testers among them
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
-
And where experienced developrs are coming from? They are not born with Linux kernel skills. They grow up from within user base. Bigger user base -> more developers (eventually) -- vda -
You missed the following in my email:
"we slowly scare them away due to the many bug reports without any
reaction."
The problem is that bug reports take time. If you go away from easy
things like compile errors then even things like describing what does
no longer work, ideally producing a scenario where you can reproduce it
and verifying whether it was present in previous kernels can easily take
many hours that are spent before the initial bug report.
If the bug report then gets ignored we discourage the person who sent
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
-
Cannot agree more. I am in a similar position right now. My patch to aic7xxx driver was ubmitted four times with not much reaction from scsi guys. Finally they replied and asked to rediff it against their git tree. I did that and sent patches back. No reply since then. And mind you, the patch is not trying to do anything complex, it mostly moves code around, removes 'inline', adds 'const'. What should I think about it? -- vda -
I'm waiting for an ACK/NAK from Hannes, the maintainer. What should I do? -- Intel are signing my paycheques ... these opinions are still mine "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step." -
I haven't actually been able to test it here (too busy, sorry). If someone else confirms it does it's job then Acked-by: Hannes Reinecke <hare@suse.de> Cheers, Hannes -- Dr. Hannes Reinecke zSeries & Storage hare@suse.de +49 911 74053 688 SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 Nürnberg GF: Markus Rex, HRB 16746 (AG Nürnberg) -
hi Matthew, You could have informed me about this, and I would talk to Hannes myself. This would free up your mind from keeping track of this particular patch. Parallelize development, prevent things from being forgotten. It's not in my mailbox on this machine, gladly we have lkml archived in the Net. Here is a positive tester report: http://lkml.org/lkml/2007/10/15/168: ====================== Date Mon, 15 Oct 2007 15:53:08 +0200 From Gabriel C <> want. Works fine for me tested on : 03:0e.0 SCSI storage controller [0100]: Adaptec AIC-7892P U160/m [9005:008f] (rev 02) Gabriel ======================= -- vda -
I still run the patches on a box with 2.6.23 and one with 2.6.24-rc1 without any problems. Didn't tested rc2/current git but I can if is needed. Gabriel -
this has nothing to do with the bugs on bugzilla. you're trying to send a janitor patch. It should be logical that the response to that is not heated or receiving a joyous reception :) If you have a problem getting your cleanup patch to the driver maintainer, send it to the subsystem maintainer instead, or even the janitors, or even Adrian Bunk who will gladly push it to everyone. Or, even to Andrew Morton who will carry it in -mm for a while and then harrasses the subsystem maintainer to merge it for you! Cheers, Auke -
There are already. IMO the problem is the development model. There are tons new features in each new kernel release and 'tons new bugs' which are not fixed during the release cycle nor in the .XX stable kernels. Maybe after XX kernel releases there should be one just with bug-fixes _without_ any new features , eg: cleaning bugs from bugzilla , know regressions , cleaning up code , Gabriel -
Won't work. You cannot force people to work on things they don't find interesting, long-term. -- vda -
Hum. If only each of those would squash one bug a week besides their own work... I would expect he's got a handful that know IDE, another group that is into network drivers and so on. I predict that pile of bugs to disappear in weeks (-; Just my $0.02. Jan Evert -
I'm very encouraged to read of your expanded testing efforts. As a bcm43xx developer, Ubuntu has been our problem distro, mostly because your standard kernels have debugging turned off for bcm43xx. When a Ubuntu user reports a problem and we ask for the relevant output from dmesg, they have no information. I ask two things of all distros: (1) Turn on debugging - we don't spam the logs that badly, and (2) forward any bugs found by your testing to the maintainer, and/or the bcm43xx mailing list. Thanks, Larry -
Heh. I hadn't enabled CONFIG_BCM43XX_DEBUG myself, but I just changed it for my next kernel build. This is a slightly different issue, which is that sometimes _DEBUG options shouldn't be turned on by default (because they really trash performance and bloat log size), and sometimes they are painless to turn on and don't cost much. If that is the case, I'd suggest removing the option and just making it compiled in by default with a run-time option to enable it. - Ted -
I am taking your suggestion and will produce the necessary patches for ssb, b43 and b43legacy. As bcm43xx is likely to be removed from 2.6.25, which is the earliest such a non-bug fix patch would be accepted, I hope that your future distribution and testing kernels will include the debug option. Thanks, Larry -
I don't see any reason that we couldn't have a tool accessible to Ubuntu users that does a real "git bisect". Git is really good at being scripted by fancy GUIs. It should be easy enough to have a drop down with all of the Ubuntu kernel package releases, where the user selects what works and what doesn't. Then the tool clones a git repository with flags to only get relevant parts, and then leads a bisect run, where it's also configuring, building, and installing the kernels (as a different grub entry), and providing instructions in general. Fundamentally, "git bisect" is a really low-interaction process: you tell it a couple of commits, and then it does stuff, and then you tell it "I tested, and it worked" or "I tested, and it had the problem" or "Something else went wrong", and it asks you something new. Other than that, it just takes time (and a build system hook, which this tool would handle for the kernel). Eventually, it tells you what to report, and you do so. -Daniel *This .sig left intentionally blank* -
It should be possible for it to clone only the portion that they actually care about based on where the known-good version is. It should also (in theory, anyway) be possible to put off some amount of the download until None of this is going to take as long, even on a slow link and a slow computer, as waiting for a response to a mailing list post. It'd annoy users who are specifically waiting for it, but if the interface is that the user says "kernel package X didn't work but the current kernel does", and it says "I'll let you know when I've got something to test", and the user watches a DVD, and afterward finds a message saying there's something to test, and tries it, and reports how it went, and the process repeats until it narrows it down to a single commit after a couple of days of the user getting occasional responses, it's not that different from asking for Could have a distro-provided mask of things that aren't worth testing and That would probably help for giving the user something to try right away. I still think that the main cost to the user is the number of times that the user has to stop doing stuff to reboot with a kernel to test, whether the test kernels are available quickly from the distro site, slowly built locally, or slowly as suggested by humans helping online. -Daniel *This .sig left intentionally blank* -
(Cc: trimmed a bit). Well, the compile phase can. Especially if the first time you try to compile the kernel with EXTRAVERSION=`git describe` which force almost a full rebuild every time... But the worst problem is that a full recompile, with a distro .config, will take hours on my 2.66GHz/CoreDuo/1G ram. Trimming down .config is fundamental to be able to bisect effectively, but it's not an easy thing to do for an unexperienced user (and a painful one for all the rest of us). What would be an invaluable help would be a tool that generates a .config with all the modules and subsystems I am using *now*. Should be possible in principle by parsing KConfig and Makefiles and using as input the current .config and lsmod... is it possible to map the kernel object name to the option enabling it? Romano -- Sorry for the disclaimer --- ¡I cannot stop it! -- La presente comunicación tiene carácter confidencial y es para el exclusivo uso del destinatario indicado en la misma. Si Ud. no es el destinatario indicado, le informamos que cualquier forma de distribución, reproducción o uso de esta comunicación y/o de la información contenida en la misma están estrictamente prohibidos por la ley. Si Ud. ha recibido esta comunicación por error, por favor, notifíquelo inmediatamente al remitente contestando a este mensaje y proceda a continuación a destruirlo. Gracias por su colaboración. This communication contains confidential information. It is for the exclusive use of the intended addressee. If you are not the intended addressee, please note that any form of distribution, copying or use of this communication or the information in it is strictly prohibited by law. If you have received this communication in error, please immediately notify the sender by reply e-mail and destroy this message. Thank you for your cooperation. -
Compared to getting useful suggestions from a mailing list, especially before you've gotten anybody's attention? Hours or overnight isn't particularly long, and doesn't take up much of your time if you've got a I don't think there's anything set up for that, aside from the actual build system generating it, and I don't know how hard that would be to repurpose for generating a configuration. -Daniel *This .sig left intentionally blank* -
I don't understand that number.
The common case are regressions in -rc1, and a bisection of
cu
Adrian
--
"Is there not promise of rain?" Ling Tan asked suddenly out
of the darkness. There had been need of rain for many days.
"Only a promise," Lao Er said.
Pearl S. Buck - Dragon Seed
-
As a long time kernel tester, I see some problem with the newer "new development model". In the short merge windows, after to much time, there are to many patches. So there are problem to bisect bugs, and to have attention of developers. My impression is that in a week there are many more messages in lkml and to much bugs to be handled in these few days. I've two proposal: - better patch quality. I would like that every commit would compile. So an automatic commit test and public blames could increase the quality of first commits. [bisecting with non compilable point it is not a trivial task] - a slow down the patch inclusion on the merge windows (aka: not to much big changes in the first days). As tester I prefer that some big changes would be included in a "secondary window" (pre o rc release), in an other period as the big patch rush. ciao cate -
I think the root issue there is that it's hard to get all testers to run a bisect, but easy to ask them to test snapshots. Right now the snapshots are generated nightly, but I think it would make more sense if they were generated every N patches, for some value of N... Of course, for that to really work, we have to ensure that the result is always compilable, which has been getting better, but not perfect. Ray -
I don't see a point in doing that - that would be a more manual bisecting, and the result would not be one guilty commit. Testers are not expected to be able to hack a kernel, but it's reasonable to expect testers to be able to build their own kernels (and your proposal wouldn't change that). The small instruction below is enough for everyone who is able to cu Adrian <-- snip --> # install git # clone Linus' tree: git clone \ git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux-2.6.git # start bisecting: cd linux-2.6 git bisect start git bisect bad v2.6.21 git bisect good v2.6.20 cp /path/to/.config . # start a round make oldconfig make # install kernel, check whether it's good or bad, then: git bisect [bad|good] # start next round After at about 10-15 reboots you'll have found the guilty commit ("... is first bad commit"). More information on git bisecting: man git-bisect -
I jump in this discussion hoping to have some more insight on git and to report my experience as a tester. I consider myself as half-literate in this (I am here since 1991, more or less, and I am able to compile a kernel and even hand-apply a patch, although I am in no way a kernel programmer). This was what I did in my (in the end almost successful) bisecting when trying to find the mmc problem (see the thread named "2.6.24-rc1 eat my SD card"). This is true in theory, but it has some problem. The "this commit does not compile is the easiest and in man git-bisect it's explained how to solve it. The changes in .config options, added or removed, are another problem when jumping back and forth from version (I was bitten by the gadzillions new options added to hda-intel alsa driver, but well, that is solvable with a bit of attention). The main problem I had, and that stopped me to arrive to a definite is this situation: j version-bad i h g unrelated (but similar) bug corrected f e d unrelated (but similar) bug introduced c b a version-good (d was the series to change drivers to use sg helpers, and g was a "fix fallout from sg helpers" patch). Now I have a series of kernels (d, e, f) that did not work at all and so I cannot mark them good or bad. With the number of patches added in the free-for-all week, this is a very probable scenario. There is a way out from this using bisect? Romano PS as a suggestion, I think that added a "Reported-by", or "Tested-by", or "Debugged-by" attribution in the repository, as happened to be in the MMC case, is a nice an d welcomed reward for the effort. -- Sorry for the disclaimer --- ¡I cannot stop it! -- La presente comunicación tiene carácter confidencial y es para el exclusivo uso del destinatario indicado en la misma. Si Ud. no es el destinatario indicado, le informamos que cualquier forma de distribución, reproducción o uso de esta comunicación y/o de la información contenida en la misma están estrictamente ...
I think there are three strategies you can use in this case:
- create a kernel config that is as simple as possible, but still supports
your hardware and reproduces your problem; a simpler config will often
avoid compilation issues in parts of the kernel that you're not using
anyway and has the benefit of speeding up the compiles too
- if you know/suspect in what part of the tree the bug is, first limit the
bisection to that; you will have to verify that you did indeed find the
correct (broken) change by doing a compile for the "last good commit + 1"
- if you find a broken commit, use 'git-reset --hard' to try to jump past
the bad set of commits, but of course that does not help in the case:
g version-bad
f unrelated bug corrected
e
d the broken commit that caused your problem
c
b unrelated bug that breaks compilation or system introduced
a version-good
in that case the best you can reasonably be expected to do is report that
you narrowed it down to "between a and g" and leave the rest to the
developers
Cheers,
FJP
-
Fixed (extended) in the DaveM's tree (or will be soon - patch was submitted by Pierre Ynard). Sorry, others are either driver related (and thus require hardware to be tested on and maintainers to be kicked in) or too obscure (like 2.6.11 bug and weird network problem which is undetectible on other systems). Yes, we suck, but we try to recover :) -- Evgeniy Polyakov -
On 13-11-2007 12:15, Andrew Morton wrote: Looks like very reproducible! Maybe you should add this to ...bugzilla? Regards, Jarek P. -
Untrue. We've been discussing it on list in the past and its now on bugzilla. Not obvious from outside I realise. That one I'm afraid is Not sure who really owns parallel. Have grabbed and will sort out. -
Actually, there has been a response (Eric asked in mailing list and created a bug and got answer to the mailing list): As I read the bug it seems that the cause was a filesystem with errors (which were in ACL's and thus kernel didn't boot only with ACL's enabled) and fsck fixed the problem... I would close this one as invalid (OK, I know the filesystem had to be corrupted somehow but unless this is at least occasionally reproducible, there's low chance of finding the bug). Honza -- Jan Kara <jack@suse.cz> SuSE CR Labs -
For christ sake Andrew. Some of us are not employed to do kernel work
24h x 365days a year. You might be, I'm not.
First thing, it's not a regression. Second thing, it's *not* a bug.
uboot requires kernel images to be specially wrapped up in their crappy
formats before uboot will recognise it. This means that if someone wants
to boot a binary image with uboot, they need to either:
1. work out the correct 'mkimage' command and run that program after
the kernel build has completed.
2. sort out adding a new target to the kernel makefiles to run this
uboot specific 'mkimage' command automatically.
And Alexandre (the original feature-missing reporter) has linked to a
message where a patch was proposed to do (2). So obviously it's no
Bug was assigned to reporter, so I ignored it on the grounds that the
reporter was resolving it. Plus, until recently I didn't have any
workable PXA systems to test stuff on.
In the end, a similar issue has been resolved anyway after a lot of
discussion on the ARM lists about how PXA should handle one-shot mode
with clockevents. It took absolutely ages to get agreement on what was
a simple patch.
commit 91bc51d8a10b00d8233dd5b6f07d7eb40828b87d
Author: Russell King <rmk@dyn-67.arm.linux.org.uk>
Date: Thu Nov 8 23:35:46 2007 +0000
[ARM] pxa: fix one-shot timer mode
One-shot timer mode on PXA has various bugs which prevent kernels
build with NO_HZ enabled booting. They end up spinning on a
permanently asserted timer interrupt because we don't properly
clear it down - clearing the OIER bit does not stop the pending
interrupt status. Fix this in the set_mode handler as well.
Moreover, the code which sets the next expiry point may race with
the hardware, and we might not set the match register sufficiently
in the future. If we encounter that situation, return -ETIME so
the generic time code retries.
Acked-by: Thomas Gleixner <tglx@linutronix.de>
Acked-by: Nicolas Pitre ...Maybe I'm optimistic, but I expected Ingo/Thomas to look after nohz problems. nohz=off highres=off fixes more than one suspend problem... ...stuff I've seen with NOHZ even without suspend (cursor blinking irregulary) make me think that nohz perhaps should not be used in production just yet... Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html -
It appears that bug 9229 has been solved, and the reporter of that bug now says that: If I unset NO_TZ suspend/resume works. If I set it suspend/resume doesn't works. So I think this guy is now suffering from bug #9275 -- Russell King Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: -
FWIW, I see the same problem with another HP notebook, DV4378EA with radeon X700 video card. It does not happen frequently but I can say that since I disabled the tickless feature I can't reproduce the problem anymore. -
.. Note: that same bug exists/existed on i386 back when NO_HZ was introduced (2.6.21?). I still see it from time to time on my Quad core system (very rare), but not any more on my Duo notebook where it used to happen about 1 in n boots (n < 10). .. I *still* get very slow resume-from-RAM quite often here (new in 2.6.22 kernel, wasn't there in early 2.6.23-rc*). Something eventually times out after a minute or so and it comes back. Cannot make it happen reliably, unless I'm in a hurry to get something done. :) I suspect USB here, probably the same loopy bug that we added a "loop limit failsafe" for back in 2.6.21(?). -
Plus we've just merged a fix for NO_HZ on PXA platforms due to an utterly broken one-shot implementation. So chances are this problem is now fixed. However, I object strongly to Andrew's responses to these bugs. He's completely out of line. Given the wide range of ARM platforms today, it is utterly idiotic to expect a single person to be able to provide responses for all ARM bugs. I for one wish I'd never *VOLUNTEERED* to be a part of the kernel bugzilla, and really *WISH* I could pull out of that function. -- Russell King Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: -
You can. Perhaps that bugzilla needs to point to some kind of arm-maintainers@vger.kernel.org list for the various ARM platform maintainers ? Alan -
That might work - though it would be hard to get all the platform maintainers to be signed up to yet another mailing list, I'm sure sufficient would do. -- Russell King Linux kernel 2.6 ARM Linux - http://www.arm.linux.org.uk/ maintainer of: -
As long as it would just be bug reports, I'm sure that most of us could be persuaded to subscribe. Adding another list for general discussions is probably not going to be read, the current list provides more than enough to keep us busy. -- Ben Q: What's a light-year? A: One-third less calories than a regular year. -
..
The "limit" added in the code below,
which was for messages of this form:
hub 1-1:1.0: hub_port_status failed (err = -71)
last message repeated 347 times
I'm not yet sure what's happening on resume now,
but there's this huge long pause with a dark screen
and then suddenly the USB subsystem comes to life
(my mouse lights up) and the system finally resumes.
More when I know more. But it doesn't happen every time,
or even most times, so git-bisect is not possible either.
This one actually requires a developer/maintainer to put
in some effort and think about things. Currently, that's me.
-ml
-
I have been reporting this off and on since 2.6.23 was released. This problem was not apparent up to perhaps 2.6.23-rc8, but definitely became common in 2.6.23 and 2.6.23.1. Most of the time, a resume-from-RAM on my notebook takes about 2.1 seconds of kernel time to complete. Once in a while, it takes *much* longer, in the 14-20 second range. These long events *seem* to be mostly after the notebook has been in suspend for a longish time, but there's really nothing consistent here. So git-bisect isn't going to work for this one. I recently rebuilt the kernel to include printk timestamps, and then it went 2 days without the issue happening, until this morning (after an overnight suspend) finally. The machine is a Dell Inspiron 9400, Intel chipset + Core2Duo 2.1GHZ w/3GB DDR2. PCIe express chipset, ATI graphics, SATA hard drive. 00:00.0 Host bridge: Intel Corporation Mobile 945GM/PM/GMS/940GML and 945GT Express Memory Controller Hub (rev 03) 00:01.0 PCI bridge: Intel Corporation Mobile 945GM/PM/GMS/940GML and 945GT Express PCI Express Root Port (rev 03) 00:1b.0 Audio device: Intel Corporation 82801G (ICH7 Family) High Definition Audio Controller (rev 01) 00:1c.0 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express Port 1 (rev 01) 00:1c.1 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express Port 2 (rev 01) 00:1c.3 PCI bridge: Intel Corporation 82801G (ICH7 Family) PCI Express Port 4 (rev 01) 00:1d.0 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI #1 (rev 01) 00:1d.1 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI #2 (rev 01) 00:1d.2 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI #3 (rev 01) 00:1d.3 USB Controller: Intel Corporation 82801G (ICH7 Family) USB UHCI #4 (rev 01) 00:1d.7 USB Controller: Intel Corporation 82801G (ICH7 Family) USB2 EHCI Controller (rev 01) 00:1e.0 PCI bridge: Intel Corporation 82801 Mobile PCI Bridge (rev e1) 00:1f.0 ISA bridge: Intel Corporation 82801GBM (ICH7-M) LPC Interface Bridge ...
Well, and why the 1-second pauses eventually stop, too. Seems interesting that they don't continue. Also, they're pretty much dead-on one- and two-second pauses, with HZ accuracy. Is this with a NO_HZ kernel? -
.. Yes, A NO_HZ kernel. Full config follows: # # Automatically generated make config: don't edit # Linux kernel version: 2.6.24-rc2-git4 # Wed Nov 14 14:19:45 2007 # CONFIG_X86_32=y CONFIG_GENERIC_TIME=y CONFIG_GENERIC_CMOS_UPDATE=y CONFIG_CLOCKSOURCE_WATCHDOG=y CONFIG_GENERIC_CLOCKEVENTS=y CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y CONFIG_LOCKDEP_SUPPORT=y CONFIG_STACKTRACE_SUPPORT=y CONFIG_SEMAPHORE_SLEEPERS=y CONFIG_X86=y CONFIG_MMU=y CONFIG_ZONE_DMA=y CONFIG_QUICKLIST=y CONFIG_GENERIC_ISA_DMA=y CONFIG_GENERIC_IOMAP=y CONFIG_GENERIC_BUG=y CONFIG_GENERIC_HWEIGHT=y CONFIG_ARCH_MAY_HAVE_PC_FDC=y CONFIG_DMI=y CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config" # # General setup # CONFIG_EXPERIMENTAL=y CONFIG_LOCK_KERNEL=y CONFIG_INIT_ENV_ARG_LIMIT=32 CONFIG_LOCALVERSION="" # CONFIG_LOCALVERSION_AUTO is not set CONFIG_SWAP=y CONFIG_SYSVIPC=y CONFIG_SYSVIPC_SYSCTL=y CONFIG_POSIX_MQUEUE=y # CONFIG_BSD_PROCESS_ACCT is not set # CONFIG_TASKSTATS is not set # CONFIG_USER_NS is not set # CONFIG_AUDIT is not set CONFIG_IKCONFIG=y CONFIG_IKCONFIG_PROC=y CONFIG_LOG_BUF_SHIFT=16 # CONFIG_CGROUPS is not set # CONFIG_FAIR_GROUP_SCHED is not set CONFIG_SYSFS_DEPRECATED=y # CONFIG_RELAY is not set CONFIG_BLK_DEV_INITRD=y CONFIG_INITRAMFS_SOURCE="" CONFIG_CC_OPTIMIZE_FOR_SIZE=y CONFIG_SYSCTL=y CONFIG_EMBEDDED=y CONFIG_UID16=y CONFIG_SYSCTL_SYSCALL=y CONFIG_KALLSYMS=y # CONFIG_KALLSYMS_ALL is not set # CONFIG_KALLSYMS_EXTRA_PASS is not set CONFIG_HOTPLUG=y CONFIG_PRINTK=y CONFIG_BUG=y CONFIG_ELF_CORE=y CONFIG_BASE_FULL=y CONFIG_FUTEX=y CONFIG_ANON_INODES=y CONFIG_EPOLL=y CONFIG_SIGNALFD=y CONFIG_EVENTFD=y CONFIG_SHMEM=y CONFIG_VM_EVENT_COUNTERS=y # CONFIG_SLUB_DEBUG is not set # CONFIG_SLAB is not set CONFIG_SLUB=y # CONFIG_SLOB is not set CONFIG_RT_MUTEXES=y # CONFIG_TINY_SHMEM is not ...
.. Blah.. that was for the wrong kernel. Here is the correct .config that was in use for both suspend logs: # # Automatically generated make config: don't edit # Linux kernel version: 2.6.23.1 # Wed Nov 14 09:41:54 2007 # CONFIG_X86_32=y CONFIG_GENERIC_TIME=y CONFIG_GENERIC_CMOS_UPDATE=y CONFIG_CLOCKSOURCE_WATCHDOG=y CONFIG_GENERIC_CLOCKEVENTS=y CONFIG_GENERIC_CLOCKEVENTS_BROADCAST=y CONFIG_LOCKDEP_SUPPORT=y CONFIG_STACKTRACE_SUPPORT=y CONFIG_SEMAPHORE_SLEEPERS=y CONFIG_X86=y CONFIG_MMU=y CONFIG_ZONE_DMA=y CONFIG_QUICKLIST=y CONFIG_GENERIC_ISA_DMA=y CONFIG_GENERIC_IOMAP=y CONFIG_GENERIC_BUG=y CONFIG_GENERIC_HWEIGHT=y CONFIG_ARCH_MAY_HAVE_PC_FDC=y CONFIG_DMI=y CONFIG_DEFCONFIG_LIST="/lib/modules/$UNAME_RELEASE/.config" # # General setup # CONFIG_EXPERIMENTAL=y CONFIG_LOCK_KERNEL=y CONFIG_INIT_ENV_ARG_LIMIT=32 CONFIG_LOCALVERSION="" # CONFIG_LOCALVERSION_AUTO is not set CONFIG_SWAP=y CONFIG_SYSVIPC=y CONFIG_SYSVIPC_SYSCTL=y CONFIG_POSIX_MQUEUE=y # CONFIG_BSD_PROCESS_ACCT is not set # CONFIG_TASKSTATS is not set # CONFIG_USER_NS is not set # CONFIG_AUDIT is not set CONFIG_IKCONFIG=y CONFIG_IKCONFIG_PROC=y CONFIG_LOG_BUF_SHIFT=16 # CONFIG_CPUSETS is not set CONFIG_SYSFS_DEPRECATED=y # CONFIG_RELAY is not set CONFIG_BLK_DEV_INITRD=y CONFIG_INITRAMFS_SOURCE="" CONFIG_CC_OPTIMIZE_FOR_SIZE=y CONFIG_SYSCTL=y CONFIG_EMBEDDED=y CONFIG_UID16=y CONFIG_SYSCTL_SYSCALL=y CONFIG_KALLSYMS=y # CONFIG_KALLSYMS_ALL is not set # CONFIG_KALLSYMS_EXTRA_PASS is not set CONFIG_HOTPLUG=y CONFIG_PRINTK=y CONFIG_BUG=y CONFIG_ELF_CORE=y CONFIG_BASE_FULL=y CONFIG_FUTEX=y CONFIG_ANON_INODES=y CONFIG_EPOLL=y CONFIG_SIGNALFD=y CONFIG_EVENTFD=y CONFIG_SHMEM=y CONFIG_VM_EVENT_COUNTERS=y CONFIG_SLAB=y # CONFIG_SLUB is not set # CONFIG_SLOB is not set CONFIG_RT_MUTEXES=y # CONFIG_TINY_SHMEM is not ...
Can you try nohz=off highres=off? Strange stuff is happening with nohz. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html -
.. (added Ingo to CC: list: maybe this is some weird interaction with CFS and jiffies being reset to 0 on resume ??) I can try it, but it won't help debug the problem much. Remember, this happens very inconsistently, maybe 3-4 times a day, or not at all for 3-4 days. But if somebody has a specific bug-fix patch that could explain this, then I'll happily apply it here. Cheers -
hm, CFS should have no impact here. To see what's happening you could try to use the latency tracer of the -rt patch and do a cross-resume trace. pick up the latest latency tracer patch from: http://redhat.com/~mingo/private/latency-tracer-v2.6.24-rc2-git5-combo.patch apply it and enable CONFIG_FUNCTION_TRACING, then pick up trace-cmd.c: http://redhat.com/~mingo/private/trace-cmd.c and do something like: ./trace-cmd pm-suspend > trace.txt or: ./trace-cmd /bin/bash -c "echo ram > /sys/power/state" > trace.txt this should trigger suspend - then you should do the resume. If everything goes well then trace.txt should contain a pretty large trace of all the stuff we do during a suspend+resume. and wait for such a pause and send us the resulting trace.txt. if it's an SMP box then first do: echo 1 > /proc/sys/kernel/trace_all_cpus to get a global trace. Let me know if something doesnt work with this scheme. Ingo -
sorry, wrong URLs, the correct links are: http://redhat.com/~mingo/latency-tracing-patches/latency-tracer-v2.6.24-rc2-git5-combo... http://redhat.com/~mingo/latency-tracing-patches/trace-cmd.c Ingo -
.. Is there a version of these that works with 2.6.23.1 ? I'm not using 2.6.24-* on this machine because: (a) behaviour may be different, and (b) something broke VMware compatibility again. Thanks -
yes, i've backported it and have uploaded the v2.6.23 version to: http://redhat.com/~mingo/latency-tracing-patches/latency-tracer-v2.6.23.1-combo.patch Ingo -
so here's an UP suspend+resume trace i did: http://redhat.com/~mingo/latency-tracing-patches/misc/trace-suspend-long.txt.bz2 tons of detail - which might be interesting to other folks as well. Fact is, our suspend-to-RAM+resume cycle is very, very slow, even on fast hardware - and this trace shows all the reasons why. This was a fully cached system - i.e. i've done a suspend+resume before to warm up the caches. (not that suspend+resume does much IO normally.) The trace shows that a suspend+resume cycle is 7.95 seconds long (without counting the time the box spent suspended) - ouch! This was a T60 with Core2Duo 1.83GHz. For example here is where freezing starts: bash-2397 0.... 31686us : remove_wait_queue (vt_waitactive) bash-2397 0.... 31688us : freeze_processes (enter_state) bash-2397 0.... 31689us : printk (freeze_processes) here is where the ACPI code triggers the suspend: bash-2397 0D... 1904138us : acpi_hw_low_level_write (acpi_hw_register_write) but this is a whopping 1.9 seconds into the trace already! first sign of life after i opened the laptop lid again: bash-2397 0D... 1904138us : __restore_processor_state (restore_processor_state) bash-2397 0D... 1904138us : enable_sep_cpu (__restore_processor_state) (in the trace there's no delay visible - the period of time spent suspended is not visible to the tracer.) One good way to start looking at such traces is to filter out rescheduling events alone: grep ': schedule <' trace-suspend-long.txt that gives a rough outline of what's going on: <idle>-0 0D... 1776566us : schedule <bash-2397> (0 20) bash-2397 0D... 1786748us : schedule <<idle>-0> (20 0) scsi_eh_-419 0D... 1786814us : schedule <bash-2397> (0 -5) bash-2397 0D... 1786960us : schedule <scsi_eh_-419> (-5 0) scsi_eh_-421 0D... 1787020us : schedule <bash-2397> (0 -5) bash-2397 0D... 1787125us : schedule <scsi_eh_-421> (-5 0) so you can zoom in on the real area ...
and the amount of time spent executing on the CPU was only 70 msecs! So
we spent 99% of that 7.9 seconds with just waiting around. Here are the
top 10 sleep reasons:
864 schedule()<-schedule_timeout()<-ps2_sendbyte()<-ps2_command()
183 schedule()<-vt_waitactive()<-vt_ioctl()<-tty_ioctl()
164 schedule()<-schedule_timeout()<-acpi_ec_wait()<-acpi_ec_transaction()
157 schedule()<-refrigerator()<-get_signal_to_deliver()<-do_notify_resume()
118 schedule()<-worker_thread()<-kthread()<-kernel_thread_helper()
80 schedule()<-do_msleep()<-msleep()<-sata_link_debounce()
64 schedule()<-schedule_timeout()<-inet_csk_accept()<-inet_accept()
37 schedule()<-__mutex_lock_slowpath()<-mutex_lock()<-acpi_ec_transaction()
20 schedule()<-schedule_timeout()<-do_select()<-core_sys_select()
20 schedule()<-io_schedule()<-sync_buffer()<-__wait_on_bit()
what's weird are all those ps2 related sleeps - they make up for much of
the delay. This how such a sleep point looks like:
bash 3500 0D... 8641415us : schedule()<-schedule_timeout()<-ps2_sendbyte()<-ps2_command()
bash 3500 0D... 8641417us : psmouse_sliced_command()<-synaptics_pt_write()<-ps2_sendbyte()<-ps2_command()
it starts somewhere here:
bash 3500 0.... 5302376us : serio_reconnect_driver (serio_resume)
and ends:
kseriod 208 0D... 9182560us : synaptics_query_hardware()<-synaptics_reconnect()<-psmouse_reconnect()<-serio_reconnect_driver()
so this section (serio_resume()) took almost 4 seconds.
the main delay seems to be dpm_resume():
bash 3500 0.N.. 3061040us : dpm_resume (device_resume)
...
bash 3500 0.... 9105017us : mutex_unlock (dpm_resume)
bash 3500 0.... 9105018us : mutex_unlock (device_resume)
6.1 seconds!
Ingo
-
snd hda suspend latency goes down a second via the patch below.
Ingo
------------->
Subject: snd hda suspend latency: shorten codec read
From: Ingo Molnar <mingo@elte.hu>
not sleeping for every codec read/write but doing a short udelay and
a conditional reschedule has cut suspend+resume latency by about 1
second on my T60.
Signed-off-by: Ingo Molnar <mingo@elte.hu>
---
sound/pci/hda/hda_intel.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
Index: linux/sound/pci/hda/hda_intel.c
===================================================================
--- linux.orig/sound/pci/hda/hda_intel.c
+++ linux/sound/pci/hda/hda_intel.c
@@ -555,7 +555,8 @@ static unsigned int azx_rirb_get_respons
}
if (!chip->rirb.cmds)
return chip->rirb.res; /* the last value */
- schedule_timeout_uninterruptible(1);
+ udelay(10);
+ cond_resched();
} while (time_after_eq(timeout, jiffies));
if (chip->msi) {
-
-- "Premature optimization is the root of all evil." - Donald Knuth -
At Fri, 16 Nov 2007 13:58:24 +0100, Cute, I applied to ALSA tree now. Thanks! -
Ouch? That's an order of magnitude faster than my 3GHz P4 :) -Mike -
.. make trace-cmd cc -Wall -O2 -s trace-cmd.c -o trace-cmd trace-cmd.c: In function ‘main’: trace-cmd.c:65: warning: label ‘usage’ defined but not used -
.. An update: Ingo's 2.6.23 version of the latency-tracing-patches only lock-up on resume here. But now that I've hacked vmware to work on 2.6.24, my notebook is now running the newer kernels. So now to see if the "strange 1-second pauses" ever happen here, and if they do I'll patch in Ingo's stuff to try and find out why. -
Don't seem to work with plain 2.6.23: kernel/sched.c:3384: warning: ‘struct prio_array’ declared inside parameter list kernel/sched.c:3384: warning: its scope is only this definition or declaration, which is probably not what you want kernel/sched.c: In function ‘trace_array’: kernel/sched.c:3391: error: dereferencing pointer to incomplete type kernel/sched.c:3393: error: dereferencing pointer to incomplete type kernel/sched.c:3393: error: dereferencing pointer to incomplete type kernel/sched.c:3396: error: dereferencing pointer to incomplete type kernel/sched.c:3396: error: dereferencing pointer to incomplete type kernel/sched.c: In function ‘trace_all_runnable_tasks’: kernel/sched.c:3407: error: ‘struct rq’ has no member named ‘active’ make[1]: *** [kernel/sched.o] Error 1 And I cannot find a definition of struct prio_array in current git either. Is another patch needed? Jörn -- Time? What's that? Time is only worth what you do with it. -- Theo de Raadt -
change that to rt_prio_array in the code. Ingo -
could you try this updated version: http://redhat.com/~mingo/latency-tracing-patches/latency-tracing-v2.6.24-rc3.combo.patch does it work any better? Ingo -
It compiles. It boots with a 512M RAM (384M was too little with all the other debug options on). But it seems to lock up when running trace-cmd. On a rerun it locks up again, but with different output. Rerun was captured: http://logfs.org/~joern/trace1.jpg I should do a couple of runs, but my girlfriend claims realtime priority for the evening. Jörn -- Chance favors only the prepared mind. -- Louis Pasteur -
hm, you should decrease MAX_TRACE in kernel/latency_tracing.c from 1 million to 16K or so. 1 million entries probably depletes lowmem quite hm, that looks weird. if you disable CONFIG_PROVE_LOCKING, does that yeah, SCHED_IDLE is not generally well received by them. Ingo -
Not much, although the dumps look different now: http://logfs.org/~joern/trace3.jpg http://logfs.org/~joern/trace4.jpg I have to change my qemu setup a little to see the top of those ...as soon as more urgent tasks has finished (weekend is over). Jörn -- It does not matter how slowly you go, so long as you do not stop. -- Confucius --
btw., if you start qemu like this: qemu -cdrom ./cdrom.iso -hda ./hda.img -boot c -full-screen -kernel ~/bzImage -append "root=/dev/hda1 earlyprintk=serial,ttyS0,9600 console=tty console=ttyS0,9600 enforcing=0 debug" you'll get the inner kernel's serial console log to qemu's standard output. Pretty useful for capturing kernel crashes. Ingo --
Almost. "-serial stdio" was missing. Much better now. stopped custom tracer. BUG: spinlock recursion on CPU#0, sh/953 lock: c030f280, .magic: dead4ead, .owner: sh/953, .owner_cpu: 0 Pid: 953, comm: sh Not tainted 2.6.24-rc3-ge1cca7e8-dirty #2 [<c0103a04>] show_trace_log_lvl+0x35/0x54 [<c010450a>] show_trace+0x2c/0x2e [<c0104e6d>] dump_stack+0x84/0x8a [<c01ded7c>] spin_bug+0xa7/0xae [<c01def14>] _raw_spin_lock+0x45/0xfa [<c02a02b1>] _spin_lock_irqsave+0x68/0x7a [<c01087e7>] pit_read+0x14/0x99 [<c0130ee9>] get_monotonic_cycles+0xf/0x2d [<c013c0ef>] now+0x2a/0x7c [<c013c33b>] ____trace+0x4d/0x1e8 [<c013dbf3>] __mcount+0x95/0xa6 [<c010d35c>] mcount+0x14/0x18 [<c0135a44>] lock_acquired+0xe/0x1d7 [<c02a02b9>] _spin_lock_irqsave+0x70/0x7a [<c01087e7>] pit_read+0x14/0x99 [<c0130791>] update_wall_time+0x23/0x692 [<c0121756>] do_timer+0x24/0xb1 [<c01331fe>] tick_periodic+0x49/0x84 [<c013325b>] tick_handle_periodic+0x22/0x73 [<c0106315>] timer_interrupt+0x4f/0x56 [<c013e2c7>] handle_IRQ_event+0x24/0x4f [<c013f44a>] handle_edge_irq+0xb8/0x125 [<c01054ee>] do_IRQ+0x89/0xa3 [<c01033df>] common_interrupt+0x23/0x28 [<c015d924>] vfs_write+0xa6/0x14c [<c015df6e>] sys_write+0x4c/0x70 [<c0102a1f>] syscall_call+0x7/0xb ======================= I assume you have the latency tracer working. If you could send me your config, I could do a manual config-bisect and see which part of mine causes the problem. Jörn -- Admonish your friends privately, but praise them openly. -- Publilius Syrus --
ah. You should mark pit_read() function as notrace. PIT clocksource is rare. (add the 'notrace' word to the function prototype) Ingo --
Hardly a change at all. Apart from some offsets, this dump is identical. stopped custom tracer. BUG: spinlock recursion on CPU#0, sh/954 lock: c030f280, .magic: dead4ead, .owner: sh/954, .owner_cpu: 0 Pid: 954, comm: sh Not tainted 2.6.24-rc3-ge1cca7e8-dirty #3 [<c0103a04>] show_trace_log_lvl+0x35/0x54 [<c010450a>] show_trace+0x2c/0x2e [<c0104e6d>] dump_stack+0x84/0x8a [<c01ded7c>] spin_bug+0xa7/0xae [<c01def14>] _raw_spin_lock+0x45/0xfa [<c02a02b1>] _spin_lock_irqsave+0x68/0x7a [<c01087e2>] pit_read+0xf/0x91 [<c0130ee1>] get_monotonic_cycles+0xf/0x2d [<c013c0e7>] now+0x2a/0x7c [<c013c333>] ____trace+0x4d/0x1e8 [<c013dbeb>] __mcount+0x95/0xa6 [<c010d354>] mcount+0x14/0x18 [<c0135a3c>] lock_acquired+0xe/0x1d7 [<c02a02b9>] _spin_lock_irqsave+0x70/0x7a [<c01087e2>] pit_read+0xf/0x91 [<c0130789>] update_wall_time+0x23/0x692 [<c012174e>] do_timer+0x24/0xb1 [<c01331f6>] tick_periodic+0x49/0x84 [<c0133253>] tick_handle_periodic+0x22/0x73 [<c0106315>] timer_interrupt+0x4f/0x56 [<c013e2bf>] handle_IRQ_event+0x24/0x4f [<c013f442>] handle_edge_irq+0xb8/0x125 [<c01054ee>] do_IRQ+0x89/0xa3 [<c01033df>] common_interrupt+0x23/0x28 [<c010d354>] mcount+0x14/0x18 [<c0120130>] sysctl_head_finish+0xc/0x33 [<c0192d64>] proc_sys_write+0x96/0xa0 [<c015d91c>] vfs_write+0xa6/0x14c [<c015df66>] sys_write+0x4c/0x70 [<c0102a1f>] syscall_call+0x7/0xb ======================= Jörn -- Don't worry about people stealing your ideas. If your ideas are any good, you'll have to ram them down people's throats. -- Howard Aiken quoted by Ken Iverson quoted by Jim Horning quoted by Raph Levien, 1979 --
hm, it seems lock_acquired() [in kernel/lockdep.c] needs to be marked 'notrace' too - otherwise we recurse back into pit_read(). Ingo --
After another ten or so notrace annotations throughout the spinlock code, the latency tracer appears to work. Not sure how many useful information is missing through all the annotations, though. Jörn -- Das Aufregende am Schreiben ist es, eine Ordnung zu schaffen, wo vorher keine existiert hat. -- Doris Lessing --
a few annotations out of thousands of function calls in the kernel it's usually not significant. Ingo --
thanks - the patches applied fine, i've added them to my latency-tracer patchqueue. Steve Rostedt might want to pick the fixes up for -rt as well. Ingo --
hm, do you have CONFIG_FRAME_POINTERS=y, i.e. are the dumps reliable? Ingo --
I do. Went through 10odd runs and annotated the function right below mcount each time. Seems to work now. Trouble is that it doesn't solve my real problem at hand. Something is causing significant delays when writing to logfs. Core logfs code is not running, but may cause whatever other code is running and burning up all the cpu time. Wasting 100ms of "qemu-time" to write a single page happens fairly frequently. With the latency tracer the problem appears to have become worse. Now the loftlockup code triggers quite frequently. Which makes a bit of sense, as the problem is a busy CPU, rather than an idle one. Guess I'll try oprofile or lcov instead. Jörn -- Joern's library part 5: http://www.faqs.org/faqs/compression-faq/part2/section-9.html --
well what does the trace say, where do the delays come from? To get a quick overview you can make tracing lighter weight by doing: echo 0 > /proc/sys/kernel/mcount_enabled echo 1 > /proc/sys/kernel/trace_syscalls (this turns the latency tracer into a "global strace" kind of tracer) Ingo --
I mistyped and did echo 1 > /proc/sys/kernel/mcount_enabled Result looked like a livelock and finally convinced me to abandon the latency tracer. Sorry, but it appears to be the right tool for the wrong job. Jörn -- They laughed at Galileo. They laughed at Copernicus. They laughed at Columbus. But remember, they also laughed at Bozo the Clown. -- unknown --
hm, we routinely use it in -rt to capture "what on earth is happening" incidents. The snippet below is a random snipped from a trace that i've just captured, with mcount enabled. It seems to work fine here, with and without mcount. (pit clocksource is almost never used, that's why you had those early problems.) oprofile helps if you can reliably reproduce the slowdown in a loop or for a long amount of time, with lots of CPU utilization - and then it's also lower overhead. The tracer can be used to capture rare or complex events, and gives the full flow control and what is happening within the kernel. Ingo ------------> <idle> 0 1D... 811us : sched_clock_idle_sleep_event (acpi_processor_idle) <idle> 0 1D... 813us : _spin_lock (sched_clock_idle_sleep_event) trace-cm 2463 0.... 814us : native_flush_tlb_others (flush_tlb_mm) <idle> 0 1D... 815us : __update_rq_clock (sched_clock_idle_sleep_event) trace-cm 2463 0.... 817us : _spin_lock (native_flush_tlb_others) <idle> 0 1D... 817us+: acpi_cstate_enter (acpi_processor_idle) trace-cm 2463 0.... 820us+: send_IPI_mask_bitmask (native_flush_tlb_others) trace-cm 2463 0D... 823us+: apic_wait_icr_idle (send_IPI_mask_bitmask) trace-cm 2463 0.... 856us+: up_write (copy_process) trace-cm 2463 0.... 859us+: copy_keys (copy_process) trace-cm 2463 0.... 862us+: copy_namespaces (copy_process) trace-cm 2463 0.... 865us+: copy_thread (copy_process) trace-cm 2463 0.... 868us+: memcpy (copy_thread) trace-cm 2463 0.... 871us+: alloc_pid (copy_process) trace-cm 2463 0.... 874us+: kmem_cache_alloc (alloc_pid) trace-cm 2463 0.... 877us+: _spin_lock_irq (alloc_pid) trace-cm 2463 0.... 880us+: _write_lock_irq (copy_process) trace-cm 2463 0D... 883us+: _spin_lock (copy_process) trace-cm 2463 0D... 887us+: recalc_sigpending (copy_process) trace-cm 2463 0D... 890us+: recalc_sigpending_tsk (recalc_sigpending) trace-cm 2463 0D... 893us+: attach_pid (copy_process) trace-cm 2463 ...
Such a trace would be useful indeed. But so far the patch has only given me grief and nothing remotely like useful output. Maybe I should simply use the complete -rt patch instead of debugging the broken-out latency-tracer patch. Jörn -- Mundie uses a textbook tactic of manipulation: start with some reasonable talk, and lead the audience to an unreasonable conclusion. -- Bruce Perens --
Looks like it. Guess I'll switch to something else for the moment. Jörn -- Linux is more the core point of a concept that surrounds "open source" which, in turn, is based on a false concept. This concept is that people actually want to look at source code. -- Rob Enderle --
After an eternity of compile time, this config does generate some useful output. qemu is not to blame. Jörn -- Joern's library part 9: http://www.scl.ameslab.gov/Publications/Gus/TwelveWays.html --
not sure. It could be qemu being scheduled away? You could try to run qemu with nice -20 or so, to avoid getting preempted. If time lapses like this still show up: trace-cm 434 0D.h. 1008us!: do_timer (tick_periodic) trace-cm 434 0D.h. 1972us+: update_wall_time (do_timer) trace-cm 434 0D.h. 1008us!: do_timer (tick_periodic) trace-cm 434 0D.h. 1972us+: update_wall_time (do_timer) then that could indicate a timekeeping weirdness, OR it could mean that qemu is simply very slow. (there could be timer hw access between those two function calls) Ingo --
Solves the prio_array problem, but leaves the non-existing member active. I've upgraded to -rc3 and will give your latest patch a whirl. Jörn -- Write programs that do one thing and do it well. Write programs to work together. Write programs to handle text streams, because that is a universal interface. -- Doug MacIlroy -
Well, if you could verify that it doesn't happen at all with NO_HZ unset, that I'm not aware of any fix related to the symptoms that you observe. Greetings, Rafael -
.. This is still happening. I was hoping my PCIe hotplug bug+fix might have been related, but no it happened just now after that fix. Since Ingo's latency trace patches lock up the machine on resume, the next thing I'll try instead is to re-enable CONFIG_IRQBALANCE=y. I think that I turned that flag off at around the same time as this problem began, maybe they're related (?). I did notice that CONFIG_IRQBALANCE=y *is* necessary to keep my myth box from having 1-second audio dropouts (2.6.23.1) during playback, so maybe that same 1-second lockout is happening on this box as well (?). Cheers -
hm, which patch did you try? Could you check whether all chunks from the
patch below are applied? (these are the fixed i did when i was doing
cross-suspend traces - this is not something i've done before, so the
tracer had to be adjusted)
i suspect if you turn off CONFIG_FUNCTION_TRACING then you wont get any
hung resume - and the resulting trace would still be pretty useful. (it
will show scheduling and irq activities, etc.)
Ingo
---
arch/x86/kernel/stacktrace.c | 2 +-
arch/x86/power/cpu.c | 3 ++-
drivers/acpi/namespace/nsutils.c | 2 +-
drivers/acpi/namespace/nswalk.c | 2 +-
include/linux/sched.h | 2 ++
kernel/latency_trace.c | 26 +++++++++++++++++++++++---
kernel/softirq.c | 6 +++---
7 files changed, 33 insertions(+), 10 deletions(-)
Index: linux/arch/x86/kernel/stacktrace.c
===================================================================
--- linux.orig/arch/x86/kernel/stacktrace.c
+++ linux/arch/x86/kernel/stacktrace.c
@@ -22,7 +22,7 @@ static int save_stack_stack(void *data,
return -1;
}
-static void save_stack_address(void *data, unsigned long addr)
+static void notrace save_stack_address(void *data, unsigned long addr)
{
struct stack_trace *trace = (struct stack_trace *)data;
if (trace->skip > 0) {
Index: linux/arch/x86/power/cpu.c
===================================================================
--- linux.orig/arch/x86/power/cpu.c
+++ linux/arch/x86/power/cpu.c
@@ -123,8 +123,9 @@ void __restore_processor_state(struct sa
mcheck_init(&boot_cpu_data);
}
-void restore_processor_state(void)
+void notrace restore_processor_state(void)
{
+ trace_resume();
__restore_processor_state(&saved_context);
}
Index: linux/drivers/acpi/namespace/nsutils.c
===================================================================
--- linux.orig/drivers/acpi/namespace/nsutils.c
+++ linux/drivers/acpi/namespace/nsutils.c
@@ -923,7 +923,7 @@ struct ..... Hi Ingo! I was using your 2.6.23.1 version of the patches, plus the fix you posted. The new patch you gave just now is for 2.6.24, which won't apply to the older kernel. I'm not switching kernels yet, as doing so might mask the problem without actually resolving it for good. Cheers -
.. Just as it prints out these messages, sometimes one of them, sometimes both (or all four on the quad core): kernel: switched to high resolution mode on cpu 1 kernel: switched to high resolution mode on cpu 0 -
Yeah. No magic sysrq key or anything. There's gotta be a race somewhere that's causing it, but it's not obvious where to look for it. My regular 2-core notebook no longer suffers from it, and subtle .config changes used to make it come and go back when it first appeared. The quad-core has only done it twice on me thus far. Tracking this one down looks tricky. It might require some early lockup detection code to be tailor made or something. Cheers -
Bug was filled under IO/Storage-Other so is it assigned to <other_other@kernel-bugs.osdl.org>. Could be a FS problem as well but it is the best to wait for confirmation with 2.6.23 before proceeding further... -
Hi, it is assigned to 'other_modules@kernel-bugs.osdl.org', so I didn't notice, it's as simple as that. -- Jiri Kosina -
