[PATCH 1/3] build system: section garbage collection for vmlinux

Previous thread: [question] IPC queue filling-up problem? by Fortier,Vincent [Montreal] on Wednesday, September 5, 2007 - 6:37 am. (3 messages)

Next thread: patch: improve generic_file_buffered_write() by Bernd Schubert on Wednesday, September 5, 2007 - 6:45 am. (9 messages)
From: Denys Vlasenko
Date: Wednesday, September 5, 2007 - 6:43 am

Build system: section garbage collection for vmlinux


Newer gcc and binutils can do dead code and data removal
at link time. It is achieved using combination of
-ffunction-sections -fdata-sections options for gcc and
--gc-sections for ld.

Theory of operation:

Option -ffunction-sections instructs gcc to place each function
(including static ones) in it's own section named .text.function_name
instead of placing all functions in one big .text section.

At link time, ld normally coalesce all such sections into one
output section .text again. It is achieved by having *(.text.*) spec
along with *(.text) spec in built-in linker scripts.

If ld is invoked with --gc-sections, it tracks references, starting
from entry point and marks all input sections which are reachable
from there. Then it discards all input sections which are not marked.

This isn't buying much if you have one big .text section per .o module,
because even one referenced function will pull in entire section.
You need -ffunction-sections in order to split .text into per-function
sections and make --gc-sections much more useful.

-fdata-sections is analogous: it places each global or static variable
into .data.variable_name, .rodata.variable_name or .bss.variable_name.

How to use it in kernel:

First, we need to adapt existing code for new section names.
Basically, we need to stop using section names of the form
.text.xxxx
.data.xxxx
.rodata.xxxx
.bss.xxxx
in the kernel for - otherwise section placement done by kernel's
custom linker scripts produces broken vmlinux and vdso images.

Second, kernel linker scripts need to be adapted by adding KEEP(xxx)
directives around sections which are not directly referenced, but are
nevertheless used (initcalls, altinstructions, etc).

These patches fix section names and add
CONFIG_DISCARD_UNUSED_SECTIONS. It is not enabled
unconditionally because only newest binutils have
ld --gc-sections which is stable enough for kernel use.
IOW: this is an experimental feature ...
From: Daniel Walker
Date: Wednesday, September 5, 2007 - 9:29 am

You version doesn't work with CONFIG_MODULES right?

Also, why do you need this patch,

[PATCH 1/3] build system: section garbage collection for vmlinux

Daniel

-

From: Denys Vlasenko
Date: Wednesday, September 5, 2007 - 11:37 am

Because otherwise, for example, .data.percpu sections we already have
get picked up by *(.data.*), and then *(.data.percpu) end up empty:

                __per_cpu_start = .;
                       *(.data.percpu)
                       *(.data.percpu.shared_aligned)
                __per_cpu_end = .;

and all hell breaks loose.

We need to stop using sections named like

.text.xxxx
.data.xxxx
.rodata.xxxx
.bss.xxxx

--
vda
-

From: Daniel Walker
Date: Wednesday, September 5, 2007 - 11:38 am

Really? Take a look at this version,

http://lkml.org/lkml/2006/6/4/169

Marcello had to implement a two pass build to add back symbol used in
modules which got removed from the main kernel.. You don't appear to do
that. Marcelo also claims better size reduction than you .

Daniel

-

From: Denys Vlasenko
Date: Wednesday, September 5, 2007 - 12:14 pm

This will discard EXPORT_SYMBOLs potentially used by
out-of-tree modules.

I also saw ~10% size reductions, but then at run-time test modules
failed to load, they didn't find needed symbols.

OTOH if I know that I am not going to be using such modules,
then this can be done. Will require another CONFIG_xxx, though.
--
vda
-

From: Daniel Walker
Date: Wednesday, September 5, 2007 - 12:07 pm

Right , so it doesn't work with modules..

Daniel

-

From: Denys Vlasenko
Date: Wednesday, September 5, 2007 - 12:49 pm

What does "it" stand for in this sentence?

My patch was tested to work in my limited testing,
but it is very conservative.

I can't talk for Marcelo, but his patch probably worked too,
it just didn't guarantee that you can install kernel, and
then compile and load external module. Wel, it will compile,
but maybe will fail to load.

It sounds like *in-tree modules* were loading
just fine for Marcelo.
--
vda
-

From: Adrian Bunk
Date: Wednesday, September 5, 2007 - 12:31 pm

One point to keep in mind is that the space penalty of CONFIG_MODULES=y 
is so big that CONFIG_MODULES=n is actually the most interesting case 

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed

-

From: Denys Vlasenko
Date: Wednesday, September 5, 2007 - 6:47 am

Part 1: fix section names over entire source (all arches).

Patch is big and boring global s/.text.lock/.text_lock/
type thing.

You can regenerate it using attached
linux-2.6.23-rc4.0.fixname.sh
(e.g. if you need to rebase to later kernel).

--
vda

From: Denys Vlasenko
Date: Wednesday, September 5, 2007 - 6:49 am

Part 2: fix x86_64 vdso linker script to not produce
broken vdso image with gcc -ffunction-sections -fdata-sections.

Does not affect normal build.
--
vda
From: Denys Vlasenko
Date: Wednesday, September 5, 2007 - 6:55 am

Part 3:

Makefile:
init/Kconfig:
  add config DISCARD_UNUSED_SECTIONS with appropriate
  big scary warning. It enables gcc and ld options
  for section garbage collection.

arch/x86_64/kernel/vmlinux.lds.S:
include/asm-generic/vmlinux.lds.h:
  add KEEP and SORT_BY_ALIGNMENT directives, as needed.

arch/frv/Makefile:
  had half-baked option similar to DISCARD_UNUSED_SECTIONS,
  replace it.

DISCARD_UNUSED_SECTIONS=n should be safe for all arches.

DISCARD_UNUSED_SECTIONS=y is usable only for x86_64 at the moment.
--
vda
From: Denys Vlasenko
Date: Wednesday, September 5, 2007 - 11:40 am

At it typically happens, last-minute "obviously correct" change was a mistake.

This doesn't work as intended:

LDFLAGS_vmlinux += $(call ld-option, --gc-sections)

With the above line, --gc-sections doesn't get added,
and vmlinux is not garbage collected.

It must be

LDFLAGS_vmlinux += --gc-sections

Sorry.
--
vda
-

From: Sam Ravnborg
Date: Wednesday, September 5, 2007 - 1:46 pm

Doing a normal kernel build will link vmlinux three or four times.
If we introduce --gc-sections we should add a preparational link of
vmlinux where we use --gc-sections and skip it for the rest of the links
assuming that --gc-sections takes some time for ld to do.

I have already tried to introduce such a preparatioanl link
of vmlinux - patch attached.
I skipped this patch because suprisingly it was no win for a kernelbuild.
ld seems not to be more effective using a prelinked vmlinux.o compared to several
.o files.

I know ths prelinked vmlinux.o does not include the init stuff but
that is so little that it is not interesting for your patch.

	Sam

diff --git a/Makefile b/Makefile
index 350dedb..2cc0fd7 100644
--- a/Makefile
+++ b/Makefile
@@ -592,7 +592,7 @@ libs-y		:= $(libs-y1) $(libs-y2)
 #   +-< $(vmlinux-init)
 #   |   +--< init/version.o + more
 #   |
-#   +--< $(vmlinux-main)
+#   +--< vmlinux.o
 #   |    +--< driver/built-in.o mm/built-in.o + more
 #   |
 #   +-< kallsyms.o (see description in CONFIG_KALLSYMS section)
@@ -608,17 +608,16 @@ libs-y		:= $(libs-y1) $(libs-y2)
 
 vmlinux-init := $(head-y) $(init-y)
 vmlinux-main := $(core-y) $(libs-y) $(drivers-y) $(net-y)
-vmlinux-all  := $(vmlinux-init) $(vmlinux-main)
 vmlinux-lds  := arch/$(ARCH)/kernel/vmlinux.lds
-export KBUILD_VMLINUX_OBJS := $(vmlinux-all)
+export KBUILD_VMLINUX_OBJS := $(vmlinux-init) vmlinux.o
 
 # Rule to link vmlinux - also used during CONFIG_KALLSYMS
 # May be overridden by arch/$(ARCH)/Makefile
 quiet_cmd_vmlinux__ ?= LD      $@
       cmd_vmlinux__ ?= $(LD) $(LDFLAGS) $(LDFLAGS_vmlinux) -o $@ \
       -T $(vmlinux-lds) $(vmlinux-init)                          \
-      --start-group $(vmlinux-main) --end-group                  \
-      $(filter-out $(vmlinux-lds) $(vmlinux-init) $(vmlinux-main) vmlinux.o FORCE ,$^)
+      --start-group vmlinux.o --end-group                        \
+      $(filter-out $(vmlinux-lds) $(vmlinux-init) vmlinux.o FORCE ,$^)
 
 # Generate new ...
From: Denys Vlasenko
Date: Thursday, September 6, 2007 - 3:55 am

ld-option is designed to test whether -Wl,-Wsomething will work

Yes, this will speed up things a bit.

However, for me build time is totally dominated by CC stages, not LD.
I don't have 32 core CPU yet :(
--
vda
-

From: Sam Ravnborg
Date: Thursday, September 6, 2007 - 3:33 pm

A typical developer workflow is editing a single file
and then build the kernel.
Here CC has much less impact than LD - and this is the case we shall optimize for.

	Sam
-

From: Sam Ravnborg
Date: Monday, September 10, 2007 - 5:01 am

If we do the --gc-sections trick during the preparational link then we do
not use the arch supplied linker script.
Will it be possible to create a dedicated linker script that is valid
for all architectures and which only include the KEEP() directives for
the diverse sections?

That would also free us from modifying a lot of arch linker scripts uglifying
them even more than they are today.

	Sam
-

From: Denys Vlasenko
Date: Monday, September 10, 2007 - 12:02 pm

Unfortunately, -r and --gc-sections don't mix.

x86_64-pc-linux-gnu-ld: --gc-sections and -r may not be used together
--
vda
-

From: Sam Ravnborg
Date: Monday, September 10, 2007 - 12:14 pm

OK - so much for that optimization :-(

But then we need to annotate ALL arch linker script before introducing this.
And that bring me back to that we should put some sanity into these first.

	Sam
-

From: Denys Vlasenko
Date: Tuesday, September 11, 2007 - 4:23 am

I was working with x86_64 ld script and am willing to clean it up a bit.

I can impelment and test DISCARD_UNUSED_SECTIONS for x86_64 and later
for i386.

Other arches can follow when they find it interesting/worthwhile.

At first, big scary warning under "config DISCARD_UNUSED_SECTIONS"
should be enough to make people avoid it for production boxes, I hope.

Should I send next round of patches to you or to Andrew?
--
vda
-

From: Sam Ravnborg
Date: Tuesday, September 11, 2007 - 4:55 am

Please send them to me with appropriate cc's.

	Sam
-

From: Sam Ravnborg
Date: Wednesday, September 5, 2007 - 1:07 pm

The normal naming scheme seems to be:
.<usage>.text so in your example it would be: .lock.text
See the naming og init and exit sections (that was renamed
during 2.5 to be compatible with -ffunction-sections).

	Sam
-

From: Denys Vlasenko
Date: Thursday, September 6, 2007 - 3:59 am

Thanks, will do it that way. I plan to re-submit patches for inclusion
into 2.6.24.
--
vda
-

From: Sam Ravnborg
Date: Thursday, September 6, 2007 - 3:36 pm

This should simmer in -mm for a few weeks at minimum before hitting
mainline. I would cosider it 2.6.25 material.
We could use the 2.6.23 timeframe to bring up all the linker
script file to a level where adding all the KEEP's are mostly done
in the generic vmlinux.h file.

	Sam
-

From: Denys Vlasenko
Date: Saturday, September 8, 2007 - 8:02 am

Well, there seems to be a problem at least with .bss:

http://sourceware.org/bugzilla/show_bug.cgi?id=5006

With __attribute__((section(".bss.page_aligned")))
gcc will produce .bss.page_aligned section
with NOBITS attribute, purely on the basis
of section name starting by '.bss.'

With __attribute__((section(".bss_page_aligned"))),
section will get PROGBITS attribute instead.

Combining NOBITS and PROGBITS sections into one .bss
section is not funny.

IOW: at least for bss, we _must_ use ".bss.xxx" names.

I propose (and will implement in next round of patches)
.bss.k.page_aligned ('k' for 'kernel').

Lickily, we alctually have only one special bss section
on kernel today.

Sam, my question - should I also do the same for text/rodata/data,
just for paranoid reasons?
--
vda
-

From: Oleg Verych
Date: Wednesday, September 5, 2007 - 8:53 am

* Wed, 5 Sep 2007 14:43:21 +0100

Maybe this is just a test suit to get finish with `make XYZ static`?
____
-

From: Denys Vlasenko
Date: Wednesday, September 5, 2007 - 11:46 am

They are vaguely connected in a sense that unused function which is
not marked static doesn't generate gcc warning, but will be discarded
by --gc-sections. "make XYZ static" also tends to find them - you make
function static, you recompile the file, and gcc informs you that
the function is not used at all!

This happened to me when I did aic7xxx patches.

You may yse --print-gc-sections to see the list of discarded sections.
--
vda
-

From: Oleg Verych
Date: Wednesday, September 5, 2007 - 1:34 pm

Anyway, this is gccism/binutilizm. That about other possible/future
options?

Give me example, please, why function must be non static if not used.
If usage requires kconfig tuning, then this is a better way to go, than
to adopt yet another GNU/Luxury.
____
-

From: Adrian Bunk
Date: Wednesday, September 5, 2007 - 2:52 pm

The kernel requires GNU gcc and GNU binutils, and if you want to use 
other tools for building the kernel they have to be sufficiently 


The alternative would be to use an unmaintainable amount of #ifdef's.

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed

-

From: Denys Vlasenko
Date: Thursday, September 6, 2007 - 3:55 am

Where do you see I'm saying that they must be non-static?

We already do it, but we don't have enough developers to audit
every driver for every possible combination of config options.
As a result, there always be some amount of unused functions and data.

Actually, this is how linkers should have worked long ago.
Borland's Turbo Pascal was doing it ten+ years ago.

I don't understand why you are opposed to toolchain helping
humans to get optimized result. But it's fine with me.
I won't force anyone to select CONFIG_DISCARD_UNUSED_SECTIONS.
--
vda
-

From: Oleg Verych
Date: Thursday, September 6, 2007 - 4:40 am

On Thu, Sep 06, 2007 at 11:55:46AM +0100, Denys Vlasenko wrote:

You've did a tool. Documenting this tool to have it available for
testers/janitors/maintainers is a better way, than to have all that

That's why. It's treating symptoms, isn't it?
____
-

From: Adrian Bunk
Date: Thursday, September 6, 2007 - 5:21 am

There is no problem with his patch.

His patch improves the build process.


There's nothing that requires treatment.

It's a matter of fact that the kernel takes advantages from some 
features of GNU binutils and GNU gcc that might not be available
in other versions of these tools.

Whether you like it or not - that's not going to change.

But don't continue arguing about something where you won't win with 
words - it's open source, so you can always create a fork of the Linux 
kernel that builds with whatever toolchain you want.

The only way you could convince other people from your point of view 
would be if your forked version of the kernel would contain advantages 
that convince many users to use your kernel rather than Linus' one.

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed

-

From: Oleg Verych
Date: Thursday, September 6, 2007 - 1:43 pm

On Thu, Sep 06, 2007 at 02:21:43PM +0200, Adrian Bunk wrote:

I would like to know timing, btw. Size, especially shown 1%, doesn't
matter if link time increased dramatically. `Allyes' config, when i
had fast and rammish machine was terrible thing (last winter). If 32

Patch? Did i said patch?

Ah, patch. Yes -- hide it, because it against LKML's rules. I can
provide ftp for such things, easily.

I said tool _and_ documentation. Because if developers don't know
about `static' code or _data_ and cann't find out that, then small
description is more than welcome, i think.

But, tool. Hide it also, becasue it's kind of thing to be shamed
of (:

== untested, for demonstation only ==

SED_REM='
/\.text\./s|\.text\.|.text_|g;
/\.data\./s|\.data\.|.data_|g;
/\.bss\.p/s|\.bss\.p|.bss_p|g; # for .bss.page_aligned only
'
for place in linux/arch/* linux/kernel linux/include/asm-*
do case $place
*cris)		ADDON='/\.text\.__/n;'	;;
*powerpc)	ADDON='/\.data\.rel/n;'	;;
*parisc)	ADDON='/\.data\.vm[p0]/n;' ;;
*frv)		ADDON='/\.bss\.stack/n;';;
esac

sed -i -e "$ADDON$SED_REM" `find $place -type f`

done
done

== ==




____
-

From: Denys Vlasenko
Date: Thursday, September 6, 2007 - 5:33 am

If I understand you correctly, you seem to think that this work
of identifying every piece of code which can end up unused under
specific combination of CONFIGs, and #ifdef'ing it out,
should be done by people, not machines.

I disagree.

Allyesconfig kernel has ~1700 unused functions/data objects,
and it is only *one* of possible .configs.
There is more than 2900 CONFIG options in kernel, giving you
about 3^2000 possible permutations.

If you find it interesting to work on making them
to not have unused functions, more power to you,
and good luck convincing people to accept tons
of additional #ifdefs.
--
vda
-

From: Adrian Bunk
Date: Wednesday, September 5, 2007 - 12:27 pm

In the long term I'd like us to be able to compile the whole (or at 
least most of) the kernel with one "-combine -fwhole-program" gcc call 
which should bring the same positive effect plus enables gcc to do 
additional optimizations.

That's meant as a remark, not against your patch (which is for a lower 

cu
Adrian

-- 

       "Is there not promise of rain?" Ling Tan asked suddenly out
        of the darkness. There had been need of rain for many days.
       "Only a promise," Lao Er said.
                                       Pearl S. Buck - Dragon Seed

-

Previous thread: [question] IPC queue filling-up problem? by Fortier,Vincent [Montreal] on Wednesday, September 5, 2007 - 6:37 am. (3 messages)

Next thread: patch: improve generic_file_buffered_write() by Bernd Schubert on Wednesday, September 5, 2007 - 6:45 am. (9 messages)
</