Please check Those three patches to make memblock allocation more top to down. Thanks Yinghai --
We pre-allocate those buffer from top, so should use it top-down, so could return unused part will be bottom side. Will get less one hole in not used RAM. Signed-off-by: Yinghai Lu <yinghai@kernel.org> --- arch/x86/include/asm/init.h | 6 +++--- arch/x86/mm/init.c | 12 ++++++------ arch/x86/mm/init_32.c | 4 ++-- arch/x86/mm/init_64.c | 5 +++-- 4 files changed, 14 insertions(+), 13 deletions(-) Index: linux-2.6/arch/x86/include/asm/init.h =================================================================== --- linux-2.6.orig/arch/x86/include/asm/init.h +++ linux-2.6/arch/x86/include/asm/init.h @@ -11,8 +11,8 @@ kernel_physical_mapping_init(unsigned lo unsigned long page_size_mask); -extern unsigned long __initdata e820_table_start; -extern unsigned long __meminitdata e820_table_end; -extern unsigned long __meminitdata e820_table_top; +extern unsigned long __meminitdata e820_table_start; +extern unsigned long __initdata e820_table_end; +extern unsigned long __meminitdata e820_table_bottom; #endif /* _ASM_X86_INIT_32_H */ Index: linux-2.6/arch/x86/mm/init.c =================================================================== --- linux-2.6.orig/arch/x86/mm/init.c +++ linux-2.6/arch/x86/mm/init.c @@ -18,9 +18,9 @@ DEFINE_PER_CPU(struct mmu_gather, mmu_gathers); -unsigned long __initdata e820_table_start; -unsigned long __meminitdata e820_table_end; -unsigned long __meminitdata e820_table_top; +unsigned long __meminitdata e820_table_start; +unsigned long __initdata e820_table_end; +unsigned long __meminitdata e820_table_bottom; int after_bootmem; @@ -73,12 +73,12 @@ static void __init find_early_table_spac if (base == MEMBLOCK_ERROR) panic("Cannot find space for the kernel page tables"); - e820_table_start = base >> PAGE_SHIFT; + e820_table_start = (base + tables) >> PAGE_SHIFT; e820_table_end = e820_table_start; - e820_table_top = e820_table_start + (tables >> ...
We need to access it right way, so make sure that it is mapped already.
Prepare to put page table on local node, and nodemap is used before that.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
arch/x86/mm/numa_64.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Index: linux-2.6/arch/x86/mm/numa_64.c
===================================================================
--- linux-2.6.orig/arch/x86/mm/numa_64.c
+++ linux-2.6/arch/x86/mm/numa_64.c
@@ -87,7 +87,7 @@ static int __init allocate_cachealigned_
addr = 0x8000;
nodemap_size = roundup(sizeof(s16) * memnodemapsize, L1_CACHE_BYTES);
- nodemap_addr = memblock_find_in_range(addr, max_pfn<<PAGE_SHIFT,
+ nodemap_addr = memblock_find_in_range(addr, get_max_mapped(),
nodemap_size, L1_CACHE_BYTES);
if (nodemap_addr == MEMBLOCK_ERROR) {
printk(KERN_ERR
--
Introduce init_memory_mapping_high(), and use it with 64bit. It will go with every memory segment above 4g to create page table to the memory range itself. before this patch all page tables was on one node. with this patch, one RED-PEN is killed debug out for 8 sockets system after patch [ 0.000000] initial memory mapped : 0 - 20000000 [ 0.000000] init_memory_mapping: [0x00000000000000-0x0000007f74ffff] [ 0.000000] 0000000000 - 007f600000 page 2M [ 0.000000] 007f600000 - 007f750000 page 4k [ 0.000000] kernel direct mapping tables up to 7f750000 @ [0x7f74c000-0x7f74ffff] [ 0.000000] RAMDISK: 7bc84000 - 7f745000 .... [ 0.000000] Adding active range (0, 0x10, 0x95) 0 entries of 3200 used [ 0.000000] Adding active range (0, 0x100, 0x7f750) 1 entries of 3200 used [ 0.000000] Adding active range (0, 0x100000, 0x1080000) 2 entries of 3200 used [ 0.000000] Adding active range (1, 0x1080000, 0x2080000) 3 entries of 3200 used [ 0.000000] Adding active range (2, 0x2080000, 0x3080000) 4 entries of 3200 used [ 0.000000] Adding active range (3, 0x3080000, 0x4080000) 5 entries of 3200 used [ 0.000000] Adding active range (4, 0x4080000, 0x5080000) 6 entries of 3200 used [ 0.000000] Adding active range (5, 0x5080000, 0x6080000) 7 entries of 3200 used [ 0.000000] Adding active range (6, 0x6080000, 0x7080000) 8 entries of 3200 used [ 0.000000] Adding active range (7, 0x7080000, 0x8080000) 9 entries of 3200 used [ 0.000000] init_memory_mapping: [0x00000100000000-0x0000107fffffff] [ 0.000000] 0100000000 - 1080000000 page 2M [ 0.000000] kernel direct mapping tables up to 1080000000 @ [0x107ffbd000-0x107fffffff] [ 0.000000] memblock_x86_reserve_range: [0x107ffc2000-0x107fffffff] PGTABLE [ 0.000000] init_memory_mapping: [0x00001080000000-0x0000207fffffff] [ 0.000000] 1080000000 - 2080000000 page 2M [ 0.000000] kernel direct mapping tables up to 2080000000 @ [0x207ff7d000-0x207fffffff] [ 0.000000] ...
Introduce init_memory_mapping_high(), and use it with 64bit. It will go with every memory segment above 4g to create page table to the memory range itself. before this patch all page tables was on one node. with this patch, one RED-PEN is killed debug out for 8 sockets system after patch [ 0.000000] initial memory mapped : 0 - 20000000 [ 0.000000] init_memory_mapping: [0x00000000000000-0x0000007f74ffff] [ 0.000000] 0000000000 - 007f600000 page 2M [ 0.000000] 007f600000 - 007f750000 page 4k [ 0.000000] kernel direct mapping tables up to 7f750000 @ [0x7f74c000-0x7f74ffff] [ 0.000000] RAMDISK: 7bc84000 - 7f745000 .... [ 0.000000] Adding active range (0, 0x10, 0x95) 0 entries of 3200 used [ 0.000000] Adding active range (0, 0x100, 0x7f750) 1 entries of 3200 used [ 0.000000] Adding active range (0, 0x100000, 0x1080000) 2 entries of 3200 used [ 0.000000] Adding active range (1, 0x1080000, 0x2080000) 3 entries of 3200 used [ 0.000000] Adding active range (2, 0x2080000, 0x3080000) 4 entries of 3200 used [ 0.000000] Adding active range (3, 0x3080000, 0x4080000) 5 entries of 3200 used [ 0.000000] Adding active range (4, 0x4080000, 0x5080000) 6 entries of 3200 used [ 0.000000] Adding active range (5, 0x5080000, 0x6080000) 7 entries of 3200 used [ 0.000000] Adding active range (6, 0x6080000, 0x7080000) 8 entries of 3200 used [ 0.000000] Adding active range (7, 0x7080000, 0x8080000) 9 entries of 3200 used [ 0.000000] init_memory_mapping: [0x00000100000000-0x0000107fffffff] [ 0.000000] 0100000000 - 1080000000 page 2M [ 0.000000] kernel direct mapping tables up to 1080000000 @ [0x107ffbd000-0x107fffffff] [ 0.000000] memblock_x86_reserve_range: [0x107ffc2000-0x107fffffff] PGTABLE [ 0.000000] init_memory_mapping: [0x00001080000000-0x0000207fffffff] [ 0.000000] 1080000000 - 2080000000 page 2M [ 0.000000] kernel direct mapping tables up to 2080000000 @ [0x207ff7d000-0x207fffffff] [ 0.000000] ...
Lovely, yet another interbranch conflict. This makes me very concerned. What is the delta between these? -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. --
your new x86/numa have
setup_physnodes(addr, max_addr, acpi, amd);
fake_physnodes(acpi, amd, num_nodes);
instead of
acpi_fake_nodes(nodes, num_nodes);
in numa_emulation()
Thanks
Yinghai
--
That's from f51bf3073a1 (x86, numa: Fake apicid and pxm mappings for NUMA emulation) and c1c3443c9c (x86, numa: Fake node-to-cpumask for NUMA emulation) in x86/numa. Given the subject line, I think your patchset is targeted to the same branch so I'm not sure what's concerning?
No, it's part of a much bigger patchset which doesn't have anything to do with NUMA. That's the problem. In other words, I need a sane way to merge them and resolve the conflict. -hpa --
The two patches above from x86/numa that create the conflict should be dependent only on 4e76f4e67a (x86, numa: Avoid compiling NUMA emulation functions without CONFIG_NUMA_EMU), so cherry-pick them into x86/bootmem? --
That would hurt more, I think. -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. --
x86/bootmem could be based on x86/numa - the latter is stable so it's not like we'll have to undo it from under x86/bootmem. We can then send it to Linus once x86/numa is upstream. Btw., i suspect we want to use x86/memblock instead of x86/bootmem? Thanks, Ingo --
FYI, either the x86/numa or the x86/bootmem changes cause the early boot crash below. Config attached. Thanks, Ingo ----------------> Linux version 2.6.37-rc8-tip-01830-g7937b8c-dirty (mingo@sirius) (gcc version 4.5.1 20100924 (Red Hat 4.5.1-4) (GCC) ) #76886 SMP Thu Dec 30 12:12:49 CET 2010 BIOS-provided physical RAM map: BIOS-e820: 0000000000000000 - 000000000009f800 (usable) BIOS-e820: 000000000009f800 - 00000000000a0000 (reserved) BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved) BIOS-e820: 0000000000100000 - 000000003fff0000 (usable) BIOS-e820: 000000003fff0000 - 000000003fff3000 (ACPI NVS) BIOS-e820: 000000003fff3000 - 0000000040000000 (ACPI data) BIOS-e820: 00000000e0000000 - 00000000f0000000 (reserved) BIOS-e820: 00000000fec00000 - 0000000100000000 (reserved) bootconsole [earlyser0] enabled debug: ignoring loglevel setting. Notice: NX (Execute Disable) protection cannot be enabled: non-PAE kernel! DMI 2.3 present. DMI: A8N-E/System Product Name, BIOS ASUS A8N-E ACPI BIOS Revision 1008 08/22/2005 e820 update range: 0000000000000000 - 0000000000010000 (usable) ==> (reserved) e820 remove range: 00000000000a0000 - 0000000000100000 (usable) last_pfn = 0x3fff0 max_arch_pfn = 0x100000 MTRR default type: uncachable MTRR fixed ranges enabled: 00000-9FFFF write-back A0000-BFFFF uncachable C0000-C7FFF write-protect C8000-FFFFF uncachable MTRR variable ranges enabled: 0 base 0000000000 mask FFC0000000 write-back 1 disabled 2 disabled 3 disabled 4 disabled 5 disabled 6 disabled 7 disabled x86 PAT enabled: cpu 0, old 0x7040600070406, new 0x7010600070106 Scan SMP from c0000000 for 1024 bytes. Scan SMP from c009fc00 for 1024 bytes. Scan SMP from c00f0000 for 65536 bytes. found SMP MP-table at [c00f5680] f5680 mpc: f1400-f152c Scanning 0 areas for low memory corruption initial memory mapped : 0 - 02800000 init_memory_mapping: 0000000000000000-00000000373fe000 0000000000 - 0000400000 page 4k 0000400000 - 0037000000 ...
It's x86/bootmem, one of these commits: 3c417751e4f0: x86: Rename e820_table_* to pgt_buf_* d7992231c148: x86-64: Move out cleanup higmap [_brk_end, _end) out of init_memory_mapping() 4645b6af9427: x86: Use early pre-allocated page table buffer top-down 1411e0ec3123: x86-64, numa: Put pgtable to local node memory dbef7b56d2fc: x86-64, numa: Allocate memnodemap under max_pfn_mapped 45635ab5e41b: x86: Change get_max_mapped() to inline 1a4a678b12c8: memblock: Make find_memory_core_early() find from top-down 32e3f2b00c52: x86-64, gart: Fix allocation with memblock 4b239f458c22: x86-64, mm: Put early page table high i'm excluding them from tip:master for now. Thanks, Ingo --
and x86/numa has this build failure: arch/x86/mm/numa_64.c: In function ‘numa_set_cpumask’: arch/x86/mm/numa_64.c:851:14: error: ‘physnodes’ undeclared (first use in this function) arch/x86/mm/numa_64.c:851:14: note: each undeclared identifier is reported only once for each function it appears in config attached. Thanks, Ingo
Yeah, you reported this one earlier and I sent a patch four days ago to fix it (http://marc.info/?l=linux-kernel&m=129340072128297). I'll reply to this email with it again. Thanks!
"x86, numa: Fake node-to-cpumask for NUMA emulation" broke the build when
CONFIG_DEBUG_PER_CPU_MAPS is set and CONFIG_NUMA_EMU is not. This is
because it is possible to map a cpu to multiple nodes when NUMA emulation
is used; the patch required a physical node address table to find those
nodes that was only available when CONFIG_NUMA_EMU was enabled.
This extracts the common debug functionality to its own function for
CONFIG_DEBUG_PER_CPU_MAPS and uses it regardless of whether
CONFIG_NUMA_EMU is set or not.
NUMA emulation will now iterate over the set of possible nodes for each
cpu and call the new debug function whereas only the cpu's node will be
used without NUMA emulation enabled.
Reported-by: Ingo Molnar <mingo@elte.hu>
Signed-off-by: David Rientjes <rientjes@google.com>
---
arch/x86/mm/numa_64.c | 48 +++++++++++++++++++++++++++++++++++++-----------
1 files changed, 37 insertions(+), 11 deletions(-)
diff --git a/arch/x86/mm/numa_64.c b/arch/x86/mm/numa_64.c
--- a/arch/x86/mm/numa_64.c
+++ b/arch/x86/mm/numa_64.c
@@ -833,15 +833,48 @@ void __cpuinit numa_remove_cpu(int cpu)
#endif /* !CONFIG_NUMA_EMU */
#else /* CONFIG_DEBUG_PER_CPU_MAPS */
+static struct cpumask __cpuinit *debug_cpumask_set_cpu(int cpu, int enable)
+{
+ int node = early_cpu_to_node(cpu);
+ struct cpumask *mask;
+ char buf[64];
+
+ mask = node_to_cpumask_map[node];
+ if (!mask) {
+ pr_err("node_to_cpumask_map[%i] NULL\n", node);
+ dump_stack();
+ return NULL;
+ }
+
+ cpulist_scnprintf(buf, sizeof(buf), mask);
+ printk(KERN_DEBUG "%s cpu %d node %d: mask now %s\n",
+ enable ? "numa_add_cpu" : "numa_remove_cpu",
+ cpu, node, buf);
+ return mask;
+}
/*
* --------- debug versions of the numa functions ---------
*/
+#ifndef CONFIG_NUMA_EMU
+static void __cpuinit numa_set_cpumask(int cpu, int enable)
+{
+ struct cpumask *mask;
+
+ mask = debug_cpumask_set_cpu(cpu, enable);
+ if (!mask)
+ return;
+
+ if (enable)
+ cpumask_set_cpu(cpu, ...caused by 4645b6af9427: x86: Use early pre-allocated page table buffer top-down 32 bit fixmap will use the pre-allocated range too. it needs range to be continuous... please drop 4645b6af9427: x86: Use early pre-allocated page table buffer top-down 3c417751e4f0: x86: Rename e820_table_* to pgt_buf_* and will send out new version of x86: Rename e820_table_* to pgt_buf_* Thanks Yinghai --
Move it into head file. to prepare use it in other files.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
arch/x86/include/asm/page_types.h | 5 +++++
arch/x86/kernel/setup.c | 9 ---------
2 files changed, 5 insertions(+), 9 deletions(-)
Index: linux-2.6/arch/x86/include/asm/page_types.h
===================================================================
--- linux-2.6.orig/arch/x86/include/asm/page_types.h
+++ linux-2.6/arch/x86/include/asm/page_types.h
@@ -45,6 +45,11 @@ extern int devmem_is_allowed(unsigned lo
extern unsigned long max_low_pfn_mapped;
extern unsigned long max_pfn_mapped;
+static inline u64 get_max_mapped(void)
+{
+ return (u64)max_pfn_mapped<<PAGE_SHIFT;
+}
+
extern unsigned long init_memory_mapping(unsigned long start,
unsigned long end);
Index: linux-2.6/arch/x86/kernel/setup.c
===================================================================
--- linux-2.6.orig/arch/x86/kernel/setup.c
+++ linux-2.6/arch/x86/kernel/setup.c
@@ -680,15 +680,6 @@ static int __init parse_reservelow(char
early_param("reservelow", parse_reservelow);
-static u64 __init get_max_mapped(void)
-{
- u64 end = max_pfn_mapped;
-
- end <<= PAGE_SHIFT;
-
- return end;
-}
-
/*
* Determine if we were loaded by an EFI loader. If so, then we have also been
* passed the efi memmap, systab, etc., so we should use these data structures
--
This is broken. <asm/page_types.h> doesn't include <linux/types.h> which is required for the u64 type -- a simple compile test would have told you this. Furthermore, it seems to me that it would make more sense for this to be phys_addr_t instead of u64; would you agree? -hpa --
or could just use unsigned long instead. on 32bit it will be under 4g on 64bit unsigned long is 64bit already. Thanks Yinghai --
This is true, although it seems fragile -- the whole terminology and the differences between 32 and 64 bits are just a huge headache. -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. --
Move it into head file. to prepare use it in other files.
-v2: hpa pointed out that u64 should not be used here.
Actually We could unsigned long here. because for 32bit it will under 4g.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
arch/x86/include/asm/page_types.h | 5 +++++
arch/x86/kernel/setup.c | 9 ---------
2 files changed, 5 insertions(+), 9 deletions(-)
Index: linux-2.6/arch/x86/include/asm/page_types.h
===================================================================
--- linux-2.6.orig/arch/x86/include/asm/page_types.h
+++ linux-2.6/arch/x86/include/asm/page_types.h
@@ -45,6 +45,11 @@ extern int devmem_is_allowed(unsigned lo
extern unsigned long max_low_pfn_mapped;
extern unsigned long max_pfn_mapped;
+static inline unsigned long get_max_mapped(void)
+{
+ return max_pfn_mapped<<PAGE_SHIFT;
+}
+
extern unsigned long init_memory_mapping(unsigned long start,
unsigned long end);
Index: linux-2.6/arch/x86/kernel/setup.c
===================================================================
--- linux-2.6.orig/arch/x86/kernel/setup.c
+++ linux-2.6/arch/x86/kernel/setup.c
@@ -680,15 +680,6 @@ static int __init parse_reservelow(char
early_param("reservelow", parse_reservelow);
-static u64 __init get_max_mapped(void)
-{
- u64 end = max_pfn_mapped;
-
- end <<= PAGE_SHIFT;
-
- return end;
-}
-
/*
* Determine if we were loaded by an EFI loader. If so, then we have also been
* passed the efi memmap, systab, etc., so we should use these data structures
--
Now it is found from memblock way. Change the name to purpose related. Signed-off-by: Yinghai Lu <yinghai@kernel.org> --- arch/x86/include/asm/init.h | 6 +++--- arch/x86/mm/init.c | 20 ++++++++++---------- arch/x86/mm/init_32.c | 8 ++++---- arch/x86/mm/init_64.c | 4 ++-- arch/x86/xen/mmu.c | 2 +- 5 files changed, 20 insertions(+), 20 deletions(-) Index: linux-2.6/arch/x86/include/asm/init.h =================================================================== --- linux-2.6.orig/arch/x86/include/asm/init.h +++ linux-2.6/arch/x86/include/asm/init.h @@ -11,8 +11,8 @@ kernel_physical_mapping_init(unsigned lo unsigned long page_size_mask); -extern unsigned long __meminitdata e820_table_start; -extern unsigned long __initdata e820_table_end; -extern unsigned long __meminitdata e820_table_bottom; +extern unsigned long __meminitdata pgt_buf_start; +extern unsigned long __initdata pgt_buf_end; +extern unsigned long __meminitdata pgt_buf_bottom; #endif /* _ASM_X86_INIT_32_H */ Index: linux-2.6/arch/x86/mm/init.c =================================================================== --- linux-2.6.orig/arch/x86/mm/init.c +++ linux-2.6/arch/x86/mm/init.c @@ -18,9 +18,9 @@ DEFINE_PER_CPU(struct mmu_gather, mmu_gathers); -unsigned long __meminitdata e820_table_start; -unsigned long __initdata e820_table_end; -unsigned long __meminitdata e820_table_bottom; +unsigned long __meminitdata pgt_buf_start; +unsigned long __initdata pgt_buf_end; +unsigned long __meminitdata pgt_buf_bottom; int after_bootmem; @@ -73,12 +73,12 @@ static void __init find_early_table_spac if (base == MEMBLOCK_ERROR) panic("Cannot find space for the kernel page tables"); - e820_table_start = (base + tables) >> PAGE_SHIFT; - e820_table_end = e820_table_start; - e820_table_bottom = base >> PAGE_SHIFT; + pgt_buf_start = (base + tables) >> PAGE_SHIFT; + pgt_buf_end = pgt_buf_start; + pgt_buf_bottom = base >> ...
Now it is found from memblock way.
Change the name to purpose related.
-v2: Ingo found "4/6 x86: Use early pre-allocated page table buffer top-down"
cause 32bit crash.
and need to drop it, So update this one accordingly.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
arch/x86/include/asm/init.h | 6 +++---
arch/x86/mm/init.c | 20 ++++++++++----------
arch/x86/mm/init_32.c | 8 ++++----
arch/x86/mm/init_64.c | 4 ++--
arch/x86/xen/mmu.c | 2 +-
5 files changed, 20 insertions(+), 20 deletions(-)
Index: linux-2.6/arch/x86/include/asm/init.h
===================================================================
--- linux-2.6.orig/arch/x86/include/asm/init.h
+++ linux-2.6/arch/x86/include/asm/init.h
@@ -11,8 +11,8 @@ kernel_physical_mapping_init(unsigned lo
unsigned long page_size_mask);
-extern unsigned long __initdata e820_table_start;
-extern unsigned long __meminitdata e820_table_end;
-extern unsigned long __meminitdata e820_table_top;
+extern unsigned long __initdata pgt_buf_start;
+extern unsigned long __meminitdata pgt_buf_end;
+extern unsigned long __meminitdata pgt_buf_top;
#endif /* _ASM_X86_INIT_32_H */
Index: linux-2.6/arch/x86/mm/init.c
===================================================================
--- linux-2.6.orig/arch/x86/mm/init.c
+++ linux-2.6/arch/x86/mm/init.c
@@ -18,9 +18,9 @@
DEFINE_PER_CPU(struct mmu_gather, mmu_gathers);
-unsigned long __initdata e820_table_start;
-unsigned long __meminitdata e820_table_end;
-unsigned long __meminitdata e820_table_top;
+unsigned long __initdata pgt_buf_start;
+unsigned long __meminitdata pgt_buf_end;
+unsigned long __meminitdata pgt_buf_top;
int after_bootmem;
@@ -73,12 +73,12 @@ static void __init find_early_table_spac
if (base == MEMBLOCK_ERROR)
panic("Cannot find space for the kernel page tables");
- e820_table_start = base >> PAGE_SHIFT;
- e820_table_end = e820_table_start;
- e820_table_top = ...It is not related to init_memory_mapping(), and init_memory_mapping() is
getting more bigger.
So make it as seperated function and call it from reserve_brk() and that is
point when _brk_end is concluded.
Signed-off-by: Yinghai Lu <yinghai@kernel.org>
---
arch/x86/kernel/setup.c | 24 ++++++++++++++++++++++++
arch/x86/mm/init.c | 19 -------------------
2 files changed, 24 insertions(+), 19 deletions(-)
Index: linux-2.6/arch/x86/kernel/setup.c
===================================================================
--- linux-2.6.orig/arch/x86/kernel/setup.c
+++ linux-2.6/arch/x86/kernel/setup.c
@@ -293,10 +293,32 @@ static void __init init_gbpages(void)
else
direct_gbpages = 0;
}
+
+static void __init cleanup_highmap_brk_end(void)
+{
+ pud_t *pud;
+ pmd_t *pmd;
+
+ mmu_cr4_features = read_cr4();
+
+ /*
+ * _brk_end cannot change anymore, but it and _end may be
+ * located on different 2M pages. cleanup_highmap(), however,
+ * can only consider _end when it runs, so destroy any
+ * mappings beyond _brk_end here.
+ */
+ pud = pud_offset(pgd_offset_k(_brk_end), _brk_end);
+ pmd = pmd_offset(pud, _brk_end - 1);
+ while (++pmd <= pmd_offset(pud, (unsigned long)_end - 1))
+ pmd_clear(pmd);
+}
#else
static inline void init_gbpages(void)
{
}
+static inline void cleanup_highmap_brk_end(void)
+{
+}
#endif
static void __init reserve_brk(void)
@@ -307,6 +329,8 @@ static void __init reserve_brk(void)
/* Mark brk area as locked down and no longer taking any
new allocations */
_brk_start = 0;
+
+ cleanup_highmap_brk_end();
}
#ifdef CONFIG_BLK_DEV_INITRD
Index: linux-2.6/arch/x86/mm/init.c
===================================================================
--- linux-2.6.orig/arch/x86/mm/init.c
+++ linux-2.6/arch/x86/mm/init.c
@@ -270,25 +270,6 @@ unsigned long __init_refok init_memory_m
load_cr3(swapper_pg_dir);
#endif
-#ifdef CONFIG_X86_64
- if (!after_bootmem && !start) {
- pud_t *pud;
- pmd_t ...Please check Those 6 patches need to be applied after three patches that i sent out 12/17/2010. it will put page table in local node memory for 64bit numa. Thanks Yinghai --
Please explain what you mean with "more top to down". Not what the code does, but what is the goal of the patchset. -hpa --
for example first node with 16g ram, it is into two parts: [0, 2g), and [4g, 18g). alloc_bootmem will get allocation from [0, 2g) always until we have can not find more. with third patch, it will try to get from [4g, 18g) at first. second patch is need to applied before third patch, because old way happenly get under 4g for generic bootmem under 4g First one is trying not to put page table for [0, 4g) under 512M. Thanks Yinghai --
The goal of this is to free up low memory for DMA and kdump, I presume? -hpa -- H. Peter Anvin, Intel Open Source Technology Center I work for Intel. I don't speak on their behalf. --
yes. otherwise if we put pgtable around 512M, then we have no chance to allocate 512M for kdump under 896M. if we put pgtable near 2g <assume [2g, 4g) for mmio), We can make it happen. later 6 patches will put try to pgtable on local node. Thanks --
| Greg KH | Og dreams of kernels |
| Jens Axboe | [PATCH 31/33] Fusion: sg chaining support |
| Arnd Bergmann | Re: finding your own dead "CONFIG_" variables |
| Mark Brown | [PATCH 2/2] Subject: natsemi: Allow users to disable workaround for DspCfg reset |
| Tony Breeds | [LGUEST] Look in object dir for .config |
git: | |
| Brian Downing | Re: Git in a Nutshell guide |
| John Benes | Re: master has some toys |
| Matthias Lederhofer | [PATCH 4/7] introduce GIT_WORK_TREE to specify the work tree |
| Alexander Sulfrian | [RFC/PATCH] RE: git calls SSH_ASKPASS even if DISPLAY is not set |
| Junio C Hamano | Re: Rss produced by git is not valid xml? |
| Linux Kernel Mailing List | iSeries: fix section mismatch in iseries_veth |
| Linux Kernel Mailing List |
