LKML Archive mirror
 help / color / mirror / Atom feed
* Re: + bootmem-node-setup-agnostic-free_bootmem.patch added to -mm tree
       [not found] <200804150623.m3F6NInZ014509@imap1.linux-foundation.org>
@ 2008-04-15  7:02 ` Yinghai Lu
  2008-04-15  7:11 ` Ingo Molnar
  1 sibling, 0 replies; 11+ messages in thread
From: Yinghai Lu @ 2008-04-15  7:02 UTC (permalink / raw
  To: akpm; +Cc: mm-commits, hannes, ak, clameter, kamezawa.hiroyu, mingo, y-goto,
	LKML

On Mon, Apr 14, 2008 at 11:23 PM,  <akpm@linux-foundation.org> wrote:
>
>  The patch titled
>      bootmem: node-setup agnostic free_bootmem()
>  has been added to the -mm tree.  Its filename is
>      bootmem-node-setup-agnostic-free_bootmem.patch
>
>  Before you just go and hit "reply", please:
>    a) Consider who else should be cc'ed
>    b) Prefer to cc a suitable mailing list as well
>    c) Ideally: find the original patch on the mailing list and do a
>       reply-to-all to that, adding suitable additional cc's
>
>  *** Remember to use Documentation/SubmitChecklist when testing your code ***
>
>  See http://www.zip.com.au/~akpm/linux/patches/stuff/added-to-mm.txt to find
>  out what to do about this
>
>  The current -mm tree may be found at http://userweb.kernel.org/~akpm/mmotm/
>
>  ------------------------------------------------------
>  Subject: bootmem: node-setup agnostic free_bootmem()
>  From: Johannes Weiner <hannes@saeurebad.de>
>
>  Make free_bootmem() look up the node holding the specified address range which
>  lets it work transparently on single-node and multi-node configurations.
>
>  Signed-off-by: Johannes Weiner <hannes@saeurebad.de>
>  Cc: Yinghai Lu <yhlu.kernel@gmail.com>
>  Acked-by: Andi Kleen <ak@suse.de>
>  Cc: Yasunori Goto <y-goto@jp.fujitsu.com>
>  Cc: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
>  Cc: Ingo Molnar <mingo@elte.hu>
>  Cc: Christoph Lameter <clameter@sgi.com>
>  Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
>  ---
>
>   mm/bootmem.c |   10 +++++++++-
>   1 file changed, 9 insertions(+), 1 deletion(-)
>
>  diff -puN mm/bootmem.c~bootmem-node-setup-agnostic-free_bootmem mm/bootmem.c
>  --- a/mm/bootmem.c~bootmem-node-setup-agnostic-free_bootmem
>  +++ a/mm/bootmem.c
>  @@ -421,7 +421,15 @@ int __init reserve_bootmem(unsigned long
>
>   void __init free_bootmem(unsigned long addr, unsigned long size)
>   {
>  -       free_bootmem_core(NODE_DATA(0)->bdata, addr, size);
>  +       bootmem_data_t *bdata;
>  +
>  +       list_for_each_entry (bdata, &bdata_list, list) {
>  +               if (addr < bdata->node_boot_start)
>  +                       continue;
>  +               free_bootmem_core(bdata, addr, size);
>  +               return;
>  +       }
>  +       BUG();
>   }

NAK,
it doesn't solve the cross nodes bootmem that is reserved via
reserve_early. like ramdisk free in x86/setup_64.c
...
                        free_bootmem(ramdisk_image, ramdisk_size);
                        printk(KERN_ERR "initrd extends beyond end of memory "
                               "(0x%08lx > 0x%08lx)\ndisabling initrd\n",
                               ramdisk_end, end_of_mem);
                        initrd_start = 0;
..
YH

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: + bootmem-node-setup-agnostic-free_bootmem.patch added to -mm tree
       [not found] <200804150623.m3F6NInZ014509@imap1.linux-foundation.org>
  2008-04-15  7:02 ` + bootmem-node-setup-agnostic-free_bootmem.patch added to -mm tree Yinghai Lu
@ 2008-04-15  7:11 ` Ingo Molnar
  2008-04-15 12:51   ` Johannes Weiner
  1 sibling, 1 reply; 11+ messages in thread
From: Ingo Molnar @ 2008-04-15  7:11 UTC (permalink / raw
  To: akpm
  Cc: mm-commits, hannes, ak, clameter, kamezawa.hiroyu, y-goto,
	yhlu.kernel, linux-kernel


* akpm@linux-foundation.org <akpm@linux-foundation.org> wrote:

> Subject: bootmem: node-setup agnostic free_bootmem()
> From: Johannes Weiner <hannes@saeurebad.de>
> 
> Make free_bootmem() look up the node holding the specified address 
> range which lets it work transparently on single-node and multi-node 
> configurations.

this patch does not fix the bug Yinghai's (now dropped) patches solved: 
reserve_early() allocations. So NAK until the full problem has been 
sorted out ...

	Ingo

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: + bootmem-node-setup-agnostic-free_bootmem.patch added to -mm tree
  2008-04-15  7:11 ` Ingo Molnar
@ 2008-04-15 12:51   ` Johannes Weiner
  2008-04-15 13:41     ` Roel Kluin
  2008-04-15 18:57     ` Yinghai Lu
  0 siblings, 2 replies; 11+ messages in thread
From: Johannes Weiner @ 2008-04-15 12:51 UTC (permalink / raw
  To: Ingo Molnar
  Cc: akpm, mm-commits, ak, clameter, kamezawa.hiroyu, y-goto,
	yhlu.kernel, linux-kernel

Hi Ingo,

Ingo Molnar <mingo@elte.hu> writes:

> * akpm@linux-foundation.org <akpm@linux-foundation.org> wrote:
>
>> Subject: bootmem: node-setup agnostic free_bootmem()
>> From: Johannes Weiner <hannes@saeurebad.de>
>> 
>> Make free_bootmem() look up the node holding the specified address 
>> range which lets it work transparently on single-node and multi-node 
>> configurations.
>
> this patch does not fix the bug Yinghai's (now dropped) patches solved: 
> reserve_early() allocations. So NAK until the full problem has been 
> sorted out ...

Okay, NAK on -mm and -x86 for sure.  The patch was meant for mainline
where there is no need for free_bootmem() going across nodes, right?

But I still object to the way Yinghai implemented it.
free_bootmem_core() should not be twisted like this.

How about the following (untested, even uncompiled, but you should get
the idea) proposal which would replace the patch discussed in this
thread:

--- tree-linus.orig/mm/bootmem.c
+++ tree-linus/mm/bootmem.c
@@ -421,7 +421,25 @@ int __init reserve_bootmem(unsigned long
 
 void __init free_bootmem(unsigned long addr, unsigned long size)
 {
-	free_bootmem_core(NODE_DATA(0)->bdata, addr, size);
+	bootmem_data_t *bdata;
+
+	list_for_each_entry(bdata, &bdata_list, list) {
+		unsigned long remainder = 0;
+
+		if (addr < bdata->node_boot_start)
+			continue;
+
+		if (PFN_DOWN(addr + size) > bdata->node_low_pfn)
+			remainder = PFN_DOWN(addr + size) - bdata->node_low_pfn;
+
+		size -= PFN_PHYS(remainder);
+		free_bootmem_core(bdata, addr, size)
+
+		if (!remainder)
+			break;
+
+		addr = PFN_PHYS(bdata->node_low_pfn + 1);
+	}
 }
 
 unsigned long __init free_all_bootmem(void)
---
	Hannes

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: + bootmem-node-setup-agnostic-free_bootmem.patch added to -mm tree
  2008-04-15 12:51   ` Johannes Weiner
@ 2008-04-15 13:41     ` Roel Kluin
  2008-04-15 14:01       ` Johannes Weiner
  2008-04-15 18:57     ` Yinghai Lu
  1 sibling, 1 reply; 11+ messages in thread
From: Roel Kluin @ 2008-04-15 13:41 UTC (permalink / raw
  To: Johannes Weiner
  Cc: Ingo Molnar, akpm, mm-commits, ak, clameter, kamezawa.hiroyu,
	y-goto, yhlu.kernel, linux-kernel

Johannes Weiner wrote:

> How about the following (untested, even uncompiled, but you should get
> the idea) proposal which would replace the patch discussed in this
> thread:
> 
> --- tree-linus.orig/mm/bootmem.c
> +++ tree-linus/mm/bootmem.c
> @@ -421,7 +421,25 @@ int __init reserve_bootmem(unsigned long
>  
>  void __init free_bootmem(unsigned long addr, unsigned long size)
>  {
> -	free_bootmem_core(NODE_DATA(0)->bdata, addr, size);
> +	bootmem_data_t *bdata;
> +
> +	list_for_each_entry(bdata, &bdata_list, list) {
> +		unsigned long remainder = 0;
> +
> +		if (addr < bdata->node_boot_start)
> +			continue;
> +
> +		if (PFN_DOWN(addr + size) > bdata->node_low_pfn)
> +			remainder = PFN_DOWN(addr + size) - bdata->node_low_pfn;
> +
> +		size -= PFN_PHYS(remainder);
> +		free_bootmem_core(bdata, addr, size)
Missing semicolon here -----------------------------^

> +
> +		if (!remainder)
> +			break;
> +
> +		addr = PFN_PHYS(bdata->node_low_pfn + 1);
> +	}
>  }

This should do the same as (untested):

void __init free_bootmem(unsigned long addr, unsigned long size)
{
	bootmem_data_t *bdata;

	list_for_each_entry(bdata, &bdata_list, list) {
		unsigned long remainder;

		if (addr < bdata->node_boot_start)
			continue;

		remainder = PFN_DOWN(addr + size) - bdata->node_low_pfn;

		if (remainder <= 0) {
			free_bootmem_core(bdata, addr, size);
			return;
		}

		size -= PFN_PHYS(remainder);
		free_bootmem_core(bdata, addr, size);
		addr = PFN_PHYS(bdata->node_low_pfn + 1);
	}
}



^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: + bootmem-node-setup-agnostic-free_bootmem.patch added to -mm tree
  2008-04-15 13:41     ` Roel Kluin
@ 2008-04-15 14:01       ` Johannes Weiner
  0 siblings, 0 replies; 11+ messages in thread
From: Johannes Weiner @ 2008-04-15 14:01 UTC (permalink / raw
  To: Roel Kluin
  Cc: Ingo Molnar, akpm, mm-commits, ak, clameter, kamezawa.hiroyu,
	y-goto, yhlu.kernel, linux-kernel

Hi,

Roel Kluin <12o3l@tiscali.nl> writes:

> Johannes Weiner wrote:
>
>> How about the following (untested, even uncompiled, but you should get
>> the idea) proposal which would replace the patch discussed in this
>> thread:
>> 
>> --- tree-linus.orig/mm/bootmem.c
>> +++ tree-linus/mm/bootmem.c
>> @@ -421,7 +421,25 @@ int __init reserve_bootmem(unsigned long
>>  
>>  void __init free_bootmem(unsigned long addr, unsigned long size)
>>  {
>> -	free_bootmem_core(NODE_DATA(0)->bdata, addr, size);
>> +	bootmem_data_t *bdata;
>> +
>> +	list_for_each_entry(bdata, &bdata_list, list) {
>> +		unsigned long remainder = 0;
>> +
>> +		if (addr < bdata->node_boot_start)
>> +			continue;
>> +
>> +		if (PFN_DOWN(addr + size) > bdata->node_low_pfn)
>> +			remainder = PFN_DOWN(addr + size) - bdata->node_low_pfn;
>> +
>> +		size -= PFN_PHYS(remainder);
>> +		free_bootmem_core(bdata, addr, size)
> Missing semicolon here -----------------------------^

Yes.  Intentional compiler trap for untested code *cough*

>> +
>> +		if (!remainder)
>> +			break;
>> +
>> +		addr = PFN_PHYS(bdata->node_low_pfn + 1);
>> +	}
>>  }
>
> This should do the same as (untested):
>
> void __init free_bootmem(unsigned long addr, unsigned long size)
> {
> 	bootmem_data_t *bdata;
>
> 	list_for_each_entry(bdata, &bdata_list, list) {
> 		unsigned long remainder;
                ^^^^^^^^
>
> 		if (addr < bdata->node_boot_start)
> 			continue;
>
> 		remainder = PFN_DOWN(addr + size) - bdata->node_low_pfn;
>
> 		if (remainder <= 0) {

Boom.

> 			free_bootmem_core(bdata, addr, size);
> 			return;
> 		}
>
> 		size -= PFN_PHYS(remainder);
> 		free_bootmem_core(bdata, addr, size);
> 		addr = PFN_PHYS(bdata->node_low_pfn + 1);
> 	}
> }

	Hannes

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: + bootmem-node-setup-agnostic-free_bootmem.patch added to -mm tree
  2008-04-15 12:51   ` Johannes Weiner
  2008-04-15 13:41     ` Roel Kluin
@ 2008-04-15 18:57     ` Yinghai Lu
  2008-04-15 19:55       ` Johannes Weiner
  1 sibling, 1 reply; 11+ messages in thread
From: Yinghai Lu @ 2008-04-15 18:57 UTC (permalink / raw
  To: Johannes Weiner
  Cc: Ingo Molnar, akpm, mm-commits, ak, clameter, kamezawa.hiroyu,
	y-goto, linux-kernel

On Tue, Apr 15, 2008 at 5:51 AM, Johannes Weiner <hannes@saeurebad.de> wrote:
> Hi Ingo,
>
>
>
>  Ingo Molnar <mingo@elte.hu> writes:
>
>  > * akpm@linux-foundation.org <akpm@linux-foundation.org> wrote:
>  >
>  >> Subject: bootmem: node-setup agnostic free_bootmem()
>  >> From: Johannes Weiner <hannes@saeurebad.de>
>  >>
>  >> Make free_bootmem() look up the node holding the specified address
>  >> range which lets it work transparently on single-node and multi-node
>  >> configurations.
>  >
>  > this patch does not fix the bug Yinghai's (now dropped) patches solved:
>  > reserve_early() allocations. So NAK until the full problem has been
>  > sorted out ...
>
>  Okay, NAK on -mm and -x86 for sure.  The patch was meant for mainline
>  where there is no need for free_bootmem() going across nodes, right?
>
>  But I still object to the way Yinghai implemented it.
>  free_bootmem_core() should not be twisted like this.
>
>  How about the following (untested, even uncompiled, but you should get
>  the idea) proposal which would replace the patch discussed in this
>  thread:
>
>  --- tree-linus.orig/mm/bootmem.c
>  +++ tree-linus/mm/bootmem.c
>  @@ -421,7 +421,25 @@ int __init reserve_bootmem(unsigned long
>
>
>   void __init free_bootmem(unsigned long addr, unsigned long size)
>   {
>  -       free_bootmem_core(NODE_DATA(0)->bdata, addr, size);
>  +       bootmem_data_t *bdata;
>  +
>  +       list_for_each_entry(bdata, &bdata_list, list) {
>  +               unsigned long remainder = 0;
>
> +
>  +               if (addr < bdata->node_boot_start)
>  +                       continue;
>  +
>  +               if (PFN_DOWN(addr + size) > bdata->node_low_pfn)
>  +                       remainder = PFN_DOWN(addr + size) - bdata->node_low_pfn;
>  +
>  +               size -= PFN_PHYS(remainder);
>
> +               free_bootmem_core(bdata, addr, size)
>  +
>  +               if (!remainder)
>  +                       break;
>  +
>  +               addr = PFN_PHYS(bdata->node_low_pfn + 1);
>  +       }
>
>  }
>
>   unsigned long __init free_all_bootmem(void)

how about
1. bdata is not sorted?
2. intel cross node box: node0: 0g-2g, 4g-6g, node1: 2g-4g, 6g-8g. i
don't think they have two bdata struct for every node.

YH

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: + bootmem-node-setup-agnostic-free_bootmem.patch added to -mm tree
  2008-04-15 18:57     ` Yinghai Lu
@ 2008-04-15 19:55       ` Johannes Weiner
  2008-04-15 20:03         ` Yinghai Lu
  0 siblings, 1 reply; 11+ messages in thread
From: Johannes Weiner @ 2008-04-15 19:55 UTC (permalink / raw
  To: Yinghai Lu
  Cc: Ingo Molnar, akpm, mm-commits, ak, clameter, kamezawa.hiroyu,
	y-goto, linux-kernel

Hi,

"Yinghai Lu" <yhlu.kernel@gmail.com> writes:

> On Tue, Apr 15, 2008 at 5:51 AM, Johannes Weiner <hannes@saeurebad.de> wrote:
>> Hi Ingo,
>>
>>
>>
>>  Ingo Molnar <mingo@elte.hu> writes:
>>
>>  > * akpm@linux-foundation.org <akpm@linux-foundation.org> wrote:
>>  >
>>  >> Subject: bootmem: node-setup agnostic free_bootmem()
>>  >> From: Johannes Weiner <hannes@saeurebad.de>
>>  >>
>>  >> Make free_bootmem() look up the node holding the specified address
>>  >> range which lets it work transparently on single-node and multi-node
>>  >> configurations.
>>  >
>>  > this patch does not fix the bug Yinghai's (now dropped) patches solved:
>>  > reserve_early() allocations. So NAK until the full problem has been
>>  > sorted out ...
>>
>>  Okay, NAK on -mm and -x86 for sure.  The patch was meant for mainline
>>  where there is no need for free_bootmem() going across nodes, right?
>>
>>  But I still object to the way Yinghai implemented it.
>>  free_bootmem_core() should not be twisted like this.
>>
>>  How about the following (untested, even uncompiled, but you should get
>>  the idea) proposal which would replace the patch discussed in this
>>  thread:
>>
>>  --- tree-linus.orig/mm/bootmem.c
>>  +++ tree-linus/mm/bootmem.c
>>  @@ -421,7 +421,25 @@ int __init reserve_bootmem(unsigned long
>>
>>
>>   void __init free_bootmem(unsigned long addr, unsigned long size)
>>   {
>>  -       free_bootmem_core(NODE_DATA(0)->bdata, addr, size);
>>  +       bootmem_data_t *bdata;
>>  +
>>  +       list_for_each_entry(bdata, &bdata_list, list) {
>>  +               unsigned long remainder = 0;
>>
>> +
>>  +               if (addr < bdata->node_boot_start)
>>  +                       continue;
>>  +
>>  +               if (PFN_DOWN(addr + size) > bdata->node_low_pfn)
>>  +                       remainder = PFN_DOWN(addr + size) - bdata->node_low_pfn;
>>  +
>>  +               size -= PFN_PHYS(remainder);
>>
>> +               free_bootmem_core(bdata, addr, size)
>>  +
>>  +               if (!remainder)
>>  +                       break;
>>  +
>>  +               addr = PFN_PHYS(bdata->node_low_pfn + 1);
>>  +       }
>>
>>  }
>>
>>   unsigned long __init free_all_bootmem(void)
>
> how about
> 1. bdata is not sorted?

They are kept in a sorted list.  How could they be unsorted?

> 2. intel cross node box: node0: 0g-2g, 4g-6g, node1: 2g-4g, 6g-8g. i
> don't think they have two bdata struct for every node.

How do the bdata structures represent this setup right now?  Are you
sure that there is not a node descriptor for every contiguous region?

	Hannes

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: + bootmem-node-setup-agnostic-free_bootmem.patch added to -mm tree
  2008-04-15 19:55       ` Johannes Weiner
@ 2008-04-15 20:03         ` Yinghai Lu
  2008-04-15 21:14           ` Johannes Weiner
  0 siblings, 1 reply; 11+ messages in thread
From: Yinghai Lu @ 2008-04-15 20:03 UTC (permalink / raw
  To: Johannes Weiner
  Cc: Ingo Molnar, akpm, mm-commits, ak, clameter, kamezawa.hiroyu,
	y-goto, linux-kernel

On Tue, Apr 15, 2008 at 12:55 PM, Johannes Weiner <hannes@saeurebad.de> wrote:
> Hi,
>
>
>
>  "Yinghai Lu" <yhlu.kernel@gmail.com> writes:
>
>  > On Tue, Apr 15, 2008 at 5:51 AM, Johannes Weiner <hannes@saeurebad.de> wrote:
>  >> Hi Ingo,
>  >>
>  >>
>  >>
>  >>  Ingo Molnar <mingo@elte.hu> writes:
>  >>
>  >>  > * akpm@linux-foundation.org <akpm@linux-foundation.org> wrote:
>  >>  >
>  >>  >> Subject: bootmem: node-setup agnostic free_bootmem()
>  >>  >> From: Johannes Weiner <hannes@saeurebad.de>
>  >>  >>
>  >>  >> Make free_bootmem() look up the node holding the specified address
>  >>  >> range which lets it work transparently on single-node and multi-node
>  >>  >> configurations.
>  >>  >
>  >>  > this patch does not fix the bug Yinghai's (now dropped) patches solved:
>  >>  > reserve_early() allocations. So NAK until the full problem has been
>  >>  > sorted out ...
>  >>
>  >>  Okay, NAK on -mm and -x86 for sure.  The patch was meant for mainline
>  >>  where there is no need for free_bootmem() going across nodes, right?
>  >>
>  >>  But I still object to the way Yinghai implemented it.
>  >>  free_bootmem_core() should not be twisted like this.
>  >>
>  >>  How about the following (untested, even uncompiled, but you should get
>  >>  the idea) proposal which would replace the patch discussed in this
>  >>  thread:
>  >>
>  >>  --- tree-linus.orig/mm/bootmem.c
>  >>  +++ tree-linus/mm/bootmem.c
>  >>  @@ -421,7 +421,25 @@ int __init reserve_bootmem(unsigned long
>  >>
>  >>
>  >>   void __init free_bootmem(unsigned long addr, unsigned long size)
>  >>   {
>  >>  -       free_bootmem_core(NODE_DATA(0)->bdata, addr, size);
>  >>  +       bootmem_data_t *bdata;
>  >>  +
>  >>  +       list_for_each_entry(bdata, &bdata_list, list) {
>  >>  +               unsigned long remainder = 0;
>  >>
>  >> +
>  >>  +               if (addr < bdata->node_boot_start)
>  >>  +                       continue;
>  >>  +
>  >>  +               if (PFN_DOWN(addr + size) > bdata->node_low_pfn)
>  >>  +                       remainder = PFN_DOWN(addr + size) - bdata->node_low_pfn;
>  >>  +
>  >>  +               size -= PFN_PHYS(remainder);
>  >>
>  >> +               free_bootmem_core(bdata, addr, size)
>  >>  +
>  >>  +               if (!remainder)
>  >>  +                       break;
>  >>  +
>  >>  +               addr = PFN_PHYS(bdata->node_low_pfn + 1);
>  >>  +       }
>  >>
>  >>  }
>  >>
>  >>   unsigned long __init free_all_bootmem(void)
>  >
>  > how about
>  > 1. bdata is not sorted?
>
>  They are kept in a sorted list.  How could they be unsorted?
>
>
>  > 2. intel cross node box: node0: 0g-2g, 4g-6g, node1: 2g-4g, 6g-8g. i
>  > don't think they have two bdata struct for every node.
>
>  How do the bdata structures represent this setup right now?  Are you
>  sure that there is not a node descriptor for every contiguous region?

http://lkml.org/lkml/2008/3/25/233

Subject	[patch] srat, x86_64: Add support for nodes spanning other nodes

For example, If the physical address layout on a two node system with 8 GB
memory is something like:
node 0: 0-2GB, 4-6GB
node 1: 2-4GB, 6-8GB

Current kernels fail to boot/detect this NUMA topology.

ACPI SRAT tables can expose such a topology which needs to be supported.

Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>

YH

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: + bootmem-node-setup-agnostic-free_bootmem.patch added to -mm tree
  2008-04-15 20:03         ` Yinghai Lu
@ 2008-04-15 21:14           ` Johannes Weiner
  2008-04-15 21:19             ` Yinghai Lu
  0 siblings, 1 reply; 11+ messages in thread
From: Johannes Weiner @ 2008-04-15 21:14 UTC (permalink / raw
  To: Yinghai Lu
  Cc: Ingo Molnar, akpm, mm-commits, ak, clameter, kamezawa.hiroyu,
	y-goto, linux-kernel

Hi,

"Yinghai Lu" <yhlu.kernel@gmail.com> writes:

>>  > 2. intel cross node box: node0: 0g-2g, 4g-6g, node1: 2g-4g, 6g-8g. i
>>  > don't think they have two bdata struct for every node.
>>
>>  How do the bdata structures represent this setup right now?  Are you
>>  sure that there is not a node descriptor for every contiguous region?
>
> http://lkml.org/lkml/2008/3/25/233
>
> Subject	[patch] srat, x86_64: Add support for nodes spanning other nodes
>
> For example, If the physical address layout on a two node system with 8 GB
> memory is something like:
> node 0: 0-2GB, 4-6GB
> node 1: 2-4GB, 6-8GB
>
> Current kernels fail to boot/detect this NUMA topology.
>
> ACPI SRAT tables can expose such a topology which needs to be supported.
>
> Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>

As I understood the code (more guessing than understanding), it breaks
down these physical nodes into contiguous logical memory blocks which
then get represented by having a node descriptor for each of them.  Can
you confirm that?

	Hannes

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: + bootmem-node-setup-agnostic-free_bootmem.patch added to -mm tree
  2008-04-15 21:14           ` Johannes Weiner
@ 2008-04-15 21:19             ` Yinghai Lu
  2008-04-15 21:38               ` Johannes Weiner
  0 siblings, 1 reply; 11+ messages in thread
From: Yinghai Lu @ 2008-04-15 21:19 UTC (permalink / raw
  To: Johannes Weiner, suresh.b.siddha
  Cc: Ingo Molnar, akpm, mm-commits, ak, clameter, kamezawa.hiroyu,
	y-goto, linux-kernel

On Tue, Apr 15, 2008 at 2:14 PM, Johannes Weiner <hannes@saeurebad.de> wrote:
> Hi,
>
>  "Yinghai Lu" <yhlu.kernel@gmail.com> writes:
>
>
> >>  > 2. intel cross node box: node0: 0g-2g, 4g-6g, node1: 2g-4g, 6g-8g. i
>  >>  > don't think they have two bdata struct for every node.
>  >>
>  >>  How do the bdata structures represent this setup right now?  Are you
>  >>  sure that there is not a node descriptor for every contiguous region?
>  >
>  > http://lkml.org/lkml/2008/3/25/233
>  >
>  > Subject       [patch] srat, x86_64: Add support for nodes spanning other nodes
>  >
>  > For example, If the physical address layout on a two node system with 8 GB
>  > memory is something like:
>  > node 0: 0-2GB, 4-6GB
>  > node 1: 2-4GB, 6-8GB
>  >
>  > Current kernels fail to boot/detect this NUMA topology.
>  >
>  > ACPI SRAT tables can expose such a topology which needs to be supported.
>  >
>  > Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
>
>  As I understood the code (more guessing than understanding), it breaks
>  down these physical nodes into contiguous logical memory blocks which
>  then get represented by having a node descriptor for each of them.  Can
>  you confirm that?

Not sure, on x86_64 one node should have one bdata only.
execpt suresh update that to make one node have two bdata.

YH

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: + bootmem-node-setup-agnostic-free_bootmem.patch added to -mm tree
  2008-04-15 21:19             ` Yinghai Lu
@ 2008-04-15 21:38               ` Johannes Weiner
  0 siblings, 0 replies; 11+ messages in thread
From: Johannes Weiner @ 2008-04-15 21:38 UTC (permalink / raw
  To: Yinghai Lu
  Cc: suresh.b.siddha, Ingo Molnar, akpm, mm-commits, andi, clameter,
	kamezawa.hiroyu, y-goto, linux-kernel

Hi,

"Yinghai Lu" <yhlu.kernel@gmail.com> writes:

> On Tue, Apr 15, 2008 at 2:14 PM, Johannes Weiner <hannes@saeurebad.de> wrote:
>> Hi,
>>
>>  "Yinghai Lu" <yhlu.kernel@gmail.com> writes:
>>
>>
>> >>  > 2. intel cross node box: node0: 0g-2g, 4g-6g, node1: 2g-4g, 6g-8g. i
>>  >>  > don't think they have two bdata struct for every node.
>>  >>
>>  >>  How do the bdata structures represent this setup right now?  Are you
>>  >>  sure that there is not a node descriptor for every contiguous region?
>>  >
>>  > http://lkml.org/lkml/2008/3/25/233
>>  >
>>  > Subject       [patch] srat, x86_64: Add support for nodes spanning other nodes
>>  >
>>  > For example, If the physical address layout on a two node system with 8 GB
>>  > memory is something like:
>>  > node 0: 0-2GB, 4-6GB
>>  > node 1: 2-4GB, 6-8GB
>>  >
>>  > Current kernels fail to boot/detect this NUMA topology.
>>  >
>>  > ACPI SRAT tables can expose such a topology which needs to be supported.
>>  >
>>  > Signed-off-by: Suresh Siddha <suresh.b.siddha@intel.com>
>>
>>  As I understood the code (more guessing than understanding), it breaks
>>  down these physical nodes into contiguous logical memory blocks which
>>  then get represented by having a node descriptor for each of them.  Can
>>  you confirm that?
>
> Not sure, on x86_64 one node should have one bdata only.
> execpt suresh update that to make one node have two bdata.

We are just guessing around here.

My understanding is that right now we have contigous physical nodes
represented by a node descriptor each and with Suresh's patch we have
each contigous block represented by its own node descriptor.

So in this setup

	node 0: 0-2GB, 4-6GB
	node 1: 2-4GB, 6-8GB

we have 4 node descriptors [0-2], [2-4], [4-6], [6-8]?

Can someone please ack/nak this?

	Hannes

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2008-04-15 21:38 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <200804150623.m3F6NInZ014509@imap1.linux-foundation.org>
2008-04-15  7:02 ` + bootmem-node-setup-agnostic-free_bootmem.patch added to -mm tree Yinghai Lu
2008-04-15  7:11 ` Ingo Molnar
2008-04-15 12:51   ` Johannes Weiner
2008-04-15 13:41     ` Roel Kluin
2008-04-15 14:01       ` Johannes Weiner
2008-04-15 18:57     ` Yinghai Lu
2008-04-15 19:55       ` Johannes Weiner
2008-04-15 20:03         ` Yinghai Lu
2008-04-15 21:14           ` Johannes Weiner
2008-04-15 21:19             ` Yinghai Lu
2008-04-15 21:38               ` Johannes Weiner

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).