kexec.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
From: Alexander Graf <graf@amazon.com>
To: Rob Herring <robh@kernel.org>
Cc: <linux-kernel@vger.kernel.org>,
	<linux-trace-kernel@vger.kernel.org>, <linux-mm@kvack.org>,
	<devicetree@vger.kernel.org>,
	<linux-arm-kernel@lists.infradead.org>,
	<kexec@lists.infradead.org>, <linux-doc@vger.kernel.org>,
	<x86@kernel.org>, Eric Biederman <ebiederm@xmission.com>,
	"H. Peter Anvin" <hpa@zytor.com>,
	Andy Lutomirski <luto@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Steven Rostedt <rostedt@goodmis.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	"Mark Rutland" <mark.rutland@arm.com>,
	Tom Lendacky <thomas.lendacky@amd.com>,
	Ashish Kalra <ashish.kalra@amd.com>,
	James Gowans <jgowans@amazon.com>,
	Stanislav Kinsburskii <skinsburskii@linux.microsoft.com>,
	<arnd@arndb.de>, <pbonzini@redhat.com>,
	<madvenka@linux.microsoft.com>,
	Anthony Yznaga <anthony.yznaga@oracle.com>,
	Usama Arif <usama.arif@bytedance.com>,
	"David Woodhouse" <dwmw@amazon.co.uk>,
	Benjamin Herrenschmidt <benh@kernel.crashing.org>
Subject: Re: [PATCH 06/15] arm64: Add KHO support
Date: Tue, 19 Dec 2023 00:01:02 +0100	[thread overview]
Message-ID: <d843596e-0def-439b-966a-a0f10a1b7f6d@amazon.com> (raw)
In-Reply-To: <20231214223604.GA1045434-robh@kernel.org>

Hey Rob!

On 14.12.23 23:36, Rob Herring wrote:
> On Wed, Dec 13, 2023 at 12:04:43AM +0000, Alexander Graf wrote:
>> We now have all bits in place to support KHO kexecs. This patch adds
>> awareness of KHO in the kexec file as well as boot path for arm64 and
>> adds the respective kconfig option to the architecture so that it can
>> use KHO successfully.
>>
>> Signed-off-by: Alexander Graf <graf@amazon.com>
>> ---
>>   arch/arm64/Kconfig        | 12 ++++++++++++
>>   arch/arm64/kernel/setup.c |  2 ++
>>   arch/arm64/mm/init.c      |  8 ++++++++
>>   drivers/of/fdt.c          | 41 +++++++++++++++++++++++++++++++++++++++
>>   drivers/of/kexec.c        | 36 ++++++++++++++++++++++++++++++++++
>>   5 files changed, 99 insertions(+)
>>
>> diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig
>> index 7b071a00425d..1ba338ce7598 100644
>> --- a/arch/arm64/Kconfig
>> +++ b/arch/arm64/Kconfig
>> @@ -1501,6 +1501,18 @@ config ARCH_SUPPORTS_CRASH_DUMP
>>   config ARCH_HAS_GENERIC_CRASHKERNEL_RESERVATION
>>        def_bool CRASH_CORE
>>
>> +config KEXEC_KHO
>> +     bool "kexec handover"
>> +     depends on KEXEC
>> +     select MEMBLOCK_SCRATCH
>> +     select LIBFDT
>> +     select CMA
>> +     help
>> +       Allow kexec to hand over state across kernels by generating and
>> +       passing additional metadata to the target kernel. This is useful
>> +       to keep data or state alive across the kexec. For this to work,
>> +       both source and target kernels need to have this option enabled.
> Why do we have the same kconfig entry twice? Here and x86.


This was how the kexec config options were done when I wrote the patches 
originally. Since then, looks like Eric DeVolder has cleaned up things 
quite nicely. I'll adapt the new way.


>
>> +
>>   config TRANS_TABLE
>>        def_bool y
>>        depends on HIBERNATION || KEXEC_CORE
>> diff --git a/arch/arm64/kernel/setup.c b/arch/arm64/kernel/setup.c
>> index 417a8a86b2db..8035b673d96d 100644
>> --- a/arch/arm64/kernel/setup.c
>> +++ b/arch/arm64/kernel/setup.c
>> @@ -346,6 +346,8 @@ void __init __no_sanitize_address setup_arch(char **cmdline_p)
>>
>>        paging_init();
>>
>> +     kho_reserve_mem();
>> +
>>        acpi_table_upgrade();
>>
>>        /* Parse the ACPI tables for possible boot-time configuration */
>> diff --git a/arch/arm64/mm/init.c b/arch/arm64/mm/init.c
>> index 74c1db8ce271..254d82f3383a 100644
>> --- a/arch/arm64/mm/init.c
>> +++ b/arch/arm64/mm/init.c
>> @@ -358,6 +358,8 @@ void __init bootmem_init(void)
>>         */
>>        arch_reserve_crashkernel();
>>
>> +     kho_reserve();
>> +
> reserve what? It is not obvious what the difference between
> kho_reserve_mem() and kho_reserve() are.


Yeah, I agree. I was struggling to find good names for them. What they 
do is:

kho_reserve() - Reserve CMA memory for later kexec. We use this memory 
region as scratch memory later.
kho_reserve_mem() - Post-KHO. Creates memory reservations inside 
memblocks for pre-KHO handed over memory.

For v2, I'll change them to kho_reserve_scratch() and 
kho_reserve_previous_mem() unless you have better ideas :)


>
>>        memblock_dump_all();
>>   }
>>
>> @@ -386,6 +388,12 @@ void __init mem_init(void)
>>        /* this will put all unused low memory onto the freelists */
>>        memblock_free_all();
>>
>> +     /*
>> +      * Now that all KHO pages are marked as reserved, let's flip them back
>> +      * to normal pages with accurate refcount.
>> +      */
>> +     kho_populate_refcount();
>> +
>>        /*
>>         * Check boundaries twice: Some fundamental inconsistencies can be
>>         * detected at build time already.
>> diff --git a/drivers/of/fdt.c b/drivers/of/fdt.c
>> index bf502ba8da95..af95139351ed 100644
>> --- a/drivers/of/fdt.c
>> +++ b/drivers/of/fdt.c
>> @@ -1006,6 +1006,44 @@ void __init early_init_dt_check_for_usable_mem_range(void)
>>                memblock_add(rgn[i].base, rgn[i].size);
>>   }
>>
>> +/**
>> + * early_init_dt_check_kho - Decode info required for kexec handover from DT
>> + */
>> +void __init early_init_dt_check_kho(void)
>> +{
>> +#ifdef CONFIG_KEXEC_KHO
> if (!IS_ENABLED(CONFIG_KEXEC_KHO))
>    return;
>
> You'll need a kho_populate() stub.


Always happy to remove #ifdefs :)


>
>> +     unsigned long node = chosen_node_offset;
>> +     u64 kho_start, scratch_start, scratch_size, mem_start, mem_size;
>> +     const __be32 *p;
>> +     int l;
>> +
>> +     if ((long)node < 0)
>> +             return;
>> +
>> +     p = of_get_flat_dt_prop(node, "linux,kho-dt", &l);
>> +     if (l != (dt_root_addr_cells + dt_root_size_cells) * sizeof(__be32))
>> +             return;
>> +
>> +     kho_start = dt_mem_next_cell(dt_root_addr_cells, &p);
>> +
>> +     p = of_get_flat_dt_prop(node, "linux,kho-scratch", &l);
>> +     if (l != (dt_root_addr_cells + dt_root_size_cells) * sizeof(__be32))
>> +             return;
>> +
>> +     scratch_start = dt_mem_next_cell(dt_root_addr_cells, &p);
>> +     scratch_size = dt_mem_next_cell(dt_root_addr_cells, &p);
>> +
>> +     p = of_get_flat_dt_prop(node, "linux,kho-mem", &l);
>> +     if (l != (dt_root_addr_cells + dt_root_size_cells) * sizeof(__be32))
>> +             return;
>> +
>> +     mem_start = dt_mem_next_cell(dt_root_addr_cells, &p);
>> +     mem_size = dt_mem_next_cell(dt_root_addr_cells, &p);
>> +
>> +     kho_populate(kho_start, scratch_start, scratch_size, mem_start, mem_size);
>> +#endif
>> +}
>> +
>>   #ifdef CONFIG_SERIAL_EARLYCON
>>
>>   int __init early_init_dt_scan_chosen_stdout(void)
>> @@ -1304,6 +1342,9 @@ void __init early_init_dt_scan_nodes(void)
>>
>>        /* Handle linux,usable-memory-range property */
>>        early_init_dt_check_for_usable_mem_range();
>> +
>> +     /* Handle kexec handover */
>> +     early_init_dt_check_kho();
>>   }
>>
>>   bool __init early_init_dt_scan(void *params)
>> diff --git a/drivers/of/kexec.c b/drivers/of/kexec.c
>> index 68278340cecf..a612e6bb8c75 100644
>> --- a/drivers/of/kexec.c
>> +++ b/drivers/of/kexec.c
>> @@ -264,6 +264,37 @@ static inline int setup_ima_buffer(const struct kimage *image, void *fdt,
>>   }
>>   #endif /* CONFIG_IMA_KEXEC */
>>
>> +static int kho_add_chosen(const struct kimage *image, void *fdt, int chosen_node)
>> +{
>> +     int ret = 0;
>> +
>> +#ifdef CONFIG_KEXEC_KHO
> ditto
>
> Though perhaps image->kho is not defined?


Correct, it is not. But I'm happy to have a few local variables that I 
stash the image->kho contents inside an ifdef into so we can at least 
compile check all libfdt invocations.


Alex




Amazon Development Center Germany GmbH
Krausenstr. 38
10117 Berlin
Geschaeftsfuehrung: Christian Schlaeger, Jonathan Weiss
Eingetragen am Amtsgericht Charlottenburg unter HRB 149173 B
Sitz: Berlin
Ust-ID: DE 289 237 879


_______________________________________________
kexec mailing list
kexec@lists.infradead.org
http://lists.infradead.org/mailman/listinfo/kexec

  reply	other threads:[~2023-12-18 23:01 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-12-13  0:04 [PATCH 00/15] kexec: Allow preservation of ftrace buffers Alexander Graf
2023-12-13  0:04 ` [PATCH 01/15] mm,memblock: Add support for scratch memory Alexander Graf
2023-12-13  0:04 ` [PATCH 02/15] memblock: Declare scratch memory as CMA Alexander Graf
2023-12-13 11:32   ` kernel test robot
2023-12-13  0:04 ` [PATCH 03/15] kexec: Add Kexec HandOver (KHO) generation helpers Alexander Graf
2023-12-13 18:36   ` Stanislav Kinsburskii
2023-12-13 23:36     ` Alexander Graf
2023-12-13  0:04 ` [PATCH 04/15] kexec: Add KHO parsing support Alexander Graf
2023-12-13 18:56   ` Stanislav Kinsburskii
2023-12-13  0:04 ` [PATCH 05/15] kexec: Add KHO support to kexec file loads Alexander Graf
2023-12-13  0:04 ` [PATCH 06/15] arm64: Add KHO support Alexander Graf
2023-12-13 11:22   ` kernel test robot
2023-12-13 13:41   ` kernel test robot
2023-12-14 22:36   ` Rob Herring
2023-12-18 23:01     ` Alexander Graf [this message]
2023-12-13  0:04 ` [PATCH 07/15] x86: " Alexander Graf
2023-12-13  0:04 ` [PATCH 08/15] tracing: Introduce names for ring buffers Alexander Graf
2023-12-13  0:15   ` Steven Rostedt
2023-12-13  0:35     ` Alexander Graf
2023-12-13  0:44       ` Steven Rostedt
2023-12-13 11:22   ` kernel test robot
2023-12-13  0:04 ` [PATCH 09/15] tracing: Introduce names for events Alexander Graf
2023-12-13  0:49   ` Steven Rostedt
2023-12-13  0:04 ` [PATCH 10/15] tracing: Introduce kho serialization Alexander Graf
2023-12-13  0:04 ` [PATCH 11/15] tracing: Add kho serialization of trace buffers Alexander Graf
2023-12-13  0:04 ` [PATCH 12/15] tracing: Recover trace buffers from kexec handover Alexander Graf
2023-12-13  0:04 ` [PATCH 13/15] tracing: Add kho serialization of trace events Alexander Graf
2023-12-13  0:04 ` [PATCH 14/15] tracing: Recover trace events from kexec handover Alexander Graf
2023-12-13  0:04 ` [PATCH 15/15] tracing: Add config option for " Alexander Graf
2023-12-14 14:58 ` [PATCH 00/15] kexec: Allow preservation of ftrace buffers Eric W. Biederman
2023-12-14 16:02   ` Alexander Graf

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d843596e-0def-439b-966a-a0f10a1b7f6d@amazon.com \
    --to=graf@amazon.com \
    --cc=akpm@linux-foundation.org \
    --cc=anthony.yznaga@oracle.com \
    --cc=arnd@arndb.de \
    --cc=ashish.kalra@amd.com \
    --cc=benh@kernel.crashing.org \
    --cc=devicetree@vger.kernel.org \
    --cc=dwmw@amazon.co.uk \
    --cc=ebiederm@xmission.com \
    --cc=hpa@zytor.com \
    --cc=jgowans@amazon.com \
    --cc=kexec@lists.infradead.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=luto@kernel.org \
    --cc=madvenka@linux.microsoft.com \
    --cc=mark.rutland@arm.com \
    --cc=pbonzini@redhat.com \
    --cc=peterz@infradead.org \
    --cc=robh@kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=skinsburskii@linux.microsoft.com \
    --cc=thomas.lendacky@amd.com \
    --cc=usama.arif@bytedance.com \
    --cc=x86@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).