linux-trace-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "张元瀚 Tio Zhang" <tiozhang@didiglobal.com>
To: Steven Rostedt <rostedt@goodmis.org>
Cc: Masami Hiramatsu <mhiramat@kernel.org>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	Ingo Molnar <mingo@kernel.org>,
	"Peter Zijlstra" <peterz@infradead.org>,
	"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
	"linux-trace-kernel@vger.kernel.org"
	<linux-trace-kernel@vger.kernel.org>,
	"zyhtheonly@yeah.net" <zyhtheonly@yeah.net>,
	"zyhtheonly@gmail.com" <zyhtheonly@gmail.com>
Subject: Re: [PATCH] trace/sched: add tgid for sched_wakeup_template
Date: Fri, 29 Mar 2024 03:02:36 +0000	[thread overview]
Message-ID: <8246DBFF-ED09-4635-9DDA-D1DBB600B853@didiglobal.com> (raw)
In-Reply-To: <20240327102634.17013392@gandalf.local.home>

Make sense to me, thank you for your explanation.

On 3/27/24, 10:24 PM, "Steven Rostedt" <rostedt@goodmis.org <mailto:rostedt@goodmis.org>> wrote:


On Wed, 27 Mar 2024 16:50:57 +0800
Tio Zhang <tiozhang@didiglobal.com <mailto:tiozhang@didiglobal.com>> wrote:


> By doing this, we are able to filter tasks by tgid while we are
> tracing wakeup events by ebpf or other methods.
> 
> For example, when we care about tracing a user space process (which has
> uncertain number of LWPs, i.e, pids) to monitor its wakeup latency,
> without tgid available in sched_wakeup tracepoints, we would struggle
> finding out all pids to trace, or we could use kprobe to achieve tgid
> tracing, which is less accurate and much less efficient than using
> tracepoint.


This is a very common trace event, and I really do not want to add more
data than necessary to it, as it increases the size of the event which
means less events can be recorded on a fixed size trace ring buffer.


Note, you are not modifying the "tracepoint", but you are actually
modifying a "trace event".


"tracepoint" is the hook in the kernel code:


trace_sched_wakeup()


"trace event" is defined by TRACE_EVENT() macro (and friends) that defines
what is exposed in the tracefs file system.


I thought ebpf could hook directly to the tracepoint which is:


trace_sched_wakeup(p);


I believe you can have direct access to the 'p' before it is processed from ebpf.


There's also "trace probes" (I think we are lacking documentation on this,
as well as event probes :-p):


$ gdb vmlinux
(gdb) p &((struct task_struct *)0)->tgid
$1 = (pid_t *) 0x56c
(gdb) p &((struct task_struct *)0)->pid
$2 = (pid_t *) 0x568


# echo 't:wakeup sched_waking pid=+0x568($arg1):u32 tgid=+0x56c($arg1):u32' > /sys/kernel/tracing/dynamic_events


# trace-cmd start -e wakeup
# trace-cmd show
trace-cmd-7307 [003] d..6. 599486.485762: wakeup: (__probestub_sched_waking+0x4/0x10) pid=845 tgid=845
bash-845 [001] d.s4. 599486.486136: wakeup: (__probestub_sched_waking+0x4/0x10) pid=17 tgid=17
bash-845 [001] d..4. 599486.486336: wakeup: (__probestub_sched_waking+0x4/0x10) pid=5516 tgid=5516
kworker/u18:2-5516 [001] d..4. 599486.486445: wakeup: (__probestub_sched_waking+0x4/0x10) pid=818 tgid=818
<idle>-0 [001] d.s4. 599486.491206: wakeup: (__probestub_sched_waking+0x4/0x10) pid=17 tgid=17
<idle>-0 [001] d.s5. 599486.493218: wakeup: (__probestub_sched_waking+0x4/0x10) pid=17 tgid=17
<idle>-0 [001] d.s4. 599486.497200: wakeup: (__probestub_sched_waking+0x4/0x10) pid=17 tgid=17
<idle>-0 [003] d.s4. 599486.829209: wakeup: (__probestub_sched_waking+0x4/0x10) pid=70 tgid=70


The above attaches to the tracepoint and $arg1 is the 'struct task_struct *p'.


-- Steve




> 
> Signed-off-by: Tio Zhang <tiozhang@didiglobal.com <mailto:tiozhang@didiglobal.com>>
> Signed-off-by: Dylane Chen <dylanechen@didiglobal.com <mailto:dylanechen@didiglobal.com>>
> ---
> include/trace/events/sched.h | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/include/trace/events/sched.h b/include/trace/events/sched.h
> index dbb01b4b7451..ea7e525649e5 100644
> --- a/include/trace/events/sched.h
> +++ b/include/trace/events/sched.h
> @@ -149,6 +149,7 @@ DECLARE_EVENT_CLASS(sched_wakeup_template,
> __field( pid_t, pid )
> __field( int, prio )
> __field( int, target_cpu )
> + __field( pid_t, tgid )
> ),
> 
> TP_fast_assign(
> @@ -156,11 +157,12 @@ DECLARE_EVENT_CLASS(sched_wakeup_template,
> __entry->pid = p->pid;
> __entry->prio = p->prio; /* XXX SCHED_DEADLINE */
> __entry->target_cpu = task_cpu(p);
> + __entry->tgid = p->tgid;
> ),
> 
> - TP_printk("comm=%s pid=%d prio=%d target_cpu=%03d",
> + TP_printk("comm=%s pid=%d prio=%d target_cpu=%03d tgid=%d",
> __entry->comm, __entry->pid, __entry->prio,
> - __entry->target_cpu)
> + __entry->target_cpu, __entry->tgid)
> );
> 
> /*






      reply	other threads:[~2024-03-29  3:05 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-03-27  8:50 [PATCH] trace/sched: add tgid for sched_wakeup_template Tio Zhang
2024-03-27 14:26 ` Steven Rostedt
2024-03-29  3:02   ` 张元瀚 Tio Zhang [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8246DBFF-ED09-4635-9DDA-D1DBB600B853@didiglobal.com \
    --to=tiozhang@didiglobal.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-trace-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mhiramat@kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=zyhtheonly@gmail.com \
    --cc=zyhtheonly@yeah.net \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).