All the mail mirrored from lore.kernel.org
 help / color / mirror / Atom feed
* [for-linus][PATCH 0/4] tracing: A few fixes for 5.10
@ 2020-11-05 16:24 Steven Rostedt
  2020-11-05 16:24 ` [for-linus][PATCH 1/4] tracing: Fix the checking of stackidx in __ftrace_trace_stack Steven Rostedt
                   ` (3 more replies)
  0 siblings, 4 replies; 5+ messages in thread
From: Steven Rostedt @ 2020-11-05 16:24 UTC (permalink / raw
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton


Qiujun Huang (1):
      tracing: Fix the checking of stackidx in __ftrace_trace_stack

Steven Rostedt (VMware) (3):
      ring-buffer: Fix recursion protection transitions between interrupt context
      tracing: Make -ENOMEM the default error for parse_synth_field()
      kprobes: Tell lockdep about kprobe nesting

----
 kernel/kprobes.c                  | 25 ++++++++++++++---
 kernel/trace/ring_buffer.c        | 58 +++++++++++++++++++++++++++++++--------
 kernel/trace/trace.c              |  4 +--
 kernel/trace/trace_events_synth.c | 17 +++++-------
 4 files changed, 76 insertions(+), 28 deletions(-)

^ permalink raw reply	[flat|nested] 5+ messages in thread

* [for-linus][PATCH 1/4] tracing: Fix the checking of stackidx in __ftrace_trace_stack
  2020-11-05 16:24 [for-linus][PATCH 0/4] tracing: A few fixes for 5.10 Steven Rostedt
@ 2020-11-05 16:24 ` Steven Rostedt
  2020-11-05 16:25 ` [for-linus][PATCH 2/4] ring-buffer: Fix recursion protection transitions between interrupt context Steven Rostedt
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 5+ messages in thread
From: Steven Rostedt @ 2020-11-05 16:24 UTC (permalink / raw
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Qiujun Huang

From: Qiujun Huang <hqjagain@gmail.com>

The array size is FTRACE_KSTACK_NESTING, so the index FTRACE_KSTACK_NESTING
is illegal too. And fix two typos by the way.

Link: https://lkml.kernel.org/r/20201031085714.2147-1-hqjagain@gmail.com

Signed-off-by: Qiujun Huang <hqjagain@gmail.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 kernel/trace/trace.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index daa96215e294..410cfeb16db5 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -2750,7 +2750,7 @@ trace_event_buffer_lock_reserve(struct trace_buffer **current_rb,
 	/*
 	 * If tracing is off, but we have triggers enabled
 	 * we still need to look at the event data. Use the temp_buffer
-	 * to store the trace event for the tigger to use. It's recusive
+	 * to store the trace event for the trigger to use. It's recursive
 	 * safe and will not be recorded anywhere.
 	 */
 	if (!entry && trace_file->flags & EVENT_FILE_FL_TRIGGER_COND) {
@@ -2952,7 +2952,7 @@ static void __ftrace_trace_stack(struct trace_buffer *buffer,
 	stackidx = __this_cpu_inc_return(ftrace_stack_reserve) - 1;
 
 	/* This should never happen. If it does, yell once and skip */
-	if (WARN_ON_ONCE(stackidx > FTRACE_KSTACK_NESTING))
+	if (WARN_ON_ONCE(stackidx >= FTRACE_KSTACK_NESTING))
 		goto out;
 
 	/*
-- 
2.28.0



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [for-linus][PATCH 2/4] ring-buffer: Fix recursion protection transitions between interrupt context
  2020-11-05 16:24 [for-linus][PATCH 0/4] tracing: A few fixes for 5.10 Steven Rostedt
  2020-11-05 16:24 ` [for-linus][PATCH 1/4] tracing: Fix the checking of stackidx in __ftrace_trace_stack Steven Rostedt
@ 2020-11-05 16:25 ` Steven Rostedt
  2020-11-05 16:25 ` [for-linus][PATCH 3/4] tracing: Make -ENOMEM the default error for parse_synth_field() Steven Rostedt
  2020-11-05 16:25 ` [for-linus][PATCH 4/4] kprobes: Tell lockdep about kprobe nesting Steven Rostedt
  3 siblings, 0 replies; 5+ messages in thread
From: Steven Rostedt @ 2020-11-05 16:25 UTC (permalink / raw
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, stable

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

The recursion protection of the ring buffer depends on preempt_count() to be
correct. But it is possible that the ring buffer gets called after an
interrupt comes in but before it updates the preempt_count(). This will
trigger a false positive in the recursion code.

Use the same trick from the ftrace function callback recursion code which
uses a "transition" bit that gets set, to allow for a single recursion for
to handle transitions between contexts.

Cc: stable@vger.kernel.org
Fixes: 567cd4da54ff4 ("ring-buffer: User context bit recursion checking")
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 kernel/trace/ring_buffer.c | 58 ++++++++++++++++++++++++++++++--------
 1 file changed, 46 insertions(+), 12 deletions(-)

diff --git a/kernel/trace/ring_buffer.c b/kernel/trace/ring_buffer.c
index 7f45fd9d5a45..dc83b3fa9fe7 100644
--- a/kernel/trace/ring_buffer.c
+++ b/kernel/trace/ring_buffer.c
@@ -438,14 +438,16 @@ enum {
 };
 /*
  * Used for which event context the event is in.
- *  NMI     = 0
- *  IRQ     = 1
- *  SOFTIRQ = 2
- *  NORMAL  = 3
+ *  TRANSITION = 0
+ *  NMI     = 1
+ *  IRQ     = 2
+ *  SOFTIRQ = 3
+ *  NORMAL  = 4
  *
  * See trace_recursive_lock() comment below for more details.
  */
 enum {
+	RB_CTX_TRANSITION,
 	RB_CTX_NMI,
 	RB_CTX_IRQ,
 	RB_CTX_SOFTIRQ,
@@ -3014,10 +3016,10 @@ rb_wakeups(struct trace_buffer *buffer, struct ring_buffer_per_cpu *cpu_buffer)
  * a bit of overhead in something as critical as function tracing,
  * we use a bitmask trick.
  *
- *  bit 0 =  NMI context
- *  bit 1 =  IRQ context
- *  bit 2 =  SoftIRQ context
- *  bit 3 =  normal context.
+ *  bit 1 =  NMI context
+ *  bit 2 =  IRQ context
+ *  bit 3 =  SoftIRQ context
+ *  bit 4 =  normal context.
  *
  * This works because this is the order of contexts that can
  * preempt other contexts. A SoftIRQ never preempts an IRQ
@@ -3040,6 +3042,30 @@ rb_wakeups(struct trace_buffer *buffer, struct ring_buffer_per_cpu *cpu_buffer)
  * The least significant bit can be cleared this way, and it
  * just so happens that it is the same bit corresponding to
  * the current context.
+ *
+ * Now the TRANSITION bit breaks the above slightly. The TRANSITION bit
+ * is set when a recursion is detected at the current context, and if
+ * the TRANSITION bit is already set, it will fail the recursion.
+ * This is needed because there's a lag between the changing of
+ * interrupt context and updating the preempt count. In this case,
+ * a false positive will be found. To handle this, one extra recursion
+ * is allowed, and this is done by the TRANSITION bit. If the TRANSITION
+ * bit is already set, then it is considered a recursion and the function
+ * ends. Otherwise, the TRANSITION bit is set, and that bit is returned.
+ *
+ * On the trace_recursive_unlock(), the TRANSITION bit will be the first
+ * to be cleared. Even if it wasn't the context that set it. That is,
+ * if an interrupt comes in while NORMAL bit is set and the ring buffer
+ * is called before preempt_count() is updated, since the check will
+ * be on the NORMAL bit, the TRANSITION bit will then be set. If an
+ * NMI then comes in, it will set the NMI bit, but when the NMI code
+ * does the trace_recursive_unlock() it will clear the TRANSTION bit
+ * and leave the NMI bit set. But this is fine, because the interrupt
+ * code that set the TRANSITION bit will then clear the NMI bit when it
+ * calls trace_recursive_unlock(). If another NMI comes in, it will
+ * set the TRANSITION bit and continue.
+ *
+ * Note: The TRANSITION bit only handles a single transition between context.
  */
 
 static __always_inline int
@@ -3055,8 +3081,16 @@ trace_recursive_lock(struct ring_buffer_per_cpu *cpu_buffer)
 		bit = pc & NMI_MASK ? RB_CTX_NMI :
 			pc & HARDIRQ_MASK ? RB_CTX_IRQ : RB_CTX_SOFTIRQ;
 
-	if (unlikely(val & (1 << (bit + cpu_buffer->nest))))
-		return 1;
+	if (unlikely(val & (1 << (bit + cpu_buffer->nest)))) {
+		/*
+		 * It is possible that this was called by transitioning
+		 * between interrupt context, and preempt_count() has not
+		 * been updated yet. In this case, use the TRANSITION bit.
+		 */
+		bit = RB_CTX_TRANSITION;
+		if (val & (1 << (bit + cpu_buffer->nest)))
+			return 1;
+	}
 
 	val |= (1 << (bit + cpu_buffer->nest));
 	cpu_buffer->current_context = val;
@@ -3071,8 +3105,8 @@ trace_recursive_unlock(struct ring_buffer_per_cpu *cpu_buffer)
 		cpu_buffer->current_context - (1 << cpu_buffer->nest);
 }
 
-/* The recursive locking above uses 4 bits */
-#define NESTED_BITS 4
+/* The recursive locking above uses 5 bits */
+#define NESTED_BITS 5
 
 /**
  * ring_buffer_nest_start - Allow to trace while nested
-- 
2.28.0



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [for-linus][PATCH 3/4] tracing: Make -ENOMEM the default error for parse_synth_field()
  2020-11-05 16:24 [for-linus][PATCH 0/4] tracing: A few fixes for 5.10 Steven Rostedt
  2020-11-05 16:24 ` [for-linus][PATCH 1/4] tracing: Fix the checking of stackidx in __ftrace_trace_stack Steven Rostedt
  2020-11-05 16:25 ` [for-linus][PATCH 2/4] ring-buffer: Fix recursion protection transitions between interrupt context Steven Rostedt
@ 2020-11-05 16:25 ` Steven Rostedt
  2020-11-05 16:25 ` [for-linus][PATCH 4/4] kprobes: Tell lockdep about kprobe nesting Steven Rostedt
  3 siblings, 0 replies; 5+ messages in thread
From: Steven Rostedt @ 2020-11-05 16:25 UTC (permalink / raw
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Dan Carpenter

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

parse_synth_field() returns a pointer and requires that errors get
surrounded by ERR_PTR(). The ret variable is initialized to zero, but should
never be used as zero, and if it is, it could cause a false return code and
produce a NULL pointer dereference. It makes no sense to set ret to zero.

Set ret to -ENOMEM (the most common error case), and have any other errors
set it to something else. This removes the need to initialize ret on *every*
error branch.

Fixes: 761a8c58db6b ("tracing, synthetic events: Replace buggy strcat() with seq_buf operations")
Reported-by: Dan Carpenter <dan.carpenter@oracle.com>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 kernel/trace/trace_events_synth.c | 17 +++++++----------
 1 file changed, 7 insertions(+), 10 deletions(-)

diff --git a/kernel/trace/trace_events_synth.c b/kernel/trace/trace_events_synth.c
index 84b7cab55291..881df991742a 100644
--- a/kernel/trace/trace_events_synth.c
+++ b/kernel/trace/trace_events_synth.c
@@ -584,7 +584,7 @@ static struct synth_field *parse_synth_field(int argc, const char **argv,
 {
 	struct synth_field *field;
 	const char *prefix = NULL, *field_type = argv[0], *field_name, *array;
-	int len, ret = 0;
+	int len, ret = -ENOMEM;
 	struct seq_buf s;
 	ssize_t size;
 
@@ -617,10 +617,9 @@ static struct synth_field *parse_synth_field(int argc, const char **argv,
 		len--;
 
 	field->name = kmemdup_nul(field_name, len, GFP_KERNEL);
-	if (!field->name) {
-		ret = -ENOMEM;
+	if (!field->name)
 		goto free;
-	}
+
 	if (!is_good_name(field->name)) {
 		synth_err(SYNTH_ERR_BAD_NAME, errpos(field_name));
 		ret = -EINVAL;
@@ -638,10 +637,9 @@ static struct synth_field *parse_synth_field(int argc, const char **argv,
 		len += strlen(prefix);
 
 	field->type = kzalloc(len, GFP_KERNEL);
-	if (!field->type) {
-		ret = -ENOMEM;
+	if (!field->type)
 		goto free;
-	}
+
 	seq_buf_init(&s, field->type, len);
 	if (prefix)
 		seq_buf_puts(&s, prefix);
@@ -653,6 +651,7 @@ static struct synth_field *parse_synth_field(int argc, const char **argv,
 	}
 	if (WARN_ON_ONCE(!seq_buf_buffer_left(&s)))
 		goto free;
+
 	s.buffer[s.len] = '\0';
 
 	size = synth_field_size(field->type);
@@ -666,10 +665,8 @@ static struct synth_field *parse_synth_field(int argc, const char **argv,
 
 			len = sizeof("__data_loc ") + strlen(field->type) + 1;
 			type = kzalloc(len, GFP_KERNEL);
-			if (!type) {
-				ret = -ENOMEM;
+			if (!type)
 				goto free;
-			}
 
 			seq_buf_init(&s, type, len);
 			seq_buf_puts(&s, "__data_loc ");
-- 
2.28.0



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* [for-linus][PATCH 4/4] kprobes: Tell lockdep about kprobe nesting
  2020-11-05 16:24 [for-linus][PATCH 0/4] tracing: A few fixes for 5.10 Steven Rostedt
                   ` (2 preceding siblings ...)
  2020-11-05 16:25 ` [for-linus][PATCH 3/4] tracing: Make -ENOMEM the default error for parse_synth_field() Steven Rostedt
@ 2020-11-05 16:25 ` Steven Rostedt
  3 siblings, 0 replies; 5+ messages in thread
From: Steven Rostedt @ 2020-11-05 16:25 UTC (permalink / raw
  To: linux-kernel; +Cc: Ingo Molnar, Andrew Morton, Peter Zijlstra, Masami Hiramatsu

From: "Steven Rostedt (VMware)" <rostedt@goodmis.org>

Since the kprobe handlers have protection that prohibits other handlers from
executing in other contexts (like if an NMI comes in while processing a
kprobe, and executes the same kprobe, it will get fail with a "busy"
return). Lockdep is unaware of this protection. Use lockdep's nesting api to
differentiate between locks taken in INT3 context and other context to
suppress the false warnings.

Link: https://lore.kernel.org/r/20201102160234.fa0ae70915ad9e2b21c08b85@kernel.org

Cc: Peter Zijlstra <peterz@infradead.org>
Acked-by: Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by: Steven Rostedt (VMware) <rostedt@goodmis.org>
---
 kernel/kprobes.c | 25 +++++++++++++++++++++----
 1 file changed, 21 insertions(+), 4 deletions(-)

diff --git a/kernel/kprobes.c b/kernel/kprobes.c
index 8a12a25fa40d..41fdbb7953c6 100644
--- a/kernel/kprobes.c
+++ b/kernel/kprobes.c
@@ -1249,7 +1249,13 @@ __acquires(hlist_lock)
 
 	*head = &kretprobe_inst_table[hash];
 	hlist_lock = kretprobe_table_lock_ptr(hash);
-	raw_spin_lock_irqsave(hlist_lock, *flags);
+	/*
+	 * Nested is a workaround that will soon not be needed.
+	 * There's other protections that make sure the same lock
+	 * is not taken on the same CPU that lockdep is unaware of.
+	 * Differentiate when it is taken in NMI context.
+	 */
+	raw_spin_lock_irqsave_nested(hlist_lock, *flags, !!in_nmi());
 }
 NOKPROBE_SYMBOL(kretprobe_hash_lock);
 
@@ -1258,7 +1264,13 @@ static void kretprobe_table_lock(unsigned long hash,
 __acquires(hlist_lock)
 {
 	raw_spinlock_t *hlist_lock = kretprobe_table_lock_ptr(hash);
-	raw_spin_lock_irqsave(hlist_lock, *flags);
+	/*
+	 * Nested is a workaround that will soon not be needed.
+	 * There's other protections that make sure the same lock
+	 * is not taken on the same CPU that lockdep is unaware of.
+	 * Differentiate when it is taken in NMI context.
+	 */
+	raw_spin_lock_irqsave_nested(hlist_lock, *flags, !!in_nmi());
 }
 NOKPROBE_SYMBOL(kretprobe_table_lock);
 
@@ -2028,7 +2040,12 @@ static int pre_handler_kretprobe(struct kprobe *p, struct pt_regs *regs)
 
 	/* TODO: consider to only swap the RA after the last pre_handler fired */
 	hash = hash_ptr(current, KPROBE_HASH_BITS);
-	raw_spin_lock_irqsave(&rp->lock, flags);
+	/*
+	 * Nested is a workaround that will soon not be needed.
+	 * There's other protections that make sure the same lock
+	 * is not taken on the same CPU that lockdep is unaware of.
+	 */
+	raw_spin_lock_irqsave_nested(&rp->lock, flags, 1);
 	if (!hlist_empty(&rp->free_instances)) {
 		ri = hlist_entry(rp->free_instances.first,
 				struct kretprobe_instance, hlist);
@@ -2039,7 +2056,7 @@ static int pre_handler_kretprobe(struct kprobe *p, struct pt_regs *regs)
 		ri->task = current;
 
 		if (rp->entry_handler && rp->entry_handler(ri, regs)) {
-			raw_spin_lock_irqsave(&rp->lock, flags);
+			raw_spin_lock_irqsave_nested(&rp->lock, flags, 1);
 			hlist_add_head(&ri->hlist, &rp->free_instances);
 			raw_spin_unlock_irqrestore(&rp->lock, flags);
 			return 0;
-- 
2.28.0



^ permalink raw reply related	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2020-11-05 16:25 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2020-11-05 16:24 [for-linus][PATCH 0/4] tracing: A few fixes for 5.10 Steven Rostedt
2020-11-05 16:24 ` [for-linus][PATCH 1/4] tracing: Fix the checking of stackidx in __ftrace_trace_stack Steven Rostedt
2020-11-05 16:25 ` [for-linus][PATCH 2/4] ring-buffer: Fix recursion protection transitions between interrupt context Steven Rostedt
2020-11-05 16:25 ` [for-linus][PATCH 3/4] tracing: Make -ENOMEM the default error for parse_synth_field() Steven Rostedt
2020-11-05 16:25 ` [for-linus][PATCH 4/4] kprobes: Tell lockdep about kprobe nesting Steven Rostedt

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.