($INBOX_DIR/description missing)
 help / color / mirror / Atom feed
From: Richard Purdie <richard.purdie@linuxfoundation.org>
To: bitbake-devel@lists.openembedded.org
Cc: randy.macleod@windriver.com
Subject: [PATCH] runqueue: Add support for BB_LOADFACTOR_MAX
Date: Wed, 21 Feb 2024 13:21:03 +0000	[thread overview]
Message-ID: <20240221132103.794574-1-richard.purdie@linuxfoundation.org> (raw)

Some ditros don't enable /proc/pressure and it tends to be those which we
see bitbake timeout issues on, seemingly as load gets too high and the bitbake
processes don't get scheduled in for minutes at a time.

Add support for stopping running extra tasks if the system load average goes
above a certain threshold by setting BB_LOADFACTOR_MAX.

The value used is scaled by CPU number, so a value of 1 would be when
the load average equals the number of cpu cores of the system, under one
only starts tasks when the load average is below the number of cores.

This means you can centrally set a value such as 1.5 which will then
scale correctly to different sized machines with differing numbers
of CPUs.

The pressure regulation is probably more accurate and responsive, however
our graphs do show singificant load spikes on some workers and this
patch is aimed at trying to avoid those.

Pressure regulation is used where available in preference to this load
factor regulation when both are set.

Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
---
 lib/bb/runqueue.py | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

diff --git a/lib/bb/runqueue.py b/lib/bb/runqueue.py
index e86ccd8c61..6987de3e29 100644
--- a/lib/bb/runqueue.py
+++ b/lib/bb/runqueue.py
@@ -220,6 +220,16 @@ class RunQueueScheduler(object):
                 bb.note("Pressure status changed to CPU: %s, IO: %s, Mem: %s (CPU: %s/%s, IO: %s/%s, Mem: %s/%s) - using %s/%s bitbake threads" % (pressure_state + pressure_values + (len(self.rq.runq_running.difference(self.rq.runq_complete)), self.rq.number_tasks)))
             self.pressure_state = pressure_state
             return (exceeds_cpu_pressure or exceeds_io_pressure or exceeds_memory_pressure)
+        elif self.rq.max_loadfactor:
+            limit = False
+            loadfactor = float(os.getloadavg()[0]) / os.cpu_count()
+            # bb.warn("Comparing %s to %s" % (loadfactor, self.rq.max_loadfactor))
+            if loadfactor > self.rq.max_loadfactor:
+                limit = True
+            if hasattr(self, "loadfactor_limit") and limit != self.loadfactor_limit:
+                bb.note("Load average limiting set to %s as load average: %s - using %s/%s bitbake threads" % (limit, loadfactor, len(self.rq.runq_running.difference(self.rq.runq_complete)), self.rq.number_tasks))
+            self.loadfactor_limit = limit
+            return limit
         return False
 
     def next_buildable_task(self):
@@ -1822,6 +1832,7 @@ class RunQueueExecute:
         self.max_cpu_pressure = self.cfgData.getVar("BB_PRESSURE_MAX_CPU")
         self.max_io_pressure = self.cfgData.getVar("BB_PRESSURE_MAX_IO")
         self.max_memory_pressure = self.cfgData.getVar("BB_PRESSURE_MAX_MEMORY")
+        self.max_loadfactor = self.cfgData.getVar("BB_LOADFACTOR_MAX")
 
         self.sq_buildable = set()
         self.sq_running = set()
@@ -1875,6 +1886,11 @@ class RunQueueExecute:
                 bb.fatal("Invalid BB_PRESSURE_MAX_MEMORY %s, minimum value is %s." % (self.max_memory_pressure, lower_limit))
             if self.max_memory_pressure > upper_limit:
                 bb.warn("Your build will be largely unregulated since BB_PRESSURE_MAX_MEMORY is set to %s. It is very unlikely that such high pressure will be experienced." % (self.max_io_pressure))
+
+        if self.max_loadfactor:
+            self.max_loadfactor = float(self.max_loadfactor)
+            if self.max_loadfactor <= 0:
+                bb.fatal("Invalid BB_LOADFACTOR_MAX %s, needs to be greater than zero." % (self.max_loadfactor))
             
         # List of setscene tasks which we've covered
         self.scenequeue_covered = set()
-- 
2.40.1



             reply	other threads:[~2024-02-21 13:21 UTC|newest]

Thread overview: 2+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-21 13:21 Richard Purdie [this message]
2024-02-26 23:12 ` [PATCH] runqueue: Add support for BB_LOADFACTOR_MAX Randy MacLeod

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240221132103.794574-1-richard.purdie@linuxfoundation.org \
    --to=richard.purdie@linuxfoundation.org \
    --cc=bitbake-devel@lists.openembedded.org \
    --cc=randy.macleod@windriver.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).