Linux-ide Archive mirror
 help / color / mirror / Atom feed
* Re: sata_mv query
       [not found] <006c01cd74ac$a06edf30$e14c9d90$@sturgeon@perpetual-data.com>
@ 2012-08-07 17:16 ` Mark Lord
  2012-08-07 18:17   ` Mark Lord
  0 siblings, 1 reply; 3+ messages in thread
From: Mark Lord @ 2012-08-07 17:16 UTC (permalink / raw
  To: Barry J Sturgeon; +Cc: linux-ide

On 12-08-07 10:54 AM, Barry J Sturgeon wrote:
> when loading the 3.2.1. kernel onto my supermicro pdsm4 motherboard (with
> marvell 8 port sata controller) i have encountered performance degradation
> due to excessive cpu usage (i.e. three busy kworkers). after some
> investigation i have tracked down the problem to:
> 
> +static void mv_wait_for_edma_empty_idle(struct ata_port *ap)
> +{
> +       void __iomem *port_mmio = mv_ap_base(ap);
> +       const u32 empty_idle = (edma_status_cache_empty | edma_status_idle);
> +       const int per_loop = 5, timeout = (15 * 1000 / per_loop);
> +       int i;
> +
> +       /*
> +        * wait for the edma engine to finish transactions in progress.
> +        */
> +       for (i = 0; i < timeout; ++i) {
> +               u32 edma_stat = readl(port_mmio + edma_status_ofs);
> +               if ((edma_stat & empty_idle) == empty_idle)
> +                       break;
> *+               udelay(per_loop);*
> +       }
> +       /* ata_port_printk(ap, kern_info, "%s: %u+ usecs\n", __func__, i); */
> +}
>  
> it appears that we always reach the timeout value.
> note that empty_idle = 0xc0 (as you know),
> but most edma_stat values are (1100 or 1000).

Okay, that's weird.
mv_wait_for_edma_empty_idle() is called from mv_stop_edma(),
which happens whenever we transition between sending NCQ commands
(eg. read/write) and non-NCQ commands (eg. flush cache).

The mv_qc_defer() function is supposed to ensure that things are
more or less idle before it even gets to mv_stop_edma().
So we'd expect the mv_wait_for_edma_empty_idle() function to be quick,
which is a big part of why we're willing to tolerate an inline busy-wait.

But here it seems to be waiting MUCH longer, as if it has to wait for
some IO-in-progress to complete before continuing.  Which means that
the mv_qc_defer() function isn't being as effective as it should be.

The idea is, before any new command is queued, libata invokes the defer
function to see if the driver is ready to accept the new command.
The sata_mv driver normally says "yes" (go ahead) for a new NCQ command
whenever it currently has existing NCQ commands in-flight.
Otherwise it always says "no" (please defer) unless completely idle.
That's how mv_qc_defer() is supposed to work, and that should minimize
the time spent inside mv_wait_for_edma_empty_idle() by ensuring the
edma engine is already idle whenever that gets called.

This in turn suggests two possibilities:
(1) perhaps ap->nr_active_links has an accounting glitch somewhere. Or
(2) maybe there's a locking bug somewhere around the invocation
of mv_qc_defer().

Or something else is wrong.  :)

So, focus your attentions on mv_qc_defer().
Also, you could try mounting the filesystem with the "barrier=0" option,
just to see if that makes the problem go away -- confirming the analysis above.
-- 
mark lord
real-time remedies inc.
mlord@pobox.com

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: sata_mv query
  2012-08-07 17:16 ` sata_mv query Mark Lord
@ 2012-08-07 18:17   ` Mark Lord
  2012-08-07 18:20     ` Mark Lord
  0 siblings, 1 reply; 3+ messages in thread
From: Mark Lord @ 2012-08-07 18:17 UTC (permalink / raw
  To: Barry J Sturgeon; +Cc: linux-ide

[-- Attachment #1: Type: text/plain, Size: 2489 bytes --]

On 12-08-07 01:16 PM, Mark Lord wrote:
..
> mv_wait_for_edma_empty_idle() is called from mv_stop_edma(),
> which happens whenever we transition between sending NCQ commands
> (eg. read/write) and non-NCQ commands (eg. flush cache).
> 
> The mv_qc_defer() function is supposed to ensure that things are
> more or less idle before it even gets to mv_stop_edma().
> So we'd expect the mv_wait_for_edma_empty_idle() function to be quick,
> which is a big part of why we're willing to tolerate an inline busy-wait.
> 
> But here it seems to be waiting MUCH longer, as if it has to wait for
> some IO-in-progress to complete before continuing.  Which means that
> the mv_qc_defer() function isn't being as effective as it should be.
> 
> The idea is, before any new command is queued, libata invokes the defer
> function to see if the driver is ready to accept the new command.
> The sata_mv driver normally says "yes" (go ahead) for a new NCQ command
> whenever it currently has existing NCQ commands in-flight.
> Otherwise it always says "no" (please defer) unless completely idle.
> That's how mv_qc_defer() is supposed to work, and that should minimize
> the time spent inside mv_wait_for_edma_empty_idle() by ensuring the
> edma engine is already idle whenever that gets called.

Say, here's a thought:  We could get rid of (or relocate to upper layers)
the busy-wait by having something similar get called from ata_qc_defer().

If you are feeling adventurous, here is a 100% untested patch to do just that.
It looks correct to me, it compiles, but that's all I can say.

The attached copy is the better one to use -- my mailer mangles inline patches.


--- linux-3.4.4/drivers/ata/sata_mv.c	2012-06-22 14:37:50.000000000 -0400
+++ linux/drivers/ata/sata_mv.c	2012-08-07 14:14:43.554503157 -0400
@@ -1388,6 +1388,17 @@
 	}
 }

+static int mv_check_for_edma_empty_idle(struct ata_port *ap)
+{
+	void __iomem *port_mmio = mv_ap_base(ap);
+	const u32 empty_idle = (EDMA_STATUS_CACHE_EMPTY | EDMA_STATUS_IDLE);
+
+	u32 edma_stat = readl(port_mmio + EDMA_STATUS);
+	if ((edma_stat & empty_idle) == empty_idle)
+		return ATA_DEFER_PORT;
+	return 0;
+}
+
 static int mv_qc_defer(struct ata_queued_cmd *qc)
 {
 	struct ata_link *link = qc->dev->link;
@@ -1423,7 +1434,7 @@
 	 * If the port is completely idle, then allow the new qc.
 	 */
 	if (ap->nr_active_links == 0)
-		return 0;
+		return mv_check_for_edma_empty_idle(ap);

 	/*
 	 * The port is operating in host queuing mode (EDMA) with NCQ

[-- Attachment #2: xx_sata_mv_move_busywait.patch --]
[-- Type: text/x-patch, Size: 838 bytes --]

--- linux-3.4.4/drivers/ata/sata_mv.c	2012-06-22 14:37:50.000000000 -0400
+++ linux/drivers/ata/sata_mv.c	2012-08-07 14:14:43.554503157 -0400
@@ -1388,6 +1388,17 @@
 	}
 }
 
+static int mv_check_for_edma_empty_idle(struct ata_port *ap)
+{
+	void __iomem *port_mmio = mv_ap_base(ap);
+	const u32 empty_idle = (EDMA_STATUS_CACHE_EMPTY | EDMA_STATUS_IDLE);
+
+	u32 edma_stat = readl(port_mmio + EDMA_STATUS);
+	if ((edma_stat & empty_idle) == empty_idle)
+		return ATA_DEFER_PORT;
+	return 0;
+}
+
 static int mv_qc_defer(struct ata_queued_cmd *qc)
 {
 	struct ata_link *link = qc->dev->link;
@@ -1423,7 +1434,7 @@
 	 * If the port is completely idle, then allow the new qc.
 	 */
 	if (ap->nr_active_links == 0)
-		return 0;
+		return mv_check_for_edma_empty_idle(ap);
 
 	/*
 	 * The port is operating in host queuing mode (EDMA) with NCQ

^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: sata_mv query
  2012-08-07 18:17   ` Mark Lord
@ 2012-08-07 18:20     ` Mark Lord
  0 siblings, 0 replies; 3+ messages in thread
From: Mark Lord @ 2012-08-07 18:20 UTC (permalink / raw
  To: Barry J Sturgeon; +Cc: linux-ide

[-- Attachment #1: Type: text/plain, Size: 1327 bytes --]

On 12-08-07 02:17 PM, Mark Lord wrote:
..
> Say, here's a thought:  We could get rid of (or relocate to upper layers)
> the busy-wait by having something similar get called from ata_qc_defer().
> 
> If you are feeling adventurous, here is a 100% untested patch to do just that.
> It looks correct to me, it compiles, but that's all I can say.

Whoops.. return codes were reversed.
Here's the corrected patch.

The attached copy is the better one to use -- my mailer mangles inline patches.

--- linux-3.4.4/drivers/ata/sata_mv.c	2012-06-22 14:37:50.000000000 -0400
+++ linux/drivers/ata/sata_mv.c	2012-08-07 14:14:43.554503157 -0400
@@ -1388,6 +1388,17 @@
 	}
 }

+static int mv_check_for_edma_empty_idle(struct ata_port *ap)
+{
+	void __iomem *port_mmio = mv_ap_base(ap);
+	const u32 empty_idle = (EDMA_STATUS_CACHE_EMPTY | EDMA_STATUS_IDLE);
+
+	u32 edma_stat = readl(port_mmio + EDMA_STATUS);
+	if ((edma_stat & empty_idle) == empty_idle)
+		return 0;
+	return ATA_DEFER_PORT;
+}
+
 static int mv_qc_defer(struct ata_queued_cmd *qc)
 {
 	struct ata_link *link = qc->dev->link;
@@ -1423,7 +1434,7 @@
 	 * If the port is completely idle, then allow the new qc.
 	 */
 	if (ap->nr_active_links == 0)
-		return 0;
+		return mv_check_for_edma_empty_idle(ap);

 	/*
 	 * The port is operating in host queuing mode (EDMA) with NCQ

[-- Attachment #2: xx_sata_mv_move_busywait.patch --]
[-- Type: text/x-patch, Size: 838 bytes --]

--- linux-3.4.4/drivers/ata/sata_mv.c	2012-06-22 14:37:50.000000000 -0400
+++ linux/drivers/ata/sata_mv.c	2012-08-07 14:14:43.554503157 -0400
@@ -1388,6 +1388,17 @@
 	}
 }
 
+static int mv_check_for_edma_empty_idle(struct ata_port *ap)
+{
+	void __iomem *port_mmio = mv_ap_base(ap);
+	const u32 empty_idle = (EDMA_STATUS_CACHE_EMPTY | EDMA_STATUS_IDLE);
+
+	u32 edma_stat = readl(port_mmio + EDMA_STATUS);
+	if ((edma_stat & empty_idle) == empty_idle)
+		return 0;
+	return ATA_DEFER_PORT;
+}
+
 static int mv_qc_defer(struct ata_queued_cmd *qc)
 {
 	struct ata_link *link = qc->dev->link;
@@ -1423,7 +1434,7 @@
 	 * If the port is completely idle, then allow the new qc.
 	 */
 	if (ap->nr_active_links == 0)
-		return 0;
+		return mv_check_for_edma_empty_idle(ap);
 
 	/*
 	 * The port is operating in host queuing mode (EDMA) with NCQ

^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2012-08-07 18:20 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <006c01cd74ac$a06edf30$e14c9d90$@sturgeon@perpetual-data.com>
2012-08-07 17:16 ` sata_mv query Mark Lord
2012-08-07 18:17   ` Mark Lord
2012-08-07 18:20     ` Mark Lord

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).