IOMMU Archive mirror
 help / color / mirror / Atom feed
From: Andrew Halaney <ahalaney@redhat.com>
To: Bjorn Andersson <quic_bjorande@quicinc.com>
Cc: linux-arm-msm@vger.kernel.org, robdclark@gmail.com,
	will@kernel.org,  iommu@lists.linux.dev, joro@8bytes.org,
	linux-arm-kernel@lists.infradead.org,
	 linux-kernel@vger.kernel.org, quic_c_gdjako@quicinc.com,
	quic_cgoldswo@quicinc.com,  quic_sukadev@quicinc.com,
	quic_pdaly@quicinc.com, quic_sudaraja@quicinc.com
Subject: Re: sa8775p-ride: What's a normal SMMU TLB sync time?
Date: Fri, 5 Apr 2024 09:04:43 -0500	[thread overview]
Message-ID: <lqdosfpb7gdjooqswgjnabzxapocndzam3ws7dq7god5bn55an@igvaowz6h7ye> (raw)
In-Reply-To: <Zg9vEJV5JyGoM8KY@hu-bjorande-lv.qualcomm.com>

On Thu, Apr 04, 2024 at 08:25:04PM -0700, Bjorn Andersson wrote:
> On Tue, Apr 02, 2024 at 04:22:31PM -0500, Andrew Halaney wrote:
> > Hey,
> > 
> > Sorry for the wide email, but I figured someone recently contributing
> > to / maintaining the Qualcomm SMMU driver may have some proper insights
> > into this.
> > 
> > Recently I remembered that performance on some Qualcomm platforms
> > takes a major hit when you use iommu.strict=1/CONFIG_IOMMU_DEFAULT_DMA_STRICT.
> > 
> > On the sa8775p-ride, I see most TLB sync calls to be about 150 us long,
> > with some spiking to 500 us, etc:
> > 
> >     [root@qti-snapdragon-ride4-sa8775p-09 ~]# trace-cmd start -p function_graph -g qcom_smmu_tlb_sync --max-graph-depth 1
> >       plugin 'function_graph'
> >     [root@qti-snapdragon-ride4-sa8775p-09 ~]# trace-cmd show
> >     # tracer: function_graph
> >     #
> >     # CPU  DURATION                  FUNCTION CALLS
> >     # |     |   |                     |   |   |   |
> >      0) ! 144.062 us  |  qcom_smmu_tlb_sync();
> > 
> > On my sc8280xp-lenovo-thinkpad-x13s (only other Qualcomm platform I can compare
> > with) I see around 2-15 us with spikes up to 20-30 us. That's thanks to this
> > patch[0], which I guess improved the platform from 1-2 ms to the ~10 us number.
> > 
> > It's not entirely clear to me how a DPU specific programming affects system
> > wide SMMU performance, but I'm curious if this is the only way to achieve this?
> > sa8775p doesn't have the DPU described even right now, so that's a bummer
> > as there's no way to make a similar immediate optimization, but I'm still struggling
> > to understand what that patch really did to improve things so maybe I'm missing
> > something.
> > 
> 
> The cause was that the TLB sync is synchronized with the display updates,
> but without appropriate safe_lut_tlb values the display side wouldn't
> play nice.

In my case we don't have display being driven at all. I'm not sure if
that changes the situation, or just complicates it. i.e. I'm unsure if
that means we're not hitting the display situation at all but something
else entirely (assuming this time is longer than ideal), or if the
safe_lut_tlb values still effect things despite Linux knowing nothing
about the display, which as far as I know is not configured by anyone
at the moment.

Any thoughts on that?

> 
> Regards,
> Bjorn
> 
> > I'm honestly not even sure what a "typical" range for TLB sync time would be,
> > but on sa8775p-ride its bad enough that some IRQs like UFS can cause RCU stalls
> > (pretty easy to reproduce with fio basic-verify.fio for example on the platform).
> > It also makes running with iommu.strict=1 impractical as performance for UFS,
> > ethernet, etc drops 75-80%.
> > 
> > Does anyone have any bright ideas on how to improve this, or if I'm even in
> > the right for assuming that time is suspiciously long?
> > 
> > Thanks,
> > Andrew
> > 
> > [0] https://lore.kernel.org/linux-arm-msm/CAF6AEGs9PLiCZdJ-g42-bE6f9yMR6cMyKRdWOY5m799vF9o4SQ@mail.gmail.com/
> > 
> 


      reply	other threads:[~2024-04-05 14:04 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-02 21:22 sa8775p-ride: What's a normal SMMU TLB sync time? Andrew Halaney
2024-04-05  3:25 ` Bjorn Andersson
2024-04-05 14:04   ` Andrew Halaney [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=lqdosfpb7gdjooqswgjnabzxapocndzam3ws7dq7god5bn55an@igvaowz6h7ye \
    --to=ahalaney@redhat.com \
    --cc=iommu@lists.linux.dev \
    --cc=joro@8bytes.org \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-arm-msm@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=quic_bjorande@quicinc.com \
    --cc=quic_c_gdjako@quicinc.com \
    --cc=quic_cgoldswo@quicinc.com \
    --cc=quic_pdaly@quicinc.com \
    --cc=quic_sudaraja@quicinc.com \
    --cc=quic_sukadev@quicinc.com \
    --cc=robdclark@gmail.com \
    --cc=will@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).