linux-embedded.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Theodore Ts'o" <tytso@mit.edu>
To: Martin Steigerwald <martin@lichtvoll.de>
Cc: "Alan C. Assis" <acassis@gmail.com>,
	"Bjørn Forsman" <bjorn.forsman@gmail.com>,
	"Kai Tomerius" <kai@tomerius.de>,
	linux-embedded@vger.kernel.org,
	"Ext4 Developers List" <linux-ext4@vger.kernel.org>,
	dm-devel@redhat.com
Subject: Re: Nobarrier mount option (was: Re: File system robustness)
Date: Fri, 21 Jul 2023 09:35:26 -0400	[thread overview]
Message-ID: <20230721133526.GF5764@mit.edu> (raw)
In-Reply-To: <38426448.10thIPus4b@lichtvoll.de>

On Thu, Jul 20, 2023 at 09:55:22AM +0200, Martin Steigerwald wrote:
> 
> I thought that nowadays a cache flush would be (almost) a no-op in the 
> case the storage receiving it is backed by such reliability measures. 
> I.e. that the hardware just says "I am ready" when having the I/O 
> request in stable storage whatever that would be, even in case that 
> would be battery backed NVRAM and/or temporary flash.

That *can* be true if the storage subsystem has the reliability
measures.  For example, if have a $$$ EMC storage array, then sure, it
has an internal UPS backup and it will know that it can ignore that
CACHE FLUSH request.

However, if you have *building* a storage system, the storage device
might be a HDD who has no idea that that it doesn't need to worry
about power drops.  Consider if you will, a rack of servers, each with
a dozen or more HDD's.  There is a rack-level battery backup, and the
rack is located in a data center with diesel generators with enough
fuel supply to keep the entire data center, plus cooling, going for
days.  The rack of servers is part of a cluster file system.  So when
a file write to a cluster file system is performed, the cluster file
system will pick three servers, each in a different rack, and each
rack is in a different power distribution domain.  That way, even the
entry-level switch on the rack dies, or the Power Distribution Unit
(PDU) servicing a group of racks blows up, the data will be available
on the other two servers.

> At least that is what I thought was the background for not doing the 
> "nobarrier" thing anymore: Let the storage below decide whether it is 
> safe to basically ignore cache flushes by answering them (almost) 
> immediately.

The problem is that the storage below (e.g., the HDD) has no idea that
all of this redundancy exists.  Only the system adminsitrator who is
configuring the file sysetm will know.  And if you are runninig a
hyper-scale cloud system, this kind of custom made system will be
much, MUCH, cheaper than buying a huge number of $$$ EMC storage
arrays.

Cheers,

					- Ted

  reply	other threads:[~2023-07-21 13:35 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
     [not found] <20230717075035.GA9549@tomerius.de>
2023-07-17  9:08 ` File system robustness Geert Uytterhoeven
     [not found] ` <CAG4Y6eTU=WsTaSowjkKT-snuvZwqWqnH3cdgGoCkToH02qEkgg@mail.gmail.com>
     [not found]   ` <20230718053017.GB6042@tomerius.de>
2023-07-18 12:56     ` Alan C. Assis
     [not found]     ` <CAEYzJUGC8Yj1dQGsLADT+pB-mkac0TAC-typAORtX7SQ1kVt+g@mail.gmail.com>
2023-07-18 13:04       ` Alan C. Assis
2023-07-18 14:47         ` Chris
2023-07-18 21:32         ` Theodore Ts'o
2023-07-19  6:22           ` Martin Steigerwald
2023-07-20  4:20             ` Theodore Ts'o
2023-07-20  7:55               ` Nobarrier mount option (was: Re: File system robustness) Martin Steigerwald
2023-07-21 13:35                 ` Theodore Ts'o [this message]
2023-07-21 14:51                   ` Martin Steigerwald
2023-07-19 10:51           ` File system robustness Kai Tomerius
2023-07-20  4:41             ` Theodore Ts'o

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20230721133526.GF5764@mit.edu \
    --to=tytso@mit.edu \
    --cc=acassis@gmail.com \
    --cc=bjorn.forsman@gmail.com \
    --cc=dm-devel@redhat.com \
    --cc=kai@tomerius.de \
    --cc=linux-embedded@vger.kernel.org \
    --cc=linux-ext4@vger.kernel.org \
    --cc=martin@lichtvoll.de \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).