Linux-EROFS Archive mirror
 help / color / mirror / Atom feed
From: Sandeep Dhavale via Linux-erofs <linux-erofs@lists.ozlabs.org>
To: linux-erofs@lists.ozlabs.org
Cc: hsiangkao@linux.alibaba.com, kernel-team@android.com
Subject: [PATCH 0/1] Opportunistically making files sparse
Date: Wed,  3 Apr 2024 16:57:23 -0700	[thread overview]
Message-ID: <20240403235724.1919539-1-dhavale@google.com> (raw)

Hi,
We noticed that in android if you build erofs images with ELFs which
have higher alignment say 16K or 64K, there was a considerable increase
in the size of the uncompressed erofs image. The size increase could be
mitigated with -Ededupe or --chunksize=4096 but that still results in
lot of redundant disk IOs during file read as all the zero blocks are
mapped to a single block on disk. Treating data blocks filled with zeros
as hole will save the diskspace and also will save us lot of disk IOs
during read.

Using EROFS tracepoints for the image built without the fix

md5sum-7535    [001] ..... 620668.748558: erofs_map_blocks_enter: dev = (7,0), nid = 60, la 364544 llen 45056 flags RAW
md5sum-7535    [001] ..... 620668.748559: erofs_map_blocks_exit: dev = (7,0), nid = 60, flags RAW la 364544 pa 40960 llen 4096 plen 4096 mflags M ret 0
md5sum-7535    [001] ..... 620668.748560: erofs_map_blocks_enter: dev = (7,0), nid = 60, la 368640 llen 40960 flags RAW
md5sum-7535    [001] ..... 620668.748560: erofs_map_blocks_exit: dev = (7,0), nid = 60, flags RAW la 368640 pa 40960 llen 4096 plen 4096 mflags M ret 0
md5sum-7535    [001] ..... 620668.748561: erofs_map_blocks_enter: dev = (7,0), nid = 60, la 372736 llen 36864 flags RAW
md5sum-7535    [001] ..... 620668.748561: erofs_map_blocks_exit: dev = (7,0), nid = 60, flags RAW la 372736 pa 40960 llen 4096 plen 4096 mflags M ret 0
md5sum-7535    [001] ..... 620668.748562: erofs_map_blocks_enter: dev = (7,0), nid = 60, la 376832 llen 32768 flags RAW

As you can see, all the reads are being redirected to read the same pa 40960.
Also this causes fragmentation.

Using EROFS tracepoints for the image built with detection of zero blocks

md5sum-7496    [000] ..... 620150.387246: erofs_map_blocks_enter: dev = (7,0), nid = 60, la 0 llen 65536 flags RAW
md5sum-7496    [000] ..... 620150.387249: erofs_map_blocks_exit: dev = (7,0), nid = 60, flags RAW la 0 pa 0 llen 262144 plen 262144 mflags  ret 0
md5sum-7496    [000] ..... 620150.387358: erofs_map_blocks_enter: dev = (7,0), nid = 60, la 65536 llen 131072 flags RAW
md5sum-7496    [000] ..... 620150.387358: erofs_map_blocks_exit: dev = (7,0), nid = 60, flags RAW la 0 pa 0 llen 262144 plen 262144 mflags  ret 0
md5sum-7496    [000] ..... 620150.387460: erofs_map_blocks_enter: dev = (7,0), nid = 60, la 196608 llen 212992 flags RAW

I think this optimization has wins on diskspace and IO cost so its better to be
default than enable conditionally with --sparse flag.

Thanks,
Sandeep.

PS: This patch is based on erofs-utils.git/experimental as it builds on the
previous fix of minextblks at
https://lore.kernel.org/all/20240403070700.1716252-1-dhavale@google.com/
which is not in erofs-utils.git/dev yet.


Sandeep Dhavale (1):
  erofs-utils: lib: treat data blocks filled with 0s as a hole

 lib/blobchunk.c | 25 ++++++++++++++++++++++++-
 1 file changed, 24 insertions(+), 1 deletion(-)

-- 
2.44.0.478.gd926399ef9-goog


             reply	other threads:[~2024-04-03 23:57 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-03 23:57 Sandeep Dhavale via Linux-erofs [this message]
2024-04-03 23:57 ` [PATCH 1/1] erofs-utils: lib: treat data blocks filled with 0s as a hole Sandeep Dhavale via Linux-erofs
2024-04-04 14:00   ` Gao Xiang
2024-04-04 16:52     ` Sandeep Dhavale via Linux-erofs

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240403235724.1919539-1-dhavale@google.com \
    --to=linux-erofs@lists.ozlabs.org \
    --cc=dhavale@google.com \
    --cc=hsiangkao@linux.alibaba.com \
    --cc=kernel-team@android.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).