* disk might be cheap, everything else isn't
@ 2015-09-16 20:16 Eric Wong
2016-08-09 19:09 ` Eric Wong
0 siblings, 1 reply; 2+ messages in thread
From: Eric Wong @ 2015-09-16 20:16 UTC (permalink / raw)
To: misc
TL;DR: Compress your data, and do it early.
Disk latency is high.
Disks (including SSDS) wear out faster.
Memory (for cache) is expensive.
Memory bandwidth is expensive.
Memory latency is high.
Network bandwidth is expensive.
Network latency is high.
Storage bus bandwidth (SAS, SATA, USB, etc) is expensive.
Storage bus latency sucks.
The CPU overhead for common zlib-based compression is relatively
inexpensive compared to these things.
Everything that gets stored on disk is expected to be read at some
point. Reading that will use memory and memory bandwidth on just
about any OS. Memory used for caching is not cheap and neither is
memory bandwidth and latency.
Sure one could use O_DIRECT, an interface designed by deranged
monkeys[2] to avoid the caching, but it is tricky to use and
most apps need to be modified to use it.
Transparent compression at the filesystem or virtual memory[1]
layers helps at some points, but becomes worthless once your
data needs to be transferred to other machines which do not
compress transparently.
As a bonus, compression formats such as FLAC and gzip tend to come
with integrity checking, too, giving you extra piece-of-mind when
you have unreliable hardware.
Sometimes compression does not even require special algorithms or
code. It could be as simple as choosing tabs over spaces for
indentation to get a 16% improvement in grep performance :)
http://mid.gmane.org/20071018024553.GA5186@coredump.intra.peff.net
("Re: On Tabs and Spaces" - Jeff King on the git mailing list)
Footnotes:
[1] https://en.wikipedia.org/wiki/Virtual_memory_compression
[2] http://man7.org/linux/man-pages/man2/open.2.html
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: disk might be cheap, everything else isn't
2015-09-16 20:16 disk might be cheap, everything else isn't Eric Wong
@ 2016-08-09 19:09 ` Eric Wong
0 siblings, 0 replies; 2+ messages in thread
From: Eric Wong @ 2016-08-09 19:09 UTC (permalink / raw)
To: misc
Eric Wong <e@80x24.org> wrote:
> http://mid.gmane.org/20071018024553.GA5186@coredump.intra.peff.net
> ("Re: On Tabs and Spaces" - Jeff King on the git mailing list)
Since gmane is down:
https://public-inbox.org/git/20071018024553.GA5186@coredump.intra.peff.net/
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, back to index
Thread overview: 2+ messages (download: mbox.gz / follow: Atom feed)
-- links below jump to the message on this page --
2015-09-16 20:16 disk might be cheap, everything else isn't Eric Wong
2016-08-09 19:09 ` Eric Wong
80x24.org misc. Free Software, open data formats/protocols discussion
Archives are clonable:
git clone --mirror https://80x24.org/misc
git clone --mirror http://ou63pmih66umazou.onion/misc
Newsgroups are available over NNTP:
nntp://news.public-inbox.org/inbox.org.80x24.misc
nntp://ou63pmih66umazou.onion/inbox.org.80x24.misc
note: .onion URLs require Tor: https://www.torproject.org/
or Tor2web: https://www.tor2web.org/
AGPL code for this site: git clone https://public-inbox.org/ public-inbox