everything related to duct tape audio suite (dtas)
 help / color / mirror / code / Atom feed
Search results ordered by [date|relevance]  view[summary|nested|Atom feed]
thread overview below | download mbox.gz: |
* Re: [PATCH] dtas-archive: paranoid archival script
  2015-04-07  7:47  3% [PATCH] dtas-archive: paranoid archival script Eric Wong
@ 2015-09-18  8:54  7% ` Eric Wong
  0 siblings, 0 replies; 3+ results
From: Eric Wong @ 2015-09-18  8:54 UTC (permalink / raw)
  To: dtas-all

Eric Wong <e@80x24.org> wrote:
> +dtas-archive is intended for archiving audio data to/from laptops and
> +computers without ECC memory, attempting to read data multiple times in
> +an attempt to detect memory corruption.  dtas-archive may only be
> +effective on machines running the Linux kernel where posix_fadvise(2)
> +can be used to drop caches for a particular file after fsync(2).
> 
> +dtas-archive spawns sox(1) to archive audio data (likely uncompressed
> +WAVE) to FLAC and verifies the result using sndfile-cmp(1), a tool
> +implemented by different than sox and less likely to share the same bugs
> +(if any) as sox.

Fwiw, I'm VERY happy to note this script actually just detected an error
the kernel + ECC RAM did not notice.  The sox wav-to-flac copy was
corrupted, and sndfile-cmp caught the error after dropping the cache and
re-comparing.  I tried remounting the device just in case, but running
sndfile-cmp by hand reproduced the error.

Again, this was from my workstation with ECC memory, even, so the error
probably happened on the USB or device level.  I was just copying a WAV
file recording from a MicroSD card off a USB card reader.

Fwiw, I've been doing this cache-dropping + sndfile-cmp dance for nearly
3 years, now (with a different, unpublished script) now and this is my
first time detecting a real error.

I haven't listened, but stats off the corrupt copy did indicate
clipping.

So the paranoia with verifying copies really is justified :>


^ permalink raw reply	[relevance 7%]

* [ANN] dtas 0.10.0 - major features and small fixes
@ 2015-04-13  6:20  5% Eric Wong
  0 siblings, 0 replies; 3+ results
From: Eric Wong @ 2015-04-13  6:20 UTC (permalink / raw)
  To: dtas-all

Bug fixes:

* Exported INFILE environment variable is always shell-escaped
  This prevent screw-ups when users are using funky filenames.

* dtas-player: enqueued commands cannot use audio format bypass
  (the audio format cannot be known ahead-of-time from raw commands)

* YAML omap (ordered map) is explicitly used for all env hashes for
  user editing.  Normal (unordered hashes) are still allowed if loading
  existing files.  This does not affect Ruby 1.9+ users, but allows
  easier processing for users of other languages.

New features (all platforms):

* dtas-player now plays dtas-splitfx YAML files support cue sheet
  emulation based on the track list.  Under Linux[1], changes to
  the YAML file are reflected in real-time as the file is edited
  and saved in an $EDITOR.  This feature is useful for dialing
  in EQ, compressor, and limiter effects on tracks.

* dtas-player supports the "source restart" command for restarting
  playback on modified files for systems without inotify support.

* dtas-splitfx now exports the INDIR and INBASE environment variables
  which are intended to act like `$(@D)' and `$(@F)' in GNU make(1).
  It should ease managing temporary files for some effects
  (e.g. noiseprof + noisered in sox)

* dtas-console supports '!' and '@' hotkeys keys for moving within
  files with embedded cue sheets.

* dtas-player supports the "trim" command to focus on a particular
  portion of a track.  It may be useful when combined with the existing
  "tl repeat" command for dialing in audio editing parameters
  (via a splitfx YAML file):

  To continuously repeat a 5 second part of the current track starting
  at 1 minute into the track:

    dtas-ctl tl repeat 1 && dtas-ctl trim 1:00 5

  Passing "off" as the parameter disables trim:

    dtas-ctl trim off

* dtas-env(7) manpage added for common environment variables across
  the suite

* dtas-sinkedit shows default parameters in addition to user-changed
  parameters

New features (Linux-only)

* dtas-sourceedit and dtas-sinkedit support inotify[1] when editing
  the YAML text file.  This allows real-time updates on $EDITOR
  file save as the user edits the parameters of the commands used
  for decoding and playback.

* dtas-archive - paranoid archival script for copying and (re-reading)
  files.  This is useful when transferring files from removable devices
  to computers without ECC memory (or any other bit errors in transport
  before main memory is accessed).  This requires Ruby 1.9.3 or later
  (no 3rd-party RubyGems) on Linux for IO#advise support.

There are also many internal cleanups and more work-in-progress
for dtas-splitfx features.

[1] feature requires the sleepy_penguin RubyGem to be installed.

Upgrade or install "dtas" via RubyGems or via tarball at:

    http://dtas.80x24.org/2015/dtas-0.10.0.tar.gz


^ permalink raw reply	[relevance 5%]

* [PATCH] dtas-archive: paranoid archival script
@ 2015-04-07  7:47  3% Eric Wong
  2015-09-18  8:54  7% ` Eric Wong
  0 siblings, 1 reply; 3+ results
From: Eric Wong @ 2015-04-07  7:47 UTC (permalink / raw)
  To: dtas-all

This archives audio files (typically .wav from a portable devices)
as FLAC and performs a best-effort verification the file was
transferred succesfully without bit errors by dropping kernel caches
and rechecking the result.
---
 Documentation/GNUmakefile      |   1 +
 Documentation/dtas-archive.txt |  61 ++++++++++++++
 bin/dtas-archive               | 187 +++++++++++++++++++++++++++++++++++++++++
 3 files changed, 249 insertions(+)
 create mode 100644 Documentation/dtas-archive.txt
 create mode 100755 bin/dtas-archive

diff --git a/Documentation/GNUmakefile b/Documentation/GNUmakefile
index 4f44cdc..537c659 100644
--- a/Documentation/GNUmakefile
+++ b/Documentation/GNUmakefile
@@ -18,6 +18,7 @@ m1 += dtas-sinkedit
 m1 += dtas-sourceedit
 m1 += dtas-tl
 m1 += dtas-splitfx
+m1 += dtas-archive
 
 m7 =
 m7 += dtas-player_protocol
diff --git a/Documentation/dtas-archive.txt b/Documentation/dtas-archive.txt
new file mode 100644
index 0000000..9f2fe04
--- /dev/null
+++ b/Documentation/dtas-archive.txt
@@ -0,0 +1,61 @@
+% dtas-archive(1) dtas user manual
+%
+
+# NAME
+
+dtas-archive - paranoid audio file copy
+
+# SYNOPSYS
+
+dtas-archive [OPTIONS] SOURCE DESTINATION
+
+# DESCRIPTION
+
+dtas-archive is intended for archiving audio data to/from laptops and
+computers without ECC memory, attempting to read data multiple times in
+an attempt to detect memory corruption.  dtas-archive may only be
+effective on machines running the Linux kernel where posix_fadvise(2)
+can be used to drop caches for a particular file after fsync(2).
+
+dtas-archive spawns sox(1) to archive audio data (likely uncompressed
+WAVE) to FLAC and verifies the result using sndfile-cmp(1), a tool
+implemented by different than sox and less likely to share the same bugs
+(if any) as sox.
+
+# OPTIONS
+
+-j, \--jobs [JOBS]
+:    Number of jobs to run in parallel.  Incrementing this may hurt
+     performance on slow storage devices.  Default: 1
+
+-n, \--dry-run
+:    Print, but do not run the commands to be executed
+
+-s, \--quiet, \--silent
+:    Silent operation, commands are not printed as executed
+
+-S, \--stats
+:    Run and save the text output of the sox "stats" effect as
+     $DESTINATION_FILE_WITHOUT_SUFFIX.stats next to the output file
+
+-k, \--keep-going
+:    Continue after error
+
+-r, \--repeat [COUNT]
+:    Number of times to repeat the sndfile-cmp(1) check.  Default: 1
+
+# COPYRIGHT
+
+Copyright 2015 all contributors <dtas-all@nongnu.org>.\
+License: GPLv3 or later <http://www.gnu.org/licenses/gpl-3.0.txt>
+
+# CONTACT
+
+All feedback welcome via plain-text mail to: <dtas-all@nongnu.org>\
+Mailing list archives available at <http://80x24.org/dtas-all/> and
+<ftp://lists.gnu.org/dtas-all/>\
+No subscription is necessary to post to the mailing list.
+
+# SEE ALSO
+
+sndfile-cmp(1), sox(1)
diff --git a/bin/dtas-archive b/bin/dtas-archive
new file mode 100755
index 0000000..69fc40e
--- /dev/null
+++ b/bin/dtas-archive
@@ -0,0 +1,187 @@
+#!/usr/bin/env ruby
+# Copyright (C) 2015 all contributors <dtas-all@nongnu.org>
+# License: GPLv3 or later (https://www.gnu.org/licenses/gpl-3.0.txt)
+usage = "#$0 SOURCE DESTINATION"
+
+# We could use the equivalent sox command here, but some folks working on
+# dtas is more likely to write patches for sox (and thus introduce bugs
+# into it), so we'll use sndfile-cmp as it lives in a different source tree
+%w(sndfile-cmp sox).each do |cmd|
+  `which #{cmd} 2>/dev/null`.chomp.empty? and abort "#{cmd} not found in PATH"
+end
+
+RUBY_PLATFORM =~ /linux/ or
+  warn "#$0 is unproven without Linux kernel fadvise semantics"
+have_advise = IO.instance_methods.include?(:advise)
+have_advise or warn "#$0 does not work reliably without IO#advise support"
+
+require 'shellwords'
+require 'fileutils'
+require 'find'
+require 'optparse'
+Thread.abort_on_exception = true
+dry_run = false
+silent = false
+type = 'flac'
+jobs = 1
+repeat = 1
+stats = false
+keep_going = false
+
+OptionParser.new('', 24, '  ') do |op|
+  op.banner = usage
+  op.on('-t', '--type [TYPE]', 'FILE-TYPE (default: flac)') { |t| type = t }
+  op.on('-j', '--jobs [JOBS]', Integer) { |j| jobs = j }
+  op.on('-S', '--stats', 'save stats on the file') { stats = true }
+  op.on('-k', '--keep-going', 'continue after error') { keep_going = true }
+  op.on('-n', '--dry-run', 'only print commands, do not run them') do
+    dry_run = true
+  end
+  op.on('-r', '--repeat [COUNT]', 'number of times to check', Integer) do |r|
+    repeat = r
+  end
+  op.on('-s', '--quiet', '--silent') { silent = true }
+  op.on('-h', '--help') do
+    puts(op.to_s)
+    exit
+  end
+  op.parse!(ARGV)
+end
+
+dst = ARGV.pop
+src = ARGV.dup
+
+FileUtils.mkpath(dst) unless File.exist?(dst)
+src_files = Hash.new { |h,dest_dir| h[dest_dir] = [] }
+
+src.each do |s|
+  src_st = File.stat(s)
+  if src_st.directory?
+    Find.find(s) do |path|
+      File.file?(path) or next
+      dir = File.dirname(path)
+      dir_st = File.stat(dir)
+      if dir_st.ino == src_st.ino && dir_st.dev == src_st.dev
+        src_files['.'] << path
+      else
+        dir = File.basename(File.dirname(path))
+        src_files[dir] << path
+      end
+    end
+  else
+    src_files['.'] << s
+  end
+end
+
+pairs = []
+type = ".#{type}" unless type.start_with?('.')
+
+src_files.each do |dir, files|
+  dir = dir == '.' ? dst : File.join(dst, dir)
+  if dry_run || !silent
+    puts "mkdir -p #{Shellwords.escape(dir)}"
+  end
+  FileUtils.mkpath(dir) unless dry_run
+
+  files.each do |path|
+    base = File.basename(path).sub(/\.[^\.]+\z/, type)
+    out = File.join(dir, base)
+    pairs << [ path, out ]
+  end
+end
+
+mtx = Mutex.new # protects fails and pairs
+fails = []
+mismatches = []
+
+on_fail = lambda do |job, status|
+  mtx.synchronize do
+    pairs.clear unless keep_going
+    fails << [ job, status ]
+  end
+end
+
+on_mismatch = lambda do |job, status|
+  mtx.synchronize do
+    mismatches << [ job, status ]
+  end
+end
+
+exiting = false
+%w(INT TERM).each do |s|
+  trap(s) do
+    warn "Caught SIG#{s}, stopping gracefully..."
+    exiting = true
+    trap(s, 'DEFAULT') # non-graceful if signaled again
+  end
+end
+
+thrs = jobs.times.map do |i|
+  Thread.new do
+    while job = mtx.synchronize { pairs.shift }
+      break if exiting
+
+      input, output = *job
+
+      unless system('soxi', '-s', input, out: IO::NULL, err: IO::NULL)
+        warn "skipping #{input.inspect}, not an audio file"
+        next
+      end
+
+      stats_out = "#{output.sub(/\.[\.]+\z/, '')}.stats" if stats
+
+      if dry_run || !silent
+        names = job.map { |x| Shellwords.escape(x) }
+        cmd = [ 'sox', *names ]
+        if stats
+          cmd << 'stats'
+          cmd << "2>#{Shellwords.escape(stats_out)}"
+        end
+
+        puts cmd.join(' ')
+        cmpcmd = "sndfile-cmp #{names[0]} #{names[1]}"
+        if dry_run
+          puts cmpcmd
+          next
+        end
+      end
+
+      cmd = [ 'sox', input, output ]
+      if stats
+        cmd << 'stats'
+        cmd = [ *cmd, { err: stats_out } ]
+      end
+      system(*cmd) or on_fail.call(job, $?)
+
+      # clear kernel caches, this relies on Linux behavior
+      repeat.times do
+        if have_advise
+          th = Thread.new { File.open(input) { |fp| fp.advise(:dontneed) } }
+          File.open(output, 'ab') do |fp|
+            fp.fsync
+            fp.advise(:dontneed)
+          end
+          th.join
+        end
+
+        puts cmpcmd unless silent
+        system('sndfile-cmp', input, output) or on_mismatch.call(job, $?)
+      end
+      st = File.stat(input)
+      File.utime(st.atime, st.mtime, output)
+    end
+  end
+end
+
+thrs.each(&:join)
+ok = true
+fails.each do |job, status|
+  $stderr.puts "#{job.inspect} failed: #{status.inspect}"
+  ok = false
+end
+mismatches.each do |job, status|
+  $stderr.puts "#{job.inspect} mismatched: #{status.inspect}"
+  ok = false
+end
+
+exit ok
-- 
EW



^ permalink raw reply related	[relevance 3%]

Results 1-3 of 3 | reverse | options above
-- pct% links below jump to the message on this page, permalinks otherwise --
2015-04-07  7:47  3% [PATCH] dtas-archive: paranoid archival script Eric Wong
2015-09-18  8:54  7% ` Eric Wong
2015-04-13  6:20  5% [ANN] dtas 0.10.0 - major features and small fixes Eric Wong

Code repositories for project(s) associated with this public inbox

	https://80x24.org/dtas.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).