everything related to duct tape audio suite (dtas)
 help / color / mirror / code / Atom feed
From: Eric Wong <e@80x24.org>
To: dtas-all@nongnu.org
Subject: [PATCH] mlib: pathnames may be blobs
Date: Fri, 19 Mar 2021 02:36:25 -0500	[thread overview]
Message-ID: <20210319073625.19041-1-e@80x24.org> (raw)

POSIX filesystems do not enforce encodings, so we'll convert
non-UTF-8 filenames to blobs for SQLite instead of failing on
encoding errors.  This should allow us to work on collections
which feature legacy encodings.
---
 lib/dtas/mlib.rb | 26 +++++++++++++++++++-------
 1 file changed, 19 insertions(+), 7 deletions(-)

diff --git a/lib/dtas/mlib.rb b/lib/dtas/mlib.rb
index 026e931..eb7554a 100644
--- a/lib/dtas/mlib.rb
+++ b/lib/dtas/mlib.rb
@@ -1,5 +1,5 @@
 # -*- encoding: utf-8 -*-
-# Copyright (C) 2015-2020 all contributors <dtas-all@nongnu.org>
+# Copyright (C) 2015-2021 all contributors <dtas-all@nongnu.org>
 # License: GPL-3.0+ <https://www.gnu.org/licenses/gpl-3.0.txt>
 # frozen_string_literal: true
 #
@@ -129,9 +129,13 @@ def worker_work(job)
       comments.where(q).delete
       tmp.each do |tid, val|
         v = vals[val: val]
-        q[:val_id] = v ? v[:id] : vals.insert(val: val)
-        q[:tag_id] = tid
-        comments.insert(q)
+        begin
+          q[:val_id] = v ? v[:id] : vals.insert(val: val)
+          q[:tag_id] = tid
+          comments.insert(q)
+        rescue => e
+          warn "E: #{e.message} (#{e.class}) q=#{q.inspect} val=#{val.inspect}"
+        end
       end
     end
   end
@@ -214,12 +218,16 @@ def scan_any(path, parent_id)
     end
   end
 
+  def maybe_blob(path)
+    path.valid_encoding? ? path : Sequel.blob(path)
+  end
+
   def scan_file(path, st, parent_id)
     return if @suffixes !~ path || st.size == 0
 
     # no-op if no change
     unless @force
-      if node = @db[:nodes][name: path, parent_id: parent_id]
+      if node = @db[:nodes][name: maybe_blob(path), parent_id: parent_id]
         return if st.ctime.to_i == node[:ctime] || node[:tlen] == DM_IGN
       end
     end
@@ -271,14 +279,16 @@ def node_update_maybe(node, tlen, ctime)
     node_id = node.delete(:id)
     @db[:nodes].where(id: node_id).update(node.merge(q))
     node[:id] = node_id
+  rescue => e
+    warn "E: #{e.message} (#{e.class}) node=#{node.inspect}"
   end
 
   def node_lookup(parent_id, name)
-    @db[:nodes][name: name, parent_id: parent_id]
+    @db[:nodes][name: maybe_blob(name), parent_id: parent_id]
   end
 
   def node_ensure(parent_id, name, tlen, ctime = nil)
-    q = { name: name, parent_id: parent_id }
+    q = { name: maybe_blob(name), parent_id: parent_id }
     if node = @db[:nodes][q]
       node_update_maybe(node, tlen, ctime)
     else
@@ -289,6 +299,8 @@ def node_ensure(parent_id, name, tlen, ctime = nil)
       node[:id] = @db[:nodes].insert(node)
     end
     node
+  rescue => e
+    warn "E: #{e.message} (#{e.class}) q=#{q.inspect}"
   end
 
   def cd(path)


                 reply	other threads:[~2021-03-19  7:36 UTC|newest]

Thread overview: [no followups] expand[flat|nested]  mbox.gz  Atom feed

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

  List information: https://80x24.org/dtas/README

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20210319073625.19041-1-e@80x24.org \
    --to=e@80x24.org \
    --cc=dtas-all@nongnu.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
Code repositories for project(s) associated with this public inbox

	https://80x24.org/dtas.git/

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).