From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on dcvr.yhbt.net X-Spam-Level: X-Spam-Status: No, score=-3.9 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RCVD_IN_DNSWL_MED, RCVD_IN_MSPIKE_H3,RCVD_IN_MSPIKE_WL,SPF_HELO_PASS,SPF_PASS shortcircuit=no autolearn=ham autolearn_force=no version=3.4.2 Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dcvr.yhbt.net (Postfix) with ESMTPS id 9D6131F9FC for ; Fri, 19 Mar 2021 07:36:34 +0000 (UTC) Received: from localhost ([::1]:53402 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1lN9gP-0005AU-FH for e@80x24.org; Fri, 19 Mar 2021 03:36:33 -0400 Received: from eggs.gnu.org ([2001:470:142:3::10]:42940) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lN9gN-000590-Ac for dtas-all@nongnu.org; Fri, 19 Mar 2021 03:36:31 -0400 Received: from dcvr.yhbt.net ([64.71.152.64]:39864) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1lN9gL-0004J4-AK for dtas-all@nongnu.org; Fri, 19 Mar 2021 03:36:31 -0400 Received: from localhost (dcvr.yhbt.net [127.0.0.1]) by dcvr.yhbt.net (Postfix) with ESMTP id 6AADA1F9FC for ; Fri, 19 Mar 2021 07:36:25 +0000 (UTC) From: Eric Wong To: dtas-all@nongnu.org Subject: [PATCH] mlib: pathnames may be blobs Date: Fri, 19 Mar 2021 02:36:25 -0500 Message-Id: <20210319073625.19041-1-e@80x24.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Received-SPF: pass client-ip=64.71.152.64; envelope-from=e@80x24.org; helo=dcvr.yhbt.net X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: dtas-all@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: duct tape audio suite List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: dtas-all-bounces+e=80x24.org@nongnu.org Sender: "dtas-all" POSIX filesystems do not enforce encodings, so we'll convert non-UTF-8 filenames to blobs for SQLite instead of failing on encoding errors. This should allow us to work on collections which feature legacy encodings. --- lib/dtas/mlib.rb | 26 +++++++++++++++++++------- 1 file changed, 19 insertions(+), 7 deletions(-) diff --git a/lib/dtas/mlib.rb b/lib/dtas/mlib.rb index 026e931..eb7554a 100644 --- a/lib/dtas/mlib.rb +++ b/lib/dtas/mlib.rb @@ -1,5 +1,5 @@ # -*- encoding: utf-8 -*- -# Copyright (C) 2015-2020 all contributors +# Copyright (C) 2015-2021 all contributors # License: GPL-3.0+ # frozen_string_literal: true # @@ -129,9 +129,13 @@ def worker_work(job) comments.where(q).delete tmp.each do |tid, val| v = vals[val: val] - q[:val_id] = v ? v[:id] : vals.insert(val: val) - q[:tag_id] = tid - comments.insert(q) + begin + q[:val_id] = v ? v[:id] : vals.insert(val: val) + q[:tag_id] = tid + comments.insert(q) + rescue => e + warn "E: #{e.message} (#{e.class}) q=#{q.inspect} val=#{val.inspect}" + end end end end @@ -214,12 +218,16 @@ def scan_any(path, parent_id) end end + def maybe_blob(path) + path.valid_encoding? ? path : Sequel.blob(path) + end + def scan_file(path, st, parent_id) return if @suffixes !~ path || st.size == 0 # no-op if no change unless @force - if node = @db[:nodes][name: path, parent_id: parent_id] + if node = @db[:nodes][name: maybe_blob(path), parent_id: parent_id] return if st.ctime.to_i == node[:ctime] || node[:tlen] == DM_IGN end end @@ -271,14 +279,16 @@ def node_update_maybe(node, tlen, ctime) node_id = node.delete(:id) @db[:nodes].where(id: node_id).update(node.merge(q)) node[:id] = node_id + rescue => e + warn "E: #{e.message} (#{e.class}) node=#{node.inspect}" end def node_lookup(parent_id, name) - @db[:nodes][name: name, parent_id: parent_id] + @db[:nodes][name: maybe_blob(name), parent_id: parent_id] end def node_ensure(parent_id, name, tlen, ctime = nil) - q = { name: name, parent_id: parent_id } + q = { name: maybe_blob(name), parent_id: parent_id } if node = @db[:nodes][q] node_update_maybe(node, tlen, ctime) else @@ -289,6 +299,8 @@ def node_ensure(parent_id, name, tlen, ctime = nil) node[:id] = @db[:nodes].insert(node) end node + rescue => e + warn "E: #{e.message} (#{e.class}) q=#{q.inspect}" end def cd(path)