($INBOX_DIR/description missing)
 help / color / mirror / Atom feed
From: holger.sebert.ext@karlstorz.com
To: "toaster@lists.yoctoproject.org" <toaster@lists.yoctoproject.org>
Subject: Database erros due to UTF-8 filenames
Date: Mon, 16 Nov 2020 12:56:53 +0000	[thread overview]
Message-ID: <e49af3aebfea45f18869a5ce08d93cd6@karlstorz.com> (raw)

Hi,

I've setup Toaster and a MySQL docker container, all running on Ubuntu 16.04.
I am encountering the following database error, when building my Yocto project:

	ERROR: (1366, "Incorrect string value: '\\xC5\\x91tan\\xC3...' for column 'path' at row 1")
	Traceback (most recent call last):
	  File "/usr/local/lib/python3.7/dist-packages/django/db/backends/utils.py", line 84, in _execute
		return self.cursor.execute(sql, params)
	  File "/usr/local/lib/python3.7/dist-packages/django/db/backends/mysql/base.py", line 71, in execute
		return self.cursor.execute(query, args)
	  File "/usr/local/lib/python3.7/dist-packages/MySQLdb/cursors.py", line 206, in execute
		res = self._query(query)
	  File "/usr/local/lib/python3.7/dist-packages/MySQLdb/cursors.py", line 319, in _query
		db.query(q)
	  File "/usr/local/lib/python3.7/dist-packages/MySQLdb/connections.py", line 260, in query
		_mysql.connection.query(self, query)
	MySQLdb._exceptions.OperationalError: (1366, "Incorrect string value: '\\xC5\\x91tan\\xC3...' for column 'path' at row 1")

The query that raised this error looks as follows:

	INSERT INTO `orm_target_file`
		(`target_id`, `path`, `size`, `inodetype`, `permission`,
		`owner`, `group`, `directory_id`, `sym_target_id`)
	VALUES (19,
		'/usr/share/ca-certificates/mozilla/NetLock_Arany_=Class_Gold=_F\xc5\x91tan\xc3\xbas\xc3\xadtv\xc3\xa1ny.crt',
		1476, 1, 'rw-r--r--', 'root', 'root', NULL, NULL)

The file causing this error has the following UTF-8 encoded filename:

	NetLock_Arany_=Class_Gold=_Főtanúsítvány.crt

When looking into the database I found out that the column `path` of table
`orm_target_file` has the following properties:

	CHARACTER_SET_NAME: latin1
	COLLATION_NAME: latin1_swedish_ci

Apperently, the column `path` is not ready for UTF-8 strings. I can fix that
manually by doing the following mysql command using the `mysql` tool:

	ALTER TABLE orm_target_file
	CONVERT TO CHARACTER SET utf8
	COLLATE utf8_general_ci;

This change makes the database error disappear.

I would like to fix that directly in Toasters's `orm/models.py`. I found the
following definition in class `Target_File`:

    path = models.FilePathField()

It seems like I need to pass some clever options to `FilePathField`, but which?
My own research in that direction has brought up nothing useful so far.

My questions are thus:

* How can I parametrize `FilePathField` to properly handle UTF-8 encoded
  filenames in the underlying database?

* How should a correspondig migration file look like in `orm/migrations`?

Thanks!

Best,
Holger

             reply	other threads:[~2020-11-16 12:56 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-16 12:56 holger.sebert.ext [this message]
2020-11-16 13:17 ` Database erros due to UTF-8 filenames Reyna, David
2020-11-26 17:32   ` Sebert, Holger.ext

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e49af3aebfea45f18869a5ce08d93cd6@karlstorz.com \
    --to=holger.sebert.ext@karlstorz.com \
    --cc=toaster@lists.yoctoproject.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for read-only IMAP folder(s) and NNTP newsgroup(s).