MinimServer Forum
Amazon Cloud : from a good news to a nightmare... - Printable Version

+- MinimServer Forum (https://forum.minimserver.com)
+-- Forum: MinimServer (/forumdisplay.php?fid=1)
+--- Forum: Music Tagging (/forumdisplay.php?fid=9)
+--- Thread: Amazon Cloud : from a good news to a nightmare... (/showthread.php?tid=3576)



Amazon Cloud : from a good news to a nightmare... - lyapounov - 13-09-2016 13:19

Dear all

it all starts with a good news, which is turning into a nightmare.

The good news is the Amazon unlimited size cloud offering for 70 euros / year; three month free.

So I have installed Cloud Sync on my Synology server, and started syncing with Amazon to have a backup of all my music. Works perfectly (up to 240 Mb/sec...)

I then noticed that some parts of my music tree structure were not synced. By digging, I realized that all files / folder whose name has an accented character coded on two characters (I guess because of UTF-8) were not synced. As an exemple, Fauré folder was not synced; I then opened File Station on Synology, when to Fauré folder, renamed it : and when I backspaced the é it became Faure, and another backspace it became Faur and I added a é (single character) and magic : the Folder was then Synced, and the é was present on the web interface of Amazon cloud. Same thing with files inside a folder. Of course, when the é was not coded on two characters, it synced perfectly.

Here is now the nightmare : I have tons of files (basically all the one I have ripped and not bought on Qobuz, I would say around 25000 of my 65000) which have this encoding, many with accented letters, and therefore are not synced.

I now am in the mood to change all such filenames to something more standard, without any accent, like what Qobuz is doing when you download.

Problem is : some filenames are coded in UTF-8, and some others are not. How can I know ?

I thought of writing a php routine, which would read the trackname tag, using getid3, and then use some preg_replace to generate a clean filename, but in order to work, I need to know what is the trackname character encoding. How ?

Oh well, if anyone has a magic wand, I take it :-)

Thx


RE: Amazon Cloud : from a good news to a nightmare... - simoncn - 13-09-2016 22:40

This might be caused by the difference between the NFC and NFD forms of Unicode. Because of this issue, MinimServer automatically converts all NFD characters in tag values to NFC (see this post).

You can do ls -l and redirect the output to a file, then use a hex editor to look at the bytes in the file to see what encoding has been used.

If you would like to attach a file containing "raw" ls -l output to a post here, I will look at it and let you know what the encoding is.


RE: Amazon Cloud : from a good news to a nightmare... - lyapounov - 13-09-2016 23:48

(13-09-2016 22:40)simoncn Wrote:  This might be caused by the difference between the NFC and NFD forms of Unicode. Because of this issue, MinimServer automatically converts all NFD characters in tag values to NFC (see this post).

You can do ls -l and redirect the output to a file, then use a hex editor to look at the bytes in the file to see what encoding has been used.

If you would like to attach a file containing "raw" ls -l output to a post here, I will look at it and let you know what the encoding is.

Here are two files
The first one is with the actual encoding
The second one : I have changed the à using file station in the track number 03 Hop-là

Thanks Simon


RE: Amazon Cloud : from a good news to a nightmare... - lyapounov - 14-09-2016 01:53

Well, I must change my post

Actually, the #233 were coming from the ID3 tag, not the filename.

Sorry; will keep you posted


OK, I think I understand the problem
https://developer.apple.com/library/content/qa/qa1173/_index.html

My files are transferred from a Mac to the Synology using transmit (I always use sftp to transfer, I never mount my NAS volume), and yes it does not handle properly composed / decomposed.

When I do a htmlentities on the filename; instead of converting as an example é to &eacute, it converts to #233

So I have to find a way to convert the #233 to e ; if possible without creating a table in php...

Thx