Post Reply 
 
Thread Rating:
  • 0 Vote(s) - 0 Average
  • 1
  • 2
  • 3
  • 4
  • 5
Problem with UTF-8 in playlists (.m3u8 files)
30-09-2015, 14:44
Post: #11
RE: Problem with UTF-8 in playlists (.m3u8 files)
Please see attachment. Out of curiosity: How can I distinguish between files containing composed or decomposed Unicode? The 'file' utility just seems to output "UTF-8 Unicode text".


Attached File(s)
.gz  sample.txt.gz (Size: 521 bytes / Downloads: 1)
Find all posts by this user
Quote this message in a reply
30-09-2015, 15:17
Post: #12
RE: Problem with UTF-8 in playlists (.m3u8 files)
(30-09-2015 14:44)Manul Wrote:  Please see attachment. Out of curiosity: How can I distinguish between files containing composed or decomposed Unicode? The 'file' utility just seems to output "UTF-8 Unicode text".

Thanks for the quick reply. These filenames are composed. I will try to set up a test using decomposed filenames to see what Java returns to MinimServer in this case.

I think the only reliable way to find out whether Unicode characters within a file are composed or decomposed is to use a hex editor to look at the bytes within the file. It's also possible for a file to contain a mixture of composed and decomposed characters.
Find all posts by this user
Quote this message in a reply
30-09-2015, 15:43
Post: #13
RE: Problem with UTF-8 in playlists (.m3u8 files)
Thanks for your answer! I've investigated some more: The files are rsynced from my Mac to the NAS. Interestingly, when I do an 'ls | hexdump -C' on the AFP-mounted NAS directory on my Mac, 'ü' is encoded as 75 cc 88 (composed) - same as in the playlist file. Doing the same on the NAS directly or on the Pi shows 'ü' as c3 bc (decomposed). Somehow AFP seems to do some on-the-fly conversion here...
Find all posts by this user
Quote this message in a reply
30-09-2015, 16:16
Post: #14
RE: Problem with UTF-8 in playlists (.m3u8 files)
(30-09-2015 15:43)Manul Wrote:  Thanks for your answer! I've investigated some more: The files are rsynced from my Mac to the NAS. Interestingly, when I do an 'ls | hexdump -C' on the AFP-mounted NAS directory on my Mac, 'ü' is encoded as 75 cc 88 (composed) - same as in the playlist file. Doing the same on the NAS directly or on the Pi shows 'ü' as c3 bc (decomposed). Somehow AFP seems to do some on-the-fly conversion here...

Actually, 75 cc 88 is decomposed ("u" + "combining diaresis") and c3 bc is composed ("u with diaresis").

I have tested the fix and it seems to work. It should be available soon in the next update. This will mean that users don't need to be concerned about the murky depths of how Unicode represents accented characters and how this is handled by various file systems.
Find all posts by this user
Quote this message in a reply
30-09-2015, 18:23
Post: #15
RE: Problem with UTF-8 in playlists (.m3u8 files)
Great work and thanks for your help, Simon!

As a workaround, I've fixed the python script I wrote to extract my iTunes playlists into m3u8 files so that it now normalizes the UTF-8 to composed (NFC, hope I have it the right way around this time). This brought the error count in my minimserver log down from ~5000 to ~500. Now to tackle the rest... Wink
Find all posts by this user
Quote this message in a reply
01-10-2015, 22:28
Post: #16
RE: Problem with UTF-8 in playlists (.m3u8 files)
(30-09-2015 16:16)simoncn Wrote:  I have tested the fix and it seems to work. It should be available soon in the next update. This will mean that users don't need to be concerned about the murky depths of how Unicode represents accented characters and how this is handled by various file systems.

This fix is now available in MinimServer update 67.
Find all posts by this user
Quote this message in a reply
02-10-2015, 11:26
Post: #17
RE: Problem with UTF-8 in playlists (.m3u8 files)
Thank you again, Simon. I can confirm that with this update MinimServer finds the corresponding files with both the old and the normalized versions of my playlist files.
Find all posts by this user
Quote this message in a reply
09-03-2018, 07:18
Post: #18
RE: Problem with UTF-8 in playlists (.m3u8 files)
Hmm.

I seem to be having the same problem running MinimServer 0.8.5.2 update 121 under Armbian 5.38 (Ubuntu 16.04.4 LTS). A typical error in the log file:

Code:
Error: playlist party.m3u8: no matching file for Movits!/Äppelknyckarjazz/03 Swing för hyresgästföreningen.mp3

The same .m3u8 file works fine on MinimServer 0.8.4 update 114 on MacOSX.
Find all posts by this user
Quote this message in a reply
09-03-2018, 09:33
Post: #19
RE: Problem with UTF-8 in playlists (.m3u8 files)
A little more info on the configuration on Armbian:

Code:
MinimServer 0.8.5.2 update 121, Copyright (c) 2012-2018 Simon Nash. All rights reserved.
MinimStreamer 0.7.6, Copyright (c) 2012-2018 Simon Nash. All rights reserved.
Java(TM) SE Runtime Environment (build 1.8.0_161-b12)
Java HotSpot(TM) Client VM (build 25.161-b12, mixed mode)
Platform default charset is UTF-8
Language setting is 'eng'

I also tried updating the MacOSX instance to MinimServer 0.8.5.2 update 121. That still works correctly.
Find all posts by this user
Quote this message in a reply
10-03-2018, 17:15 (This post was last modified: 10-03-2018 17:18 by simoncn.)
Post: #20
RE: Problem with UTF-8 in playlists (.m3u8 files)
Please do the following from a terminal window on the Armbian machine:

Code:
cd Movits!/
ls -l Äppelknyckarjazz/* >~/lstest.txt

then attach the lstest.txt fle as a file attachment here. Do not copy the contents of this file into a post.

Also, please confirm that the Movits!/ folder is a subfolder of the folder containing the party.m3u8 file.
Find all posts by this user
Quote this message in a reply
Post Reply 


Forum Jump:


User(s) browsing this thread: 1 Guest(s)