Post Reply 
comment display
09-01-2015, 08:29
Post: #11
RE: comment display
(09-01-2015 02:48)Andre Gosselin Wrote:  Thanks Simon,
Your advice worked, but I have another problem.
I tend to use the COMMENT tag to store free form multi-line text which is difficult to "formalize" using standard tags. A good example is the cast of an opera. I frequently copy-paste this text from other sources like the album booklet originally in PDF format. I found that, in doing this, I may paste an "illegal" character into the COMMENT tag, which MinimServer reports as follows inside the log file:
Quote:Warning: illegal character 0x01 in COMMENT tag value in file Callas, Maria/Callas- Lyric & Coloratura arias (1954) - etc...
From my observations, it appears that this illegal character is almost always an '0x01' character. Even if this situation is qualified in the log as a "warning", the net result is that the COMMENT tag is then completely ignored by MinimServer, and not sent to the comtrol point (BubbleUPnP in my case).

Can you post a link to an example PDF booklet that is causing this problem?
Find all posts by this user
Quote this message in a reply
09-01-2015, 18:43
Post: #12
RE: comment display
(09-01-2015 08:29)simoncn Wrote:  Can you post a link to an example PDF booklet that is causing this problem?
Simon,
The booklet comes from the "Callas Remastered" edition, which is not publicly available. I have uploaded the booklet to your ftp server, under "/andregosselin".

Try pasting in a COMMENT tag the contents of the last page of the booklet:

Quote:Recorded: 12–17.VI.1954, Teatro alla Scala, Milan
Producer: Walter Legge · Balance engineer: Robert Beckett
Recorded in cooperation with E.A. Teatro alla Scala, Milan
Newly remastered from the original tapes at Abbey Road Studios
Digital remastering  2014 Warner Classics, Warner Music UK Ltd. A Warner Music Group Company.
? 2014 Warner Classics, Warner Music UK Ltd. A Warner Music Group Company.
http://www.warnerclassics.com

I realize that, when creating this message, I can see two offending characters (originally shown in the PDF as a copyright symbol, and a 'P' inside a circle). But when pasting to a COMMENT tag using mp3tag, foobar or mediamonkey, only the second one is "hinted" at in the software UI. The same is true once my message is previewed: the first illegal character goes unnoticed.

Hunting for illegal characters introduced by a cut&paste operation can thus quickly become really troublesome. Cutting & pasting inside the comments tag appears such a natural thing to do (at least for me) that I risk the suggestion that, for this tag only, illegal characters should be reported in the log, but stripped so that the tag contents can be safely and usefully passed to the control point.

Thanks for your help
Find all posts by this user
Quote this message in a reply
09-01-2015, 21:52
Post: #13
RE: comment display
(09-01-2015 18:43)Andre Gosselin Wrote:  I realize that, when creating this message, I can see two offending characters (originally shown in the PDF as a copyright symbol, and a 'P' inside a circle). But when pasting to a COMMENT tag using mp3tag, foobar or mediamonkey, only the second one is "hinted" at in the software UI. The same is true once my message is previewed: the first illegal character goes unnoticed.

Hunting for illegal characters introduced by a cut&paste operation can thus quickly become really troublesome. Cutting & pasting inside the comments tag appears such a natural thing to do (at least for me) that I risk the suggestion that, for this tag only, illegal characters should be reported in the log, but stripped so that the tag contents can be safely and usefully passed to the control point.

Thanks for your help

I have reproduced the problem using the PDF file you uploaded. I have also tried copying and pasting text from some of my own booklet PDF files. These don't create the same problem with illegal characters because the pasted text contains C in place of the © symbol and P in place of the ℗ symbol.

My conclusions are:

1) This 0x01 problem is relatively unusual because it occurs only with faulty PDF files

2) When this 0x01 problem does occur, there is no simple or convenient way for the user to locate the 0x01 character and remove it

3) Because the position of the 0x01 character can't be located easily, MinimServer should convert the illegal tag to a legal format instead of ignoring it

4) It would be helpful for MinimServer to show the user where the 0x01 character is in the tag value so that the user can fix it with a tag editor

My proposed solution is to add code to replace all illegal characters (not just 0x01) in all tag values (not just COMMENT) with a legal substitute such as the Unicode replacement character � (U+FFFD). The replacement would be internal to MinimServer and the file tags would not be modified.

When the user is viewing the tag value in a control point, the replacement character will be visible and this will show the user that the tag value contains an illegal character as well as identifying the exact position of the illegal character.
Find all posts by this user
Quote this message in a reply
09-01-2015, 23:11
Post: #14
RE: comment display
(09-01-2015 21:52)simoncn Wrote:  My proposed solution is to add code to replace all illegal characters (not just 0x01) in all tag values (not just COMMENT) with a legal substitute such as the Unicode replacement character � (U+FFFD). The replacement would be internal to MinimServer and the file tags would not be modified.

When the user is viewing the tag value in a control point, the replacement character will be visible and this will show the user that the tag value contains an illegal character as well as identifying the exact position of the illegal character.

I warmly welcome this proposal. When I was reading this thread, my immediate thought was that the problem was by no means confined to the Comment tag. I copy and paste into a number of tags, and it is always possible to pick up illegal characters by accident. The combination of workaround and diagnostic in MinimServer would make this problem very manageable.

David
Find all posts by this user
Quote this message in a reply
09-01-2015, 23:49
Post: #15
RE: comment display
(09-01-2015 21:52)simoncn Wrote:  I have reproduced the problem using the PDF file you uploaded. I have also tried copying and pasting text from some of my own booklet PDF files. These don't create the same problem with illegal characters because the pasted text contains C in place of the © symbol and P in place of the ℗ symbol.

My conclusions are:

1) This 0x01 problem is relatively unusual because it occurs only with faulty PDF files

2) When this 0x01 problem does occur, there is no simple or convenient way for the user to locate the 0x01 character and remove it

3) Because the position of the 0x01 character can't be located easily, MinimServer should convert the illegal tag to a legal format instead of ignoring it

4) It would be helpful for MinimServer to show the user where the 0x01 character is in the tag value so that the user can fix it with a tag editor

My proposed solution is to add code to replace all illegal characters (not just 0x01) in all tag values (not just COMMENT) with a legal substitute such as the Unicode replacement character � (U+FFFD). The replacement would be internal to MinimServer and the file tags would not be modified.

When the user is viewing the tag value in a control point, the replacement character will be visible and this will show the user that the tag value contains an illegal character as well as identifying the exact position of the illegal character.

Simon,
Is it the PDF which is faulty, or the technique I used to cut&paste (open the PDF with AdobeReader, highlite text + Ctrl-C, Ctrl-V in mp3tag, all done on Win7) ? It would be a shame that all volumes of the Callas Remastered edition in high-res download come with faulty pdf booklets. When you say that you do not face the same problems when pasting from your own PDF files, and that "pasted text contains C in place of the © symbol and P in place of the ℗ symbol", could it be that is is so because you go a different route than me ?

As for your diagnosis of the problem and the solution you proposed, I totally agree, and would like to thank you very much for the care you take to solve issues with MinimServer.

Regards
Find all posts by this user
Quote this message in a reply
10-01-2015, 08:53
Post: #16
RE: comment display
(09-01-2015 23:49)Andre Gosselin Wrote:  Simon,
Is it the PDF which is faulty, or the technique I used to cut&paste (open the PDF with AdobeReader, highlite text + Ctrl-C, Ctrl-V in mp3tag, all done on Win7) ? It would be a shame that all volumes of the Callas Remastered edition in high-res download come with faulty pdf booklets. When you say that you do not face the same problems when pasting from your own PDF files, and that "pasted text contains C in place of the © symbol and P in place of the ℗ symbol", could it be that is is so because you go a different route than me ?

I am copying text by opening the PDF file in Adobe Reader, selecting text with the mouse and pressing Ctrl+C to copy it to the Windows clipboard. I am then pasting the copied text into various text editors and an empty Microsoft Word document. Using this approach, the © and ℗ symbols are translated to C and P, even though the copied text is Unicode and both © and ℗ are valid Unicode characters. I don't know how or why this translation is happening.

Quote:As for your diagnosis of the problem and the solution you proposed, I totally agree, and would like to thank you very much for the care you take to solve issues with MinimServer.

Regards

This change will be in the next update. Thanks for drawing my attention to the problem with the "invisible" 0x01 character.
Find all posts by this user
Quote this message in a reply
10-01-2015, 23:54
Post: #17
RE: comment display
Simon, can you look at this concerning description XML string encoding:

http://forum.xda-developers.com/showpost...count=8397
Find all posts by this user
Quote this message in a reply
11-01-2015, 01:03
Post: #18
RE: comment display and BubbleUPnP v2.2.3.1
(10-01-2015 23:54)bubbleguuum Wrote:  Simon, can you look at this concerning description XML string encoding:

http://forum.xda-developers.com/showpost...count=8397

This post is about an incompatiility detected about the encoding of newline characters when sent to BubbleUPnP. With the new BubbleUPnP v2.2.3.1 release, newlines present in the COMMENT tag as sent by MinimServer are rejected as illegally encoded by BubbleUPnP, resultng in the display of all text as a single long line inside the Show Metadata window, which quickly becomes unreadable.

My advice is to not upgrade to v2.2.3.1, until this issue is fixed.

Regards
Find all posts by this user
Quote this message in a reply
11-01-2015, 10:19
Post: #19
RE: comment display
(10-01-2015 23:54)bubbleguuum Wrote:  Simon, can you look at this concerning description XML string encoding:

http://forum.xda-developers.com/showpost...count=8397

The characters 0x9, 0xA and 0xD are valid in XML. See this section of the XML specification.
Find all posts by this user
Quote this message in a reply
11-01-2015, 11:12 (This post was last modified: 11-01-2015 11:12 by bubbleguuum.)
Post: #20
RE: comment display
(11-01-2015 10:19)simoncn Wrote:  
(10-01-2015 23:54)bubbleguuum Wrote:  Simon, can you look at this concerning description XML string encoding:

http://forum.xda-developers.com/showpost...count=8397

The characters 0x9, 0xA and 0xD are valid in XML. See this section of the XML specification.

Yes indeed, found about it this morning after checking. I'll fix my buggy XML invalid char method.

Still some media servers (not MinimServer) manage to send invalid characters < 0x20 (not the 3 ones valid above) and BubbleUPnP has to filter them out before serializing XML. Curiously, the Android XML Pull parser will accept to parse these invalid chars but will not (rightly so) serialize them (throws Exception).
Find all posts by this user
Quote this message in a reply
Post Reply 


Forum Jump:


User(s) browsing this thread: 2 Guest(s)