Portuguese Characters in Unicode (UTF-8) or Quoted-Printable

 
Post new topic   Reply to topic    newsmanpro.com Forum Index : Technical Support
View previous topic :: View next topic  
Author Message
Rick
New User
New User


Joined: 22 Jul 2007
Posts: 2

PostPosted: Sun Jul 22, 2007 11:45 pm    Post subject: Portuguese Characters in Unicode (UTF-8) or Quoted-Printable Reply with quote

Hi,

I'm a Portuguese user (from Portugal) of the Registered version of NewsMan Pro 2.7 (Build 2.7.0.2).
Congratulations to the developers for building a great software program! Smile

Unfortunately, I'm having some problems reading some posts in Portuguese newsgroups, that have accented Portuguese letters (characters with accents), like á ("a acute"), ã, ("a tilde") ç ("c cedilla") and others (entered using Portuguese keyboards).

To sum up:

1 - If the post was done using the ISO-8859-1 ("Latin 1") encoding with 8 bit characters, all the characters appear as they should in NewsMan.

Sample - Message Source (posted in the misc.test newsgroup):
http://groups.google.com/group/misc.test/msg/71d43e73e5620ba6?dmode=source

Same Sample - Parsed version:
http://groups.google.com/group/misc.test/msg/71d43e73e5620ba6

Message-ID link:
<news:13a72huq4ifhe42@corp.supernews.com>


2 - If the post was done using ISO-8859-1 ("Latin 1") charset, with 8 bits encoding, but the *Subject* of the message does not use 8 bits (but only 7 bits), then the Subject appears somewhat changed.

Sample - Message Source (posted in the misc.test newsgroup):
http://groups.google.com/group/misc.test/msg/8a081d48d84bdfd7?dmode=source&hl=en

Same Sample - Parsed version:
http://groups.google.com/group/misc.test/msg/8a081d48d84bdfd7

Message-ID link:
<news:13a71purb0fvk92@corp.supernews.com>

The original subject was:
Test of Portuguese accented characters using ISO-8859-1 ("Latin 1") with 8-bit (ÀÁÂÃ Ç ÈÉÊ ÌÍÎ ÒÓÔÕ ÚÙÛ àáâã ç èéê ìíî òóôõ ùú)

In NewsMan it appears like this:
Test of Portuguese accented characters usi Test of Portuguese accented characters using ISO-8859-1 ("Latin 1") with 8-bit (ÀÁÂÃ Test of Portuguese acce...

(note that the Subject breaks after the letters "usi" - from the word "using" - and then starts from the beginning - and then stops again after "ÀÁÂÃ" and then starts again from the beginning)


3 - If the post was done using ISO-8859-1 ("Latin 1") charset, with 7 bits and "Quoted Printable" encoding - http://en.wikipedia.org/wiki/Quoted-printable - then the "Quoted Printable" characters are NOT converted back to the correct accented characters.

Sample - Message Source (posted in the misc.test newsgroup):
http://groups.google.com/group/misc.test/msg/5b7571177da2db6d?dmode=source

Same sample - Parsed version:
http://groups.google.com/group/misc.test/msg/5b7571177da2db6d

Message-ID link:
<news:13a73a2e3tgv803@corp.supernews.com>

In NewsMan, the post should appear like it does in the "Parsed" version in Google Groups, but instead it appears like it does in the "Message Source" version.

4 - If the post was done using Unicode with 8-bits encoding ("UTF-8"), the accented characters also do not appear correctly:

Sample - Message Source (posted in the misc.test newsgroup):
http://groups.google.pt/group/misc.test/msg/c46a521fa2f5802b?dmode=source

Same sample - Parsed Version:
http://groups.google.pt/group/misc.test/msg/c46a521fa2f5802b

Message-ID link:
<news:13a743ktdd2u407@corp.supernews.com>

In NewsMan, the post should appear like it does in the "Parsed" version in Google Groups, but instead it appears like it does in the "Message Source" version.

5 - If the post was done using Unicode with 8-bits ("UTF-8") but encoded using "Quoted-Printable", then the accented characters also do not appear correctly (granted, this would probably be a strange setup for one to use):

Sample - Message Source (posted in the misc.test newsgroup):
http://groups.google.com/group/misc.test/msg/db19c735fea29d97?dmode=source

Same sample - Parsed Version:
http://groups.google.pt/group/misc.test/msg/c46a521fa2f5802b

Message-ID link:
<news:13a73imicl06810@corp.supernews.com>

In NewsMan, the post should appear like it does in the "Parsed" version in Google Groups, but instead it appears like it does in the "Message Source" version.


All these posts appear correctly to me when using Microsoft Outlook Express 6 or Forté Agent 4.2 (in the same laptop, using Windows XP Professional with Service Pack 2, and all critical updates applied).


In my NewsMan Pro configuration ("Settings"), I have the following:

"Fonts and Colors" > "Character Set"
<Default> option selected in the Dropdown

"Fonts and Colors" > "Message View"
I'm using the "Tahoma" font (but "Arial", "Courier New" and "Verdana" seems to give me the same results).


Is this some problem with my configuration or are these bugs / features not yet developed?

Any help and/or information would be much appreciated! Smile

Thanks in advance.

Best wishes,
Ricardo Dias Marques
ricmarques AT spamcop DOT net
Back to top
View user's profile Send private message
administrator
Developer
Developer


Joined: 24 Jul 2004
Posts: 4750
Location: King William, VA

PostPosted: Mon Jul 23, 2007 2:29 am    Post subject: Re: Portuguese Characters in Unicode (UTF-8) or Quoted-Print Reply with quote

Rick wrote:
Hi,

I'm a Portuguese user (from Portugal) of the Registered version of NewsMan Pro 2.7 (Build 2.7.0.2).
Congratulations to the developers for building a great software program! Smile

Unfortunately, I'm having some problems reading some posts in Portuguese newsgroups, that have accented Portuguese letters (characters with accents), like á ("a acute"), ã, ("a tilde") ç ("c cedilla") and others (entered using Portuguese keyboards).

To sum up:

1 - If the post was done using the ISO-8859-1 ("Latin 1") encoding with 8 bit characters, all the characters appear as they should in NewsMan.


Good.

Rick wrote:
2 - If the post was done using ISO-8859-1 ("Latin 1") charset, with 8 bits encoding, but the *Subject* of the message does not use 8 bits (but only 7 bits), then the Subject appears somewhat changed.


Fixed for the next beta release.

The remaining three test posts have been captured and we'll take a look at what the issue might be as soon as we can.

Best Regards
Back to top
View user's profile Send private message Send e-mail
administrator
Developer
Developer


Joined: 24 Jul 2004
Posts: 4750
Location: King William, VA

PostPosted: Mon Jul 23, 2007 5:57 pm    Post subject: Reply with quote

Cases #3 and #5 should be fixed in the next beta release due to changes in the handling of "Quoted-Printable" characters. It looks like the characters were being correctly decoded, but the decoded data was not getting transferred to the text buffer. A simple fix that was a bit hard to find.

Regards
Back to top
View user's profile Send private message Send e-mail
administrator
Developer
Developer


Joined: 24 Jul 2004
Posts: 4750
Location: King William, VA

PostPosted: Mon Jul 23, 2007 6:17 pm    Post subject: Reply with quote

Case #4 is a Unicode UTF-8 encoding which is not supported by NMP. We have made a change for the next beta release to, at the very least, decode the UTF-8 encoding. This will allow the message body to be displayed correctly as long as a Unicode-enabled font is used.

Regards
Back to top
View user's profile Send private message Send e-mail
Rick
New User
New User


Joined: 22 Jul 2007
Posts: 2

PostPosted: Mon Jul 23, 2007 10:46 pm    Post subject: Re: Portuguese Characters in Unicode (UTF-8) or Quoted-Print Reply with quote

Hi,
administrator wrote:

Rick wrote:
2 - If the post was done using ISO-8859-1 ("Latin 1") charset, with 8 bits encoding, but the *Subject* of the message does not use 8 bits (but only 7 bits), then the Subject appears somewhat changed.

Fixed for the next beta release.


That's great! Thanks a lot for your quick reply and for fixing that bug for the next beta release. Smile

administrator wrote:

Cases #3 and #5 should be fixed in the next beta release due to changes in the handling of "Quoted-Printable" characters. It looks like the characters were being correctly decoded, but the decoded data was not getting transferred to the text buffer. A simple fix that was a bit hard to find.


That's also great. Thanks for the fix and information.

administrator wrote:

Case #4 is a Unicode UTF-8 encoding which is not supported by NMP. We have made a change for the next beta release to, at the very least, decode the UTF-8 encoding. This will allow the message body to be displayed correctly as long as a Unicode-enabled font is used.


Well, at least for me, being able to READ posts that use Unicode UTF-8 is enough (I don't need to WRITE posts in Unicode).

Thanks again for your very quick reply and for correcting these issues! Smile

Best wishes,
Ricardo Dias Marques
ricmarques AT spamcop DOT net
Back to top
View user's profile Send private message
Display posts from previous:   
Post new topic   Reply to topic    newsmanpro.com Forum Index : Technical Support All times are GMT
Page 1 of 1

 
Jump to:  
You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot vote in polls in this forum


Copyright 2003-2006, Daniel Cumpian
NMP Default By ::Dementeddogz.com::