| View previous topic :: View next topic |
| Author |
Message |
Rick New User

Joined: 22 Jul 2007 Posts: 2
|
Posted: Sun Jul 22, 2007 11:45 pm Post subject: Portuguese Characters in Unicode (UTF-8) or Quoted-Printable |
|
|
Hi,
I'm a Portuguese user (from Portugal) of the Registered version of NewsMan Pro 2.7 (Build 2.7.0.2).
Congratulations to the developers for building a great software program!
Unfortunately, I'm having some problems reading some posts in Portuguese newsgroups, that have accented Portuguese letters (characters with accents), like á ("a acute"), ã, ("a tilde") ç ("c cedilla") and others (entered using Portuguese keyboards).
To sum up:
1 - If the post was done using the ISO-8859-1 ("Latin 1") encoding with 8 bit characters, all the characters appear as they should in NewsMan.
Sample - Message Source (posted in the misc.test newsgroup):
http://groups.google.com/group/misc.test/msg/71d43e73e5620ba6?dmode=source
Same Sample - Parsed version:
http://groups.google.com/group/misc.test/msg/71d43e73e5620ba6
Message-ID link:
<news:13a72huq4ifhe42@corp.supernews.com>
2 - If the post was done using ISO-8859-1 ("Latin 1") charset, with 8 bits encoding, but the *Subject* of the message does not use 8 bits (but only 7 bits), then the Subject appears somewhat changed.
Sample - Message Source (posted in the misc.test newsgroup):
http://groups.google.com/group/misc.test/msg/8a081d48d84bdfd7?dmode=source&hl=en
Same Sample - Parsed version:
http://groups.google.com/group/misc.test/msg/8a081d48d84bdfd7
Message-ID link:
<news:13a71purb0fvk92@corp.supernews.com>
The original subject was:
Test of Portuguese accented characters using ISO-8859-1 ("Latin 1") with 8-bit (ÀÁÂÃ Ç ÈÉÊ ÌÍÎ ÒÓÔÕ ÚÙÛ àáâã ç èéê ìíî òóôõ ùú)
In NewsMan it appears like this:
Test of Portuguese accented characters usi Test of Portuguese accented characters using ISO-8859-1 ("Latin 1") with 8-bit (ÀÁÂÃ Test of Portuguese acce...
(note that the Subject breaks after the letters "usi" - from the word "using" - and then starts from the beginning - and then stops again after "ÀÁÂÃ" and then starts again from the beginning)
3 - If the post was done using ISO-8859-1 ("Latin 1") charset, with 7 bits and "Quoted Printable" encoding - http://en.wikipedia.org/wiki/Quoted-printable - then the "Quoted Printable" characters are NOT converted back to the correct accented characters.
Sample - Message Source (posted in the misc.test newsgroup):
http://groups.google.com/group/misc.test/msg/5b7571177da2db6d?dmode=source
Same sample - Parsed version:
http://groups.google.com/group/misc.test/msg/5b7571177da2db6d
Message-ID link:
<news:13a73a2e3tgv803@corp.supernews.com>
In NewsMan, the post should appear like it does in the "Parsed" version in Google Groups, but instead it appears like it does in the "Message Source" version.
4 - If the post was done using Unicode with 8-bits encoding ("UTF-8"), the accented characters also do not appear correctly:
Sample - Message Source (posted in the misc.test newsgroup):
http://groups.google.pt/group/misc.test/msg/c46a521fa2f5802b?dmode=source
Same sample - Parsed Version:
http://groups.google.pt/group/misc.test/msg/c46a521fa2f5802b
Message-ID link:
<news:13a743ktdd2u407@corp.supernews.com>
In NewsMan, the post should appear like it does in the "Parsed" version in Google Groups, but instead it appears like it does in the "Message Source" version.
5 - If the post was done using Unicode with 8-bits ("UTF-8") but encoded using "Quoted-Printable", then the accented characters also do not appear correctly (granted, this would probably be a strange setup for one to use):
Sample - Message Source (posted in the misc.test newsgroup):
http://groups.google.com/group/misc.test/msg/db19c735fea29d97?dmode=source
Same sample - Parsed Version:
http://groups.google.pt/group/misc.test/msg/c46a521fa2f5802b
Message-ID link:
<news:13a73imicl06810@corp.supernews.com>
In NewsMan, the post should appear like it does in the "Parsed" version in Google Groups, but instead it appears like it does in the "Message Source" version.
All these posts appear correctly to me when using Microsoft Outlook Express 6 or Forté Agent 4.2 (in the same laptop, using Windows XP Professional with Service Pack 2, and all critical updates applied).
In my NewsMan Pro configuration ("Settings"), I have the following:
"Fonts and Colors" > "Character Set"
<Default> option selected in the Dropdown
"Fonts and Colors" > "Message View"
I'm using the "Tahoma" font (but "Arial", "Courier New" and "Verdana" seems to give me the same results).
Is this some problem with my configuration or are these bugs / features not yet developed?
Any help and/or information would be much appreciated!
Thanks in advance.
Best wishes,
Ricardo Dias Marques
ricmarques AT spamcop DOT net |
|
| Back to top |
|
 |
administrator Developer


Joined: 24 Jul 2004 Posts: 4750 Location: King William, VA
|
Posted: Mon Jul 23, 2007 2:29 am Post subject: Re: Portuguese Characters in Unicode (UTF-8) or Quoted-Print |
|
|
| Rick wrote: | Hi,
I'm a Portuguese user (from Portugal) of the Registered version of NewsMan Pro 2.7 (Build 2.7.0.2).
Congratulations to the developers for building a great software program!
Unfortunately, I'm having some problems reading some posts in Portuguese newsgroups, that have accented Portuguese letters (characters with accents), like á ("a acute"), ã, ("a tilde") ç ("c cedilla") and others (entered using Portuguese keyboards).
To sum up:
1 - If the post was done using the ISO-8859-1 ("Latin 1") encoding with 8 bit characters, all the characters appear as they should in NewsMan. |
Good.
| Rick wrote: | | 2 - If the post was done using ISO-8859-1 ("Latin 1") charset, with 8 bits encoding, but the *Subject* of the message does not use 8 bits (but only 7 bits), then the Subject appears somewhat changed. |
Fixed for the next beta release.
The remaining three test posts have been captured and we'll take a look at what the issue might be as soon as we can.
Best Regards |
|
| Back to top |
|
 |
administrator Developer


Joined: 24 Jul 2004 Posts: 4750 Location: King William, VA
|
Posted: Mon Jul 23, 2007 5:57 pm Post subject: |
|
|
Cases #3 and #5 should be fixed in the next beta release due to changes in the handling of "Quoted-Printable" characters. It looks like the characters were being correctly decoded, but the decoded data was not getting transferred to the text buffer. A simple fix that was a bit hard to find.
Regards |
|
| Back to top |
|
 |
administrator Developer


Joined: 24 Jul 2004 Posts: 4750 Location: King William, VA
|
Posted: Mon Jul 23, 2007 6:17 pm Post subject: |
|
|
Case #4 is a Unicode UTF-8 encoding which is not supported by NMP. We have made a change for the next beta release to, at the very least, decode the UTF-8 encoding. This will allow the message body to be displayed correctly as long as a Unicode-enabled font is used.
Regards |
|
| Back to top |
|
 |
Rick New User

Joined: 22 Jul 2007 Posts: 2
|
Posted: Mon Jul 23, 2007 10:46 pm Post subject: Re: Portuguese Characters in Unicode (UTF-8) or Quoted-Print |
|
|
Hi,
| administrator wrote: |
| Rick wrote: | | 2 - If the post was done using ISO-8859-1 ("Latin 1") charset, with 8 bits encoding, but the *Subject* of the message does not use 8 bits (but only 7 bits), then the Subject appears somewhat changed. |
Fixed for the next beta release.
|
That's great! Thanks a lot for your quick reply and for fixing that bug for the next beta release.
| administrator wrote: |
Cases #3 and #5 should be fixed in the next beta release due to changes in the handling of "Quoted-Printable" characters. It looks like the characters were being correctly decoded, but the decoded data was not getting transferred to the text buffer. A simple fix that was a bit hard to find.
|
That's also great. Thanks for the fix and information.
| administrator wrote: |
Case #4 is a Unicode UTF-8 encoding which is not supported by NMP. We have made a change for the next beta release to, at the very least, decode the UTF-8 encoding. This will allow the message body to be displayed correctly as long as a Unicode-enabled font is used.
|
Well, at least for me, being able to READ posts that use Unicode UTF-8 is enough (I don't need to WRITE posts in Unicode).
Thanks again for your very quick reply and for correcting these issues!
Best wishes,
Ricardo Dias Marques
ricmarques AT spamcop DOT net |
|
| Back to top |
|
 |
|
|
You cannot post new topics in this forum You cannot reply to topics in this forum You cannot edit your posts in this forum You cannot delete your posts in this forum You cannot vote in polls in this forum
|
|
|