Converting RTF to HTML with Webdings-Symbols

General TRichView support forum. Please post your questions here
Post Reply
rbom
Posts: 4
Joined: Mon Nov 26, 2018 1:56 pm

Converting RTF to HTML with Webdings-Symbols

Post by rbom » Mon Nov 26, 2018 2:29 pm

Hello *,

I'm having an issue while converting RTF to HTML files. My HTML file show me some unknown characters instead of a Webdings-Symbol.

Can you help me?

Moreover, I will add two files two show my problem. The first file (Input.rtf) is the file which I'm trying to convert and the second (Output.txt) is my generated HTML file (renamed into a txt-file, because I could not upload HTML files), which shows the unknown characters.

In addition, I've analysed the problem a little bit deeper and I found the function
RVUnicodeToEncodedHTMLEx
in the
RVHtmlSave
-file. This function checks at the beginning if the given string is written in a symbol font.

If not: The text will be normally encoded. (This way is working for my Webdings-Symbols.)
Otherwise: The text is supposed to be converted into a known HTML-Symbol, but this is not working for my Webdings-Symbols.

Is there a way to avoid this behaviour? Or is there another way to convert my files.

Thank you for your help.
Attachments
Output.txt
Rename it to Output.htm to get the normal file.
(1.62 KiB) Downloaded 8 times
Input.rtf
(501 Bytes) Downloaded 7 times

Sergey Tkachenko
Site Admin
Posts: 14212
Joined: Sat Aug 27, 2005 10:28 am
Contact:

Re: Converting RTF to HTML with Webdings-Symbols

Post by Sergey Tkachenko » Tue Nov 27, 2018 9:44 am

Yes, when saving to HTML, TRichView converts the characters of the most important symbol fonts (including "Wingdings") to the corresponding Unicode character.
But I cannot reproduce the problem. What version of TRichView do you use?
I opened your RTF file in the ActionTest demo and exported it as "HTML - Simpified". The result: https://www.trichview.com/support/forum ... gChar.html
You can see your character (browsers may display it as an emoji icon). I used the newest version of TRichView and Delphi 10.3

But I also can see a header in your HTML file. Maybe HTML was corrupted when adding this header (for example the application that did it does not understand Unicode surrogate pairs, this pair is used to represent this character)?

rbom
Posts: 4
Joined: Mon Nov 26, 2018 1:56 pm

Re: Converting RTF to HTML with Webdings-Symbols

Post by rbom » Tue Nov 27, 2018 3:01 pm

Hello,

I've created a small Testproject. This project cannot convert the symbols.

It is created with Delphi 2007 and I am using TRichView 17.5.2 .

The project contains only one function, which does the conversion.

For Testing: Place a path to a RTF-file into the edit field and hit the button. The converted HTML-file will be placed next to the input file.
Attachments
RTF Testtool.zip
(16.2 KiB) Downloaded 7 times

rbom
Posts: 4
Joined: Mon Nov 26, 2018 1:56 pm

Re: Converting RTF to HTML with Webdings-Symbols

Post by rbom » Wed Nov 28, 2018 2:08 pm

Hello,

I've analysed the problem a little bit further and I found out that the SaveOptions were different.

I've used the rvsoUTF8-Flag.

If I'm converting files without this flag, I will get a NCR Code for the symbol. In this case the code is "🏞".

I've got the problem that I'm not able to use this code. We were expecting an "P" at the same position instead of this code.

Is there away to prevent TRichView to convert the symbol to an unicode character.

Sergey Tkachenko
Site Admin
Posts: 14212
Joined: Sat Aug 27, 2005 10:28 am
Contact:

Re: Converting RTF to HTML with Webdings-Symbols

Post by Sergey Tkachenko » Wed Nov 28, 2018 6:10 pm

I found the problem. It is in the Delphi's Utf8Encode function. In Delphi 6, 7 and 2007 it does not handle Unicode surrogate pairs correctly.
I'll fix this problem in the next update, I'll try to upload it tomorrow.

As for turning off conversion of symbol characters to Unicode, no, it is hard-coded. And writing "P" there is not a good option.
Back In the days when I started to implement symbol-to-Unicode conversion, all browsers displayed it as "P". Today I tested it with Edge, Chrome and FireFox. Edge and Chrome display the home character, while FireFox displays "P".
The solution with Unicode character is universal and is displayed correctly in all modern browsers.

rbom
Posts: 4
Joined: Mon Nov 26, 2018 1:56 pm

Re: Converting RTF to HTML with Webdings-Symbols

Post by rbom » Thu Nov 29, 2018 7:32 am

Cool, thank you for your quick help. :D

Sergey Tkachenko
Site Admin
Posts: 14212
Joined: Sat Aug 27, 2005 10:28 am
Contact:

Re: Converting RTF to HTML with Webdings-Symbols

Post by Sergey Tkachenko » Fri Nov 30, 2018 7:04 pm

Fixed in TRichView 17.6 (just uploaded)

Post Reply