[Example] Removing formatting

Demos, code samples. Only questions related to the existing topics are allowed here.
Post Reply
Sergey Tkachenko
Site Admin
Posts: 13636
Joined: Sat Aug 27, 2005 10:28 am
Contact:

[Example] Removing formatting

Post by Sergey Tkachenko » Wed Mar 12, 2008 3:16 pm

This example removes all formatting from the document by converting all text and paragraph styles to 0 and (optionally) removing bullets and numbering. If KeepLinks=True, hypertext will not be removed (all text links will be converted to the hypertext style with the lowest index).
Note 1: if the document contains both Unicode and non-Unicode text, or non-Unicode text of different Charsets, some text information may be lost.
Note 2: this is not an editing operation. When called for TRichViewEdit, undo buffer must be cleared.

Code: Select all

procedure RemoveFormatting(RVData: TCustomRVData; RemoveMarkers, KeepLinks: Boolean);
var i, HypertextStyleNo: Integer;
    UnicodeTo, UnicodeHTo: Boolean;
  {.............................................................}
  procedure DoRemoveFormatting(RVData: TCustomRVData);
  var i, r, c, StyleNo, StyleNoTo: Integer;
      PB, UnicodeFrom, ThisUnicodeTo: Boolean;
      table: TRVTableItemInfo;
  begin
    for i := RVData.ItemCount-1 downto 0 do begin
      RVData.GetItem(i).ParaNo := 0;
      StyleNo := RVData.GetItemStyle(i);
      case StyleNo of
        rvsTable:
          begin
            table := TRVTableItemInfo(RVData.GetItem(i));
            for r := 0 to table.RowCount-1 do
              for c := 0 to table.ColCount-1 do
                if table.Cells[r,c]<>nil then
                  DoRemoveFormatting(table.Cells[r,c].GetRVData);
          end;
        rvsListMarker:
          if RemoveMarkers then begin
            PB := RVData.PageBreaksBeforeItems[i];
            RVData.DeleteItems(i, 1);
            if i<RVData.ItemCount then begin
              RVData.GetItem(i).SameAsPrev := False;
              RVData.GetItem(i).PageBreakBefore := PB;
            end;
          end;
        rvsTab:
          if RVData.GetRVStyle.TextStyles[
            TRVTabItemInfo(RVData.GetItem(i)).TextStyleNo].Jump and KeepLinks then
            TRVTabItemInfo(RVData.GetItem(i)).TextStyleNo := HypertextStyleNo
          else begin
            TRVTabItemInfo(RVData.GetItem(i)).TextStyleNo := 0;
            RVData.SetItemTag(i, '');
          end;
        1..MaxInt:
          begin
            if KeepLinks and RVData.GetRVStyle.TextStyles[StyleNo].Jump then begin
              ThisUnicodeTo := UnicodeHTo;
              StyleNoTo := HypertextStyleNo;
              end
            else begin
              ThisUnicodeTo := UnicodeTo;
              StyleNoTo := 0;
              RVData.SetItemTag(i, '');              
            end;
            UnicodeFrom := RVData.GetRVStyle.TextStyles[StyleNo].Unicode;
            RVData.GetItem(i).StyleNo := StyleNoTo;
            if UnicodeFrom and not ThisUnicodeTo then begin
              RVData.GetItem(i).ItemOptions := RVData.GetItem(i).ItemOptions-[rvioUnicode];
              RVData.SetItemTextR(i,
                RVU_UnicodeToAnsi(RVData.GetStyleCodePage(0), RVData.GetItemTextR(i)));
              end
            else if not UnicodeFrom and ThisUnicodeTo then begin
              RVData.GetItem(i).ItemOptions := RVData.GetItem(i).ItemOptions+[rvioUnicode];
              RVData.SetItemTextR(i,
                RVU_AnsiToUnicode(RVData.GetStyleCodePage(0), RVData.GetItemTextR(i)));
            end;
          end;
      end;
    end;
  end;
  {.............................................................}
begin
  HypertextStyleNo := 0;
  if KeepLinks then
    for i := 0 to RVData.GetRVStyle.TextStyles.Count-1 do
      if RVData.GetRVStyle.TextStyles[i].Jump then begin
        HypertextStyleNo := i;
        break;
      end;
  UnicodeTo  := RVData.GetRVStyle.TextStyles[0].Unicode;
  UnicodeHTo := RVData.GetRVStyle.TextStyles[HypertextStyleNo].Unicode;
  DoRemoveFormatting(RVData);
end;
How to use:

Code: Select all

procedure TForm3.ToolButton66Click(Sender: TObject);
begin
  RemoveFormatting(RichViewEdit1.RVData, True, True);
  NormalizeRichView(RichViewEdit1.RVData); // from RVNormalize.pas, from RichViewActions
  RichViewEdit1.DeleteUnusedStyles(True, True, True);
  RichViewEdit1.ClearUndo;
  RichViewEdit1.Format;
end;
Updates:
2008-Dec-11: for compatibility with TRichView 11 (using GetItemTextR and SetItemTextR instead of GetItemText and SetItemText)
2011-Oct-2: for compatibility with TRichView 13.4 (item tags became strings)
Last edited by Sergey Tkachenko on Sun Oct 02, 2011 6:39 pm, edited 2 times in total.

Sergey Tkachenko
Site Admin
Posts: 13636
Joined: Sat Aug 27, 2005 10:28 am
Contact:

Post by Sergey Tkachenko » Thu Mar 13, 2008 5:34 pm

Updated: KeepLinks parameter is added.

Klixx
Posts: 5
Joined: Tue Jul 27, 2010 9:28 am

Post by Klixx » Tue Jul 27, 2010 9:37 am

Hello,
I've changed your example to remove only tables, since I have to obtain only content of cells: with my RichView V10.1 , it doesn't remove them.
The RTF text remains not modified. Below is my code :

Code: Select all

procedure RemoveRTFTable(RVData: TCustomRVData);
  {.............................................................}
  procedure DoRemoveRTFTable(RVData: TCustomRVData);
  var
    i, r, c, StyleNo: Integer;
    table: TRVTableItemInfo;
  begin
    for i := RVData.ItemCount - 1 downto 0 do
    begin
      RVData.GetItem(i).ParaNo := 0;
      StyleNo := RVData.GetItemStyle(i);
      case StyleNo of
        rvsTable:
          begin
            table := TRVTableItemInfo(RVData.GetItem(i));
            for r := 0 to table.RowCount - 1 do
              for c := 0 to table.ColCount - 1 do
                if table.Cells[r, c] <> nil then
                  DoRemoveRTFTable(table.Cells[r, c].GetRVData);
          end;
      end;
    end;
  end;
  {.............................................................}
begin
  DoRemoveRTFTable(RVData);
end;
../..
  RemoveRTFTable(frm.RichView1.RVData);
  frm.RichView1.DeleteUnusedStyles(true, true, true);
  frm.RichView1.Format;
Where's the trap ?
Thanks

Sergey Tkachenko
Site Admin
Posts: 13636
Joined: Sat Aug 27, 2005 10:28 am
Contact:

Post by Sergey Tkachenko » Tue Jul 27, 2010 3:10 pm

Your procedure does not contain any code for deleting items (it must call RVData.DeleteItems method somewhere).
I still do not understand what you want to delete.
If you want to delete all tables:

Code: Select all

for i := RichView1.ItemCount-1 downto 0 do
  if RichView1.GetItemStyle(i)=rvsTable then
     RichView1.DeleteItems(i, 1);
RichView1.DeleteUnusedStyles(True, True, True);
RichView1.Format;

Klixx
Posts: 5
Joined: Tue Jul 27, 2010 9:28 am

Post by Klixx » Wed Jul 28, 2010 10:17 am

Hi,

It's exactly what I guessed : I've tried your whole example but it didn't removed the tables from my RTF file.
So, as I only want to convert table cells to paragraphs, i tried your other example, converting table to text, without modifying it : http://www.trichview.com/forums/viewtopic.php?t=581
It works partially, and removes only the first table row.
Thank you for your help.

A RTF example (sorry it's quick'n' dirty done with Word, so heavy) :

Code: Select all

{\rtf1\adeflang1025\ansi\ansicpg1252\uc1\adeff0\deff0\stshfdbch0\stshfloch0\stshfhich0\stshfbi0\deflang1036\deflangfe1036{\fonttbl{\f0\froman\fcharset0\fprq2{\*\panose 02020603050405020304}Times New Roman;}{\f50\fswiss\fcharset0\fprq2{\*\panose 020b0604030504040204}Verdana;}{\f135\froman\fcharset238\fprq2 Times New Roman CE;}
{\f136\froman\fcharset204\fprq2 Times New Roman Cyr;}{\f138\froman\fcharset161\fprq2 Times New Roman Greek;}{\f139\froman\fcharset162\fprq2 Times New Roman Tur;}{\f140\fbidi \froman\fcharset177\fprq2 Times New Roman (Hebrew);}
{\f141\fbidi \froman\fcharset178\fprq2 Times New Roman (Arabic);}{\f142\froman\fcharset186\fprq2 Times New Roman Baltic;}{\f143\froman\fcharset163\fprq2 Times New Roman (Vietnamese);}{\f635\fswiss\fcharset238\fprq2 Verdana CE;}
{\f636\fswiss\fcharset204\fprq2 Verdana Cyr;}{\f638\fswiss\fcharset161\fprq2 Verdana Greek;}{\f639\fswiss\fcharset162\fprq2 Verdana Tur;}{\f642\fswiss\fcharset186\fprq2 Verdana Baltic;}{\f643\fswiss\fcharset163\fprq2 Verdana (Vietnamese);}}
{\colortbl;\red0\green0\blue0;\red0\green0\blue255;\red0\green255\blue255;\red0\green255\blue0;\red255\green0\blue255;\red255\green0\blue0;\red255\green255\blue0;\red255\green255\blue255;\red0\green0\blue128;\red0\green128\blue128;\red0\green128\blue0;
\red128\green0\blue128;\red128\green0\blue0;\red128\green128\blue0;\red128\green128\blue128;\red192\green192\blue192;\red255\green255\blue255;}{\stylesheet{\ql \li0\ri0\widctlpar\wrapdefault\aspalpha\aspnum\faauto\adjustright\rin0\lin0\itap0 \rtlch\fcs1 
\af0\afs24\alang1025 \ltrch\fcs0 \fs24\lang1036\langfe1036\cgrid\langnp1036\langfenp1036 \snext0 Normal;}{\*\cs10 \additive \ssemihidden Default Paragraph Font;}{\*
\ts11\tsrowd\trftsWidthB3\trpaddl108\trpaddr108\trpaddfl3\trpaddft3\trpaddfb3\trpaddfr3\tblind0\tblindtype3\tscellwidthfts0\tsvertalt\tsbrdrt\tsbrdrl\tsbrdrb\tsbrdrr\tsbrdrdgl\tsbrdrdgr\tsbrdrh\tsbrdrv 
\ql \li0\ri0\widctlpar\wrapdefault\aspalpha\aspnum\faauto\adjustright\rin0\lin0\itap0 \rtlch\fcs1 \af0\afs20 \ltrch\fcs0 \fs20\lang1024\langfe1024\cgrid\langnp1024\langfenp1024 \snext11 \ssemihidden Normal Table;}}
{\*\latentstyles\lsdstimax156\lsdlockeddef0}{\*\rsidtbl \rsid2781679\rsid9505419\rsid9598588}{\*\generator Microsoft Word 11.0.0000;}{\info{\title Comment}{\operator jmblanchet}{\creatim\yr2010\mo7\dy28\hr12\min14}{\revtim\yr2010\mo7\dy28\hr12\min14}
{\version3}{\edmins25}{\nofpages1}{\nofwords3}{\nofchars21}{\nofcharsws23}{\vern24615}{\*\password 00000000}}{\*\xmlnstbl {\xmlns1 http://schemas.microsoft.com/office/word/2003/wordml}}
\paperw12240\paperh15840\margl1417\margr1417\margt1417\margb1417\gutter0\ltrsect 
\widowctrl\ftnbj\aenddoc\hyphhotz425\donotembedsysfont0\donotembedlingdata1\grfdocevents0\validatexml0\showplaceholdtext0\ignoremixedcontent0\saveinvalidxml0\showxmlerrors0\horzdoc\dghspace120\dgvspace120\dghorigin1701\dgvorigin1984\dghshow0\dgvshow3
\jcompress\viewkind1\viewscale144\viewzk2\rsidroot9598588 \fet0{\*\wgrffmtfilter 013f}\ilfomacatclnup0\ltrpar \sectd \ltrsect\linex0\sectdefaultcl\sftnbj {\*\pnseclvl1\pnucrm\pnstart1\pnindent720\pnhang {\pntxta .}}{\*\pnseclvl2
\pnucltr\pnstart1\pnindent720\pnhang {\pntxta .}}{\*\pnseclvl3\pndec\pnstart1\pnindent720\pnhang {\pntxta .}}{\*\pnseclvl4\pnlcltr\pnstart1\pnindent720\pnhang {\pntxta )}}{\*\pnseclvl5\pndec\pnstart1\pnindent720\pnhang {\pntxtb (}{\pntxta )}}{\*\pnseclvl6
\pnlcltr\pnstart1\pnindent720\pnhang {\pntxtb (}{\pntxta )}}{\*\pnseclvl7\pnlcrm\pnstart1\pnindent720\pnhang {\pntxtb (}{\pntxta )}}{\*\pnseclvl8\pnlcltr\pnstart1\pnindent720\pnhang {\pntxtb (}{\pntxta )}}{\*\pnseclvl9\pnlcrm\pnstart1\pnindent720\pnhang 
{\pntxtb (}{\pntxta )}}\ltrrow\trowd \irow0\irowband0\ltrrow
\ts11\trgaph30\trleft0\trftsWidth1\trftsWidthB3\trftsWidthA3\trautofit1\trspdl15\trspdt15\trspdb15\trspdr15\trspdfl3\trspdft3\trspdfb3\trspdfr3\trpaddl15\trpaddt15\trpaddb15\trpaddr15\trpaddfl3\trpaddft3
\trpaddfb3\trpaddfr3\tblrsid9505419\tblind45\tblindtype3 \clvertalt\clbrdrt\brdrnone \clbrdrl\brdrnone \clbrdrb\brdrnone \clbrdrr\brdrnone \cltxlrtb\clftsWidth1\clshdrawnil \cellx1187\pard\plain \ltrpar
\ql \li0\ri0\nowidctlpar\intbl\wrapdefault\faauto\rin0\lin0 \rtlch\fcs1 \af0\afs24\alang1025 \ltrch\fcs0 \fs24\lang1036\langfe1036\cgrid\langnp1036\langfenp1036 {\rtlch\fcs1 \ab\af0 \ltrch\fcs0 \b\cf1\lang0\langfe1036\langnp0\insrsid9598588 First Row}{
\rtlch\fcs1 \ab\af0 \ltrch\fcs0 \b\cf1\lang0\langfe1036\langnp0\insrsid9505419 \cell }\pard \ltrpar\ql \li0\ri0\widctlpar\intbl\wrapdefault\aspalpha\aspnum\faauto\adjustright\rin0\lin0 {\rtlch\fcs1 \af0 \ltrch\fcs0 
\cf1\lang0\langfe1036\langnp0\insrsid9505419 \trowd \irow0\irowband0\ltrrow
\ts11\trgaph30\trleft0\trftsWidth1\trftsWidthB3\trftsWidthA3\trautofit1\trspdl15\trspdt15\trspdb15\trspdr15\trspdfl3\trspdft3\trspdfb3\trspdfr3\trpaddl15\trpaddt15\trpaddb15\trpaddr15\trpaddfl3\trpaddft3
\trpaddfb3\trpaddfr3\tblrsid9505419\tblind45\tblindtype3 \clvertalt\clbrdrt\brdrnone \clbrdrl\brdrnone \clbrdrb\brdrnone \clbrdrr\brdrnone \cltxlrtb\clftsWidth1\clshdrawnil \cellx1187\row \ltrrow}\pard \ltrpar
\ql \li0\ri0\nowidctlpar\intbl\wrapdefault\faauto\rin0\lin0 {\rtlch\fcs1 \ab\af50\afs16 \ltrch\fcs0 \b\f50\fs16\cf1\lang0\langfe1036\langnp0\insrsid9598588 Second Row}{\rtlch\fcs1 \af50\afs16 \ltrch\fcs0 
\f50\fs16\cf1\lang0\langfe1036\langnp0\insrsid9505419 \cell }\pard \ltrpar\ql \li0\ri0\widctlpar\intbl\wrapdefault\aspalpha\aspnum\faauto\adjustright\rin0\lin0 {\rtlch\fcs1 \af0 \ltrch\fcs0 \cf1\lang0\langfe1036\langnp0\insrsid9505419 
\trowd \irow1\irowband1\lastrow \ltrrow\ts11\trgaph30\trleft0\trftsWidth1\trftsWidthB3\trftsWidthA3\trautofit1\trspdl15\trspdt15\trspdb15\trspdr15\trspdfl3\trspdft3\trspdfb3\trspdfr3\trpaddl15\trpaddt15\trpaddb15\trpaddr15\trpaddfl3\trpaddft3
\trpaddfb3\trpaddfr3\tblrsid9505419\tblind45\tblindtype3 \clvertalt\clbrdrt\brdrnone \clbrdrl\brdrnone \clbrdrb\brdrnone \clbrdrr\brdrnone \cltxlrtb\clftsWidth1\clshdrawnil \cellx1187\row }\pard \ltrpar
\ql \li0\ri0\nowidctlpar\wrapdefault\faauto\rin0\lin0\itap0 {\rtlch\fcs1 \af0 \ltrch\fcs0 \cf1\lang0\langfe1036\langnp0\insrsid9505419 
\par }}

Sergey Tkachenko
Site Admin
Posts: 13636
Joined: Sat Aug 27, 2005 10:28 am
Contact:

Post by Sergey Tkachenko » Wed Jul 28, 2010 11:30 am

I cannot reproduce the problem. In my tests, it works as expected: table is removed, its cells are inserted as normal paragraph.
This is the result I got:

Code: Select all

{\rtf1\ansi\ansicpg0\uc0\deff0\deflang0\deflangfe0{\fonttbl{\f0\fnil Arial;}{\f1\fnil\fcharset0 Times New Roman;}{\f2\fnil\fcharset0 Verdana;}}{\colortbl;\red0\green0\blue0;\red0\green0\blue255;\red0\green255\blue255;\red0\green255\blue0;\red255\green0\blue255;\red255\green0\blue0;\red255\green255\blue0;\red255\green255\blue255;\red0\green0\blue128;\red0\green128\blue128;\red0\green128\blue0;\red128\green0\blue128;\red128\green0\blue0;\red128\green128\blue0;\red128\green128\blue128;\red192\green192\blue192;}


\pard\fi0\li0\ql\ri0\sb0\sa0\itap0 \plain \f1\fs24    
\par \plain \f1\b\fs24\cf1 First Row\plain \f1\fs24  
\par \plain \f2\b\fs16\cf1 Second Row
\par \plain \f1\fs24\cf1 \par}

Klixx
Posts: 5
Joined: Tue Jul 27, 2010 9:28 am

Post by Klixx » Wed Jul 28, 2010 11:38 am

I don't understand. My sample can be wrong. Here is real data I have to "convert". Can you check cells are converted to paragraphs ?
TY

Code: Select all

{\rtf1\ansi\ansicpg0\uc1\deff0\deflang0\deflangfe0{\fonttbl{\f0\fnil Times New Roman;}{\f1\fnil Verdana;}}{\colortbl;\red0\green0\blue0;\red0\green0\blue255;\red0\green255\blue255;\red0\green255\blue0;\red255\green0\blue255;\red255\green0\blue0;\red255\green255\blue0;\red255\green255\blue255;\red0\green0\blue128;\red0\green128\blue128;\red0\green128\blue0;\red128\green0\blue128;\red128\green0\blue0;\red128\green128\blue0;\red128\green128\blue128;\red192\green192\blue192;\red0\green51\blue153;\red172\green168\blue153;\red255\green255\blue255;}

\uc1
\pard\fi0\li0\ql\ri0\sb0\sa0\itap0 \plain \f0\b\fs24\cf1 Commentaire
\par \plain \f0\fs24\cf1 
\par {\trowd\trgaph30\trleft0\itap1\trpaddl15\trpaddt15\trpaddr15\trpaddb15\trpaddfl3\trpaddft3\trpaddfr3\trpaddfb3\trspdl15\trspdr15\trspdfl3\trspdfr3\trspdt15\trspdft3\trspdb15\trspdfb3\trftsWidth1\richviewtbw0\clftsWidth1\richviewcbw0\richviewcbh0\cellx16065\pard\intbl\itap1{{
\pard\fi0\li0\ql\ri0\sb0\sa0\itap1\intbl \plain \f0\b\fs24\cf1 Note ATCD\cell}}\pard\intbl\itap1\row}{\trowd\trgaph30\trleft0\itap1\trpaddl15\trpaddt15\trpaddr15\trpaddb15\trpaddfl3\trpaddft3\trpaddfr3\trpaddfb3\trspdl15\trspdr15\trspdfl3\trspdfr3\trspdt15\trspdft3\trspdb15\trspdfb3\trftsWidth1\lastrow\richviewtbw0\clftsWidth1\richviewcbw0\richviewcbh0\cellx16065\pard\intbl\itap1{{
\pard\fi0\li0\ql\ri0\sb0\sa0\itap1\intbl \plain \f1\b\fs16\cf1 DNID et HTA -> 01/12/2099
\par \plain \f1\fs16\cf1 
\par 1990 : DNID
\par 1985 : HTA trait\'e9e par Tenormine
\par Dyslipid\'e9mie
\par Insuffisance veineuse
\par Hernie hiatale
\par 2000 : proth\'e8se genou gauche
\par 2007 : glaucome droit
\par 2007 : fracture f\'e9morale osteosynth\'e9s\'e9e
\par Extrasystoles auriculaires
\par 
\par \plain \f1\b\ul\fs16\cf1 Mode de vie
\par \plain \f1\fs16\cf1 M\'e9decin Traitant :\plain \f1\fs16\cf17  Bonfils
\par \plain \f1\fs16\cf1 Allergie m\'e9dicamenteuse :\plain \f1\fs16\cf17  aucune
\par \plain \f1\fs16\cf1 Tabac :\plain \f1\fs16\cf17  0
\par \plain \f0\fs24\cf1 
\par \plain \f1\b\ul\fs16\cf1 Facteurs de risque CV
\par \plain \f1\fs16\cf1 Nature :\plain \f1\fs16\cf17  HTA, DNID, Femme age>60 ans
\par \plain \f1\fs16\cf1 Nombre FR :\plain \f1\fs16\cf17  3
\par \plain \f0\fs24\cf1 
\par \plain \f1\b\ul\fs16\cf1 familiaux
\par \plain \f1\fs16\cf1 Fille : dcd d'un n\'e9o du sein \'e0 40 ans \cell}}\pard\intbl\itap1\row}\par}

Sergey Tkachenko
Site Admin
Posts: 13636
Joined: Sat Aug 27, 2005 10:28 am
Contact:

Post by Sergey Tkachenko » Wed Jul 28, 2010 11:44 am

I see no problems with this file as well.

ConvertTableToText converts only the current table (at the position of the caret).
May be you expect that all tables will be converted, even if the caret is not in a table?

Klixx
Posts: 5
Joined: Tue Jul 27, 2010 9:28 am

Post by Klixx » Wed Jul 28, 2010 11:46 am

Yes ! How can I achieve this ?

I'm sorry I'm not a TRichView pro.

Sergey Tkachenko
Site Admin
Posts: 13636
Joined: Sat Aug 27, 2005 10:28 am
Contact:

Post by Sergey Tkachenko » Wed Jul 28, 2010 11:50 am

I can modify this procedure to remove all tables.
But some questions:
1) should this be an editable operation (that the user can undo) or not?
2) do you need to convert nested tables as well?
3) which version of the procedure do you need to modify? The first one adds line breaks between cells, the second one adds ';' (I guess the first, since you have multiline data in cells)

Klixx
Posts: 5
Joined: Tue Jul 27, 2010 9:28 am

Post by Klixx » Wed Jul 28, 2010 1:50 pm

1) should this be an editable operation (that the user can undo) or not?
No editable
2) do you need to convert nested tables as well?
Yes if possible
3) which version of the procedure do you need to modify? The first one adds line breaks between cells, the second one adds ';' (I guess the first, since you have multiline data in cells)
The first one.
Thank you so much.
BTW, I use RvHTMLImporter component to obtain the data, and so there are tables. Web devs love them ! :)


Post Reply