[Examples] Count of characters and words

Demos, code samples. Only questions related to the existing topics are allowed here.
Sergey Tkachenko
Site Admin
Posts: 13636
Joined: Sat Aug 27, 2005 10:28 am
Contact:

[Examples] Count of characters and words

Post by Sergey Tkachenko » Sun Aug 28, 2005 11:42 am

Calculating a number of characters

Code: Select all

uses CRVData;

function GetCharCount(RVData: TCustomRVData): Integer;
var i,r,c: Integer;
     table: TRVTableItemInfo;
begin
Result := 0;
for i := 0 to RVData.Items.Count-1 do
  if RVData.GetItemStyle(i)>=0 then begin // this is a text item
    inc(Result, RVData.GetItemTextLength(i))
  else if RVData.GetItemStyle(i)=rvsTab then 
    inc(Result)
  else if RVData.GetItemStyle(i)=rvsTable then begin
    table := TRVTableItemInfo(RVData.GetItem(i));
    for r := 0 to table.Rows.Count-1 do
      for c := 0 to table.Rows[r].Count-1 do
        if table.Cells[r,c]<>nil then
          inc(Result, GetCharCount(table.Cells[r,c].GetRVData));
  end;
end;
Call:

Code: Select all

r := GetCharCount(RichView.RVData);
This function does not count images, etc.

Updated 2017-02-15 for compatibility with TRichView 16.13. For older versions of TRichView, use ItemLength instead of GetItemTextLength
Last edited by Sergey Tkachenko on Thu Jun 22, 2006 3:33 pm, edited 1 time in total.

Sergey Tkachenko
Site Admin
Posts: 13636
Joined: Sat Aug 27, 2005 10:28 am
Contact:

Post by Sergey Tkachenko » Sun Aug 28, 2005 11:43 am

Calculating a number of words

You can use a class from
http://www.trichview.com/resources/spell/rvspell.zip
(Update: in the new version of TRichView, RVWordEnum.pas is included in the main component set)

You need a unit RVWordEnum.pas from there (other files are specific for some spell checkers).

Create a class

Code: Select all

TWordCounter = class(TRVWordEnumerator)
private
    FCounter: Integer;
protected
    function ProcessWord: Boolean; override;
public
   function GetWordCount(rve: TCustomRichViewEdit): Integer;
end;

function TWordCounter.ProcessWord: Boolean;
begin
  inc(FCounter);
  Result := True;
end;

function TWordCounter.GetWordCount(rve: TCustomRichViewEdit): Integer;
begin
  FCounter := 0;
  Run(rve, rvesFromStart);
  Result := FCounter;
end;
(this function treats word written with two different fonts as two words)

Call:

Code: Select all

var wcnt: TWordCounter;

wcnt := TWordCounter.Create;
r := wcnt.GetWordCount(RichViewEdit1);
wcnt.Free;

christopher00
Posts: 2
Joined: Wed Jun 21, 2006 8:58 pm

Re: [Examples] Count of characters and words

Post by christopher00 » Wed Jun 21, 2006 9:13 pm

Sergey Tkachenko wrote:Calculating a number of characters

Code: Select all

uses CRVData;

function GetCharCount(RVData: TCustomRVData): Integer;
var i,r,c: Integer;
     table: TRVTableItemInfo;
begin
Result := 0;
for i := 0 to RVData.Items.Count-1 do
  if RVData.GetItemStyle(i)>=0 then begin // this is a text item
    inc(Result, RVData.ItemLength(i))
  else if RVData.GetItemStyle(i)=rvsTable then begin
    table := TRVTableItemInfo(RVData.GetItem(i));
    for r := 0 to table.Rows.Count-1 do
      for c := 0 to table.Rows[r].Count-1 do
        if table.Cells[r,c]<>nil then
          inc(Result, GetCharCount(table.Cells[r,c].GetRVData));
  end;
end;
Call:

Code: Select all

r := GetCharCount(RichView.RVData);
This function does not count images, etc.
Will this function count the characters in a TRichViewEdit? The word count function looks like it would.

Sergey Tkachenko
Site Admin
Posts: 13636
Joined: Sat Aug 27, 2005 10:28 am
Contact:

Post by Sergey Tkachenko » Thu Jun 22, 2006 3:34 pm

Yes, it can be used for TRichViewEdit too.

PS: I just modified it to take tab characters into account. Line breaks are still not counted.

beboyle
Posts: 1
Joined: Sun Oct 23, 2005 7:46 pm

Calculating Words and Characters

Post by beboyle » Sun Jul 30, 2006 7:43 pm

This pair of functions counts the number of words and characters in a TRichViewEdit control. The WordCount function is just a generic routine to return word count from any text. The second is fired when the text changes. It loops through the text items in the RVE and passes each to WordCount and totals up the results.

Code: Select all

// Function to provide word count from memo or text field
function WordCount(t: String): LongInt;
var
  ws: Boolean;
  wc: Integer;
  i: Integer;
begin
  ws := True; // In whitespace
  wc := 0;
  for i := 0 to Length(t) - 1 do
  begin
    if t[i] in [' ', #9, #10, #13] then
         ws := True
    else if ws then
      begin
        ws := False;
        Inc(wc);
      end;
  end;
    Result := wc;
end;

// Change in text - count words and characters.
procedure Tform1.RVEChange(Sender: TObject);
var
  wc, cc: LongInt;
  I: Integer;
begin
  cc := 0;
  wc := 0;
  with RVE do
  begin
    for I := 0 to RVData.Items.Count - 1 do
      if RVData.GetItemStyle(I) >= 0 then
      begin
        inc (cc, RVData.GetItemTextLength(I));
        inc (wc, WordCount(RVData.GetItemTextA(I)));
      end;
  end;
  lbCount.Caption := IntToStr(cc) + ' characters, ' + IntToStr(wc) + ' words.';
end;
Updated 2017-02-15 for compatibility with TRichView 16.13. For older versions of TRichView, use ItemLength instead of GetItemTextLength

Sergey Tkachenko
Site Admin
Posts: 13636
Joined: Sat Aug 27, 2005 10:28 am
Contact:

Post by Sergey Tkachenko » Sat Jul 21, 2007 6:08 pm

Calculating a number of words in the selection.

Parsers for spellcheckers v1.9.1 (2007-Jul-21) are required:
http://www.trichview.com/resources/spell/rvspell.zip

(Update: this parser (RVWordEnum unit) is now included in the main set of TRichView units)

Code: Select all

uses RVWordEnum, RVTable, CRVFData;

type
  TSelWordCounter = class(TRVWordEnumerator)
  private
    FCounter: Integer;
    FSelRVData: TCustomRVFormattedData;
    FSelStartItemNo, FSelStartOffs, FSelEndItemNo, FSelEndOffs: Integer;
  protected
    function ProcessWord: Boolean; override;
  public
    function GetWordCount(rve: TCustomRichViewEdit): Integer;
  end;

function TSelWordCounter.ProcessWord: Boolean;
begin
  Result :=
    not
    (
     (FRVData=FSelRVData) and
     ((FItemNo>FSelEndItemNo) or ((FItemNo=FSelEndItemNo) and (FStartOffs>=FSelEndOffs)))
    );
  if Result then
    inc(FCounter);
end;

function TSelWordCounter.GetWordCount(rve: TCustomRichViewEdit): Integer;
var table: TRVTableItemInfo;
    r, c: Integer;
begin
  Result   := 0;
  FCounter := 0;
  rve := rve.TopLevelEditor;
  if rve.RVData.PartialSelectedItem<>nil then begin
    // multicell selection
    FSelRVData := nil;
    table := rve.RVData.PartialSelectedItem as TRVTableItemInfo;
    for r := 0 to table.RowCount-1 do
      for c := 0 to table.ColCount-1 do
        if (table.Cells[r,c]<>nil) and table.IsCellSelected(r, c) then
          RunRVData(rve, table.Cells[r,c], 0, table.Cells[r,c].GetOffsBeforeItem(0));
    end
  else begin
    if not rve.SelectionExists then
      exit;
    // normal selection
    FSelRVData := rve.RVData;
    FSelRVData.GetSelectionBoundsEx(FSelStartItemNo, FSelStartOffs,
      FSelEndItemNo, FSelEndOffs, True);
    RunRVData(rve, rve.RVData, FSelStartItemNo, FSelStartOffs);
  end;
  Result := FCounter;
end;

sr1111
Posts: 27
Joined: Thu Jan 31, 2008 12:18 pm

Post by sr1111 » Sun Jan 18, 2009 3:54 am

I used this function in the delphi2009 error

[DCC Warning] Unit3.pas(777): W1050 WideChar reduced to byte char in set expressions. Consider using 'CharInSet' function in 'SysUtils' unit.

can you fix this function


function WordCount(t: String): LongInt;
var
ws: Boolean;
wc: Integer;
i: Integer;
begin
ws := True; // In whitespace
wc := 0;
for i := 0 to Length(t) - 1 do
begin
if t in [' ', #9, #10, #13] then
ws := True
else if ws then
begin
ws := False;
Inc(wc);
end;
end;
Result := wc;
end;

chmichael
Posts: 12
Joined: Sat Aug 27, 2005 4:53 pm
Location: Greece

Post by chmichael » Sun Jan 18, 2009 3:35 pm

It is possible to save both counts into the RVF file for performance reasons ?

Sergey Tkachenko
Site Admin
Posts: 13636
Joined: Sat Aug 27, 2005 10:28 am
Contact:

Post by Sergey Tkachenko » Sun Jan 18, 2009 6:20 pm

You can save any additional information in DocProperties

chmichael
Posts: 12
Joined: Sat Aug 27, 2005 4:53 pm
Location: Greece

Post by chmichael » Sun Jan 18, 2009 7:40 pm

Thanks!

Petko
Posts: 166
Joined: Tue Sep 06, 2005 12:42 pm

Post by Petko » Wed Mar 10, 2010 4:09 pm

It is possible to count lines and paragraphs too, and if yes, do you think that it will be too time consuming when working with large documents (compared to char and word count)?

Sergey Tkachenko
Site Admin
Posts: 13636
Joined: Sat Aug 27, 2005 10:28 am
Contact:

Post by Sergey Tkachenko » Wed Mar 10, 2010 8:34 pm

No, it will be fast.
Count of lines - do you mean lines that depend on word wrapping?

How do you want to calculate the count of paragraphs and lines for tables?

Petko
Posts: 166
Joined: Tue Sep 06, 2005 12:42 pm

Post by Petko » Thu Mar 11, 2010 9:47 am

I need something similar to Word's word count. I've made a test and it seems to work like this:

* Lines - they depend on word wrapping and each table cell is counted as a line, if the cell is empty
* Paragraphs - it seems that only items, containing text are counted as paragraphs (empty table cells are not)

You can see my test file here:
http://www.box.net/shared/v9ximbdj0f

Anyway, absolute accuracy is not necessary, speed is priority here. People using a word count function need accuracy in the word and character counts more.

Sergey Tkachenko
Site Admin
Posts: 13636
Joined: Sat Aug 27, 2005 10:28 am
Contact:

Post by Sergey Tkachenko » Thu Mar 11, 2010 7:14 pm

I created functions calculating these values in the most natural way for TRichView. It may be different from Word.

Paragraph count

Code: Select all

uses CRVData, RVTable;

function GetParagraphCount(RVData: TCustomRVData): Integer;
var i,r,c: Integer;
  table: TRVTableItemInfo;
begin
  Result := 0;
  for i := 0 to RVData.ItemCount-1 do begin
    if RVData.IsParaStart(i) then
      inc(Result);
    if RVData.GetItemStyle(i)=rvsTable then begin
      table := TRVTableItemInfo(RVData.GetItem(i));
      for r := 0 to table.RowCount-1 do
        for c := 0 to table.ColCount-1 do
          if table.Cells[r,c]<>nil then
            inc(Result, GetParagraphCount(table.Cells[r,c].GetRVData));
    end;
  end;
end;
Use: Count := GetParagraphCount(RichViewEdit1.RVData);
In the show-special-characters mode, you can see pilcrow characters at the end of paragraphs. GetParagraphCount returns the count of these characters. Note that they are displayed in places where MS Word does not display them:
- at the end of table cells
- at the end of tables.
Last edited by Sergey Tkachenko on Thu Mar 11, 2010 7:24 pm, edited 1 time in total.

Sergey Tkachenko
Site Admin
Posts: 13636
Joined: Sat Aug 27, 2005 10:28 am
Contact:

Post by Sergey Tkachenko » Thu Mar 11, 2010 7:19 pm

Line count

Code: Select all

uses CRVFData, RVTable, DLines;

function GetLineCount(RVData: TCustomRVFormattedData): Integer;
var i,r,c: Integer;
  table: TRVTableItemInfo;
begin
  Result := 0;
  for i := 0 to RVData.DrawItems.Count-1 do begin
    if RVData.DrawItems[i].FromNewLine then
      inc(Result);
    if RVData.GetItemStyle(RVData.DrawItems[i].ItemNo)=rvsTable then begin
      table := TRVTableItemInfo(RVData.GetItem(RVData.DrawItems[i].ItemNo));
      for r := 0 to table.RowCount-1 do
        for c := 0 to table.ColCount-1 do
          if table.Cells[r,c]<>nil then
            inc(Result, GetLineCount(TCustomRVFormattedData(table.Cells[r,c].GetRVData)));
    end;
  end;
end;
Use: Count := GetLineCount(RichViewEdit1.RVData);

This code uses some undocumented methods.

Post Reply