Deadlock with Synchronize in TRVWordEnumThread (XE6)

General TRichView support forum. Please post your questions here
anovikov
Posts: 5
Joined: Fri Oct 03, 2014 2:33 am

Deadlock with Synchronize in TRVWordEnumThread (XE6)

Post by anovikov » Fri Oct 03, 2014 2:52 am

1. Problem background

Here is the new method implementation of

procedure TRVWordEnumThread.Finish;
{$IFNDEF RICHVIEWDEF6}
var Msg: TMsg;
{$ENDIF}
begin
if StopWorking=3 then
exit;
StopWorking := 1;
Priority := tpNormal;
while (StopWorking=1) and not Suspended do
{$IFDEF RICHVIEWDEF6}CheckSynchronize;{$ELSE}
if PeekMessage(Msg, 0, $8FFF, $8FFF, PM_REMOVE) then
TranslateMessage(Msg);
{$ENDIF};
if Suspended then begin
Terminate;
Resume;
end
else
Terminate;
end;


It is supposed to wait until the TRVWordEnumThread.Execute finishes gracefully and enters the Suspended state. To do that, the TRVWordEnumThread.Execute calls TRVWordEnumThread.SyncProc via Synchronize() upon exitting.

This mechanism does not work under scenarios leading to freezing the application.

2. Here is the problem description.

The TRVWordEnumThread.Execute attempts to call Synchronize(SynProc) and enters waiting state till message queue is empty and this call can be completed in the MainThread. However, if the MainThread is already executing the TRVWordEnumThread.Finish; there is a deadlock as the scheduled Synchronize cannot be completed (CheckSynchronize does NOT trigger waking of the Synchronize!) and so the application enters the eternal loop, as the MainThread waits for CheckSynchronize to return, and the TRVWordEnumThread waits for Synchronize to return.

3. How to reproduce

a. Place a RichViewEdit on a form with a lot of other controls (to allow for some time creating the form).
b. Create this form in run-time and immediately click on the RichViewEdit control
c. The application will freeze as on Focus, TRVWordEnumThread.Finish will be called, but the TRVWordEnumThread.Execute is not completed yet.

4. How to hot-fix

a. Replace CheckSynchronize in the TRVWordEnumThread.Finish method with Sleep(1)
b. Comment out Synchronize(SyncProc) to avoid running dummy SyncProc altogether and hence the deadlock

5. Patch required

Sergei, please provide a more robust synchronization solution to this problem as this hot fix is still not 100% clean (ideally you want this status request/response signalling done via WIN API Events). I can provide further detail in the private messages how we fixed this problem with other similar situations. We have encountered that this is a design bug with Delphi and we are NOT using Synchronize() / CheckSynchronize() / ProcessMessages for Main Thread synchronization.

anovikov
Posts: 5
Joined: Fri Oct 03, 2014 2:33 am

Re: Deadlock with Synchronize in TRVWordEnumThread (XE6)

Post by anovikov » Fri Oct 03, 2014 3:19 am

anovikov wrote: 5. Patch required

Sergei, please provide a more robust synchronization solution to this problem as this hot fix is still not 100% clean (ideally you want this status request/response signalling done via WIN API Events). I can provide further detail in the private messages how we fixed this problem with other similar situations. We have encountered that this is a design bug with Delphi and we are NOT using Synchronize() / CheckSynchronize() / ProcessMessages for Main Thread synchronization.
Additionally, you might want to use InterlockedExchange() when changing the value of the TRVWordEnumThread.StopWorking. Delphi does not support volatile declarations (yet); so using the InterlockedExchange() will help avoid read/write reordering on the CPU level. Also, your existing code will stop working if the compiler starts caching the memory value of the StopWorking into a local variable. You might think about rewriting this synchronization code altogether as this is not a good synchronization practice subject to race conditions on compiler and CPU levels.

Sergey Tkachenko
Site Admin
Posts: 14755
Joined: Sat Aug 27, 2005 10:28 am
Contact:

Post by Sergey Tkachenko » Fri Oct 03, 2014 10:19 am

CheckSynchronize does call thread's Synchronize method (actually, synchronized methods are always called from CheckSynchronize).
So if the main thread calls CheckSynchronize periodically, a deadlock with a thread calling Synchronize is not possible.

This is not very gracious method, because this cycle eats CPU time. If there are some other actively working threads (created by other components), it might be a problem, but not a deadlock, since it will finish soon or later.

Sorry, your fix may lead to a deadlock, because waiting without calling CheckSyncronize will lead to deadlock when a thread calls Synchronize(AddMisspelling).

Please send me stacks of threads when your application freezes to richviewgmailcom

Sergey Tkachenko
Site Admin
Posts: 14755
Joined: Sat Aug 27, 2005 10:28 am
Contact:

Post by Sergey Tkachenko » Fri Oct 03, 2014 10:26 am

As for InterlockedExchange, I do not think it is necessary in case of StopWorking.
A variable assignment is already an atomic operation.
InterlockedExchange is useful only when we need both assign a new value and use the previous value.

Sergey Tkachenko
Site Admin
Posts: 14755
Joined: Sat Aug 27, 2005 10:28 am
Contact:

Post by Sergey Tkachenko » Fri Oct 03, 2014 10:30 am

Can you test simply by adding sleep after checksynchronize:

change

Code: Select all

CheckSynchronize
to

Code: Select all

begin CheckSynchronize; Sleep(1); end
without removing Synchronize(SyncProc)?

It will reduce CPU usage on waiting.

anovikov
Posts: 5
Joined: Fri Oct 03, 2014 2:33 am

Post by anovikov » Fri Oct 03, 2014 6:23 pm

Sergey, thank you for quick but unfortunately rushed reply. Please read below the responses to your comments.

1. “CheckSynchronize does call thread's Synchronize method (actually, synchronized methods are always called from CheckSynchronize).
So if the main thread calls CheckSynchronize periodically, a deadlock with a thread calling Synchronize is not possible.”

Unfortunately, this is not true. As I mentioned before, there is an architectural flaw in the Delphi design, which leads to the deadlock under certain scenario. Please take your time and reproduce it as per my instructions. It happens universally (i.e. it’s not your code, it’s the Delphi internals). You should NOT use this as is.

2. “Sorry, your fix may lead to a deadlock, because waiting without calling CheckSyncronize will lead to deadlock when a thread calls Synchronize(AddMisspelling).”

Unfortunately, call Synchronize(AddMisspelling) will also deadlock under the same scenario. I agree that neither scenario is a robust solution.

3. “As for InterlockedExchange, I do not think it is necessary in case of StopWorking.
A variable assignment is already an atomic operation.
InterlockedExchange is useful only when we need both assign a new value and use the previous value.”

No, you are unfortunately not correct on 3 accounts.

A. It will be atomic only under certain alignment, the size of a defined type, and the behavior of the underlying hardware. So in general case your statement is NOT true. You are lucky that most of the users fall into the scenario when this is true. So you MUST explicitly make atomic operations either by calling special CPU instructions, or using CRITICAL SECTIONS (or mutexes on Linux) if you don’t want to deal with hardware. In your case, to make it clean, I would use Critical Sections to read/write to this value to guarantee atomic operations.

B. On top of it, InterlockedExchange helps access the variable implicitly as volatile. There is no volatile concept in Delphi which is a huge problem for multithreaded programming. For example, what if the optimizing complier decides to cache your StopWorking value in a arithmetic register? So you will never see that the value changed in the memory by the other thread. You are lucky Delphi compiler is bad; your code would never work in C++ without a volatile declaration (at least). The moment the Delphi compiler starts doing some basic optimizations, your code will need rewriting. I suggest you do it clean from the get-go.

C. Also, InterlockedExchange introduces a full memory barrier that guarantees read/write ordering, besides doing atomic changes. You should read about memory barriers as this concept is extremely important in multithreaded programming, as the CPU may read/write memory in a different order compared to your source code, giving rise to plenty of glitches.

I will be happy to provide you with further assistance and will send you the screenshots as per your request.

BTW, thank you for your excellent product! I just wanted to help you make it better.

Sergey Tkachenko
Site Admin
Posts: 14755
Joined: Sat Aug 27, 2005 10:28 am
Contact:

Post by Sergey Tkachenko » Sat Oct 04, 2014 8:51 am

Thank you for InterlockedExchange explanation, I will change the code to use it.

As for potential deadlock solving. Unfortunately, as far as I understand, a simple patching is not possible, if we need to remove Synchronize (and replace it to events and PostMessage), it's a lot of work, close to the full rewriting.

So, I need to understand why CheckSynchronize does not work.

In your test, you suggest to click to call thread.Finish. But clicking does not call Finish, Finish is called from ClearLiveSpellingResults (which is normally called when the editor is cleared)

Sergey Tkachenko
Site Admin
Posts: 14755
Joined: Sat Aug 27, 2005 10:28 am
Contact:

Post by Sergey Tkachenko » Sat Oct 04, 2014 9:13 am

By the way, if the compiler decides to cache StopWorking, can it pass a cached value to InterlockedExchange? It's not volatile, after all.
Last edited by Sergey Tkachenko on Sun Oct 05, 2014 6:21 am, edited 1 time in total.

Sergey Tkachenko
Site Admin
Posts: 14755
Joined: Sat Aug 27, 2005 10:28 am
Contact:

Post by Sergey Tkachenko » Sat Oct 04, 2014 9:33 am

There is one case of deadlock I can imagine.
When CheckSynchronize is called in a cycle, it calls a synchronized method of another thread (not a live spelling thread, but another one), and this method hangs.

So, do you use thirdparty components, or may be your own code uses Synchronize?

anovikov
Posts: 5
Joined: Fri Oct 03, 2014 2:33 am

Post by anovikov » Sun Oct 05, 2014 6:47 pm

Sergey Tkachenko wrote: So, do you use thirdparty components, or may be your own code uses Synchronize?
I have emailed you the RVThread implementation with the atomic, volatile read/write to thread status.

I also added the unit that fixes the synchronization deadlock universally. I inherited your RVThread from TSynchronizedThread and replaced CheckSynchronize by SyncOrAbort implemented in the TSynchronizedThread unit. Please note that TSynchronizedThread has its own Synchronize() method implementation so it uses a different mechanism to synchronize with the Main Thread that helps avoid the deadlock.

BTW I did not have time to fully test this new unit in your case (it just works for now and I have other things to do), but you might want to run different tests on to ensure this mechanism solves this problem without creating new issues.

You can go ahead and add this to your library if it works for you in the end as the fix.

Sergey Tkachenko
Site Admin
Posts: 14755
Joined: Sat Aug 27, 2005 10:28 am
Contact:

Post by Sergey Tkachenko » Mon Oct 06, 2014 3:25 pm

I answered by email.

x_ginger
Posts: 12
Joined: Fri May 07, 2010 9:22 am

Post by x_ginger » Wed Dec 17, 2014 4:12 pm

We have the same problem. We use Synchronize in our another code, and when I call Finish to stop live spelling, sometimes I fall into CheckSynchronize loop and have no exit.
If you have any solution I'd like to see it. Thanks.

Sergey Tkachenko
Site Admin
Posts: 14755
Joined: Sat Aug 27, 2005 10:28 am
Contact:

Post by Sergey Tkachenko » Wed Dec 17, 2014 4:35 pm

Does your procedure called via Synchronize contain some waiting code?

x_ginger
Posts: 12
Joined: Fri May 07, 2010 9:22 am

Post by x_ginger » Fri Dec 19, 2014 3:16 pm

One of threads calls Synchronize, and another calls just Sleep. And sometimes they are blocked by the live spelling thread. Before calling Finish for the live spelling thread it was OK

Sergey Tkachenko
Site Admin
Posts: 14755
Joined: Sat Aug 27, 2005 10:28 am
Contact:

Post by Sergey Tkachenko » Fri Dec 19, 2014 9:28 pm

It's hard to tell why it happens without knowing what the tread does in the procedure called from Syncronize. Can you send me more details to richviewgmailcom?

Post Reply