본문 바로가기

Robotics/Software Tech.

Synchronization in Multithreaded Applications with MFC

Screenshot - arsynmtmfc.gif

Introduction

This article discusses the basic synchronization concepts and practices that are supposed to be useful for beginners to do multithreaded programming. By saying beginner, I don't mean those that are beginners in learning C++ language, but the people that are somewhat new in multithreaded programming. The main concentration of this article is on synchronization techniques. Thus this article is like a tutorial on synchronization.

The General View

During their execution, threads, more or less, are interoperating with each other. This interoperation may have various forms and may be of various kinds. For example, a thread, after performing the task it is assigned to, informs another thread about it. Then the second thread whose job is a logical continuation of the first thread starts operating.

All the forms of interoperations might be described by the term synchronization which can be supported in several ways. Most usable ones are those whose primary aim is to support synchronization itself. The following objects are intended to support the synchronization (this is not a complete list):

  • Semaphores
  • Mutexes
  • Critical Sections
  • Events

Each of these objects has a different special purpose and usage but the general purpose is to support synchronization. I will introduce them to you through this article later. There are other objects that can be used as synchronization mediums such as Process and Thread objects. Their usage enables a programmer to decide, for example, if a given process or thread has finished its execution or not.

To use the Process and Thread objects for synchronization purposes, we are supposed to use wait-functions. Before getting to learn these functions, you should learn a key concept, that is, any kernel object that can be used as a synchronization object can be in one of the two states; signaled state and nonsignaled state. Except for critical sections, all synchronization objects can be in either of these two states. For example, for Process and Thread objects, the nonsignaled state is encountered when they start their execution and the signaled state is encountered when they finish their execution. To decide whether a given process or thread has finished, we should find out whether their representative objects are in signaled state; to do that, we should turn to the wait-functions.

Wait-functions

The following function is the simplest wait-function amongst the other wait-functions. It has the following declaration format:

DWORD WaitForSingleObject
(
  HANDLE hHandle,
  DWORD dwMilliseconds
);

The parameter hHandle takes the descriptor of an object whose signaled or nonsignaled state is going to be examined. The parameter dwMilliseconds takes the time that the calling thread should wait until the examining object enters the signaled state. As soon as the object is signaled or the given time interval expires, the function returns the control to the caller thread. If dwMilliseconds takes INIFINITE value (-1), the thread will wait until the object becomes signaled. If it doesn't become signaled, the thread will wait forever.

For example, the following call checks whether a process [identified by hProcess descriptor] is in execution or not:

DWORD dw = WaitForSingleObject(hProcess, 0);
switch (dw)
{
   case WAIT_OBJECT_0:
      // the process has exited
      break;

   case WAIT_TIMEOUT:
      // the process is still executing
      break;

   case WAIT_FAILED:
      // failure
      break;
}

As you notice, we passed 0 to the function's dwMilliseconds parameter in which case the function instantly checks the object's state [signaled or nonsignaled] and immediately returns the control. If the object is signaled, the function returns WAIT_OBJECT_0. If it is nonsignaled - WAIT_TIMEOUT is returned. In case of failure, WAIT_FAILED is returned (a failure may occur when an invalid descriptor is passed to the function).

Next wait-function is similar to the previous one except that it takes a list of descriptors and waits until either one of them or all of them become signaled:

DWORD WaitForMultipleObjects
(
  DWORD nCount,
  CONST HANDLE *lpHandles,
  BOOL fWaitAll,
  DWORD dwMilliseconds
);

The parameter nCount takes the number of descriptors to be examined. The parameter lpHandles should point an array of descriptors. If the parameter fWaitAll is TRUE, the function will wait until all the objects become signaled. If it is FALSE, the function returns even if a single object becomes signaled [no matter what the others are]. dwMilliseconds is the same as in the previous function.

For example, the following code decides which process will exit first from the list of given HANDLEs:

HANDLE h[3];
h[0] = hThread1;
h[1] = hThread2;
h[2] = hThread3;

DWORD dw = WaitForMultipleObjects(3, h, FALSE, 5000);
switch (dw)
{
   case WAIT_FAILED:
      // failure
      break;

   case WAIT_TIMEOUT:
      // no processes exited during 5000ms
      break;

   case WAIT_OBJECT_0 + 0:
      // a process with h[0] descriptor has exited
      break;

   case WAIT_OBJECT_0 + 1:
      // a process with h[1] descriptor has exited
      break;

   case WAIT_OBJECT_0 + 2:
      // a process with h[2] descriptor has exited
      break;
}

As we see, the function can return different values which show the reason the function returned. You already know the meaning of the first two values. Next values are returned by the following logic; WAIT_OBJECT_0 + index is returned which shows that the object from the array of HANDLEs whose index is index, has got signaled. If fWaitAll parameter is TRUE, WAIT_OBJECT_0 will be returned [if all the objects become signaled].

A thread, if it calls a wait-function, enters the kernel mode from the user mode. This fact is both bad and good. It is bad because to enter the kernel mode, approximately 1000 processor cycles are required which may be too expensive in a concrete situation. The good point is that after entering the kernel mode, no processor usage is needed; the thread is asleep.

Let's turn to MFC and see what it can do for us. There are two classes that encapsulate calls to wait-functions; CSingleLock and CMultiLock. We will see their usage later in this article.

Synchronization object Equivalent C++ class
Events CEvent
Critical sections CCriticalSection
Mutexes CMutex
Semaphores CSemaphore

Each of these classes inherits a single class - CSyncObject whose most usable member is the overloaded HANDLE operator that returns the underlying descriptor of a given synchronization object. All these classes are declared in <AfxMt.h> include file.

Events

Generally, events are used in cases when a thread [or threads] is supposed to start doing its job after a specified action has occurred. For example, a thread might wait until the necessary data is gathered and then start saving them in the hard drive. There are two kinds of events; manual-reset and auto-reset. By using an event we simply can notify another thread that a specified action has occurred. With a first kind of event, that is manual-reset, a thread can notify more than one thread about a specified action. But with a second kind of event, that is auto-reset, only one can be notified. In MFC, there is CEvent class that encapsulates the event object (in terms of Windows, it is represented by an HANDLE value). The constructor of CEvent allows us to create both manual-reset and auto-reset events. By default, the second kind of event is created. To notify the waiting threads, we should use CEvent::SetEvent method, this means that this kind of call will make the event enter the signaled state. If the event is manual-reset, then it will stay in signaled state until a corresponding CEvent::ResetEvent call is invoked which will make the event enter the nonsignaled state. This is the feature that allows a thread to notify more than one thread by a single SetEvent call. If the event is auto-reset, then only one thread from all waiting threads will be able to receive the notification. After it is received by a thread, the event will automatically enter the nonsignaled state. The following two examples will illustrate these thoughts. The first example:
// create an auto-reset event
CEvent g_eventStart;

UINT ThreadProc1(LPVOID pParam)
{
    ::WaitForSingleObject(g_eventStart, INFINITE);

        ...

    return 0;
}

UINT ThreadProc2(LPVOID pParam)
{
    ::WaitForSingleObject(g_eventStart, INFINITE);

        ...

    return 0;
}

In this code, a global CEvent object is created of auto-reset type. In addition, there are two working threads which are waiting for that event in order to start their job. As soon as a third thread calls SetEvent for that object, one and only one thread from these two threads (note that nobody can say exactly which one) will receive the notification, and afterwards the event will enter the nonsignaled state which will not allow a second thread to catch the event. The code, though not very useful, illustrates how an auto-reset event works. Let's look at the second example:

// create a manual-reset event
CEvent g_eventStart(FALSE, TRUE);

UINT ThreadProc1(LPVOID pParam)
{
    ::WaitForSingleObject(g_eventStart, INFINITE);

        ...

    return 0;
}

UINT ThreadProc2(LPVOID pParam)
{
    ::WaitForSingleObject(g_eventStart, INFINITE);

        ...

    return 0;
}

This code differs from the previous one by only the CEvent constructor's parameters. But in sense of functionality, there is a principal difference in the way that the two threads may work. If a third thread calls SetEvent method for this object, then it will be possible to guarantee that the two threads will start working at the same (almost same) time. This is because a manual-reset event, after entering the signaled state, will not enter the nonsignaled state until a corresponding ResetEvent call is done.

Yet another method for working with events - CEvent::PulseEvent. This method first makes the event enter the signaled state and then makes it enter back into the nonsignaled state. If the event is of manual-reset type, the event enters the signaled state then all the waiting threads are getting notified, and then it enters the nonsignaled state. If the event is of auto-reset type, then only one thread will get notified even if there are many threads waiting. If no thread is waiting, the call to ResetEvent will do nothing.

Example - WorkerThreads

In this example I will show how to create worker threads and how to destroy them properly. Here we define a controlling function which is used by all threads. Every time we click the view, one thread is created. All the created threads use the mentioned controlling function which will draw a moving ellipse in the view's client area. Here a manual-reset event is used which informs all the working threads about their death. Besides, we will see how to make the primary thread wait until all the worker threads leave the scene.

Maxheap example

All the ellipses are traversing in the client area and are not leaving its boundaries

  1. You should have an SDI application open. Assume the project name is WorkerThreads.
  2. Let's have a WM_LBUTTONDOWN message handler for launching our threads.
  3. Declare the controlling function. A controlling function may be declared in any file; the point is that it should have global access. Assume we have a Threads.h/Threads.cpp file in which the controlling function is declared/defined as follows:
    // Threads.h
    #pragma once
    
    struct THREADINFO
    {
        HWND hWnd;
        POINT point;
    };
    
    
    UINT ThreadDraw(PVOID pParam);
    
    Collapse
    // Threads.cpp
    extern CEvent g_eventEnd;
    
    UINT ThreadDraw(PVOID pParam)
    {
        static int snCount = 0;
        snCount ++;
        TRACE("- ThreadDraw %d: started...\n", snCount);
    
        THREADINFO *pInfo = reinterpret_cast<threadinfo /> (pParam);
    
        CWnd *pWnd = CWnd::FromHandle(pInfo->hWnd);
    
        CClientDC dc(pWnd);
    
        int x = pInfo->point.x;
        int y = pInfo->point.y;
    
        srand((UINT)time(NULL));
        CRect rectEllipse(x - 25, y - 25, x + 25, y + 25);
    
        CSize sizeOffset(1, 1);
    
        CBrush brush(RGB(rand()% 256, rand()% 256, rand()% 256));
        CBrush *pOld = dc.SelectObject(&brush);
        while (WAIT_TIMEOUT == ::WaitForSingleObject(g_eventEnd, 0))
        {
            CRect rectClient;
            pWnd->GetClientRect(rectClient);
    
            if (rectEllipse.left < rectClient.left || 
                rectEllipse.right > rectClient.right)
                sizeOffset.cx *= -1;
    
            if (rectEllipse.top < rectClient.top || 
                rectEllipse.bottom > rectClient.bottom)
                sizeOffset.cy *= -1;
    
            dc.FillRect(rectEllipse, CBrush::FromHandle
                ((HBRUSH)GetStockObject(WHITE_BRUSH)));
    
            rectEllipse.OffsetRect(sizeOffset);
    
            dc.Ellipse(rectEllipse);
            Sleep(25);
        }
    
        dc.SelectObject(pOld);
    
        delete pInfo;
    
        TRACE("- ThreadDraw %d: exiting.\n", snCount --);
        return 0;
    }
    

    This function takes a single object via its PVOID parameter, that is, a struct whose fields are the handle of the view, in order to be able to draw on its client area, and the point from where to start the cycle. Note that we should pass the very handle and not a CWnd pointer to let each thread create a temporary C++ object over the handle and use it. Otherwise all of them would share a single C++ object which is a potential danger in sense of safe multithreaded programming. In its core, the controlling function renders a moving circle in the client area of the view. Besides, include <Afxmt.h> file in "StdAfx.h" file to make CEvent visible.

    Another key point here is that we prepare a structure THREADINFO to pass to the thread. This technique is mostly used when there is a need to pass more than one value to a thread (or get them from a thread). We need to pass the window handle of the view and the initial point of the cycle that is going to be created. Each thread deletes the THREADINFO object passed to itself. Beware that this deletion is done in regard to our convention; that is, the primary thread should reserve a heap memory for a THREADINFO object and the targeting thread should delete it. The idea is that the primary thread doesn't know when to do deletion as the object will have been owned by the secondary thread itself.

  4. Declare an array variable in CWorkerThreadView class. We should store the pointer to CWinThread objects to use them later:
    private:
        CArray<CWinThread *, CWinThread *> m_ThreadArray;
    

    Besides, include <AfxTempl.h>; file in "StdAfx.h" file to make CArray visible.

  5. Change the file WorkerThreadsView.cpp. First define a global CEvent manual-reset variable somewhere at the beginning of the file:
    // manual-reset event
    CEvent g_eventEnd(FALSE, TRUE);
    

    Now add code to the WM_LBUTTONDOWN message handler:

    void CWorkerThreadsView::OnLButtonDown()
    {
        THREADINFO *pInfo = new THREADINFO;
        pInfo->hWnd = GetSafeHwnd();
        pInfo->point = point;
    
        CWinThread *pThread = AfxBeginThread(ThreadDraw, 
        (PVOID) pInfo, THREAD_PRIORITY_NORMAL, 0, CREATE_SUSPENDED);
        pThread->m_bAutoDelete = FALSE;
        pThread->ResumeThread();
        m_ThreadArray.Add(pThread);
    }
    

    Be aware that we exclude the auto-deletion property of a newly created thread but instead we store the pointer to that CWinThread object in our array. Note that we create an instance of THREADINFO in the heap and let the thread delete it after it finishes working with the structure. To make ThreadDraw and THREADINFO visible in WorkerThreadsView.cpp file, include "Threads.h" file.

  6. Take care to destroy the threads properly. As all threads are related to the view object (they are working with it), it will be reasonable to destroy them in the view's WM_DESTROY message handler:
    void CWorkerThreadsView::OnDestroy()
    {
        CView::OnDestroy();
    
        // TODO: Add your message handler code here
        g_eventEnd.SetEvent();
        for (int j = 0; j < m_ThreadArray.GetSize(); j ++)
        {
        ::WaitForSingleObject(m_ThreadArray[j]->m_hThread, INFINITE);
        delete m_ThreadArray[j];
        }
    }
    

    This function first makes the event become signaled to notify the working threads about their death, and then it uses WaitForSingleObject to make the primary thread wait for each worker thread until the later is destroyed fully. To do this we should have a valid CWinThread pointer even when the corresponding thread is destroyed; that is why we removed the auto-deletion property from CWinThread objects in the previous step. As soon as a worker thread exits, the second line of the for loop destroys the corresponding C++ object. Note that in each iteration a call to WaitForSingleObject is done which simply results in entering the kernel mode from the user mode. For example, for 10 iterations there will be wasted ~10000 processor cycles. To overcome this moment, we might use WaitForMultipleObjects. In this case we will need a C-array of thread descriptors. So, the above for loop could be replaced with the following code:

    //second way (comment in 'for' loop above)
    int nSize = m_ThreadArray.GetSize();
    HANDLE *p = new HANDLE[nSize];
    
    for (int j = 0; j < nSize; j ++)
    {
        p[j] = m_ThreadArray[j]->m_hThread;
    }
    
    ::WaitForMultipleObjects(nSize, p, TRUE, INFINITE);
    
    for (j = 0; j < nSize; j ++)
    {
        delete m_ThreadArray[j];
    }
    delete [] p;
    

    As the previous code executes only once and in addition at the end of the application, such improvements could hardly be valued much.

  7. This is all. You can test it.

Critical Sections

Unlike other synchronization objects, critical sections are working in the user mode unless there is a need to enter the kernel mode. If a thread tries to run a code that is caught be a critical section, it first does a spin blocking and after a specified amount of time, it enters the kernel mode to wait for the critical section. Actually, a critical section consists of a spin counter and a semaphore; the former is for the user mode waiting, and the later is for the kernel mode waiting (sleeping). In Win32 API, there is a CRITICAL_SECTION structure that represents critical section objects. In MFC, there is a class named CCriticalSection. Conceptually, a critical section is a sector of source code that is needed in integrated execution, that is, during the execution of that part of the code it should be guaranteed that the execution will not be interrupted by another thread. Such sectors of code may be required in cases when there is a need to grant a single thread the monopoly of using a shared resource. A simple case is using global variables by more than one thread. For example:

int g_nVariable = 0;

UINT Thread_First(LPVOID pParam)
{
    if (g_nVariable < 100)
    {
       ...
    }
    return 0;
}

UINT Thread_Second(LPVOID pParam)
{
    g_nVariable += 50;
    ...
    return 0;
}

This is not a safe code as no thread has a monopoly access to g_nVariable variable. Consider the following scenario; assume the initial value of g_nVariable is 80, the control is passed to the first thread which sees that the value of g_nVariable is less than 100 and thus it tries to execute the block under the condition. But at that time the processor switches to the second thread which adds 50 to the variable, so it becomes greater than 100. Afterwards, the processor switches back to the first thread and continues executing the if block. Guess what? Inside the if block the value of g_nVariable is greater than 100 though it is supposed to be less than 100. To cover this gap, we may use a critical section like so:

CCriticalSection g_cs;
int g_nVariable = 0;

UINT Thread_First(LPVOID pParam)
{
    g_cs.Lock();
    if (g_nVariable < 100)
    {
       ...
    }
    g_cs.Unlock();
    return 0;
}

UINT Thread_Second(LPVOID pParam)
{
    g_cs.Lock();
    g_nVariable += 20;
    g_cs.Unlock();
    ...
    return 0;
}

Here, two methods of CCriticalSection class are used. A call to Lock function informs the system that the execution of underlying code should not be interrupted until the same thread makes a call to Unlock function. In response to this call, the system first checks whether that code is not captured by another thread with the same critical section object. If it is, the thread waits until the capturing thread releases the critical section and than captures it itself.

If there are more than two shared resources to be protected, it would be a good practice to use a separate critical section per resource. Do not forget to match Unlock to each Lock. When using critical sections, one should be careful not to prepare mutual blocking situations for collaborating threads. This means that a thread could wait for a critical section to be freed by another thread, which in turn, waits for a critical section that is captured by the first thread. It is obvious that in such a case the two threads will wait forever.

There is a practice to embed critical sections into C++ classes and thus make them thread-safe. This kind of trick might be needed when the objects of a specific class are supposed to be used by more than one thread simultaneously. The big picture looks like this:

class CSomeClass
{
    CCriticalSection m_cs;
    int m_nData1;
    int m_nData2;

public:
    void SetData(int nData1, int nData2)
    {
        m_cs.Lock();
        m_nData1 = Function(nData1);
        m_nData2 = Function(nData2);
        m_cs.Unlock();
    }

    int GetResult()
    {
        m_cs.Lock();
        int nResult = Function(m_nData1, m_nData2);
        m_cs.Unlock();
        return nResult;
    }
};

It's possible that at the same time two or more threads call SetData and/or GetData methods for the same object of CSomeClass type. Therefore, by wrapping the content of those methods, we will prevent the data from getting distorted during those calls.

Mutexes

Mutexes, like critical sections, are designated to protect shared resources from simultaneous accesses. Mutexes are implemented inside the kernel and thus they enter the kernel mode to operate. A mutex can perform synchronization not only between different threads but also between different processes. Such a mutex should have a unique name to be recognized by another process (such mutexes are called named mutexes). MFC represents CMutex class for working with mutexes. A mutex might be used in this way:

CSingleLock singleLock(&m_Mutex);
singleLock.Lock();  // try to capture the shared resource
if (singleLock.IsLocked())  // we did it
{
    // use the shared resource ...

    // After we done, let other threads use the resource
    singleLock.Unlock();
}

Or the same by Win32 API functions:

// try to capture the shared resource
::WaitForSingleObject(m_Mutex, INFINITE);

// use the shared resource ...

// After we done, let other threads use the resource
::ReleaseMutex(m_Mutex);

A mutex can also be used to limit the number of running instances by a single one. The following code might be placed at the beginning of InitInstance method (or WinMain):

HANDLE h = CreateMutex(NULL, FALSE, "MutexUniqueName");
if (GetLastError() == ERROR_ALREADY_EXISTS)
{
    AfxMessageBox("An instance is already running.");
    return(0);
}

To guarantee a globally unique name, use a GUID instead.

Semaphores

In order to limit the number of threads that use shared resources we should use semaphores. A semaphore is a kernel object. It stores a counter variable to keep track of the number of threads that are using the shared resource. For example, the following code creates a semaphore by the MFC CSemaphore class which could be used to guarantee that only 5 threads at a maximum would be able to use the shared resource in a given time period (this fact is indicated by the first parameter of the constructor). It is supposed that no threads have captured the resource initially (the second parameter):

CSemaphore g_Sem(5, 5);

As soon as a thread gets access to the shared resource, the counter variable of the semaphore is decremented by one. If it becomes equal to zero, then any further attempt to use the resource will be rejected until at least one thread that has captured the resource leaves it (in other words, releases the semaphore). We may turn to CSingleLock and/or CMultiLock classes to wait/capture/release a semaphore. We could also use the API functions as shown below:

// Try to use the shared resource
::WaitForSingleObject(g_Sem, INFINITE);
// Now the user's counter of the semaphore has decremented by one

//... Use the shared resource ...

// After we done, let other threads use the resource
::ReleaseSemaphore(g_Sem, 1, NULL);
// Now the user's counter of the semaphore has incremented by one

Communication between Secondary Threads and the Primary Thread

If a primary thread wants to inform a secondary thread about some action, it is convenient to use an event object. But doing vice-versa will be inefficient and not convenient for users since stopping the primary thread to wait for an event may (and mostly does) slow down the application. In this case it would be correct to use user-defined messages to interact with the primary thread. Such a message should be addressed to a specific window which means that the descriptor of such a window should be visible to callers (secondary threads).

To create a user-defined message, we firstly should define an identifier for that message (more correctly - define the message itself). Supposedly, such an identifier should be visible to both the primary thread and secondary threads:

#define WM_MYMSG WM_USER + 1

WM_USER+n messages are supposed to be unique through a window class but not through the application. A more secure [in sense of its uniqueness] way is to use WM_APP+n messages like so:

#define WM_MYMSG WM_APP + 1

Next, a handler method should be declared for the message inside the window class declaration to which (window) the message is going to be addressed:

afx_msg LRESULT OnMyMessage(WPARAM , LPARAM );

Of course, there should be some definition of the method:

LRESULT CMyWnd::OnMyMessage(WPARAM wParam, LPARAM lParam)
{
    // A notification got
    // Do something ...
    return 0;
}

And finally, to assign the handler to the message identifier, ON_MESSAGE macro should be used inside BEGIN_MESSAGE_MAP and END_MESSAGE_MAP pairs:

BEGIN_MESSAGE_MAP(CMyWnd, CWnd)
    ...

    ON_MESSAGE(WM_MYMSG, OnMyMessage)
END_MESSAGE_MAP()

Now a secondary thread having a window handle [that lives in the primary thread], can notify it by the user-defined message as follows:

UINT ThreadProc(LPVOID pParam)
{
    HWND hWnd = (HWND) pParam;

    ...

    // notify the primary thread's window
    ::PostMessage(hWnd, WM_MYMSG, 0, 0);

    return 0;
}

History

This text was first written more than three years ago. At that time I was a two-year old programmer. My intention was to write a book about MFC. Funny? But I was too young to write a book, and thus my chapters have stayed in my computer only. Now I've rewritten a text from there and submitted it to you. And of course, any note you think is worth suggesting about this essay would be appreciated very much.