Why Researchers Should Use C++ Unit Testing

In this post, I will explain how to use GTest (Google C++ Testing Framework) configured with CMake to handle C++ unit testing. Unit testing ensures you the correctness of your code especially when you modify it to incorporate optimizations / special conditions.

Purpose Statement

Often in academia and research, people tend to think that investing time to proper testing is a waste of time since they expect to prototype multiple aspects and to observe whether their initial hypothesis was correct. But I have seen how this approach leads to very complicated projects and when you introduce new people to a project, it’s virtually impossible for them to understand the functionality of specific sections of the code. Therefore, I am a strong proponent of unit testing. Testing allows to focus on the functionality and gives a sense of assurance that your code works!

Differences in C++ Unit Testing between Academia and Industry

From my 3+ years of industrial experience, I could comment that in commercial environments, your tests need to be extremely thorough. For example, the assumptions you make about user inputs should be kept to a minimum (eg:- handling unexpected inputs, string literals, negative numbers etc.). Also, exception handling and terminating conditions are vital since you want your application to safely handle all possible error conditions.

However, in research, the focus is more on the functionality and it is safe to make less restricted assumptions. For example, input sizes to your algorithm can be safely assumed to be unsigned integers. The objective is to guarantee functionality under a reasonable level of assumptions and not to produce production level code.

Testing Parallel Merge Sort

In this post, I’m using an example from my previous post https://malithjayaweera.com/2019/02/parallel-merge-sort/. I will walk you through step by step on configuring Google tests and writing them. The code is available in the github repository: https://github.com/malithj/gtest-merge-sort

Step 1: Configuring CMake Project

You will need to add Google tests libraries and header files to build with your project. The easiest way is to configure it as an external install. This will ensure that CMake downloads the Google Test github repository, builds the library and links it with your program. A sample CMakeLists.txt would look like as given below. The CMake version used in this example is 3.15.1

ExternalProject_Add(googletest_
    GIT_REPOSITORY https://github.com/google/googletest
    CMAKE_ARGS -DCMAKE_INSTALL_PREFIX=${EXTERNAL_INSTALL_LOCATION}
    GIT_TAG release-1.8.1
)

include_directories(${EXTERNAL_INSTALL_LOCATION}/include)
link_directories(${EXTERNAL_INSTALL_LOCATION}/lib)

Step 2: Adding the Link Libraries

Next you will need to link the Google test library with your binary.

# Add executable target with source files listed in SOURCE_FILES variable
add_executable(main)

target_sources(main
    PUBLIC
        ${CMAKE_CURRENT_LIST_DIR}/parallel_merge_sort.cc
        ${CMAKE_CURRENT_LIST_DIR}/../test/main.cc
)

add_dependencies(main googletest_)
target_link_libraries(main gtest gtest_main pthread)

Now, if you execute cmake followed by make, a binary should be created for your program.

Step 3: Writing a Test Case

Ideally, each test case should cover one aspect of functionality. In this scenario, I’m writing tests to ensure that the program handles the input size and concurrency properly. In order to do that, I will compare results from the std::sort library and the parallel merge sort algorithm that I have written. The code is available in the test directory (test/merge_test_suite.h).

/*  Test to check whether sort is performed properly with varying  *
 *  input sizes but with fixed parallelism                         */
TEST(PARALLEL_MERGE_SORT, INPUT_SIZE) {
    srand(SEED);
    int length = 10000;
    int num_threads = 2;
    unsigned int lower_lim = 1;
    unsigned int upper_lim = 10000;

    /* change input sizes and test sort */
    for (int j = 10; j < 1000; j ++) {
        length = j;
        /* define array */
        int * test_arr = (int *)malloc(sizeof(int) * length);
        int * actual_arr = (int *)malloc(sizeof(int) * length);

        /* initialize array with random numbers */
        for (int i = 0; i < length; i ++) {
            int random = generate_random_number(lower_lim, upper_lim);
            test_arr[i] = random;
            actual_arr[i] = random;
        }

        /* sort */
        parallel_merge_sort(test_arr, length, num_threads);
        std::sort(actual_arr, actual_arr + length); 

        /* perform comparison */
        for (int i = 0; i < length; i ++) {
            EXPECT_EQ(test_arr[i], actual_arr[i]);
        }
        free(test_arr);
        free(actual_arr);
    }
}


/*  Test to check whether sort is performed properly with varying  *
 *  concurrency but with fixed input size                       
 */
TEST(PARALLEL_MERGE_SORT, NUM_THREADS) {
    srand(SEED);
    int length = 10000;
    int num_threads = 2;
    unsigned int lower_lim = 1;
    unsigned int upper_lim = 10000;

    /* change input sizes and test sort */
    for (int j = 1; j < 5; j ++) {
        num_threads = j;
        /* define array */
        int * test_arr = (int *)malloc(sizeof(int) * length);
        int * actual_arr = (int *)malloc(sizeof(int) * length);

        /* initialize array with random numbers */
        for (int i = 0; i < length; i ++) {
            int random = generate_random_number(lower_lim, upper_lim);
            test_arr[i] = random;
            actual_arr[i] = random;
        }

        /* sort */
        parallel_merge_sort(test_arr, length, num_threads);
        std::sort(actual_arr, actual_arr + length); 

        /* perform comparison */
        for (int i = 0; i < length; i ++) {
            EXPECT_EQ(test_arr[i], actual_arr[i]);
        }
        free(test_arr);
        free(actual_arr);
    }
}

Step 4: Configuring the Driver for Tests

You can include a main program to drive the tests by including the header file with tests.

#include "gtest/gtest.h"
#include "merge_test_suite.h"

int main(int argc, char* argv[]) {
    testing::InitGoogleTest(&argc, argv);
    unsigned int __result__ = RUN_ALL_TESTS();
    return __result__;
}

Step 5: Add CTest to Automate Testing

By including include(CTest) in your CMakeLists.txt, you can ensure that tests can be executed via CMake testing module.

And That’s it! Writing tests can be cumbersome at times but it will save you a significant amount of debugging time in the future. I ended up thanking myself for writing unit tests when I had to debug extreme cases.

One thought on “Why Researchers Should Use C++ Unit Testing

Leave a Reply

Your email address will not be published. Required fields are marked *