How to Write a Makefile with Ease

Makefiles provide a way to organize build steps involved in C / C++ project compilation. This article explains how you can set up your own makefile for your C / C++ project. Why Use a Makefile? Usual compilation with g++ will involve a command as follows. The command will compile each C++ source file and…

Why Researchers Should Use C++ Unit Testing

In this post, I will explain how to use GTest (Google C++ Testing Framework) configured with CMake to handle C++ unit testing. Unit testing ensures you the correctness of your code especially when you modify it to incorporate optimizations / special conditions. Purpose Statement Often in academia and research, people tend to think that investing…

Hacking DNN Architectures in Cloud Environments

Deep Neural Networks (DNN) are increasingly being deployed in commercial cloud environments. However, in shared environments your intellectual property might not be safe as you once thought. In fact, you might just have allowed your competitor to reverse engineer your architecture. In this post, I will explain how an adversary can use the cache hierarchy…

IWOCL 19: Let’s Teach are Computers to Program

Recently I had the privilege of joining the organizing committee of International Workshop on OpenCL (IWOCL) 2019, as a volunteer. The workshop was three days; However, I was excited to attend a conference held in conjunction with IWOCL which was DHPCC++ (Distributed & Heterogeneous Programming in C/C++). IWOCL: The Push Towards Heterogeneous Programming The end…

Compilation and Linking Cuda with C

Managing complexity and modularity becomes important as your project scope increases. Therefore, separate compilation and linking Cuda with C is a must have. Learn how you could compile your Cuda code separately and link with your C object code. Example Files As an example, we will look at a stencil computation (nearest neighbor computation). Let’s…

Parallel Merge Sort with Pthreads

Most of the implementations in the web for parallel merge sort do not consider how elements are divided between threads, if the total number of elements is not perfectly divisible by the number of threads. Also, the final merge (having joined all threads) should happen in a recursive manner. But first, let’s go through a…