This tutorial explains how users can inject their own OpenCL compute kernels for use with ViennaCL objects
We start with including the necessary headers:
The next step is to define the custom compute kernels in a string. It is assumed that you are familiar with writing basic OpenCL kernels. If this is not the case, please have a look at one of the many OpenCL tutorials in the web.
We define two custom compute kernels which compute an elementwise product and the element-wise division of two vectors.
Input: v1 ... vector
v2 ... vector
Output: result ... vector
Algorithm: set result[i] <- v1[i] * v2[i]
or result[i] <- v1[i] / v2[i]
(in MATLAB notation this is 'result = v1 .* v2' and 'result = v1 ./ v2');
Since no auxiliary routines are needed, we can directly start with main().
Initialize OpenCL vectors:
Fill the operands vec1 and vec2 with some numbers.
Set up the OpenCL program given in my_compute_kernel: A program is one compilation unit and can hold many different compute kernels.
Now we can get the kernels from the program 'my_program'. (Note that first all kernels need to be registered via add_kernel() before get_kernel() can be called, otherwise existing references might be invalidated)
Launch the kernel with 'vector_size' threads in one work group Note that std::size_t might differ between host and device. Thus, a cast to cl_uint is necessary for the forth argument.
Print the result:
We are already done. We only needed a few lines of code by letting ViennaCL deal with the details :-)
: User-Defined Kernels