Jay Kruemcke
By: Jay Kruemcke

October 12, 2020 2:49 pm

742 views

Simplified access to the NVIDIA CUDA toolkit on SUSE Linux for HPC

Overview The High-Performance Computing industry is rapidly embracing the use of AI and ML technology in addition to legacy parallel computing. Heterogeneous Computing, the use of both CPUs and accelerators like graphical processing units (GPUs), has become increasingly more common and GPUs from NVIDIA are the most popular accelerators used today for AI/ML workloads. To […]

Read More


Sumit Jamgade
By: Sumit Jamgade

May 11, 2018 10:00 am

2,418 views

An experiment with Gnocchi – (the database) – Part 3

In Part 2, I summarized the iterations applied to the kernel (the piece of CUDA code that executes on GPU) to remove the bottlenecks encountered during profiling, like using shared memory to avoid non-coalesced memory access. In this part, I will talk about the final version of the kernel and using the GPU in other […]

Read More


Sumit Jamgade
By: Sumit Jamgade

May 7, 2018 8:41 am

3,078 views

An Experiment with Gnocchi – (the Database) – Part 2

In Part 1, I introduced the problem of metrics aggregations for Ceilometer in OpenStack and how Gnocchi (a time-series database) tries to solve it in a different way. I argued about possibility of offloading work to (Nvida Quadro K620) GPU. This part summarizes the iterations applied to the kernel (piece of CUDA code, that executes […]

Read More


Sumit Jamgade
By: Sumit Jamgade

May 3, 2018 2:11 pm

3,583 views

An Experiment with Gnocchi – (the Database) – Part 1

Background (Why): Openstack has component: Ceilometer (Telemetry). Ceilometer was huge and its main purpose was to help with the Monitoring and Metering of whole of th OpenStack. But because of the scale at which OpenStack operates, Ceilometer would often fall behind and become a major bottleneck for the same function it was designed to do. […]

Read More