In Part 2, I summarized the iterations applied to the kernel (the piece of CUDA code that executes on GPU) to remove the bottlenecks encountered during profiling, like using shared memory to avoid non-coalesced memory access. In this part, I will talk about the final version of the kernel and using the GPU in other […]
In Part 1, I introduced the problem of metrics aggregations for Ceilometer in OpenStack and how Gnocchi (a time-series database) tries to solve it in a different way. I argued about possibility of offloading work to (Nvida Quadro K620) GPU.
This part summarizes the iterations applied to the kernel (piece of CUDA code, that executes […]
Openstack has component: Ceilometer (Telemetry). Ceilometer was huge and its main purpose was to help with the Monitoring and Metering of whole of th OpenStack. But because of the scale at which OpenStack operates, Ceilometer would often fall behind and become a major bottleneck for the same function it was designed to do. […]