I am currently doing research on the acceleration of FHE using GPUs. I am playing around with some acceleration of openFHE on AMD GPUs using code written in HIP. Is there currently a GPU backend in development?
I am trying to understand exactly where I should be placing my code in order to properly slot in with the HAL. I originally did this inside the operator+= override in dcrtpoly-impl.h between two dcrtpolyimpls (line 409). This code is executed when I run examples such as simple-real-numbers. However, if I try to benchmark using poly-benchmark-64k, this code is never executed.
My other idea is to duplicate the fixed bitwidth backend, rewrite its vector operations to utilize the GPU, and then compile with that. Is that a better approach?
Any insights would be greatly appreciated.
It is difficult to figure out what caused the problems you encountered without examining your code.
Regarding your alternative approach, I would recommend focusing on the native integer/vector mathematical backends. On GPUs, big integer operations are unnecessary; RNS (Residue Number System) can handle everything effectively, especially if you are interested in GPU acceleration of homomorphic evaluations. Big integer backends are mostly used for the setup phase of OpenFHE which does not need to be accelerated.
I am a little confused. What do you mean for the setup phase of OpenFHE? Key generation and encryption? Why are the big integer backends under the HAL if they aren’t really used for acceleration?
If I understand you correctly, you are saying that GPU acceleration is mostly useful for the matrix operations once the ciphertexts are in CRT/DCRT format which only uses the native integer/vector operations.
The setup phase involves calculating cryptographic parameters and deriving scheme-specific constants. Since these computations are performed only once during system initialization, they can be efficiently executed on the CPU. OpenFHE does require multi-precision (
BigInteger) support during this phase.
The primary cryptographic operations, including Key Generation, Encoding, Decoding, Encryption, Decryption, and Homomorphic Evaluations, can all be implemented using native integers/vectors for enhanced performance. Notably, homomorphic evaluations, which are considered computationally intensive and executed more frequently, should be offloaded to the GPU for accelerated execution.
If you follow the approach above, you can focus your development efforts on supporting the native integer/vector backend only. However, you are always free to implement all the mathematical backends in GPUs, but I do not recommend that.