OpenFHE on AMD APUs

mpuskaric · October 7, 2025, 3:30pm

Hi,

I’ve been experimenting with running parts of the OpenFHE library on the AMD MI300A APU [1], which, unlike a discrete GPU, provides unified shared memory between its CPU and GPU cores.

After profiling and considering the work presented in [2], I modified the ApproxSwitchCRTBasis function. Specifically, I replaced 128-bit integer operations with a struct of two 64-bit integers, and rewrote the related arithmetic functions such as multiplication and Barrett reduction to operate on this new representation.

For the ring dimension of 16384 and p and q size of 4, I measured the performance (wall time) of the modified parallelized loop by executing a single multiplication operation using BFV encryption scheme. So far, the performance is roughly the same as the default implementation. This does not include overhead caused by marshalling and unmarshalling of data for the GPU computation.

I will continue experimenting, any feedback is appreciated.

[1] https://www.amd.com/content/dam/amd/en/documents/instinct-tech-docs/white-papers/amd-cdna-3-white-paper.pdf

[2] Towards GPU Accelerated FHE Computations | IEEE Conference Publication | IEEE Xplore

mpuskaric · February 23, 2026, 3:34pm

I just wanted to comment that I achieved performance improvement of about 25% compared to the CPU (AMD EPYC 9374F), excluding data (un)marshalling overhead. Ring dimensions was 32768. Ensuring that vector and array elements are stored in contiguous memory locations had the most influence on performance.

Topic		Replies	Views
Creating GPU Backend OpenFHE Hardware Acceleration	3	450	November 27, 2023
GPU Acceleration of OpenFHE - GPU memory OpenFHE Hardware Acceleration	1	641	January 26, 2024
Parallel computations in OpenFHE Library Questions	2	551	August 24, 2023
No acceleration on Intel HEXL AVX512 OpenFHE Hardware Acceleration	3	422	September 3, 2023
Running OpenFHE with in the most efficient way for an intensive application Library Questions questions	4	112	February 7, 2025

OpenFHE on AMD APUs

Related topics