Running OpenFHE with in the most efficient way for an intensive application

ngesbrian23 · February 1, 2025, 12:32am

I have access to 96CPUs of AMD EPYC 7713 64-Core Processor to run my OpenFHE application. When I run it naively, on the system, It takes about 90mins to complete a single round of my job and I have 100 rounds to go through in my for-loop. I am heavily under utilizing this sources as I use just only about 10% of the resources I have access too.

What is the best way to parallelize openFHE or get the maximum throughput for my job within this system? I am not really resource bounded. I have looked at openfhe-development/src/core/examples/parallel.cpp at main · openfheorg/openfhe-development · GitHub and I am not very sure whether I should try to use openMP to parallelize all my loops or there are some openFHE configurations I can do in the build process of the library or my code to increase the performance by default.

Thanks

ypolyakov · February 6, 2025, 5:11am

Please read the following section of OpenFHE performance optimization guidelines: openfhe-development/docs/static_docs/Best_Performance.md at main · openfheorg/openfhe-development · GitHub

adolgert · February 6, 2025, 8:12pm

Hi. I’ve run OpenFHE on large HPC clusters. You can only call the OpenFHE library from a single thread within each process. If you use OpenMP, for instance, to make calls that look like they are independent, you will usually see random failures. These result from some overlapping use of data within the library. That means that if you want to use every one of 64 cores, you need to start 64 processes and exchange data among the processes using, for instance, MPI, TCP/IP, or other message protocols.

There is one important exception. It is possible to serialize and deserialize ciphertexts in parallel using OpenMP. That works well, which is great because this is a slow part of parallel code.

Yuriy pointed to a document that recommends using the parallelism built into the library. He’s right, and this means that, if you have 64 cores, you might start 4 or 8 processes, each of which will try to use a bunch of cores. You could limit each of those processes to a subset of the cores. It depends on the workload, meaning the timing of specific calls, as to which is most efficient. You would experiment with this balance if you have time.

I hope I’ve saved you some heartache. Good luck!
Drew

ngesbrian23 · February 7, 2025, 12:31am

thank you for the input

ngesbrian23 · February 7, 2025, 12:32am

thank you for your input.

Topic		Replies	Views
Multithreading support in OpenFHE Library Questions	1	404	June 22, 2023
Parallel computations in OpenFHE Library Questions	2	419	August 24, 2023
OpenFHE Single thread Library Questions	9	602	October 18, 2022
Amount of OpenMP threads limited to 8 Library Questions questions	1	50	November 7, 2024
How use open mp with opefhe python Library Questions	1	30	January 14, 2025

Running OpenFHE with in the most efficient way for an intensive application

Related topics