I am digging into some performance profiling for OpenFHE, and I noticed that, for the existing lib-benchmark.cpp, changing the scaling method from FIXEDMANUAL to FLEXIBLEAUTOEXT (which is the default choice) causes the performance of EvalAdd to nearly triple.
diff --git a/benchmark/src/lib-benchmark.cpp b/benchmark/src/lib-benchmark.cpp
index b92dda15..a49ea70b 100644
--- a/benchmark/src/lib-benchmark.cpp
+++ b/benchmark/src/lib-benchmark.cpp
@@ -77,7 +77,7 @@ using namespace lbcrypto;
CCParams<CryptoContextCKKSRNS> parameters;
parameters.SetScalingModSize(48);
parameters.SetBatchSize(8);
- parameters.SetScalingTechnique(FIXEDMANUAL);
+ parameters.SetScalingTechnique(FLEXIBLEAUTOEXT);
parameters.SetMultiplicativeDepth(mdepth);
auto cc = GenCryptoContext(parameters);
cc->Enable(PKE);
CKKSrns_Add 82.5 us 82.4 us 7973
Whereas without this change:
CKKSrns_Add 31.3 us 31.3 us 21808
Is there an easy way to understand why the scaling method affects the performance here? I am bringing up EvalAdd because it is the simplest example of a nontrivial performance change I am seeing when using the default parameters versus the ones configured in the benchmark file.
For reference, I am building with
cmake -DCMAKE_BUILD_TYPE=Release -DWITH_NTL=OFF -DWITH_TCM=OFF -DMATHBACKEND=4 -DWITH_NATIVEOPT=OFF -DNATIVE_SIZE=64 -DBUILD_BENCHMARKS=ON -DBUILD_EXAMPLES=OFF -DBUILD_UNITTESTS=OFF ..
And my platform details are:
-- The C compiler identification is Clang 18.1.8
-- The CXX compiler identification is Clang 18.1.8
-- Architecture is x86_64
-- Found OpenMP_C: -fopenmp=libomp (found version "5.1")
-- Found OpenMP_CXX: -fopenmp=libomp (found version "5.1")
(etc., ask if you need more info here)
I tried the same think with Clang 19.1.7, just to see if the compiler version was the issue, but the values are roughly the same.