Hi
I compiled the library with and without OpenMP support and the profiled results show that the performance w/o OpenMP performs faster than the code compiled with OpenMP enabled.
Compiling library disabling OpenMP (presume to run in the single thread manner)
Compilation Command:
cmake -DWITH_OPENMP=OFF ..
make -j
./bin/
The profiled latency from different kernels is shown below:
--------------------------------------------------------------------
Benchmark Time CPU Iterations
--------------------------------------------------------------------
CKKSrns_KeyGen 2266 us 2266 us 309
CKKSrns_MultKeyGen 4387 us 4386 us 161
CKKSrns_EvalAtIndexKeyGen 4746 us 4745 us 147
CKKSrns_Encryption 1622 us 1622 us 433
CKKSrns_Decryption 160 us 160 us 4384
CKKSrns_Add 31.5 us 31.5 us 22255
CKKSrns_AddInPlace 23.4 us 23.4 us 29659
CKKSrns_MultNoRelin 289 us 288 us 2425
CKKSrns_MultRelin 2770 us 2769 us 253
CKKSrns_Relin 2531 us 2530 us 276
CKKSrns_RelinInPlace 2623 us 2621 us 267
CKKSrns_Rescale 449 us 449 us 1555
CKKSrns_RescaleInPlace 439 us 439 us 1590
CKKSrns_EvalAtIndex 2456 us 2456 us 282
Compiling library enabling OpenMP (presume to run in the single thread manner)
Compilation Command:
cmake ..
make -j
./bin/
The profiled latency from different kernels is shown below:
--------------------------------------------------------------------
Benchmark Time CPU Iterations
--------------------------------------------------------------------
CKKSrns_KeyGen 2561 us 2561 us 277
CKKSrns_MultKeyGen 4580 us 4580 us 153
CKKSrns_EvalAtIndexKeyGen 4880 us 4880 us 143
CKKSrns_Encryption 1623 us 1623 us 431
CKKSrns_Decryption 149 us 149 us 4656
CKKSrns_Add 24.4 us 24.4 us 28618
CKKSrns_AddInPlace 18.7 us 18.7 us 37505
CKKSrns_MultNoRelin 185 us 185 us 3812
CKKSrns_MultRelin 57316 us 55202 us 12
CKKSrns_Relin 57121 us 54685 us 10
CKKSrns_RelinInPlace 57108 us 55116 us 12
CKKSrns_Rescale 453 us 453 us 1547
CKKSrns_RescaleInPlace 447 us 447 us 1564
CKKSrns_EvalAtIndex 56590 us 54281 us 13
Might I ask whether the multi-thread performance becomes worse on Relin, RelinInPlace, EvalAtIndex etc.?
Best
Jianming