How to use custom NTT/CRT library

Can anyone tell me how to use my own NTT/CRT routines?
I have looked at following directories, but it is hard to understand how it works.

  • src/include/math/hal/bigintntl
  • src/include/lattice/hal/default
  • src/lib/lattice/hal/default

I also tried to figure out how Intel Hexl-FPGA offloads, but it is not clear to me so far.

Any tip or comment would be grateful.
Thanks in advance.

Could you provide more details on your custom NTT/CRT implementation?
Which hardware platform is it targeting? Which programming language it is written in?

OpenFHE includes integration of Intel HEXL, an open-source library that takes advantage of optimizations in the latest Intel® AVX-512 instructions targeting specific Intel CPUs. As of now, OpenFHE does not include integration with Intel HEXL-FPGA yet.

Bump to follow up on this

I am not sure but there is a simple workaround to plug in NTT and INTT for a specific data type such as intnat.
This not what I am looking for and I believe there should be a systematic backend plug-in like Intel Hexl. If anyone give any tips to do this it would be grateful.

  1. Go to “src/core/include/math/hal/intnat” directory
  2. Look at ‘transfomnat-impl.h’, where a number of forward and inverse NTTs are provided, but only a few of them will be called while running a program. One particular example will be as follows.

void NumberTheoreticTransformNat::
ForwardTransformToBitReverseInPlace(const VecType& rootOfUnityTable,
const VecType& preconRootOfUnityTable,

                                                           VecType* element) { .... }

void NumberTheoreticTransformNat::
const VecType& rootOfUnityInverseTable,
const VecType& preconRootOfUnityInverseTable,
const IntType& cycloOrderInv,
const IntType& preconCycloOrderInv,
VecType* element) { … }

  1. Now you can add your own NTT and INTT. Do not forget to follow how these member methods get data and put results, which uses negative wrapped convolution.

By the way, I’m interested in accelerating NTT/INTT/CRT/ICRT/ModMul operations using FPGA through PCIe.

I’m also looking forward to hearing more from those who have better idea. Thanks.

FYI we have plans to expand HAL to PCI based solutions, where we include operations in the lattice layer. Clearly one has to manage memory differently than in the AVX512 solution.
Does the HEXL-HAL give any insight for NTT?

Not that in OpenFHE NTT an iNTT do not use bit reversal for efficiency reasons. the only operations that need that are automorphism, and there we fold the bit reversal into the re-indexing operation.