Polynomial Multiplication & other high-level operations on hardware backend

Hi @homer_g.
the Current HAL layer is insufficient for any PCI based accelerator, and also stops at the lowest math and transform level – it needs to be augmented up to the lattice layer in order to really accelerate commercial level FHE workloads. We plan to do this under our DARPA program.
You are actually the vanguard some of this by being the first one to investigate what is and isn’t there in the HAL for a PCI interfaced accelerator.

I think we need some definitions here.

what do you call a poly?
and tell me exactly what your poly_mult(a,b) would do, and how you see it used in FHE?

Because in OpenFHE our poly’s are basically 1) all in RNS form and are called DCRTpoly 2) components of other things such as cipertexts and keys.
Rarely do you just multiply two DCRTpolynomials hence the reason there isn’t an obvious stand alone call. @Caesar points out Times but that is an elementwise hadamard multiply of all the towers in a DCRTpoly and not a “multiplication of two polynomials” which requires a convolution of their coefficients. The code that @Caesar pointed out is what is used for that as direct convolution is N^2 vs NlogN and N is LARGE.

I would suggest maybe you provide us with a list of what your expected acceleration primitives are and then we may be able to suggest similar points in the OpenFHE code. Basically you have a small number of primitives and a huge number of OpenFHE functions, so wouldn’t that make sense?