Question about the Cooley-Tukey operation in NTT

Yusaku_suzuki · January 27, 2025, 5:29am

I have a question about the ForwardTransformToBitReverseInPlace method in the NumberTheoreticTransformNat class.
Source name is transformnat-impl.h.
Function Name is below.
template <typename VecType> void NumberTheoreticTransformNat<VecType>::ForwardTransformToBitReverseInPlace(const VecType& rootOfUnityTable, const VecType& preconRootOfUnityTable, VecType* element);

Does the Cooley-Tukey (CT) butterfly operation follow the structure shown in the diagram “Figure 1”?

The legend is “Figure 2”.

This structure differs from the general Cooley-Tukey butterfly operation shown “Figure 3”

Could you explain the reason for this difference?

Caesar · January 27, 2025, 3:29pm

Figure (1) is not complete. First, we do not skip points as we move from one stage to the next. Also, the twiddle factors for each stage are not shown.

Anyway, forward NTT in OpenFHE employs the standard Cooley-Tukey dataflow with a slight modification of the twiddles order.
The NTT implementation in OpenFHE follows the NTT algorithms in this work.

At a high level, for NTT, in stage 0 \leq s \lt \log_{2}{N}, you divide the points into 2^{s+1} groups of points. The butterfly operation is then applied to these groups in an interleaved manner, processing pairs of groups sequentially. My description might not be very clear, but it should be straightforward to figure out the pattern if you trace ForwardTransformToBitReverseInPlace.

Yusaku_suzuki · January 28, 2025, 1:48am

Thank you for response.

I have traced the code.
I included the traced code (Figure 4) and its output (Figure 5) in the attached images.

The trace code is a butterfly operation part.
The indices of the elements used in the operation (loIdx, hiIdx) and the indices of omega (omegaIdx) were output according to the base code.

I would like to confirm the case of Step m=2, t=2, logt=1 specifically Step i=1.
In this case, the butterfly operation is applied to ‘elements 2–5’.
However, I understand that in the standard Cooley-Tukey FFT, this operation would involve elements 4–7.
Additionally, in Step m=2, t=2, logt=1 no operation is performed for elements 6 and 7.

Could you give me your opinion on the above?

Caesar · January 28, 2025, 5:13pm

This does not seem correct to me. You might have implemented it incorrectly. Instead of pulling the code out of OpenFHE and re-implementing it, why don’t you trace it inside OpenFHE itself? This way, you can rule out the possibility of incorrect implementation. Your code differs significantly from OpenFHE’s implementation.

The code you provided must have an issue since all elements must be touched in every stage in NTT.

Yusaku_suzuki · January 29, 2025, 2:30am

Thank you for your response.

Upon rechecking in OpenFHE, I identified the cause of my misunderstanding. I had incorrectly interpreted the behavior of GetMSB.
For example, I assumed GetMSB(4) would be equivalent to log2(4) = 2, counting from 0. However, it actually counts from 1, making GetMSB(4) = 3.

As a result, the Cooley-Tukey FFT was being executed correctly.

I will close this Q&A. Thank you very much for your support.

Topic		Replies	Views
Differences between rootOfUnityTable and preconRootOfUnityTable Library Questions	3	222	June 12, 2023
How to use custom NTT/CRT library Library Questions	4	492	February 27, 2023
NTT operations in CKKS Library Questions	4	267	March 30, 2024
Is adapting a NTT done in a GPU to OpenFHE difficult? Library Questions	1	159	March 12, 2024
IFFT and Bit reversal Library Questions	6	135	December 19, 2023

Question about the Cooley-Tukey operation in NTT

Related topics