Hey folks,

I am new to FHE and want to use it for my deep learning project.

I have just started understanding TFHE and CKKS and I have a question. **How to do matrix multiplication in both on GPU?** After exploring CKKS side, I have understood that the data is encoded in polynomial format and we have to use diagonal encoding to perform efficient matrix multiplication.

I am not able to wrap my head around TFHE side of things. According to my understanding, TFHE encodes each number by placing its data in Most Significant Bit (MSB), adds noise to Least Significant Bit (LSB) and encrypts each number independently while preserving the shape of the matrix. **In TFHE, can I use existing GEMM implementations for CPUs and GPUs as the independent numbers are encrypted and the shape of the data remains the same?** For eg: I have a matrix 256x256 and I want to multiply it with another matrix of size 256x512. If I use TFHE, then each number in my matrix will be encrypted. And, when I want to do matrix multiplication on GPU, can I use the existing GEMM scripts containing efficient tiling algos? If not, what am I missing?

I am currently trying to compare 2 schemes with respect to their performance on accelerators. With my current understanding, TFHE should be the best as we can leverage existing algos, whereas for CKKS we have to design new ones. **Then why is CKKS considered better for deep-learning from performance point of view?**

Thanks!