Hello,
I’m wondering if Openfhe has the capability of efficiently taking one batched CKKS ciphertext ct and create multiple ciphertexts from the slots contained in ct.
As explained in this previous post Slot extraction in BFV or CKKS, one might just multiply by the corresponding masks and then apply a rotation. Unfortunately, I have tried this approach and it is painfully slow. For example, if I want to go from a ciphertext of just 2^9 slots with depth 20 it takes around 0.8 seconds for each EvalMult + Rotation on a normal laptop. Is this the best approach or am I missing something?
The opposite question also applies: if we want to go from multiple ciphertext to a single ciphertext, do we have to use masks and rotations or is there a generalization of EvalMerge that considers more than the first slot of the original ciphertexts?
Thanks in advance!
I suggest reading section II.F (page 8) of Optimized Homomorphic Encryption Solution for Secure Genome-Wide Association Studies It describes three different ways of converting one batched ciphertext into multiple ones (one for each slot).
For the inverse case, EvalMerge
still uses masking and rotations (see https://github.com/openfheorg/openfhe-development/blob/v1.1.0/src/pke/lib/schemebase/base-advancedshe.cpp#L412). So all common approaches in CKKS are based on this.
There is also an option to convert a single ciphertext into multiple LWE ciphertexts (in scheme switching), but I don’t think you are asking about this case.
Thank you for your answer, Yuriy.
I have studied the 3 approaches suggested and I believe they won’t be an inprovement over the simple mask + rotation for my case. That is, assume that you start with a vector y = y_1, \dots, y_N and N is a power of 2. What we want is the encryption of y_i for 1 \leq i \leq n without the need to clone each value y_i for each slot of the ciphertexts (y_i is encrypted only in the first slot). Then we just need N mask multiplications and N-1 rotations. Indeed, if you add the overhead of cloning the result to each slot then the methods in the paper perform better.
This leads me to the question of why do you want to clone each value to every slot of the new ciphertexts. Is this just to perform fast matrix products by means of SumRowVec and SumColVec? Is this fast matrix multiplication approach currently implemented in openfhe?
The motivation for cloning is explained in the first paragraph of Section II.6 of the paper. It makes operations with SNPs after the conversion faster (w/o rotations).
In your case you only need each value in the zero-th slot (for the purposes of your application), which is slightly different. BTW, what it the motivation of generating these sparse ciphertexts (with only 1 slot used out many thousands); typically such sparse packing is avoided because it is inefficient.
Thanks again for your reply, Yuriy.
In reality, what I’m doing is to start with a ciphertext that is not sparse, and then I split it in many ciphertexts using the mask + rotations idea (this is to avoid bandwidth cost). The new ciphertexts have more than the first slot used (they are sparse though, I just need less than a 100 slots). This is done because I need to work with the information that these new ciphertexts contain independently than the rest.
As you mention, working with these sparse packing is inefficient, but I didn’t find a way to reduce the slots after creating the new ciphertexts. I tried using the SetSlots method to much the size of the new ciphertexts, but I obtained wrong decryption outputs while testing. If you have any pointer towards a way to reduce the sparsity of the new ciphertexts, that would be also very valuable.
The common approach to improve efficiency in the scenario you described is to pack multiple sparse ciphertexts into one fully packed ciphertext, i.e., use a SIMD-like approach. For example, if you use 100 slots, you can reserve 128 slots for the first ciphertext. Then add the second sparse ciphertext between slots 129 and 256, and so on. This way you can do multiple operations using single ciphertexts, i.e., you case you can pack [ring dimension]/(2*128) sparse ciphertexts into one fully packed ciphertext.
SetSlots
is a “meta instruction” telling OpenFHE how to interpret the current ciphertext (it does not perform any underlying operations). The number of slots implies there is a vector of that size embedded in the subring element, and this vector is cloned to fill the full ring element.
Perfect, everything is clear now. Thank you for your help, Yuriy!