Hi there,
I recall reading somewhere (unfortunately, I am not able to find the source) that in CKKS the difference between the parameters firstModuleSize
and scalingModuleSize
corresponds to the number of bits used to represent the integer part of a value. Thus, with the default settings of 60 and 59 respectively, the scheme should be able to handle only values in the range [-1, +1]. Conversely, if the parameters are set to 60 and 56, the scheme should support values in the range of [-8, +8]. First of all, could you confirm that this is correct?
In my application, which consists of deep neural network training, it is inevitable that some of the values (more specifically, the gradients) will exceed this range. What I do not understand is the expected behavior of the library when there are values outside the specified range, and what the potential side effects are.
I’m asking because I have been running some experiments with the parameters set to 60 and 59, and I observed that some gradient values exceed the range of [-1, +1], extending up to [-4, +4]. I was expecting the library to raise an exception, or to have a very large drop in precision, but this does not seem to be the case.
Thanks for your assistance.
There are a couple of considerations:
- If no CKKS bootstrapping is used, then you can easily support larger values by reserving an extra level in multiplicative depth (all 59 bits will be used for the integer part)
- If CKKS bootstrapping is used, then you need to normalize the CKKS bootstrapping input to values close to 1 (this is needed for bootstrapping). You can still support larger values outside CKKS bootstrapping.
- The limits are not precise. Inverse FFT is applied to the input vector, so the effective bound is averaged over all possible slots. Individual values may be higher than 1.
Hi Yuriy,
Thanks for your feedback. It is very on-point.
I have a few remaining questions.
- If CKKS bootstrapping is used, then you need to normalize the CKKS bootstrapping input to values close to 1 (this is needed for bootstrapping). You can still support larger values outside CKKS bootstrapping.
I am utilizing bootstrapping, but it is applied to the weights rather than the gradients. The weights are indeed within the range [-1, +1]. Therefore, I should be able to both apply bootstrapping to the weights and support larger values for the gradients.
- The limits are not precise. Inverse FFT is applied to the input vector, so the effective bound is averaged over all possible slots. Individual values may be higher than 1.
Interesting. This clarifies why I am not observing a significant precision drop. Indeed, even if some values exceed the [-1, +1] range, the average (absolute) value of the gradients remains close to 0. Thus, could you confirm that the 60/59 setting is appropriate for my use case?
I don’t know your concrete use case very well But if you get correct/accurate (to reasonable precision) results, i.e., they match your reference implementation in the clear, then this is a good sign. Note that I do suggest checking the robustness of FHE results using some edge cases as well. It is always a good idea to do this.
Thanks for the tip.
Could you elaborate further on what edge cases are?
Are you suggesting testing scenarios where the weights fall outside the bootstrapping range of [−1,+1] and/or where the gradients exceed a mean absolute value of 1?
I am referring to practically possible edge cases for your application, i.e., the highest average values you may run into.