I am new to CKKS and Logistic Regression.
I want to do homomorphic logistic regression in OpenFHE. I’ve roughly read the project: openfhe-genomic-examples, and its corresponding thesis ‘Secure large-scale genome-wide association studies using homomorphic encryption’ as well as the appendix. I’ve also roughly learnt the KyoohyungHan/HELR: Homomorphic Logistic Regression on Encrypted Data.
I have several questions.
- The whole LRA in openFHE is a one round procedure? (Compare to HELR which needs several iteration to update the weight vector to raise the accuracy.)
Furthermore, since the Q_L is sufficiently large for the whole 16 evaluation, we don’t need any bootstrapping in LRA?
- How can i use the given LRA with other dataset such as Default of Credit Card Clients Dataset | Kaggle? Is it correct by just changing the relative runtime input params?
- How can i get the accuracy result out of LRA?
- Other high level or detailed comparison between HELR and LRA would be helpful.
Great thanks for your help!
The openfhe-genomic-examples repo provides an implementation of the optimized solution that works well for genome-wide association studies (the plaintext version of it is described in GWAS on your notebook: fast semi-parallel linear and logistic regression for genome-wide association studies | BMC Bioinformatics | Full Text). This solution is orders of magnitude faster than the traditional solution of solving logistic regression training for each SNP (dozens of thousands SNPs in this case). The nice feature of this solution is that it does not require deep computations as you noticed; so no bootstrapping is needed in this case. However, it works only for special cases.
Your question referred to the more general cases of logistic regression training where many iterations are needed. The idea is to perform an iteration of logistic regression training, then do bootstrapping over the current estimate of the regression vector, and continue this procedure as many iterations as needed to achieve desired accuracy metric.
OpenFHE provides all the HE primitives to perform logistic regression training, i.e., CKKS bootstrapping,
EvalLogistic to evaluation the logistic function, and low-level vector operations for the linear operations. However, There is no full public example at the moment for logistic regression training (though we have it in a private repository). We are planning to share one publicly in the near future. I will check on the timeline of publicly releasing it and we will update this thread.
An update: we just published more general examples for logistic regression training. See ML using OpenFHE: Logistic regression training examples are now available for more details.