SignedDigitDecompose Error caused by the EvalBinGate in the Bool scheme

Hi OpenFHE Team!

Use OpenFHE in computing the following polynomials with the UBSAN tool:
((NOT (0 AND 0)) AND ((0 OR 0)) AND ((1 OR 0) AND (1 OR 1)) OR ((1 AND 1))) OR ((NOT (NOT 1)) AND ((NOT 0)))
These calculation will causes UBSAN to issue these runtime error. which is

/data/homomorphic/openfhe-1.2.3/openfhe-development/src/binfhe/lib/rgsw-acc.cpp:74:21: runtime error: left shift of 67101697 by 54 places cannot be represented in type 'long int'
/data/homomorphic/openfhe-1.2.3/openfhe-development/src/binfhe/lib/rgsw-acc.cpp:84:22: runtime error: left shift of 65529 by 54 places cannot be represented in type 'long int'
/data/homomorphic/openfhe-1.2.3/openfhe-development/src/binfhe/lib/rgsw-acc.cpp:71:21: runtime error: left shift of negative value -6946191
/data/homomorphic/openfhe-1.2.3/openfhe-development/src/binfhe/lib/rgsw-acc.cpp:78:22: runtime error: left shift of negative value -6783

After conducting a thorough investigation, we have identified some reasons.
In our code, it is the following line that triggered these runtime errors.

auto ctOR = cc.EvalBinGate(OR, ct1, ct2);

and why it triggered a left shift runtime error in the SignedDigitDecompose function?
Please refer to the following call stack.

 #0 0x7f1bc3bce265 in lbcrypto::RingGSWAccumulator::SignedDigitDecompose(std::shared_ptr<lbcrypto::RingGSWCryptoParams> const&, std::vector<lbcrypto::PolyImpl<intnat::NativeVectorT<intnat::NativeIntegerT<unsigned long> > >, std::allocator<lbcrypto::PolyImpl<intnat::NativeVectorT<intnat::NativeIntegerT<unsigned long> > > > > const&, std::vector<lbcrypto::PolyImpl<intnat::NativeVectorT<intnat::NativeIntegerT<unsigned long> > >, std::allocator<lbcrypto::PolyImpl<intnat::NativeVectorT<intnat::NativeIntegerT<unsigned long> > > > >&) const /data/homomorphic/openfhe-1.2.3/openfhe-development/src/binfhe/lib/rgsw-acc.cpp:74
    #1 0x7f1bc3e0d659 in lbcrypto::RingGSWAccumulatorCGGI::AddToAccCGGI(std::shared_ptr<lbcrypto::RingGSWCryptoParams> const&, std::shared_ptr<lbcrypto::RingGSWEvalKeyImpl const> const&, std::shared_ptr<lbcrypto::RingGSWEvalKeyImpl const> const&, intnat::NativeIntegerT<unsigned long> const&, std::shared_ptr<lbcrypto::RLWECiphertextImpl>&) const /data/homomorphic/openfhe-1.2.3/openfhe-development/src/binfhe/lib/rgsw-acc-cggi.cpp:112
    #2 0x7f1bc3e220a1 in lbcrypto::RingGSWAccumulatorCGGI::EvalAcc(std::shared_ptr<lbcrypto::RingGSWCryptoParams> const&, std::shared_ptr<lbcrypto::RingGSWACCKeyImpl const> const&, std::shared_ptr<lbcrypto::RLWECiphertextImpl>&, intnat::NativeVectorT<intnat::NativeIntegerT<unsigned long> > const&) const /data/homomorphic/openfhe-1.2.3/openfhe-development/src/binfhe/lib/rgsw-acc-cggi.cpp:66
    #3 0x7f1bc3c546e5 in lbcrypto::BinFHEScheme::BootstrapGateCore(std::shared_ptr<lbcrypto::BinFHECryptoParams> const&, lbcrypto::BINGATE, std::shared_ptr<lbcrypto::RingGSWACCKeyImpl const> const&, std::shared_ptr<lbcrypto::LWECiphertextImpl const> const&) const /data/homomorphic/openfhe-1.2.3/openfhe-development/src/binfhe/lib/binfhe-base-scheme.cpp:538
    #4 0x7f1bc3c597e1 in lbcrypto::BinFHEScheme::EvalBinGate(std::shared_ptr<lbcrypto::BinFHECryptoParams> const&, lbcrypto::BINGATE, lbcrypto::RingGSWBTKey const&, std::shared_ptr<lbcrypto::LWECiphertextImpl const> const&, std::shared_ptr<lbcrypto::LWECiphertextImpl const> const&, bool) const /data/homomorphic/openfhe-1.2.3/openfhe-development/src/binfhe/lib/binfhe-base-scheme.cpp:100
    #5 0x7f1bc3d19956 in lbcrypto::BinFHEContext::EvalBinGate(lbcrypto::BINGATE, std::shared_ptr<lbcrypto::LWECiphertextImpl const> const&, std::shared_ptr<lbcrypto::LWECiphertextImpl const> const&, bool) const /data/homomorphic/openfhe-1.2.3/openfhe-development/src/binfhe/lib/binfhecontext.cpp:281
    #6 0x40932d in analysis(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >, lbcrypto::BinFHEContext, std::shared_ptr<lbcrypto::LWEPrivateKeyImpl>) /data/homomorphic/test-openfhe/demo_boolean_symmetric.cpp:89
    #7 0x40ca65 in main /data/homomorphic/test-openfhe/demo_boolean_symmetric.cpp:266
    #8 0x7f1bbcc32082 in __libc_start_main ../csu/libc-start.c:308
    #9 0x4078bd in _start (/data/homomorphic/test-openfhe/new-build/test+0x4078bd)

Due to the excessive depth of function calls in the program, our GDB was unable to trace the specific values of the parameters passed at each level.
UBSAN tool detects two different errors: one being a left shift of a negative value, and the other being a left shift that cannot be represented in type ‘long int’. Therefore, I believe that these two errors are likely to affect the correctness of the OR Boolean function in the SignedDigitDecompose function.

Is these two are bug? I think these two behaviour trigger C++ undefined behaviour which it can be cause some degree of SignedDigitDecompose calculation error. At the same time, you can print out why the SignedDigitDecompose function sometimes performs a left shift on negative numbers or causes an integer overflow during left shifts, to investigate the true reason. By the way the error messages vary depending on the specific polynomial .

Thank you for taking the time to look into this issue. Please let us know if any further details or clarification would be helpful!

Best regards,
wowblk

Hi @wowblk8,

We do not explicitly use long int in the code. The signed int type is determined by NATIVE_SIZE. If NATIVE_SIZE=32, then it is int32_t. If NATIVE_SIZE=64, then OpenFHE uses int64_t. It looks to me that during the compilation you (or the UBSAN tool) chose NATIVE_SIZE=32 and then tried to run binfhe for the case when the modulus size is 54. By design, this should not work. We do not do extra checks in SignedDigitDecompose as such checks would too computationally expensive. We expect the user to know that if NATIVE_SIZE is set 32, it will not work for moduli larger than 32 bits. In other words, I do not see a problem here. Please let me know if there is something else that I am missing here.

1 Like

Hi!
Thank you for kindly response.
Actually I choose the NATIVE_SIZE=64, and it should work corretly.
here is make status

-- FOUND PACKAGE OpenFHE
-- OpenFHE Version: 1.2.3
-- OpenFHE installed as shared libraries: ON
-- OpenFHE include files location: /usr/local/include/openfhe
-- OpenFHE lib files location: /usr/local/lib
-- OpenFHE Native Backend size: 64
-- Configuring done
-- Generating done
-- Build files have been written to: /data/homomorphic/test-openfhe/new-build

As you can see, When the Native Backend size is 64. So the moduli bits should be right, 54. And I didn’t choose the moduli manually, i use the cc.GenerateBinFHEContext(STD256) Also, When i use thecc.GenerateBinFHEContext(STD128) the moduli set to 55.

t0:102502330 Type:unsigned long
d0:100100 Type:long
r0:-70 Type:long
t1:6235415 Type:unsigned long
d1:6089 Type:long
r1:279 Type:long
d0:116005 Type:long
r0:-389 Type:long
t1:112374035 Type:unsigned long
d1:109740 Type:long
r1:275 Type:long
t0:189770870 Type:unsigned long
d0:-76757 Type:long
r0:117 Type:long
t1:228567131 Type:unsigned long
d1:-38870 Type:long
r1:90 Type:long
QHalf: 134184960 Type:unsigned long
Q_int: 134184960 Type:long
gBits: 134184960 Type:long
gBitsMaxBits: 54 Type:long

I explored and printed many parameters of this function in depth, and we can see that these parameters are of type long, rather than the int32_t or int64_t you mentioned.
Additionally, we can see that the values calculated in this function do indeed cause type overflow(long int overflow) and undefined behavior, such as left-shifting negative numbers.
By the way, I forcibly interrupted this function at a point where no result was calculated and successfully computed the correct result. This suggests that the function didn’t produce calculation errors because, to some extent, its design didn’t hinder the computation result.

Here is the source code(SignedDigitDecompose) I am using for debugging.

void RingGSWAccumulator::SignedDigitDecompose(const std::shared_ptr<RingGSWCryptoParams>& params,
                                              const std::vector<NativePoly>& input,
                                              std::vector<NativePoly>& output) const {
    auto QHalf{params->GetQ().ConvertToInt<BasicInteger>() >> 1};
    auto Q_int{params->GetQ().ConvertToInt<NativeInteger::SignedNativeInt>()};
    auto gBits{static_cast<NativeInteger::SignedNativeInt>(__builtin_ctz(params->GetBaseG()))};
    auto gBitsMaxBits{static_cast<NativeInteger::SignedNativeInt>(NativeInteger::MaxBits() - gBits)};
    // approximate gadget decomposition is used; the first digit is ignored
    uint32_t digitsG2{(params->GetDigitsG() - 1) << 1};
    uint32_t N{params->GetN()};

    std::cout << "QHalf: " << QHalf << " Type:" << typeid(QHalf).name() << std::endl; 
    std::cout << "Q_int: " << QHalf << " Type:" << typeid(Q_int).name() << std::endl; 
    std::cout << "gBits: " << QHalf << " Type:" << typeid(gBits).name() << std::endl; 
    std::cout << "gBitsMaxBits: " << gBitsMaxBits << " Type:" << typeid(gBitsMaxBits).name() << std::endl; 
    std::cout << "params: " << params << std::endl;

    std::cout << "inputs: ";
    for(auto i = 0UL; i < input.size(); i++) {
        std::cout << input[i] << std::endl;
    }

    std::cout << "outputs: ";
    for(auto i = 0UL; i < output.size(); i++) {
        std::cout << output[i] << std::endl;
    }

    for (uint32_t k{0}; k < N; ++k) {
        auto t0{input[0][k].ConvertToInt<BasicInteger>()};
        auto d0{static_cast<NativeInteger::SignedNativeInt>(t0 < QHalf ? t0 : t0 - Q_int)};
        auto t1{input[1][k].ConvertToInt<BasicInteger>()};
        auto d1{static_cast<NativeInteger::SignedNativeInt>(t1 < QHalf ? t1 : t1 - Q_int)};

        auto r0{(d0 << gBitsMaxBits) >> gBitsMaxBits};
        d0 = (d0 - r0) >> gBits;

        std::cout << "t0:" << t0 << " Type:" << typeid(t0).name() << std::endl;
        std::cout << "d0:" << d0 << " Type:" << typeid(d0).name() << std::endl;
        std::cout << "r0:" << r0 << " Type:" << typeid(r0).name() << std::endl;

        auto r1{(d1 << gBitsMaxBits) >> gBitsMaxBits};
        d1 = (d1 - r1) >> gBits;

        std::cout << "t1:" << t1 << " Type:" << typeid(t1).name() << std::endl;
        std::cout << "d1:" << d1 << " Type:" << typeid(d1).name() << std::endl;
        std::cout << "r1:" << r1 << " Type:" << typeid(r1).name() << std::endl;

        return; // interrupted this function

        for (uint32_t d{0}; d < digitsG2; d += 2) {
            r0 = (d0 << gBitsMaxBits) >> gBitsMaxBits;
            d0 = (d0 - r0) >> gBits;
            if (r0 < 0)
                r0 += Q_int;
            output[d + 0][k] += r0;

            r1 = (d1 << gBitsMaxBits) >> gBitsMaxBits;
            d1 = (d1 - r1) >> gBits;
            if (r1 < 0)
                r1 += Q_int;
            output[d + 1][k] += r1;
        }
    }
}

It seems the auto keyword made some result that we didn’t want.

And I rebuild the libary once to setting NATIVE_SIZE=32.

QHalf: 134184960 Type:unsigned int
Q_int: 134184960 Type:int
gBits: 134184960 Type:int
gBitsMaxBits: 22 Type:int
t0:0 Type:unsigned int
d0:0 Type:int
r0:0 Type:int
t1:33546241 Type:unsigned int
d1:32760 Type:int
r1:1 Type:int

At this point, we can see that the modulus size is 22. There’s no issue, and the type is int, though it could also be int32_t. With NATIVE_SIZE=64, I believe the compiler, to some extent, treats int64_t as long int. Even if the type is int64_t, overflow situations and left-shifting of negative numbers (undefined behavior) should not occur.

d0:-120494 Type:int
r0:-67 Type:int
t1:67604203 Type:unsigned int
d1:66020 Type:int
r1:-277 Type:int
t0:20293471 Type:unsigned int
d0:19818 Type:int
r0:-161 Type:int
t1:114177879 Type:unsigned int
d1:111502 Type:int
r1:-169 Type:int
t0:135681650 Type:unsigned int
d0:-129578 Type:int
r0:-399 Type:int
t1:102970428 Type:unsigned int
d1:100557 Type:int
r1:60 Type:int
t0:9560797 Type:unsigned int
d0:9337 Type:int
r0:-291 Type:int
t1:217510989 Type:unsigned int
d1:-49667 Type:int

It can be seen that, with NATIVE_SIZE=32, the two aforementioned errors also occur: (d1 and d0)660220 << 22 exceeds the range of int32_t and results in a negative number being left-shifted.

After reviewing all my settings, I believe I have not exceeded the usage scope or performed any unauthorized actions in OpenFHE. Given that the above two issues still occurred, the problems caused by this function should be reported as an issue to be addressed.

Hi @ypolyakov , Any clue about this problem?

Best regards,
wowblk

I looked at it more carefully. You are correct that the left-shift in openfhe-development/src/binfhe/lib/rgsw-acc.cpp at v1.2.3 · openfheorg/openfhe-development · GitHub pushes most significant bits outside 64 bits. But the intent of this operation is exactly that, i.e., to discard the most significant bits. In your first example, gBits = 10 (digit size is 2^{10}). We want to extract the least 10 significant bits. So we left-shift by 54. All the bits above the 10 least significant bits are dropped. Then we right-shift the 10 bits back to make them least significant again. For the next values of d0/d1, the 10 least significant bits are then subtracted to achieve the floor/round functionality.

Could you point to a scenario where this can produce an incorrect result? I am not aware of this logic causing any issues on any architecture. The digit extraction is implemented this way for efficiency reasons.