Simple Homomorphic Encryption (SHE)
Basics
Homomorphic Encryption allows you to send encrypted data to a third
party and that third party can do computations on the data and return it
to you will all the computations happening on the encrypted data itself.
This library uses the HElib’s binArithmetic library to implement as full
a C++ semantic as possible. HElib uses the BGV homomorphic encryption
system. This system can do primitive operations of bit level addition
and bit level multiplication. Each operation adds some noise to the
result, so only a limited number of operations can be done before the
noise overwhelms our actual data. Part of BGV is a way to set the noise
level to a lower value by decrypting our the encrypted value
homophorphically. The details aren’t important for the use of SHELib,
but this affects the performance of any calculations you do.
The goals of this library.
HElib as been available for about 10 years, and is already used in
systems to provide homomorphic services. It is the "assembler" level of
doing homomorphic encryption, with a few higher level features like
binArithmetic. The latter provides basic functionality like shift,
bitwise operations, addition, subtraction, and multiplication. Using
helib requires finding correct parameters, and initializing crypto
systems with those parameters, which can be tricky.
The goals of this library is to reduce the complexity of initializing
your the system and provide C++ like access to underlying homomorphic
operations.
Overall the goals, in order, are:
Ease of Use: Encrypted values are represented by C++ classes that mimic
the C++ builtin types, with operator overrides so you an do math on them
just like the native types. The compiler precedence work, non-bool value
will take on bool values in logical expression (1=non-zero, 0=0).
this means much of the details of HElib are hidden, and while you can
get the underlying HElib data structures, you usually don’t have to.
as much as possible, your standard programming will work (with the
notable exception of comparison, which returns an encrypted value and
the results cannot affect your program flow (see more below)).
Correct and fully functioning: The goal is that anything that you can
possibly do with the built-ins, you should be able to do with the
Encrypted values, except you can’t cast encrypted values to unencrypted
values (so you can’t get a bool, or instance from a comparison).
this means that integer division is included, even if it could take
hours or days to complete.
encrypted int64 types are allowed, even though they would take hours to
do basic operations.
bootstrapping (see below) is implemented automatically so you don’t have
to worry about it in your initial implementation (even if it slows down
performance).
Performance: This is goal 3. This means if there are two ways of doing
something, one way is significantly slower, but produces the correct
results and the other way is faster, but sometimes hits corner cases.
I’ve chosen correctness. That being said, I will usually try to chose
the method that provides optimal performance within the correctness
constraint. I’ve also provided access to methods the application
programmer can use to optimize their implementation.
Using SHELib
KeyGeneration and management.
To use SHE, you first need to get a keypair. This keyPair is generated
using the SHEGenerate_KeyPair(), which takes a type, a security level,
and an operation level. The operation level sets how many operations you
can do before you need to bootstrap again. The higher the operation
level, the slower each operation will be for a given security level.
Security level is the strength of the system in bits. Several levels are
provided that are not really secure, but allow you to experiment before
cranking up to a higher security level.
One you have your keyPair, you can save them to a file, or stream using
the C++ << operators. In a real system, you will want to send the
publickey along with your encrypted data, so that they server can
operate on your encrypted data.
NOTE: in production, only the client needs to generate a keypair, the
server just accepts publicKey along with the encrypted data.
Integer operations.
Encrypted data is stored in various SHEInt classes: SHEInt8, SHEInt16,
SHEInt32, SHEInt64, SHEUInt8, SHEUInt16, SHEUInt32, SHEUInt64and
SHEBool. These correspond the the C++ types int8_t, int16_t, int32_t,
int64_t, uint8_t, uint16_t, uint32_t, uint64_t, and bool respectively
These classes are all subclasses of SHEInt, so you can implement generic
integer libraries that work on any of these native types by using
SHEInt. All SHEInt constructs must have a publicKey, either explicitly,
or implicitly. Simply declare a SHEInt variable, and start using it.
SHEInt16 a(publicKey);
SHEInt16 b(publicKey);
SHEInt16 r(publicKey);
a = 5;
b = -7;
r = a*a + b;
All the basic integer operators are supported except the ? operator
(a?b:c). Instead you can use a.select(b:c), or select(a,b,c);
Logical and comparison operations return SHEInt and not bool, and there
is no cast from SHEInt to bool. This is because the results of these
operations are encrypted, so they cannot be examined directly in your
program. This means they can not be used in a if, or as an exit to a
loop condition.
if (a == b) // This will not work.
This is because the result of the comparison is itself encrypted. So you
can’t query the result (unless you decrypt it). What you can do is use
the result to select a value:
r = (a==b).select(10, 5);
or
r = select(a==b, 10, 5);
r will get the selected encrypted value (either 10 or 5) depending on
the state of the comparison. You can use integer constants or SHEInt
values or a mix of values in the select.
To implement a while loop with an encrypted index, you can’t use and
encrypted bool result to exit the loop. Instead you need to use a for
loop to loop over all the possibilities and stop updating your variables
when the loop condition ‘ends’:
// unencrypted a and b:
while (a < b) {
a += 1;
b >> 1;
}
// encrypted a && b, assumes a is not negative…
SHEBool lbreak(publicKey, false);
for (int i=0; i < b.getSize(); i++) {
// lbreak is zero until the condition is reached
lbreak= lbreak || (a < b);
// update variables until the lbreak condition occurs
a = lbreak.select(a, a+1);
b = lbreak.select(b, b>>1);
}
Note: it’s important to know the actual maximum bound of the while loop,
if, in the above example, a is -b.getSize(), then this loop won’t
execute as long as expected an will produce incorrect results.
SHEVector operations.
SHEVector allows you to store a vector or array of encrypted values and
access that array with an encrypted index. In all other ways it
functions as a normal std::vector. When using an encrypted index, you
can’t assign to it, but you can use the assign() method:
SHEInt16 model(publicKey);
SHEVector<SHEInt16> array(model,1);
SHEInt16 r(publicKey,5);
SHEInt16 r2(publicKey);
SHEInt8 index(publicKey, 3);
array[1] = SHEInt16(publicKey, 5); // this works because the index is
// unencrypted
array[3] = r;
r2 = array[index]; // r2 will get an encrypted value of r
array[index] = SHEInt16(publicKey, 7) // this will not work.
array.assign(index,SHEInt16(publicKey, 7)) // use this instead
SHEVectors can store and index any value that has a
<type>select(const SHEInt &, const <type>&, const
<type>&); function. Indexes are type SHEInt.
Using encrypted indicies on unencrypted arrays, vectors and maps.
SHEVector.h also defines 3 template functions: getArray, getVector, and
getMap which takes normal arrays, vectors and maps and returns encrypted
values based on encrypted indicies. These functions also take a default
encrypted value to use if the index is out of bounds.
SHEFp operations
Like SHEInt, SHEFp is a base class used to implement SHEHalfFloat,
SHEBFloat16, SHEFloat, SHEDouble, SHEExtendedFloat, and SHELongDouble.
Like SHEInt, in needs to be initialized with a publicKey either
implicitly or explicitly. Once you have a float variable, you can assign
to it, operate, on it, etc. Basic operators +, -, *, and / are defined
(returning SHEFp). Comparison operators return SHEBool and can be used
in a select function:
SHEFp select(&SHEBool sel,const SHEFp &a_true, const SHEFp
&a_false);
as well as versions which also allow mixing in floating point constants
as well.
A special SHEFpBool class is defined which allows syntax like:
SHEFp a = SHEFPBool(c>b).select(c,b);
This is equivalent to a = select(c>b,c,b);
SHEString operations
SHEString provides an encrypted version of std::string. Constructors
take either a model string or a public key, just like SHEInt and SHEFp.
Strings are implemented as a vector of SHEChar. The length of the string
may or may not be encrypted. You can use the hasEncryptedLength flag to
specify encryptedLength in the constructor. You can also call the
encryptLength() methode to force a string to have an encrypted length.
Any operation that uses an encrypted length itself, and encrypted
position, or another string with an encrypted length will force the
resulting string to have an encrypted length. The only operation that
will cause a string with an encrypted length to have an unencrypted
length is a resize with an unencrypted length value.
Strings with encrypted lengths have 2 different modes, set by global
class variables. The modes are fixed and non-fixed. Fixed strings all
have the same length , set by the maxStringLength variable. If a string
result would be larger than maxStringLength, then that result is
truncated. Non-fixed strings have variable lengths which are guaranteed
to be the same or longer than the encrypted length. These strings are
still limited by the maxEncryptedStringLength because the length is
stored in a SHEUInt8 variable. Applications can set the behavior with
SHEString::SetFixedSize(), and query it with SHEString::getFixedSize()
and SHEString::getHasFixedSize().
Performance and bootstrapping.
While pretty much anything you want to do with Integers and Doubles is
support, not everything can be done in finite time. While the library
‘hides’ the details of underlying bootstrapping operations, they will
quickly become apparent in your application. For any real security
level, bootstrapping large integers, like 64 bits, can take 3-8 hours to
complete. So while you may be able to do some operation in multiple
seconds, when you run close to the noise limit, they can suddenly take
hours. There are a couple of ways to mitigate this issue.
you can select a high noise tolerance context and try to complete your
entire calculation before bootstrapping becomes necessary.
you can use a lower noise tolerance to get faster performance for the
same security level.
you can reduce your security level.
you can reduce your variable size.
For almost all usable security levels, SHEInt64 will have extremely poor
performance. SHEInt16 will have the best compromise of performance and
functionality. If you do have to bootstrap, you can use the
.bitCapacity() method to see how many more operations your variable can
go before it needs to bootstrap. There is also a useful method
(.verifyArgs()) that will preemptively bootstrap those variables before
you execute a particular function. This is important because you may
find that while SHELib can automatically bootstrap, it may bootstrap
temporary copies, leaving the original variable unupdated. Also
bootstrapping a variable that may be used in multiple locations helps
keep all the variables that depend on it from loosing capacity.
Debugging
You can get logging output from the internals of each component of
SHELib by providing a stream to the SHExxx::setLog function. Setting the
logger will cause the function to output various calls to that function.
You can also output summary information for SHEInt and SHEFp types using
the SHEIntSummary and SHEFpSummary casts before outputting them to a
stream:
std::cout << "My encrypted vars: " << (SHEIntSummary) myint
<< ", " << (SHEFpSummary) myfloat <<
<< ", " << (SHEStringSummary) mystring << std::endl
For SHEInts you’ll get:
SHEInt(label, bitSize, isUnsigned, isZero, capacity[:value])
where:
label is the label you set on the variable (temporaries get their own
unique ‘t’ label which is set when they are first printed).
bitSize is the number of encrypted bits of this SHEInt (same as returned
from myint.getSize()).
isUnsigned is ‘U’ for unsigned values and ‘S’ for signed values.
isZero is ‘Z’ if this value is an unencrypted explicit zero, and ‘E’ if
this is a properly encrypted value.
capacity how many ’level’s are left before this variable is too noisy to
be useful, MAX means the value is effectively unencrypted and has no
noise.
value is the decrypted value. It’s only included if you have set the
private key with SHEInt::setDebugPrivateKey(SHEPrivateKey&)
For SHEFps you’ll get:
SHEFp(label, expBitSize, mantissaBitSize, capacity[:value])
label is the label you set on the variable (temporaries get their own
unique ‘t’ label which is set when they are first printed).
expBitSize is the number of encrypted bits of exponent (same as returned
from myfloat.getExp().getSize()).
mantBitSize is the number of encrypted bits mantissa (same as returned
from myfloat.getMatissa().getSize()).
capacity how many ’level’s are left before this variable is too noisy to
be useful, MAX means the value is effectively unencrypted and has no
noise.
value is the decrypted value. It’s only included if you have set the
private key with SHEFp::setDebugPrivateKey(SHEPrivateKey&)
For SHEStrings you’ll get:
SHEString(label, encryptedLength, size, capacity[:”value”])
where:
label is the label you set on the variable (temporaries get their own
unique ‘t’ label which is set when they are first printed).
encryptedLength is ‘El’ if the length is encrypted and ‘Ul’ for
unencrypted length.
size size of the string buffer, for unencrypted lengths this is the
string size, for encrypted lengths it’s always bigger than equal to the
real length.
capacity how many ’level’s are left before this variable is too noisy to
be useful, MAX means the value is effectively unencrypted and has no
noise.
value is the decrypted value. It’s only included if you have set the
private key with SHEString::setDebugPrivateKey(SHEPrivateKey&)
If you don’t cast the variables, you will get the object encoded as
json, which is suitable for communicating to other locations, but not
all that usable for debugging.
If you set some radix other than std::dec, all the above values will
still be printed in radix std::dec except the SHEInt values, which will
use the radix you supplied (std::hex or std::octal)