I. Introduction
Recent advances in lattice-based Homomorphic Encryption (HE) have shown that it can be practical to securely run arbitrary computations over encrypted data [1], [2]. In addition to supporting encrypted computing, lattice-based HE schemes are attractive because they are post-quantum public-key schemes [3], meaning they are resistant to quantum computing attacks. Despite recent advances, HE is still not widely practical, partially because of computational bottlenecks in lattice-based HE schemes when run on commodity CPU-based computation devices. It would be valuable to accelerate the execution of core HE operations, possibly in a low-cost but computationally efficient and highly optimized hardware co-processor.