Alex²

Research

I’m interested in doing computer architecture research where I apply techniques from classical algorithms and data structures. I’m not yet quite sure what this entails, but it will probably have something to do with accelerators, compilers, and/or hardware-software co-design. Ideally, I would be able to focus on maximizing efficiency, free from the limitations of existing architectures.

Previously, I worked with Nikola Samardzic on accelerating Fully Homomorphic Encryption—encryption that allows running programs on secret data without decrypting it.

A Tensor Compiler with Automatic Data Packing for Simple and Efficient Fully Homomorphic Encryption

Aleksandar Krastev, Nikola Samardzic, Simon Langowski, Srinivas Devadas, Daniel Sanchez

Fhelipe is an FHE compiler exposing an easy-to-use tensor programming interface. Fhelipe simplifies programming by abstracting data layouts and noise management, while achieving great performance via two key contributions: a novel bit-permutation tensor layout representation and a novel bootstrap placement algorithm. Fhelipe is the first compiler to match the performance of large hand-optimized FHE applications, outperforming prior compilers by gmean 18.5×.

I designed Fhelipe’s layout representation, helped design all algorithms in the compiler, and implemented the compiler frontend.

CraterLake: A Hardware Accelerator for Efficient Unbounded Computation on Encrypted Data

Nikola Samardzic, Axel Feldmann, Aleksandar Krastev, Nathan Manohar, Nicholas Genise, Srinivas Devadas, Karim Eldefrawy, Chris Peikert, Daniel Sanchez

CraterLake is a state-of-the-art hardware accelerator for FHE, providing speedups of 5,000× over CPU on a broad range of applications. Building upon F1’s functional units, CraterLake introduces a novel architecture that significantly reduces on- and off-chip data movement.

I came up with the way computation is distributed across the chip (Sec. 4), designed the on-chip network (Sec. 5.3), and designed the KeySwitch hint generator (Sec. 5.2).

F1: A Fast and Programmable Accelerator for Fully Homomorphic Encryption

Axel Feldmann, Nikola Samardzic, Aleksandar Krastev, Srini Devadas, Ron Dreslinski, Christopher Peikert, Daniel Sanchez

F1 was our initial proposal for an FHE accelerator. F1 proposes novel high-throughput FHE functional units, but suffers from excessive on-chip data movement that prevents it from scaling to large FHE application. As a result, F1 is about 5,000× faster than a CPU on small applications (similar to CraterLake), but only 400× faster on large ones (11× slower than CraterLake).

I designed the first SRAM-only, fully-pipelined transpose unit (Sec. 5.1), which is a crucial component of F1’s novel FFT and automorphism units.