Creating Defenses Against Multi-Tenant ML Cloud-FPGA Vulnerabilities

ECE Assistant Professor Xiaolin Xu, in collaboration with Deliang Fan from Arizona State University, was awarded a $500K NSF grant to design a “Secure and Robust Machine Learning in Multi-Tenant Cloud FPGA.”


Abstract Source: NSF

Alongside the rapid growth of cloud-computing market and critical developments in machine learning (ML) computation, the cloud-FPGA (Field Programmable Gate Arrays) has become a vital hardware resource for public lease, where multiple tenants can co-reside and share an FPGA chip over time or even simultaneously. With many hardware resources being jointly used in the multi-tenant cloud-FPGA environment, a unique attack surface is created, where a malicious tenant can leverage such indirect interaction to manipulate the circuit application of other tenants, e.g., intentionally injecting faults. It has been demonstrated in prior research that small, but carefully designed, perturbation of the ML model parameter transmission between off-chip memory and on-chip buffer could completely malfunction ML intelligence, even under black-box attack scenario, posing an unprecedented threat to future ML cloud-FPGA system. This project (1) targets to understand the vulnerability of multi-tenant ML cloud-FPGA systems and explore defensive approaches, which are crucial and timely for both industry and academia in the cloud-FPGA computing domain; (2) advances the security of ML cloud system against hardware-based model tampering on off-chip data transmission in multi-tenant cloud-FPGA computing infrastructure; and (3) integrates the research outcomes with education in terms of new curriculum development, undergraduate and graduate student training, as well as promoting women and underrepresented minorities in STEM through K-12 outreach programs.

This project integrates ML algorithm security and FPGA hardware security to follow a software-hardware co-design mechanism, exploring novel solutions that improve the security of multi-tenant ML cloud-FPGA system. It consists of three research thrusts. Thrust-1 systematically studies, models, and characterizes an adversarial weight duplication hardware fault injection method, which leverages aggressive power-plundering circuits in malicious tenant to inject fault into the victim tenant’s ML model. Thrust-2 explores various ML algorithmic methodologies to enhance the intrinsic robustness and resiliency of ML model against adversarial fault injection into model parameters during the transmission from off-chip memory to on-chip buffer. Thrust-3 investigates FPGA system-level tamper-resistant approaches to further provide comprehensive solutions to improve the ML-FPGA system security.

Related Departments:Electrical & Computer Engineering