Creating High Performance and Reliable Storage Systems
Improving Data Storage for Technological Advancements
ECE Associate Professor Ningfang Mi was recently awarded a $500K NSF grant for “New Techniques for I/O Behavior Modeling and Persistent Storage Device Configuration.”
Collaborating with Florida International University, Mi will work to develop new techniques for benchmarking data. The research will also focus on configuring storage systems to obtain the best possible performance and reliability.
“We want to derive new input/output (I/O) models to accurately capture I/O behaviors when running multiple applications with different workloads on storage systems, like in flash-based solid-state drives (SSDs),” Mi said.
Different storage devices have different drivers and parameters, she added. Using default parameters does not always achieve the best performance.
“We will leverage the benchmarks to see how to set up and select good algorithms on devices to better manage them and optimize their performance,” Mi explained.
The challenge though is to not simply emulate an I/O application. It’s important to emulate multiple I/O workloads or applications running at the same time, she posited.
“Simple benchmarks cannot capture real applications in real systems,” Mi said. “We want to ask the question, can good benchmarks emulate the real workloads in real systems?”
The research will “analyze the impact of various system components while running multiple workloads on emerging storage systems,” according to the NSF grant. Furthermore, the project centers on helping “to configure storage systems with respect to their workloads and data processing requirements.”
Mi added that this is an ideal opportunity to learn new storage techniques. Devices like laptops have flash memory and are capable of processing data quickly. But new storage techniques are being created. This research can potentially help make computers, smart phones, and other devices move even faster, she said.
The future is female
In addition to creating higher performing and more reliable storage systems, Mi hopes the project will open up more career opportunities for female students.
“Engineering has previously been thought of as a ‘male’ industry,” she said. “But we want more female students to study in engineering fields and enjoy their research. We can do the same thing, and at times, better.”
Mi added that the grant aims to recruit more female students, at the undergrad and graduate levels. With her own mentor a woman, Mi wants to continue to show young women that they can achieve great things – especially in the STEM fields.
“Currently in my class that I teach, there are not too many female students,” she said. “But I work with many female faculty members and researchers. I’m proud to work with them, and I want to show [young women] that they can also publish papers and conduct research in these fields.”
Abstract Source: NSF
Currently, there is a rapidly growing diversity in data processing workloads. Likewise, new advancements in persistent storage technologies are emerging. Therefore, it is important to have new techniques for benchmarking and appropriately configuring storage systems in order to obtain the best possible performance and reliability. This project proposes to derive new input/output (I/O) models to capture I/O behaviors accurately when running multiple applications with different workloads on storage systems such as flash-based solid-state drives (SSDs). In addition, this project develops new approaches to identify the most appropriate internal algorithm for different types of persistent storage devices and dynamically adjust the associated algorithm parameters according to I/O activities.
This project makes empirical contributions to storage systems by addressing challenges issued by large-scale data-intensive applications. Specifically, it advances (1) how to analyze the impact of various system components while running multiple workloads on emerging storage systems; (2) how to design interactive frameworks that allow users to modify the internal algorithms and parameters of modern storage devices; (3) how to enable novices to configure storage systems with respect to their workloads and data processing requirements; and (4) how to derive I/O models to predict future I/O workload patterns and accordingly configure storage systems in advance for better performance.
This project will lead to better storage systems design with high performance and reliability. The outcome of this project will bring a significant impact on many areas that are dependent on processing a large amount of data. This project will share the findings with undergraduate and graduate students through computer science and engineering programs and open up career opportunities to female students, underrepresented minorities, and first-generation college students. This project will disseminate the proposed techniques into the industry and foster technology transfer through new industrial collaborations. The developed infrastructure will be available to the research community through a web-based portal.
All the publicly disclosable NSF funded work products developed under this project will be maintained at the project website (https://damrl.cis.fiu.edu/research/) at Florida International University (FIU) for at least five years beyond the end of the project. Data generated and collected as part of this project will be deposited into Digital Repository Service (DRS) at Northeastern University (NEU) and maintained for at least 5 years beyond the end of the project. The developed software code and tools will be published in scholarly articles and be made available online via NEU’s DRS, and FIU’s project website.