CSR: System and Middleware Approaches to Predictable Services in Multi-Tenant Clouds (NSF CNS-1649502)
Project description and goals
Datacenter-based cloud services exhibit unpredictable performance variations due to multi-tenant interferences and the heterogeneity in datacenter hardware. The investigators attribute the causes of such performance unpredictability to the missing of two important service guarantees from existing cloud providers: resource capacity and application agility. To provide guaranteed resource capacity and enhanced application agility, this project develops independent but complementary approaches at system and middleware levels to reduce performance variations of in-cloud applications without compromising other objectives such as high datacenter utilization and good average performance. The deliverables are new system support in cloud resource management to account for interferences and hardware heterogeneity in shared infrastructures and middleware approaches to perform agile, non-invasive and application-centric resource provisioning. The research methodology combines architectural knowledge on the complex interplay between simultaneous multi-threading, multicore, and non-uniform memory access architectures with statistical learning algorithms to quantify interference and heterogeneity, and integrates the strength of self-optimizing learning and control techniques to automate resource provisioning under dynamic workloads. This project broadens impact by exploring inter-disciplinary techniques in computer system design and enhancing cloud services with predictability guarantees. The success will guide resource management and metering in future cloud systems.
Participants
Dr. Jia Rao, Principal investigator
Dr. Xiaobo Zhou, Co-Principal investigator
Kun Suo, Ph.D. student, 2013-2017
Yong Zhao, Ph.D. student, 2014-2017
Xiaofeng Wu, Ph.D. student, 2017
Anthony Ayodele, PhD student, 2013 - 2016
Sawyer Peterson, REU student, 2014 - 2015
Kevin Zarkovacki, REU student, 2014 - 2015
Khanh Nguyen, REU student, 2016 - 2017
Mason Moreland, REU student, 2016 - 2017
Scott Laue, REU student, 2017
Project-sponsored Publications
Characterizing and Optimizing Hotspot Parallel Garbage Collection on Multicore Systems
Kun Suo, Jia Rao, Hong Jiang, and Witawas Srisa-an.
To appear in The European Conference on Computer Systems (EuroSys), 2018
An Analysis and Empirical Study of Container Networks
Kun Suo, Yong Zhao, Wei Chen, and Jia Rao.
To appear in The IEEE International Conference on Computer Communications (INFOCOM), 2018
Scheduler Activations for Interference-resilient SMP Virtual Machine Scheduling
Yong Zhao, Kun Suo, Luwei Cheng, and Jia Rao.
In Proceedings of The ACM/IFIP/USENIX Conference on Middleware (Middleware), 2017
Preserving I/O Prioritization in Virtualized OSes
Kun Suo, Yong Zhao, Jia Rao, Luwei Cheng, Xiaobo Zhou, and Francis C.M. Lau.
In Proceedings of The Symposium on Cloud Computing (SoCC), 2017
Preemptive, Low Latency Datacenter Scheduling via Lightweight Virtualization
Wei Chen, Jia Rao, and Xiaobo Zhou.
In Proceedings of The USENIX Annual Technical Conference (ATC), 2017
Characterizing and Optimizing the Performance of Multithreaded Programs Under Interference
Yong Zhao, Jia Rao and Qing Yi.
In Proceedings of The 25th International Conference on Parallel Architecture and Compilation Techniques (PACT), 2016
Time Capsule: Tracing Packet Latency across Different Layers in Virtualized Systems
Kun Suo, Jia Rao, Luwei Cheng and Francis C.M. Lau.
In Proceedings of The 7th ACM SIGOPS Asia-Pacific Workshop on Systems (APSys), 2016
Best paper award (2 out of 52 submissions)
Resource and Deadline-aware Job Scheduling in Dynamic Hadoop Clusters
Dazhao Cheng, Jia Rao, Changjun Jiang and Xiaobo Zhou.
In Proceedings of the IEEE International Parallel & Distributed Processing Symposium (IPDPS), 2015
StoreApp: A Shared Storage Appliance for Efficient and Scalable Virtualized Hadoop Clusters
Yanfei Guo, Jia Rao, Dazhao Cheng, Changjun Jiang, Cheng-Zhong Xu and Xiaobo Zhou.
In Proceedings of the 34th IEEE Conference on Computer Communications (INFOCOM), 2015
Co-tenancy Interference Measurement and Performance Anomaly Detection in a Multi-tenant Cloud Computing Environment
Anthony Ayodele, Terrance Boult and Jia Rao.
In Proceedings of IEEE International Conference on Cloud Computing (CLOUD), 2015.
Understanding Parallel Performance Under Interferences in Multi-tenant Clouds
Yong Zhao, Jia Rao, Xiaobo Zhou, and Qing Yi.
In Proceedings of International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS), Poster, 2015.
Improving MapReduce Performance in Heterogeneous Environments with Adaptive Task Tuning
Dazhao Cheng, Jia Rao, Yanfei Guo and Xiaobo Zhou.
In Proceedings of ACM/IFIP/USENIX International Conference on Middleware (Middleware), 2014.
Moving Hadoop into the Cloud with Flexible Slots
Yanfei Guo, Jia Rao, Changjun Jiang and Xiaobo Zhou.
In Proceedings of ACM/IEEE International Conference for High Performance Computing, Networking, Storage, and Analysis (SC), 2014.
User-Centric Heterogeneity-Aware MapReduce Job Provisioning in the Public Cloud
Eric Pettijohn, Yanfei Guo, Palden Lama, and Xiaobo Zhou.
In Proceedings of USENIX International Conference on Autonomic Computing (ICAC), 2014.
Software release
|