Session D1

Edge-Cloud Performance

4:20 PM — 5:40 PM HKT
Dec 2 Wed, 3:20 AM — 4:40 AM EST

Task Offloading in Trusted Execution Environment empowered Edge Computing

Yuepeng Li, Deze Zeng, Lin Gu, Andong Zhu and Quan Chen

To tackle the computation resource poorness on the end devices, task offloading is developed to reduce the task completion time and improve the Quality-of-Service (QoS). Edge computing facilitates such offloading by provisioning resources at the proximity of the end devices. Nowadays, many tasks on end devices have an urgent demand for the security of execution environment. To address this problem, we introduce trusted execution environment (TEE) to empower edge computing for secure task offloading. To explore TEE, the offloading process should be redesigned with the introduction of data encryption and decryption. This makes traditional offloading optimization policy fail to be applied directly. To address this issue, we are motivated to take the data encryption and decryption into the offloading scheduling algorithm. In particular, we propose a Customized List Scheduling based Offloading (CLSO) algorithm, aiming at minimizing the total completion time with the consideration of energy budget limitations on the end devices. The experiment results show that our approximation algorithm can effectively reduce the total completion time and significantly outperforms existing state-of-the-art offloading strategy.

Gecko: Guaranteeing Latency SLO on a Multi-Tenant Distributed Storage System

Zhenyu Leng, Dejun Jiang, Liuying Ma and Jin Xiong

Meeting tail latency Service Level Objective (SLO) as well as achieving high resource utilization is important to distributed storage systems. Recent works adopt strict priority scheduling or constant rate limiting to provide SLO guarantee but cause under-utilization resources. To address this issue, we first analyze the relationship between workload burst and latency SLO. Based on burst patterns and latency SLOs, we classify tenants into two categories: Postponement-Tolerable tenant and Postponement-Intolerable tenant. We then explore the opportunity to improve resource utilization by carefully allocating resources to each tenant type. We design Rate- Limiting-Priority scheduling algorithm to limit the impact of high priority tenants on low priority ones. Meanwhile, we propose Postponement-Aware scheduling algorithm which allows Postponement-Intolerable tenants to preempt system capacity from Postponement-Tolerable tenants. This helps to increase resource utilization. We propose a latency SLO guarantee framework Gecko. Gecko guarantees multi-tenant latency SLOs via combining the two proposed scheduling algorithms together with an admission control strategy. We evaluate Gecko with real production traces and the results show that Gecko admits 44% more tenants on average than state-of-the-art techniques meanwhile guaranteeing latency SLO.

A Learning-based Dynamic Load Balancing Approach for Microservice Systems in Multi-cloud Environment

Jieqi Cui, Pengfei Chen and Guangba Yu

Multi-cloud environment has become common since companies manage to prevent cloud vendor lock-in for security and cost concerns. Meanwhile, the microservice architecture is often considered for its flexibility. Combining multicloud with microservice, the problem of routing requests among all possible microservice instances in multi-cloud environment arises. This paper presents a learning-based approach to route requests in order to balance the load. In our approach, the performance of microservice is modeled explicitly through machine learning models. The model can derive the response time from request volume, route decision, and other cloud metrics. Then the balanced route decision is obtained from optimizing the model with Bayesian Optimization. With this approach, the request route decision can adjust to dynamic runtime metrics instead of remaining static for all different circumstances. Explicit performance modeling avoids searching on an actual microservice system which is time-consuming. Experiments show that our approach reduces average response time by 10% at least.

Enhancing Availability for the MEC Service: CVaR-based Computation Offloading

Shengli Pan, Zhiyong Zhang, Tao Xue and Guangmin Hu

Mobile Edge Computing (MEC) enables mobile users to offload their computation loads to nearby edge servers, and is seen to be integrated in the 5G architecture to support a variety of low-latency applications and services. However, an edge server might soon be overloaded when its computation resources are heavily requested, and would then fail to process all of its received computation loads in time. Unlike most of existing schemes that ingeniously instruct the overloaded edge server to transfer computation loads to the remote cloud, we make use of the spare computation resources from other local edge servers by specially taking the risk of network link failures into account. We measure such link failure risks with the financial risk management metric of Conditional Value-at- Risk (CVaR), and well constrain it to the offloading decisions using a Minimum Cost Flow (MCF) problem formulation. Numerical results validate the enhancement of the MEC service��s availability by our risk-aware offloading scheme.

Session Chair

Yufeng Zhan (Beijing Institute of Technology)

Session D2

Performance of Distributed Systems

4:20 PM — 5:20 PM HKT
Dec 2 Wed, 3:20 AM — 4:20 AM EST

Intermediate Value Size Aware Coded MapReduce

Yamei Dong, Bin Tang, Baoliu Ye, Zhihao Qu and Sanglu Lu

MapReduce is a commonly used framework for parallel processing of data-intensive tasks, but its performance usually suffers from heavy communication load incurred by the shuffling of intermediate values (IVs) among computing servers. Recently, the Coded MapReduce framework is proposed which uses a coding scheme named coded distributed computing (CDC) to trade the communication load with extra computation in MapReduce. CDC can achieve the optimal computation-communication tradeoff when all the IVs have the same size. However, in many practical applications, the sizes of IVs can vary over a large range, leading to inferior performance. In this paper, we introduce a generalized CDC scheme which takes the sizes of IVs into account and then propose a combinatorial optimization problem aiming to minimize the communication load when the computation load is fixed. We show that the problem is NP-hard, and further propose a very efficient algorithm which achieves an approximation ratio of 2. Experiments conducted on Alibaba Cloud show that, compared to the original CDC scheme, our proposed IV size aware approach can significantly reduce the communication load and achieve a lower total execution time.

A Customized Reinforcement Learning based Binary Offloading in Edge Cloud

Yuepeng Li, Lvhao Chen, Deze Zeng and Lin Gu

To tackle the computation resource poorness on the end devices, task offloading is developed to reduce the task completion time and improve the Quality-of-Service (QoS). Edge cloud facilitates such offloading by provisioning resources at the proximity of the end devices. Modern applications are usually deployed as a chain of subtasks (e.g., microservices) where a special offloading strategy, referred as binary offloading, shall be applied. Binary offloading divides the chain into two parts, which will be executed on end device and the edge cloud, respectively. The offloading point in the chain therefore is critical to the QoS in terms of task completion time. Considering the system dynamics and algorithm sensitivity, we apply Q-learning to address this problem. In order to deal with the late feedback problem, a reward rewind match strategy is proposed to customize Q-learning. Trace-driven simulation results show that our customized Q-learning based approach is able to achieve significant reduction on the total execution time, outperforming traditional offloading strategies and noncustomized Q-learning.

Optimal Use Of The TCP/IP Stack in User-space Storage Applications with ADQ feature in NIC

Ziye Yang, Ben Walker, James R Harris, Yadong Li, and Gang Cao

For storage applications based on TCP/IP, performance of the TCP/IP stack is often the dominant driver of the application��s overall performance. In this paper, we introduce Tbooster, which is a library designed for TCP/IP based storage applications that optimally leverages the Linux kernel��s TCP/IP stack and the socket interface to improve performance. This library allows for efficient grouping of connections onto threads and asynchronous, poll-mode operation, scaling to a massive number of TCP connections on each thread. Tbooster is designed to avoid making expensive system calls by batching and merging operations into a single operation per connection for each poll loop. Further, Tbooster leverages the Application Device Queues (ADQ) feature available in some Network Interface Cards to steer incoming data to the correct NIC receive queue. This avoids expensive coordination and message passing within the kernel when handling incoming data and especially improves outlier tail latencies of requests. Compared with a more standard usage of the Linux kernel TCP/IP stack, Tbooster improves storage I/O per second significantly (e.g., 9% to 22.7% IOPS improvement for an iSCSI target at 4KiB I/O size, 36.3% to 59.4% IOPS improvement for NVMe-oF TCP target on 8KiB I/O size). Moreover, when using the ADQ feature from Intel��s 100GbE NIC, it demonstrates 30% average time reduction of 99.99% long tail latency under heavy workloads for NVMe-oF TCP.

Session Chair

Shengli Pan (China University of Geosciences (Wuhan))

Made with in Toronto · Privacy Policy · © 2020 Duetone Corp.