This work proposes an error-correcting-code inspired strategy to execute computing tasks in edge computing environments, which is designed to mitigate variability in response times and errors caused by edge devices’ heterogeneity and lack of reliability.
Running intensive compute tasks across the fifth generation mobile network of edge devices introduces distributed computing challenges: edge devices are heterogeneous in the compute, storage, and communication capabilities; and can exhibit unpredictable straggler effects and failures. In this work, we propose an error-correcting-code inspired strategy to execute computing tasks in edge computing environments, which is designed to mitigate variability in response times and errors caused by edge devices’ heterogeneity and lack of reliability. Unlike prior coding approaches, we incorporate partially unfinished coded tasks into our computation recovery, which allows us to achieve smooth performance degradation with low-complexity decoding when the coded tasks are run on edge devices with a fixed deadline. By further carrying out coding on edge devices as well as a master node, the proposed computing scheme also alleviates communication bottlenecks during data shuffling and is amenable to distributed implementation in a highly variable and limited network. Such distributed encoding forces us to solve new decoding challenges. Using a representative implementation based on federated multi-task learning frameworks, extensive performance simulations are carried out, which demonstrate that the proposed strategy offers significant gains in latency and accuracy over conventional coded computing schemes.