Fast edge resource scaling with distributed DNN

Giannakas, Theodoros; Tsilimantos, Dimitrios; Destounis, Apostolos; Spyropoulos, Thrasyvoulos
IEEE Transactions on Network and Service Management, 27 January 2025

Network slicing has been proposed as a paradigm for 5G+ networks. The operators slice physical resources from the edge all the way to the datacenter, and are responsible to micro-manage the allocation of these resources among tenants bound by predefined Service Level Agreements (SLAs). A key task, for which recent works have advocated the use of Deep Neural Networks (DNNs), is tracking the tenant demand and scaling its resources. Nevertheless, for the edge resources (e.g. RAN), a question arises on whether operators can: (a) scale them fast enough (often in the order of ms) and (b) afford to transmit huge amounts of data towards a remote cloud where such a DNN model might operate. We propose a Distributed DNN (DDNN) architecture for a class of such problems: a small subset of the DNN layers at the edge attempt to act as fast, standalone resource allocator; this is complemented by a mechanism to intelligently offload a percentage of (harder) decisions to additional DNN layers running at a remote cloud. To implement the offloading, we propose: (i) a Bayes-inspired method, using dropout during inference, to estimate the confidence in the local prediction; (ii) a learnable function which automatically classifies samples as “remote” (to be offloaded) or “local”. Using the public Milano dataset, we investigate how such a DDNN should be trained and operated to address (a) and (b). In some cases, our offloading methods are near-optimal, resolving up to 50% of decisions locally with little or no penalty on the allocation cost.


DOI
Type:
Journal
Date:
2025-01-27
Department:
Communication systems
Eurecom Ref:
8046
Copyright:
© 2025 IEEE. Personal use of this material is permitted. However, permission to reprint/republish this material for advertising or promotional purposes or for creating new collective works for resale or redistribution to servers or lists, or to reuse any copyrighted component of this work in other works must be obtained from the IEEE.

PERMALINK : https://www.eurecom.fr/publication/8046