Large-scale network simulation over heterogeneous computing architecture

Ben Romdhanne, Bilel

The simulation is a primary step on the evaluation process of modern networked systems. The scalability and efficiency of such a tool in view of increasing complexity of the emerging networks is a key to derive valuable results. The discrete event simulation is recognized as the most scalable model that copes with both parallel and distributed architecture. Nevertheless, the recent hardware provides new heterogeneous computing resources that can be exploited in parallel.

The main scope of this thesis is to provide a new mechanisms and optimizations that enable efficient and scalable parallel simulation using heterogeneous computing node architecture including multicore CPU and GPU. To address the efficiency, we propose to describe the events that only differs in their data as a single entry to reduce the event management cost. At the run time, the proposed hybrid scheduler will dispatch and inject the events on the most appropriate computing target based on the event descriptor and the current load obtained through a feedback mechanisms such that the hardware usage rate is maximized. Results have shown a significant gain of 100 times compared to traditional CPU based approaches.

In order to increase the scalability of the system, we propose a new simulation model, denoted as general purpose coordinator-master-worker, to address  jointly  the challenge of distributed and parallel simulation at different levels. The performance of a distributed simulation that relies on the GP-CMW architecture tends toward the maximal theoretical efficiency in a homogeneous deployment. The scalability of such a simulation model is validated on the largest European GPU-based super-calculator the TGCC Curie with 1024 LPs each of which simulates up to 1 million nodes.

To further validate the efficiency and scalability of the proposed mechanisms and optimizations, we applied the event grouping, hybrid scheduling, and GP-CMW to popular network simulator NS-3. Results have demonstrated a gain of 25 times under a  large scale deployment including 288 GPUs and 1152 CPU on the TGCC Curie.

Systèmes de Communication
Eurecom Ref:
© TELECOM ParisTech. Personal use of this material is permitted. The definitive version of this paper was published in Thesis and is available at :