Data Science Department


The amount of Data in our world has been exploding. E-commerce, financial applications, billing and customer services, multimedia and social applications - to name a few examples - will continue to fuel exponential growth of large pools of Data, that have become a competitive assets for a wide range of companies and scientific institutions. Nowadays, the ability to store, analyze, and ultimately explain the underlying meaning of Data has become ever more accessible as trends such as Moore’s Law in computing, its equivalent in digital storage, and Cloud Computing continue to lower costs and other technology barriers. As a result, a new breed of innovative and better services can be envisioned, with benefits for businesses, their customers and the general public.


Our vision is defined through an interdisciplinary approach to research, merging contributions from computer science, web science, machine learning and statistics, and addressing numerous applied problems. The study of data analysis comes with its own challenges, such as the development of methods, algorithms and ultimately computer programs for making reliable inferences from high dimensional and heterogeneous data. As a consequence, the Data Science research program is centered around the disciplines to semantically integrate and enrich data, to model and understand data, to design and analyze scalable computational approaches to machine learning, and to build systems that allow storing and processing vast amounts of Data. Ultimately, our research enables new and improved applications to emerge, in a multitude of domains.


Our research areas

The main research lines underpinning our academic and industrial projects involve the development of a solid foundation of systems and theoretical tools to interact with, manipulate and model Data:

  • Machine learning, deep learning and statistical modeling
  • Large-scale data mining and fusion
  • Information extraction and knowledge base population
  • Game theory, adversarial learning and economics models of Data
  • Distributed systems and data management systems

In addition, our work develops around several application domains, that cover the multitude and variety of modern Data sources:

  • Multimedia Data: image and video platforms, connected TVs
  • Machine/Sensor Data: smart cities, web of things, IoT, smart grids and ICT security
  • User Generated Data: social Data processing


The Data Science cloud computing platform

Our cloud computing platform enables teaching and research in key areas such as data science tools and applications, parallel and distributed systems, and high-performance cloud architectures. Hosted in our private data center, the platform features 1000+ computing cores, 2.5 TB of RAM and several hundreds TB of storage, backed by a well-provisioned network fabric.

The platform can host both traditional, virtualization-based services and modern micro-service architectures that use Docker containers. In particular, we have developed the concept of analytics-as-a-service, that let our users focus on their data science challenges, rather than on low-level intricacies of distributed data management systems.