Aller au contenu  Aller au menu Aller à la recherche

Navigation principale

accès rapides, services personnalisés

Rechercher

Recherche détaillée

Contact

Prométhée Spathis
Head of the CNI program at SU

courriel : master.info.digit-cni@upmc.fr

 

 

Network Data Analysis

 

Person Responsible for Module (Name, Mail address):

Anastasios GIOVANIDIS

anastasios.giovanidis@lip6.fr>

Credit Points (ECTS): 6 Module-ID: MU5IN076
University: Sorbonne Université Department: Master Informatique

 

Prerequisites for Participation

The course NDA is largely self-contained and requires a minimum of two preliminaries:

  • The students should have followed and understood an elementary course in Probability, and should be confident with the notions of random variables, expectation, variance, and probability distributions.
  • The students should have basic knowledge and basic experience with the Python programming language.

Intended Learning Outcomes

Students who have successfully finished the model should have a sufficient background to understand machine learning techniques and manipulate data using the Python language and libraries, especially with respect to network-oriented problems. They should be capable of estimating basic parameters from raw data, as well as to perform regression and classification tasks for prediction. Furthermore, the students should be able to perform certain techniques for anomaly detection, and to study data that are described by time-series. These type of capabilities are extremely important for computer scientists who wish to work both in academia and industry, as it is nowadays crucial to validate models through data, and to use available measurements in order to estimate and predict network performance in very complex network structures.

Content

The course spans the fields of statistics, machine learning, time-series and their applications to networks. During this course the students will first build their knowledge on data manipulation and statistics in order to estimate with some confidence unknown network parameters from available data (e.g. traffic measures), or to make decisions under uncertainty. The main core of the course will cover machine learning topics including regression, classification, feature selection, as well as non-supervised learning with applications in clustering and anomaly detection. The course concludes with a study on time-series, which are often available in several industrial environments, e.g. from network operators. The course combines theory with practice. Every subject is treated in a 2 hour course where basic notions and analysis are presented; it is then completed with a 2 hour laboratory on Python, which urges the students to directly apply their new knowledge on provided datasets, and familiarise themselves with existing code-libraries. In this way users get direct hands-on experience of the notions taught during each class.

Teaching and Learning Methods

  • 2h weekly hours lecture
  • 2h weekly hours integrated interactive tutorials (problem solving, assignments discussion, lab sessions)

Assessment and Grading Procedures

The course is taught over 14 lectures each one of which includes 2 hours of teaching and 2 hours of laboratory exercise with Python. The grading is based on a 40% evaluation of the student performance over the Python-lab and 60% on a final examination. This means that the students are expected to deliver code about the Python exercise after every lab. These exercises will be noted and they will be taken into account for the 40% of the total grade. There will be no mid-term written examination but it is possible to ask for a project exercise to be delivered as well. The final examination that covers 60% of the grade will be written.

Workload calculation (contact hours, homework, exam preparation,..)

  • 4h weekly contact hours x 14 weeks = 56 h
  • 5h weekly hours preparation and afterwork x 14 weeks = 70 h
  • Exams preparation: 24h
  • Total: 150 h

Frequency and dates

Offered every Fall semester:

  • Classes start mid-September and end end-January;
  • Mid-semester exam around beginning of November;
  • Final exam at the beginning of February;
  • Makeup exam for those who failed the first session in next September.

Max. Number of Participants

30

Enrolment Procedures

Request to the head of CNI program

Recommended Reading, Course Material

  • H. Pishro-Nik, "Introduction to probability, statistics, and random processes", available at https://www.probabilitycourse.com, Kappa Research LLC, 2014
  • "An Introduction to Statistical Learning (with applications in R)", by Gareth James, Daniela Witten, Trevor Hastie, Robert Tibshirani, (Springer) DOI 10.1007/978-1-4614-7138-7, ISSN 1431-875X, ISBN 978-1-4614-7137-0, see also web-link: http://www-bcf.usc.edu/~gareth/ISL/
  • "Pattern Recognition and machine Learning", by Christopher M. Bishop, Springer 2006, ISBN 978-0387-31073-2