Machine learning in Python with scikit-learn (original) (raw)

What you will learn

At the end of this course, you will be able to:

Description

Predictive modeling is a pillar of modern data science. In this field, scikit-learn is a central tool: it is easily accessible, yet powerful, and naturally dovetails in the wider ecosystem of data-science tools based on the Python programming language.

This course is an in-depth introduction to predictive modeling with scikit-learn. Step-by-step and didactic lessons introduce the fundamental methodological and software tools of machine learning, and is as such a stepping stone to more advanced challenges in artificial intelligence, text mining, or data science.

The course is more than a cookbook: it will teach you to be critical about each step of the design of a predictive modeling pipeline: from choices in data preprocessing, to choosing models, gaining insights on their failure modes and interpreting their predictions.

The training will be essentially practical, focusing on examples of applications with code executed by the participants.

The MOOC is free of charge, all the course materials are available at: https://inria.github.io/scikit-learn-mooc/.

The authors of the course are scikit-learn core developers, they will be your guides throughout the training!

Format

The course will cover practical aspects through the use of Jupyter notebooks and regular exercises. Throughout the course, we will highlight scikit-learn best practices and give you the intuition to use scikit-learn in a methodologically sound way.

Prerequisites

The course aims to be accessible without a strong technical background. The requirements for this course are:
- basic knowledge of Python programming : defining variables, writing functions, importing modules
- some prior experience with the NumPy, pandas and Matplotlib libraries is recommended but not required

For a quick introduction to these libraries, you can use the following resources : Introduction to NumPy and Matplotlib by Sebastian Raschka and 10 minutes to pandas.

Assessment and certification

Students' work in the course is assessed through quizzes after the lessons and programming exercises at the end of every modules.

An Open Badge for successful completion of the course will be issued on request to learners who obtain an overall score of 60% correct answers to all the quizzes and programming exercises.

Course plan

Course team

Arturo Amor

Arturo Amor is an engineer at Inria. He is in charge of broadening the scikit-learn documentation's accessibility to all kind of users.

Loïc Estève

Loïc Estève is a research engineer at Inria. He is a scikit-learn core developer since 2016.

Olivier Grisel

Olivier Grisel is a machine learning engineer at Inria. He is a scikit-learn core developer since 2010.

Guillaume Lemaître

Guillaume Lemaître is a research engineer at Inria. He is a scikit-learn core developer since 2017.

Gaël Varoquaux

Gaël Varoquaux is a research director at Inria. He is one of the creator of scikit-learn and the project manager for the scikit-learn consortium.

Thomas Schmitt

Thomas Schmitt is a machine Learning Engineer at Inria.

Organizations

Partnership

Hosting the Jupyter notebook execution environment for this MOOC.

Social networks

Follow us on twitter @InriaLearnLab and feel free to use the #ScikitLearnMooc hashtag.

License

License for the course content

Attribution

You are free to:

Under the following terms:

License for the content created by course participants