{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# NumPy Indexing et Selection\n", "\n", "Dans cette session, nous discuterons de la façon de sélectionner des éléments ou des groupes d'éléments à partir d'un tableau." ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "import numpy as np" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [], "source": [ "# Création d'un tableau simple\n", "arr = np.arange(0,11)" ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# afficher\n", "arr" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Indexation et sélection\n", "La façon la plus simple de choisir un ou plusieurs éléments d'un tableau ressemble beaucoup aux listes python :" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "8" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Obtenir une valeur à un index\n", "arr[8]" ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([1, 2, 3, 4])" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Obtenir des valeurs dans une plage d'entiers\n", "arr[1:5]" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([0, 1, 2, 3, 4])" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Obtenir des valeurs dans une plage d'entiers\n", "arr[0:5]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Broadcasting\n", "\n", "Les tableaux Numpy diffèrent d'une liste Python normale en raison de leur capacité à diffuser :" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([100, 100, 100, 100, 100, 5, 6, 7, 8, 9, 10])" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Réglage d'une valeur avec plage d'indice (Broadcasting)\n", "arr[0:5]=100\n", "\n", "# afficher\n", "arr" ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Réinitialiser le tableau, nous verrons pourquoi \n", "# j'ai dû le réinitialiser plus bas.\n", "arr = np.arange(0,11)\n", "\n", "# afficher\n", "arr" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([0, 1, 2, 3, 4, 5])" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Remarques importantes sur les tranches (slices)\n", "slice_of_arr = arr[0:6]\n", "\n", "# afficher tranche\n", "slice_of_arr" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([99, 99, 99, 99, 99, 99])" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Changer de tranche\n", "slice_of_arr[:] = 99\n", "\n", "# Afficher de nouveau\n", "slice_of_arr" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Notez maintenant que les changements se produisent aussi dans notre tableau d'origine !" ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([99, 99, 99, 99, 99, 99, 6, 7, 8, 9, 10])" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "arr" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Les données ne sont pas copiées, c'est une vue du tableau d'origine ! Cela permet d'éviter les problèmes de mémoire !" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([99, 99, 99, 99, 99, 99, 6, 7, 8, 9, 10])" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Pour en obtenir une copie, il faut être explicite\n", "arr_copy = arr.copy()\n", "\n", "arr_copy" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Indexation d'un tableau 2D (matrices)\n", "\n", "Le format général est le suivant **arr_2d[row][col]** ou **arr_2d[row,col]**. Je recommande habituellement d'utiliser la notation par virgule pour plus de clarté." ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[ 5, 10, 15],\n", " [20, 25, 30],\n", " [35, 40, 45]])" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "arr_2d = np.array(([5,10,15],[20,25,30],[35,40,45]))\n", "\n", "# afficher\n", "arr_2d" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([20, 25, 30])" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Indexer une ligne\n", "arr_2d[1]\n" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "20" ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Le format est arr_2d[row][col] ou arr_2d[row,col]\n", "\n", "# Obtenir la valeur d'un élément individuel\n", "arr_2d[1][0]" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "20" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Obtenir la valeur d'un élément individuel\n", "arr_2d[1,0]" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([[10, 15],\n", " [25, 30]])" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# découpage en tranches d'un tableau 2D\n", "\n", "# Forme (2,2) du coin supérieur droit\n", "arr_2d[:2,1:]" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([35, 40, 45])" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Forme de la rangée du bas\n", "arr_2d[2]" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([35, 40, 45])" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Forme de la rangée du bas\n", "arr_2d[2,:]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Plus d'aide pour l'indexation\n", "L'indexation d'une matrice 2d peut être un peu confuse au début, surtout lorsque vous commencez à ajouter une taille de pas. Essayez la recherche d'images sur google \"NumPy indexing\" pour trouver des images utiles, comme celle-ci :\n", "\n", "" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Sélection conditionnelle\n", "\n", "C'est un concept très fondamental qui se réglera directement par pandas plus tard, assurez-vous de bien comprendre cette partie !\n", "\n", "Passons brièvement en revue la façon d'utiliser les parenthèses pour la sélection basée sur des opérateurs de comparaison." ] }, { "cell_type": "code", "execution_count": 28, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "arr = np.arange(1,11)\n", "arr" ] }, { "cell_type": "code", "execution_count": 30, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([False, False, False, False, True, True, True, True, True, True], dtype=bool)" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "arr > 4" ] }, { "cell_type": "code", "execution_count": 31, "metadata": { "collapsed": true }, "outputs": [], "source": [ "bool_arr = arr>4" ] }, { "cell_type": "code", "execution_count": 32, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([False, False, False, False, True, True, True, True, True, True], dtype=bool)" ] }, "execution_count": 32, "metadata": {}, "output_type": "execute_result" } ], "source": [ "bool_arr" ] }, { "cell_type": "code", "execution_count": 33, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([ 5, 6, 7, 8, 9, 10])" ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" } ], "source": [ "arr[bool_arr]" ] }, { "cell_type": "code", "execution_count": 34, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([ 3, 4, 5, 6, 7, 8, 9, 10])" ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "arr[arr>2]" ] }, { "cell_type": "code", "execution_count": 37, "metadata": {}, "outputs": [ { "data": { "text/plain": [ "array([ 3, 4, 5, 6, 7, 8, 9, 10])" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x = 2\n", "arr[arr>x]" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Bon travail!\n" ] } ], "metadata": { "anaconda-cloud": {}, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.5" } }, "nbformat": 4, "nbformat_minor": 1 }