{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {
    "collapsed": true
   },
   "source": [
    "# DataFrames\n",
    "\n",
    "Les DataFrames sont le point central de pandas et sont directement inspirés par le langage de programmation R. Nous pouvons considérer un DataFrame comme un ensemble d'objets Series assemblés et qui partagent le même index. Utilisons pandas pour explorer ce sujet !"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "import numpy as np"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [],
   "source": [
    "from numpy.random import randn\n",
    "np.random.seed(101)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {},
   "outputs": [],
   "source": [
    "df = pd.DataFrame(randn(5,4),index='A B C D E'.split(),columns='W X Y Z'.split())"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "          W         X         Y         Z\n",
       "A  2.706850  0.628133  0.907969  0.503826\n",
       "B  0.651118 -0.319318 -0.848077  0.605965\n",
       "C -2.018168  0.740122  0.528813 -0.589001\n",
       "D  0.188695 -0.758872 -0.933237  0.955057\n",
       "E  0.190794  1.978757  2.605967  0.683509"
      ],
      "text/html": "<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>W</th>\n      <th>X</th>\n      <th>Y</th>\n      <th>Z</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>A</th>\n      <td>2.706850</td>\n      <td>0.628133</td>\n      <td>0.907969</td>\n      <td>0.503826</td>\n    </tr>\n    <tr>\n      <th>B</th>\n      <td>0.651118</td>\n      <td>-0.319318</td>\n      <td>-0.848077</td>\n      <td>0.605965</td>\n    </tr>\n    <tr>\n      <th>C</th>\n      <td>-2.018168</td>\n      <td>0.740122</td>\n      <td>0.528813</td>\n      <td>-0.589001</td>\n    </tr>\n    <tr>\n      <th>D</th>\n      <td>0.188695</td>\n      <td>-0.758872</td>\n      <td>-0.933237</td>\n      <td>0.955057</td>\n    </tr>\n    <tr>\n      <th>E</th>\n      <td>0.190794</td>\n      <td>1.978757</td>\n      <td>2.605967</td>\n      <td>0.683509</td>\n    </tr>\n  </tbody>\n</table>\n</div>"
     },
     "metadata": {},
     "execution_count": 4
    }
   ],
   "source": [
    "df"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Sélection et Indexation\n",
    "\n",
    "Apprenons les différentes méthodes pour récupérer des données à partir d'une DataFrame"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {},
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "A    2.706850\n",
       "B    0.651118\n",
       "C   -2.018168\n",
       "D    0.188695\n",
       "E    0.190794\n",
       "Name: W, dtype: float64"
      ]
     },
     "metadata": {},
     "execution_count": 5
    }
   ],
   "source": [
    "df['W']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "          W         Z\n",
       "A  2.706850  0.503826\n",
       "B  0.651118  0.605965\n",
       "C -2.018168 -0.589001\n",
       "D  0.188695  0.955057\n",
       "E  0.190794  0.683509"
      ],
      "text/html": "<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>W</th>\n      <th>Z</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>A</th>\n      <td>2.706850</td>\n      <td>0.503826</td>\n    </tr>\n    <tr>\n      <th>B</th>\n      <td>0.651118</td>\n      <td>0.605965</td>\n    </tr>\n    <tr>\n      <th>C</th>\n      <td>-2.018168</td>\n      <td>-0.589001</td>\n    </tr>\n    <tr>\n      <th>D</th>\n      <td>0.188695</td>\n      <td>0.955057</td>\n    </tr>\n    <tr>\n      <th>E</th>\n      <td>0.190794</td>\n      <td>0.683509</td>\n    </tr>\n  </tbody>\n</table>\n</div>"
     },
     "metadata": {},
     "execution_count": 6
    }
   ],
   "source": [
    "# Passer une liste de noms de colonnes\n",
    "df[['W','Z']]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "A    2.706850\n",
       "B    0.651118\n",
       "C   -2.018168\n",
       "D    0.188695\n",
       "E    0.190794\n",
       "Name: W, dtype: float64"
      ]
     },
     "metadata": {},
     "execution_count": 7
    }
   ],
   "source": [
    "# Syntaxe SQL (Non Recommandée!)\n",
    "df.W"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Les colonnes d'un DataFrame sont juste des séries"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {},
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "pandas.core.series.Series"
      ]
     },
     "metadata": {},
     "execution_count": 8
    }
   ],
   "source": [
    "type(df['W'])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Création d'une nouvelle colonne:**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [],
   "source": [
    "df['new'] = df['W'] + df['Y']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "          W         X         Y         Z       new\n",
       "A  2.706850  0.628133  0.907969  0.503826  3.614819\n",
       "B  0.651118 -0.319318 -0.848077  0.605965 -0.196959\n",
       "C -2.018168  0.740122  0.528813 -0.589001 -1.489355\n",
       "D  0.188695 -0.758872 -0.933237  0.955057 -0.744542\n",
       "E  0.190794  1.978757  2.605967  0.683509  2.796762"
      ],
      "text/html": "<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>W</th>\n      <th>X</th>\n      <th>Y</th>\n      <th>Z</th>\n      <th>new</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>A</th>\n      <td>2.706850</td>\n      <td>0.628133</td>\n      <td>0.907969</td>\n      <td>0.503826</td>\n      <td>3.614819</td>\n    </tr>\n    <tr>\n      <th>B</th>\n      <td>0.651118</td>\n      <td>-0.319318</td>\n      <td>-0.848077</td>\n      <td>0.605965</td>\n      <td>-0.196959</td>\n    </tr>\n    <tr>\n      <th>C</th>\n      <td>-2.018168</td>\n      <td>0.740122</td>\n      <td>0.528813</td>\n      <td>-0.589001</td>\n      <td>-1.489355</td>\n    </tr>\n    <tr>\n      <th>D</th>\n      <td>0.188695</td>\n      <td>-0.758872</td>\n      <td>-0.933237</td>\n      <td>0.955057</td>\n      <td>-0.744542</td>\n    </tr>\n    <tr>\n      <th>E</th>\n      <td>0.190794</td>\n      <td>1.978757</td>\n      <td>2.605967</td>\n      <td>0.683509</td>\n      <td>2.796762</td>\n    </tr>\n  </tbody>\n</table>\n</div>"
     },
     "metadata": {},
     "execution_count": 10
    }
   ],
   "source": [
    "df"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Supression d'une colonne**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {},
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "          W         X         Y         Z\n",
       "A  2.706850  0.628133  0.907969  0.503826\n",
       "B  0.651118 -0.319318 -0.848077  0.605965\n",
       "C -2.018168  0.740122  0.528813 -0.589001\n",
       "D  0.188695 -0.758872 -0.933237  0.955057\n",
       "E  0.190794  1.978757  2.605967  0.683509"
      ],
      "text/html": "<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>W</th>\n      <th>X</th>\n      <th>Y</th>\n      <th>Z</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>A</th>\n      <td>2.706850</td>\n      <td>0.628133</td>\n      <td>0.907969</td>\n      <td>0.503826</td>\n    </tr>\n    <tr>\n      <th>B</th>\n      <td>0.651118</td>\n      <td>-0.319318</td>\n      <td>-0.848077</td>\n      <td>0.605965</td>\n    </tr>\n    <tr>\n      <th>C</th>\n      <td>-2.018168</td>\n      <td>0.740122</td>\n      <td>0.528813</td>\n      <td>-0.589001</td>\n    </tr>\n    <tr>\n      <th>D</th>\n      <td>0.188695</td>\n      <td>-0.758872</td>\n      <td>-0.933237</td>\n      <td>0.955057</td>\n    </tr>\n    <tr>\n      <th>E</th>\n      <td>0.190794</td>\n      <td>1.978757</td>\n      <td>2.605967</td>\n      <td>0.683509</td>\n    </tr>\n  </tbody>\n</table>\n</div>"
     },
     "metadata": {},
     "execution_count": 11
    }
   ],
   "source": [
    "df.drop('new',axis=1)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {},
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "          W         X         Y         Z       new\n",
       "A  2.706850  0.628133  0.907969  0.503826  3.614819\n",
       "B  0.651118 -0.319318 -0.848077  0.605965 -0.196959\n",
       "C -2.018168  0.740122  0.528813 -0.589001 -1.489355\n",
       "D  0.188695 -0.758872 -0.933237  0.955057 -0.744542\n",
       "E  0.190794  1.978757  2.605967  0.683509  2.796762"
      ],
      "text/html": "<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>W</th>\n      <th>X</th>\n      <th>Y</th>\n      <th>Z</th>\n      <th>new</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>A</th>\n      <td>2.706850</td>\n      <td>0.628133</td>\n      <td>0.907969</td>\n      <td>0.503826</td>\n      <td>3.614819</td>\n    </tr>\n    <tr>\n      <th>B</th>\n      <td>0.651118</td>\n      <td>-0.319318</td>\n      <td>-0.848077</td>\n      <td>0.605965</td>\n      <td>-0.196959</td>\n    </tr>\n    <tr>\n      <th>C</th>\n      <td>-2.018168</td>\n      <td>0.740122</td>\n      <td>0.528813</td>\n      <td>-0.589001</td>\n      <td>-1.489355</td>\n    </tr>\n    <tr>\n      <th>D</th>\n      <td>0.188695</td>\n      <td>-0.758872</td>\n      <td>-0.933237</td>\n      <td>0.955057</td>\n      <td>-0.744542</td>\n    </tr>\n    <tr>\n      <th>E</th>\n      <td>0.190794</td>\n      <td>1.978757</td>\n      <td>2.605967</td>\n      <td>0.683509</td>\n      <td>2.796762</td>\n    </tr>\n  </tbody>\n</table>\n</div>"
     },
     "metadata": {},
     "execution_count": 12
    }
   ],
   "source": [
    "# Pas de remplacement sauf si spécifié!\n",
    "df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {},
   "outputs": [],
   "source": [
    "df.drop('new',axis=1,inplace=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {},
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "          W         X         Y         Z\n",
       "A  2.706850  0.628133  0.907969  0.503826\n",
       "B  0.651118 -0.319318 -0.848077  0.605965\n",
       "C -2.018168  0.740122  0.528813 -0.589001\n",
       "D  0.188695 -0.758872 -0.933237  0.955057\n",
       "E  0.190794  1.978757  2.605967  0.683509"
      ],
      "text/html": "<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>W</th>\n      <th>X</th>\n      <th>Y</th>\n      <th>Z</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>A</th>\n      <td>2.706850</td>\n      <td>0.628133</td>\n      <td>0.907969</td>\n      <td>0.503826</td>\n    </tr>\n    <tr>\n      <th>B</th>\n      <td>0.651118</td>\n      <td>-0.319318</td>\n      <td>-0.848077</td>\n      <td>0.605965</td>\n    </tr>\n    <tr>\n      <th>C</th>\n      <td>-2.018168</td>\n      <td>0.740122</td>\n      <td>0.528813</td>\n      <td>-0.589001</td>\n    </tr>\n    <tr>\n      <th>D</th>\n      <td>0.188695</td>\n      <td>-0.758872</td>\n      <td>-0.933237</td>\n      <td>0.955057</td>\n    </tr>\n    <tr>\n      <th>E</th>\n      <td>0.190794</td>\n      <td>1.978757</td>\n      <td>2.605967</td>\n      <td>0.683509</td>\n    </tr>\n  </tbody>\n</table>\n</div>"
     },
     "metadata": {},
     "execution_count": 14
    }
   ],
   "source": [
    "df"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "On peut aussi supprimer une ligne de cette façon:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "          W         X         Y         Z\n",
       "A  2.706850  0.628133  0.907969  0.503826\n",
       "B  0.651118 -0.319318 -0.848077  0.605965\n",
       "C -2.018168  0.740122  0.528813 -0.589001\n",
       "D  0.188695 -0.758872 -0.933237  0.955057"
      ],
      "text/html": "<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>W</th>\n      <th>X</th>\n      <th>Y</th>\n      <th>Z</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>A</th>\n      <td>2.706850</td>\n      <td>0.628133</td>\n      <td>0.907969</td>\n      <td>0.503826</td>\n    </tr>\n    <tr>\n      <th>B</th>\n      <td>0.651118</td>\n      <td>-0.319318</td>\n      <td>-0.848077</td>\n      <td>0.605965</td>\n    </tr>\n    <tr>\n      <th>C</th>\n      <td>-2.018168</td>\n      <td>0.740122</td>\n      <td>0.528813</td>\n      <td>-0.589001</td>\n    </tr>\n    <tr>\n      <th>D</th>\n      <td>0.188695</td>\n      <td>-0.758872</td>\n      <td>-0.933237</td>\n      <td>0.955057</td>\n    </tr>\n  </tbody>\n</table>\n</div>"
     },
     "metadata": {},
     "execution_count": 15
    }
   ],
   "source": [
    "df.drop('E',axis=0)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Sélection de lignes**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {},
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "W    2.706850\n",
       "X    0.628133\n",
       "Y    0.907969\n",
       "Z    0.503826\n",
       "Name: A, dtype: float64"
      ]
     },
     "metadata": {},
     "execution_count": 16
    }
   ],
   "source": [
    "df.loc['A']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Ou sélectionner en fonction de la position au lieu de l'étiquette "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {},
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "W   -2.018168\n",
       "X    0.740122\n",
       "Y    0.528813\n",
       "Z   -0.589001\n",
       "Name: C, dtype: float64"
      ]
     },
     "metadata": {},
     "execution_count": 17
    }
   ],
   "source": [
    "df.iloc[2]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "**Sélection d'un sous-ensemble de lignes et de colonnes**"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "-0.8480769834036315"
      ]
     },
     "metadata": {},
     "execution_count": 18
    }
   ],
   "source": [
    "df.loc['B','Y']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {},
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "          W         Y\n",
       "A  2.706850  0.907969\n",
       "B  0.651118 -0.848077"
      ],
      "text/html": "<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>W</th>\n      <th>Y</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>A</th>\n      <td>2.706850</td>\n      <td>0.907969</td>\n    </tr>\n    <tr>\n      <th>B</th>\n      <td>0.651118</td>\n      <td>-0.848077</td>\n    </tr>\n  </tbody>\n</table>\n</div>"
     },
     "metadata": {},
     "execution_count": 19
    }
   ],
   "source": [
    "df.loc[['A','B'],['W','Y']]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Sélection conditionnelle\n",
    "\n",
    "Une caractéristique importante de pandas est la sélection conditionnelle à l'aide des crochets, très similaire à celle de numpy :"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "          W         X         Y         Z\n",
       "A  2.706850  0.628133  0.907969  0.503826\n",
       "B  0.651118 -0.319318 -0.848077  0.605965\n",
       "C -2.018168  0.740122  0.528813 -0.589001\n",
       "D  0.188695 -0.758872 -0.933237  0.955057\n",
       "E  0.190794  1.978757  2.605967  0.683509"
      ],
      "text/html": "<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>W</th>\n      <th>X</th>\n      <th>Y</th>\n      <th>Z</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>A</th>\n      <td>2.706850</td>\n      <td>0.628133</td>\n      <td>0.907969</td>\n      <td>0.503826</td>\n    </tr>\n    <tr>\n      <th>B</th>\n      <td>0.651118</td>\n      <td>-0.319318</td>\n      <td>-0.848077</td>\n      <td>0.605965</td>\n    </tr>\n    <tr>\n      <th>C</th>\n      <td>-2.018168</td>\n      <td>0.740122</td>\n      <td>0.528813</td>\n      <td>-0.589001</td>\n    </tr>\n    <tr>\n      <th>D</th>\n      <td>0.188695</td>\n      <td>-0.758872</td>\n      <td>-0.933237</td>\n      <td>0.955057</td>\n    </tr>\n    <tr>\n      <th>E</th>\n      <td>0.190794</td>\n      <td>1.978757</td>\n      <td>2.605967</td>\n      <td>0.683509</td>\n    </tr>\n  </tbody>\n</table>\n</div>"
     },
     "metadata": {},
     "execution_count": 20
    }
   ],
   "source": [
    "df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "       W      X      Y      Z\n",
       "A   True   True   True   True\n",
       "B   True  False  False   True\n",
       "C  False   True   True  False\n",
       "D   True  False  False   True\n",
       "E   True   True   True   True"
      ],
      "text/html": "<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>W</th>\n      <th>X</th>\n      <th>Y</th>\n      <th>Z</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>A</th>\n      <td>True</td>\n      <td>True</td>\n      <td>True</td>\n      <td>True</td>\n    </tr>\n    <tr>\n      <th>B</th>\n      <td>True</td>\n      <td>False</td>\n      <td>False</td>\n      <td>True</td>\n    </tr>\n    <tr>\n      <th>C</th>\n      <td>False</td>\n      <td>True</td>\n      <td>True</td>\n      <td>False</td>\n    </tr>\n    <tr>\n      <th>D</th>\n      <td>True</td>\n      <td>False</td>\n      <td>False</td>\n      <td>True</td>\n    </tr>\n    <tr>\n      <th>E</th>\n      <td>True</td>\n      <td>True</td>\n      <td>True</td>\n      <td>True</td>\n    </tr>\n  </tbody>\n</table>\n</div>"
     },
     "metadata": {},
     "execution_count": 21
    }
   ],
   "source": [
    "df>0"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "          W         X         Y         Z\n",
       "A  2.706850  0.628133  0.907969  0.503826\n",
       "B  0.651118       NaN       NaN  0.605965\n",
       "C       NaN  0.740122  0.528813       NaN\n",
       "D  0.188695       NaN       NaN  0.955057\n",
       "E  0.190794  1.978757  2.605967  0.683509"
      ],
      "text/html": "<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>W</th>\n      <th>X</th>\n      <th>Y</th>\n      <th>Z</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>A</th>\n      <td>2.706850</td>\n      <td>0.628133</td>\n      <td>0.907969</td>\n      <td>0.503826</td>\n    </tr>\n    <tr>\n      <th>B</th>\n      <td>0.651118</td>\n      <td>NaN</td>\n      <td>NaN</td>\n      <td>0.605965</td>\n    </tr>\n    <tr>\n      <th>C</th>\n      <td>NaN</td>\n      <td>0.740122</td>\n      <td>0.528813</td>\n      <td>NaN</td>\n    </tr>\n    <tr>\n      <th>D</th>\n      <td>0.188695</td>\n      <td>NaN</td>\n      <td>NaN</td>\n      <td>0.955057</td>\n    </tr>\n    <tr>\n      <th>E</th>\n      <td>0.190794</td>\n      <td>1.978757</td>\n      <td>2.605967</td>\n      <td>0.683509</td>\n    </tr>\n  </tbody>\n</table>\n</div>"
     },
     "metadata": {},
     "execution_count": 22
    }
   ],
   "source": [
    "df[df>0]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {},
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "          W         X         Y         Z\n",
       "A  2.706850  0.628133  0.907969  0.503826\n",
       "B  0.651118 -0.319318 -0.848077  0.605965\n",
       "D  0.188695 -0.758872 -0.933237  0.955057\n",
       "E  0.190794  1.978757  2.605967  0.683509"
      ],
      "text/html": "<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>W</th>\n      <th>X</th>\n      <th>Y</th>\n      <th>Z</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>A</th>\n      <td>2.706850</td>\n      <td>0.628133</td>\n      <td>0.907969</td>\n      <td>0.503826</td>\n    </tr>\n    <tr>\n      <th>B</th>\n      <td>0.651118</td>\n      <td>-0.319318</td>\n      <td>-0.848077</td>\n      <td>0.605965</td>\n    </tr>\n    <tr>\n      <th>D</th>\n      <td>0.188695</td>\n      <td>-0.758872</td>\n      <td>-0.933237</td>\n      <td>0.955057</td>\n    </tr>\n    <tr>\n      <th>E</th>\n      <td>0.190794</td>\n      <td>1.978757</td>\n      <td>2.605967</td>\n      <td>0.683509</td>\n    </tr>\n  </tbody>\n</table>\n</div>"
     },
     "metadata": {},
     "execution_count": 23
    }
   ],
   "source": [
    "df[df['W']>0]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 24,
   "metadata": {},
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "A    0.907969\n",
       "B   -0.848077\n",
       "D   -0.933237\n",
       "E    2.605967\n",
       "Name: Y, dtype: float64"
      ]
     },
     "metadata": {},
     "execution_count": 24
    }
   ],
   "source": [
    "df[df['W']>0]['Y']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {},
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "          Y         X\n",
       "A  0.907969  0.628133\n",
       "B -0.848077 -0.319318\n",
       "D -0.933237 -0.758872\n",
       "E  2.605967  1.978757"
      ],
      "text/html": "<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>Y</th>\n      <th>X</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>A</th>\n      <td>0.907969</td>\n      <td>0.628133</td>\n    </tr>\n    <tr>\n      <th>B</th>\n      <td>-0.848077</td>\n      <td>-0.319318</td>\n    </tr>\n    <tr>\n      <th>D</th>\n      <td>-0.933237</td>\n      <td>-0.758872</td>\n    </tr>\n    <tr>\n      <th>E</th>\n      <td>2.605967</td>\n      <td>1.978757</td>\n    </tr>\n  </tbody>\n</table>\n</div>"
     },
     "metadata": {},
     "execution_count": 25
    }
   ],
   "source": [
    "df[df['W']>0][['Y','X']]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Pour 2 conditions, vous pouvez utiliser | et & avec des parenthèses:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "          W         X         Y         Z\n",
       "E  0.190794  1.978757  2.605967  0.683509"
      ],
      "text/html": "<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>W</th>\n      <th>X</th>\n      <th>Y</th>\n      <th>Z</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>E</th>\n      <td>0.190794</td>\n      <td>1.978757</td>\n      <td>2.605967</td>\n      <td>0.683509</td>\n    </tr>\n  </tbody>\n</table>\n</div>"
     },
     "metadata": {},
     "execution_count": 26
    }
   ],
   "source": [
    "df[(df['W']>0) & (df['Y'] > 1)]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Plus de détails sur l'Index\n",
    "\n",
    "Discutons d'autres caractéristiques de l'indexation, y compris la réinitialisation de l'index ou la définition d'une autre fonction. Nous parlerons aussi de la hiérarchie des indices !"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {},
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "          W         X         Y         Z\n",
       "A  2.706850  0.628133  0.907969  0.503826\n",
       "B  0.651118 -0.319318 -0.848077  0.605965\n",
       "C -2.018168  0.740122  0.528813 -0.589001\n",
       "D  0.188695 -0.758872 -0.933237  0.955057\n",
       "E  0.190794  1.978757  2.605967  0.683509"
      ],
      "text/html": "<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>W</th>\n      <th>X</th>\n      <th>Y</th>\n      <th>Z</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>A</th>\n      <td>2.706850</td>\n      <td>0.628133</td>\n      <td>0.907969</td>\n      <td>0.503826</td>\n    </tr>\n    <tr>\n      <th>B</th>\n      <td>0.651118</td>\n      <td>-0.319318</td>\n      <td>-0.848077</td>\n      <td>0.605965</td>\n    </tr>\n    <tr>\n      <th>C</th>\n      <td>-2.018168</td>\n      <td>0.740122</td>\n      <td>0.528813</td>\n      <td>-0.589001</td>\n    </tr>\n    <tr>\n      <th>D</th>\n      <td>0.188695</td>\n      <td>-0.758872</td>\n      <td>-0.933237</td>\n      <td>0.955057</td>\n    </tr>\n    <tr>\n      <th>E</th>\n      <td>0.190794</td>\n      <td>1.978757</td>\n      <td>2.605967</td>\n      <td>0.683509</td>\n    </tr>\n  </tbody>\n</table>\n</div>"
     },
     "metadata": {},
     "execution_count": 27
    }
   ],
   "source": [
    "df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "  index         W         X         Y         Z\n",
       "0     A  2.706850  0.628133  0.907969  0.503826\n",
       "1     B  0.651118 -0.319318 -0.848077  0.605965\n",
       "2     C -2.018168  0.740122  0.528813 -0.589001\n",
       "3     D  0.188695 -0.758872 -0.933237  0.955057\n",
       "4     E  0.190794  1.978757  2.605967  0.683509"
      ],
      "text/html": "<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>index</th>\n      <th>W</th>\n      <th>X</th>\n      <th>Y</th>\n      <th>Z</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>0</th>\n      <td>A</td>\n      <td>2.706850</td>\n      <td>0.628133</td>\n      <td>0.907969</td>\n      <td>0.503826</td>\n    </tr>\n    <tr>\n      <th>1</th>\n      <td>B</td>\n      <td>0.651118</td>\n      <td>-0.319318</td>\n      <td>-0.848077</td>\n      <td>0.605965</td>\n    </tr>\n    <tr>\n      <th>2</th>\n      <td>C</td>\n      <td>-2.018168</td>\n      <td>0.740122</td>\n      <td>0.528813</td>\n      <td>-0.589001</td>\n    </tr>\n    <tr>\n      <th>3</th>\n      <td>D</td>\n      <td>0.188695</td>\n      <td>-0.758872</td>\n      <td>-0.933237</td>\n      <td>0.955057</td>\n    </tr>\n    <tr>\n      <th>4</th>\n      <td>E</td>\n      <td>0.190794</td>\n      <td>1.978757</td>\n      <td>2.605967</td>\n      <td>0.683509</td>\n    </tr>\n  </tbody>\n</table>\n</div>"
     },
     "metadata": {},
     "execution_count": 28
    }
   ],
   "source": [
    "# Réinitialisation de l'indice par défaut 0,1...n\n",
    "df.reset_index()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [],
   "source": [
    "newind = 'CA NY WY OR CO'.split()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [],
   "source": [
    "df['States'] = newind"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "          W         X         Y         Z States\n",
       "A  2.706850  0.628133  0.907969  0.503826     CA\n",
       "B  0.651118 -0.319318 -0.848077  0.605965     NY\n",
       "C -2.018168  0.740122  0.528813 -0.589001     WY\n",
       "D  0.188695 -0.758872 -0.933237  0.955057     OR\n",
       "E  0.190794  1.978757  2.605967  0.683509     CO"
      ],
      "text/html": "<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>W</th>\n      <th>X</th>\n      <th>Y</th>\n      <th>Z</th>\n      <th>States</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>A</th>\n      <td>2.706850</td>\n      <td>0.628133</td>\n      <td>0.907969</td>\n      <td>0.503826</td>\n      <td>CA</td>\n    </tr>\n    <tr>\n      <th>B</th>\n      <td>0.651118</td>\n      <td>-0.319318</td>\n      <td>-0.848077</td>\n      <td>0.605965</td>\n      <td>NY</td>\n    </tr>\n    <tr>\n      <th>C</th>\n      <td>-2.018168</td>\n      <td>0.740122</td>\n      <td>0.528813</td>\n      <td>-0.589001</td>\n      <td>WY</td>\n    </tr>\n    <tr>\n      <th>D</th>\n      <td>0.188695</td>\n      <td>-0.758872</td>\n      <td>-0.933237</td>\n      <td>0.955057</td>\n      <td>OR</td>\n    </tr>\n    <tr>\n      <th>E</th>\n      <td>0.190794</td>\n      <td>1.978757</td>\n      <td>2.605967</td>\n      <td>0.683509</td>\n      <td>CO</td>\n    </tr>\n  </tbody>\n</table>\n</div>"
     },
     "metadata": {},
     "execution_count": 31
    }
   ],
   "source": [
    "df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {},
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "               W         X         Y         Z\n",
       "States                                        \n",
       "CA      2.706850  0.628133  0.907969  0.503826\n",
       "NY      0.651118 -0.319318 -0.848077  0.605965\n",
       "WY     -2.018168  0.740122  0.528813 -0.589001\n",
       "OR      0.188695 -0.758872 -0.933237  0.955057\n",
       "CO      0.190794  1.978757  2.605967  0.683509"
      ],
      "text/html": "<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>W</th>\n      <th>X</th>\n      <th>Y</th>\n      <th>Z</th>\n    </tr>\n    <tr>\n      <th>States</th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>CA</th>\n      <td>2.706850</td>\n      <td>0.628133</td>\n      <td>0.907969</td>\n      <td>0.503826</td>\n    </tr>\n    <tr>\n      <th>NY</th>\n      <td>0.651118</td>\n      <td>-0.319318</td>\n      <td>-0.848077</td>\n      <td>0.605965</td>\n    </tr>\n    <tr>\n      <th>WY</th>\n      <td>-2.018168</td>\n      <td>0.740122</td>\n      <td>0.528813</td>\n      <td>-0.589001</td>\n    </tr>\n    <tr>\n      <th>OR</th>\n      <td>0.188695</td>\n      <td>-0.758872</td>\n      <td>-0.933237</td>\n      <td>0.955057</td>\n    </tr>\n    <tr>\n      <th>CO</th>\n      <td>0.190794</td>\n      <td>1.978757</td>\n      <td>2.605967</td>\n      <td>0.683509</td>\n    </tr>\n  </tbody>\n</table>\n</div>"
     },
     "metadata": {},
     "execution_count": 32
    }
   ],
   "source": [
    "df.set_index('States')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {},
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "          W         X         Y         Z States\n",
       "A  2.706850  0.628133  0.907969  0.503826     CA\n",
       "B  0.651118 -0.319318 -0.848077  0.605965     NY\n",
       "C -2.018168  0.740122  0.528813 -0.589001     WY\n",
       "D  0.188695 -0.758872 -0.933237  0.955057     OR\n",
       "E  0.190794  1.978757  2.605967  0.683509     CO"
      ],
      "text/html": "<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>W</th>\n      <th>X</th>\n      <th>Y</th>\n      <th>Z</th>\n      <th>States</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>A</th>\n      <td>2.706850</td>\n      <td>0.628133</td>\n      <td>0.907969</td>\n      <td>0.503826</td>\n      <td>CA</td>\n    </tr>\n    <tr>\n      <th>B</th>\n      <td>0.651118</td>\n      <td>-0.319318</td>\n      <td>-0.848077</td>\n      <td>0.605965</td>\n      <td>NY</td>\n    </tr>\n    <tr>\n      <th>C</th>\n      <td>-2.018168</td>\n      <td>0.740122</td>\n      <td>0.528813</td>\n      <td>-0.589001</td>\n      <td>WY</td>\n    </tr>\n    <tr>\n      <th>D</th>\n      <td>0.188695</td>\n      <td>-0.758872</td>\n      <td>-0.933237</td>\n      <td>0.955057</td>\n      <td>OR</td>\n    </tr>\n    <tr>\n      <th>E</th>\n      <td>0.190794</td>\n      <td>1.978757</td>\n      <td>2.605967</td>\n      <td>0.683509</td>\n      <td>CO</td>\n    </tr>\n  </tbody>\n</table>\n</div>"
     },
     "metadata": {},
     "execution_count": 33
    }
   ],
   "source": [
    "df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {},
   "outputs": [],
   "source": [
    "df.set_index('States',inplace=True)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 35,
   "metadata": {},
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "               W         X         Y         Z\n",
       "States                                        \n",
       "CA      2.706850  0.628133  0.907969  0.503826\n",
       "NY      0.651118 -0.319318 -0.848077  0.605965\n",
       "WY     -2.018168  0.740122  0.528813 -0.589001\n",
       "OR      0.188695 -0.758872 -0.933237  0.955057\n",
       "CO      0.190794  1.978757  2.605967  0.683509"
      ],
      "text/html": "<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>W</th>\n      <th>X</th>\n      <th>Y</th>\n      <th>Z</th>\n    </tr>\n    <tr>\n      <th>States</th>\n      <th></th>\n      <th></th>\n      <th></th>\n      <th></th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>CA</th>\n      <td>2.706850</td>\n      <td>0.628133</td>\n      <td>0.907969</td>\n      <td>0.503826</td>\n    </tr>\n    <tr>\n      <th>NY</th>\n      <td>0.651118</td>\n      <td>-0.319318</td>\n      <td>-0.848077</td>\n      <td>0.605965</td>\n    </tr>\n    <tr>\n      <th>WY</th>\n      <td>-2.018168</td>\n      <td>0.740122</td>\n      <td>0.528813</td>\n      <td>-0.589001</td>\n    </tr>\n    <tr>\n      <th>OR</th>\n      <td>0.188695</td>\n      <td>-0.758872</td>\n      <td>-0.933237</td>\n      <td>0.955057</td>\n    </tr>\n    <tr>\n      <th>CO</th>\n      <td>0.190794</td>\n      <td>1.978757</td>\n      <td>2.605967</td>\n      <td>0.683509</td>\n    </tr>\n  </tbody>\n</table>\n</div>"
     },
     "metadata": {},
     "execution_count": 35
    }
   ],
   "source": [
    "df"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Multi-index et hiérarchie des indices\n",
    "\n",
    "Voyons comment travailler avec un Multi-Index, nous allons d'abord créer un exemple rapide de ce à quoi ressemblerait un DataFrame Multi-Indexé :"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {},
   "outputs": [],
   "source": [
    "# Niveaux d'Index\n",
    "outside = ['G1','G1','G1','G2','G2','G2']\n",
    "inside = [1,2,3,1,2,3]\n",
    "hier_index = list(zip(outside,inside))\n",
    "hier_index = pd.MultiIndex.from_tuples(hier_index)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {},
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "MultiIndex([('G1', 1),\n",
       "            ('G1', 2),\n",
       "            ('G1', 3),\n",
       "            ('G2', 1),\n",
       "            ('G2', 2),\n",
       "            ('G2', 3)],\n",
       "           )"
      ]
     },
     "metadata": {},
     "execution_count": 37
    }
   ],
   "source": [
    "hier_index"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {},
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "             A         B\n",
       "G1 1  0.302665  1.693723\n",
       "   2 -1.706086 -1.159119\n",
       "   3 -0.134841  0.390528\n",
       "G2 1  0.166905  0.184502\n",
       "   2  0.807706  0.072960\n",
       "   3  0.638787  0.329646"
      ],
      "text/html": "<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th></th>\n      <th>A</th>\n      <th>B</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th rowspan=\"3\" valign=\"top\">G1</th>\n      <th>1</th>\n      <td>0.302665</td>\n      <td>1.693723</td>\n    </tr>\n    <tr>\n      <th>2</th>\n      <td>-1.706086</td>\n      <td>-1.159119</td>\n    </tr>\n    <tr>\n      <th>3</th>\n      <td>-0.134841</td>\n      <td>0.390528</td>\n    </tr>\n    <tr>\n      <th rowspan=\"3\" valign=\"top\">G2</th>\n      <th>1</th>\n      <td>0.166905</td>\n      <td>0.184502</td>\n    </tr>\n    <tr>\n      <th>2</th>\n      <td>0.807706</td>\n      <td>0.072960</td>\n    </tr>\n    <tr>\n      <th>3</th>\n      <td>0.638787</td>\n      <td>0.329646</td>\n    </tr>\n  </tbody>\n</table>\n</div>"
     },
     "metadata": {},
     "execution_count": 38
    }
   ],
   "source": [
    "df = pd.DataFrame(np.random.randn(6,2),index=hier_index,columns=['A','B'])\n",
    "df"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Maintenant, montrons comment indexer ceci ! Pour la hiérarchie d'index nous utilisons df.loc[], si c'était sur l'axe des colonnes, vous n'utiliseriez que la notation normale entre crochets df[]. L'appel d'un niveau de l'index retourne un sous-dataframe :"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 39,
   "metadata": {},
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "          A         B\n",
       "1  0.302665  1.693723\n",
       "2 -1.706086 -1.159119\n",
       "3 -0.134841  0.390528"
      ],
      "text/html": "<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>A</th>\n      <th>B</th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>1</th>\n      <td>0.302665</td>\n      <td>1.693723</td>\n    </tr>\n    <tr>\n      <th>2</th>\n      <td>-1.706086</td>\n      <td>-1.159119</td>\n    </tr>\n    <tr>\n      <th>3</th>\n      <td>-0.134841</td>\n      <td>0.390528</td>\n    </tr>\n  </tbody>\n</table>\n</div>"
     },
     "metadata": {},
     "execution_count": 39
    }
   ],
   "source": [
    "df.loc['G1']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {},
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "A    0.302665\n",
       "B    1.693723\n",
       "Name: 1, dtype: float64"
      ]
     },
     "metadata": {},
     "execution_count": 40
    }
   ],
   "source": [
    "df.loc['G1'].loc[1]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {},
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "FrozenList([None, None])"
      ]
     },
     "metadata": {},
     "execution_count": 41
    }
   ],
   "source": [
    "df.index.names"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "metadata": {},
   "outputs": [],
   "source": [
    "df.index.names = ['Group','Num']"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "metadata": {},
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "                  A         B\n",
       "Group Num                    \n",
       "G1    1    0.302665  1.693723\n",
       "      2   -1.706086 -1.159119\n",
       "      3   -0.134841  0.390528\n",
       "G2    1    0.166905  0.184502\n",
       "      2    0.807706  0.072960\n",
       "      3    0.638787  0.329646"
      ],
      "text/html": "<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th></th>\n      <th>A</th>\n      <th>B</th>\n    </tr>\n    <tr>\n      <th>Group</th>\n      <th>Num</th>\n      <th></th>\n      <th></th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th rowspan=\"3\" valign=\"top\">G1</th>\n      <th>1</th>\n      <td>0.302665</td>\n      <td>1.693723</td>\n    </tr>\n    <tr>\n      <th>2</th>\n      <td>-1.706086</td>\n      <td>-1.159119</td>\n    </tr>\n    <tr>\n      <th>3</th>\n      <td>-0.134841</td>\n      <td>0.390528</td>\n    </tr>\n    <tr>\n      <th rowspan=\"3\" valign=\"top\">G2</th>\n      <th>1</th>\n      <td>0.166905</td>\n      <td>0.184502</td>\n    </tr>\n    <tr>\n      <th>2</th>\n      <td>0.807706</td>\n      <td>0.072960</td>\n    </tr>\n    <tr>\n      <th>3</th>\n      <td>0.638787</td>\n      <td>0.329646</td>\n    </tr>\n  </tbody>\n</table>\n</div>"
     },
     "metadata": {},
     "execution_count": 43
    }
   ],
   "source": [
    "df"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "metadata": {},
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "            A         B\n",
       "Num                    \n",
       "1    0.302665  1.693723\n",
       "2   -1.706086 -1.159119\n",
       "3   -0.134841  0.390528"
      ],
      "text/html": "<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>A</th>\n      <th>B</th>\n    </tr>\n    <tr>\n      <th>Num</th>\n      <th></th>\n      <th></th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>1</th>\n      <td>0.302665</td>\n      <td>1.693723</td>\n    </tr>\n    <tr>\n      <th>2</th>\n      <td>-1.706086</td>\n      <td>-1.159119</td>\n    </tr>\n    <tr>\n      <th>3</th>\n      <td>-0.134841</td>\n      <td>0.390528</td>\n    </tr>\n  </tbody>\n</table>\n</div>"
     },
     "metadata": {},
     "execution_count": 44
    }
   ],
   "source": [
    "df.xs('G1')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 45,
   "metadata": {},
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "A    0.302665\n",
       "B    1.693723\n",
       "Name: (G1, 1), dtype: float64"
      ]
     },
     "metadata": {},
     "execution_count": 45
    }
   ],
   "source": [
    "df.xs(['G1',1])"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 46,
   "metadata": {},
   "outputs": [
    {
     "output_type": "execute_result",
     "data": {
      "text/plain": [
       "              A         B\n",
       "Group                    \n",
       "G1     0.302665  1.693723\n",
       "G2     0.166905  0.184502"
      ],
      "text/html": "<div>\n<style scoped>\n    .dataframe tbody tr th:only-of-type {\n        vertical-align: middle;\n    }\n\n    .dataframe tbody tr th {\n        vertical-align: top;\n    }\n\n    .dataframe thead th {\n        text-align: right;\n    }\n</style>\n<table border=\"1\" class=\"dataframe\">\n  <thead>\n    <tr style=\"text-align: right;\">\n      <th></th>\n      <th>A</th>\n      <th>B</th>\n    </tr>\n    <tr>\n      <th>Group</th>\n      <th></th>\n      <th></th>\n    </tr>\n  </thead>\n  <tbody>\n    <tr>\n      <th>G1</th>\n      <td>0.302665</td>\n      <td>1.693723</td>\n    </tr>\n    <tr>\n      <th>G2</th>\n      <td>0.166905</td>\n      <td>0.184502</td>\n    </tr>\n  </tbody>\n</table>\n</div>"
     },
     "metadata": {},
     "execution_count": 46
    }
   ],
   "source": [
    "df.xs(1,level='Num')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Bon travail!"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "name": "python3",
   "display_name": "Python 3.7.9 64-bit ('pyfinance': conda)",
   "metadata": {
    "interpreter": {
     "hash": "e89404a230d8800c54ad520c7b67d1bd9bb833a07b37dd3e521a178a3dc34904"
    }
   }
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.9-final"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 1
}