python-pour-finance/03-Pandas/03-DataFrames.ipynb

1139 lines
51 KiB
Plaintext
Raw Permalink Normal View History

2023-08-21 15:12:19 +00:00
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"collapsed": true
},
"source": [
"# DataFrames\n",
"\n",
"Les DataFrames sont le point central de pandas et sont directement inspirés par le langage de programmation R. Nous pouvons considérer un DataFrame comme un ensemble d'objets Series assemblés et qui partagent le même index. Utilisons pandas pour explorer ce sujet !"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd\n",
"import numpy as np"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"from numpy.random import randn\n",
"np.random.seed(101)"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"df = pd.DataFrame(randn(5,4),index='A B C D E'.split(),columns='W X Y Z'.split())"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" W X Y Z\n",
"A 2.706850 0.628133 0.907969 0.503826\n",
"B 0.651118 -0.319318 -0.848077 0.605965\n",
"C -2.018168 0.740122 0.528813 -0.589001\n",
"D 0.188695 -0.758872 -0.933237 0.955057\n",
"E 0.190794 1.978757 2.605967 0.683509"
],
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>W</th>\n <th>X</th>\n <th>Y</th>\n <th>Z</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>A</th>\n <td>2.706850</td>\n <td>0.628133</td>\n <td>0.907969</td>\n <td>0.503826</td>\n </tr>\n <tr>\n <th>B</th>\n <td>0.651118</td>\n <td>-0.319318</td>\n <td>-0.848077</td>\n <td>0.605965</td>\n </tr>\n <tr>\n <th>C</th>\n <td>-2.018168</td>\n <td>0.740122</td>\n <td>0.528813</td>\n <td>-0.589001</td>\n </tr>\n <tr>\n <th>D</th>\n <td>0.188695</td>\n <td>-0.758872</td>\n <td>-0.933237</td>\n <td>0.955057</td>\n </tr>\n <tr>\n <th>E</th>\n <td>0.190794</td>\n <td>1.978757</td>\n <td>2.605967</td>\n <td>0.683509</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {},
"execution_count": 4
}
],
"source": [
"df"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Sélection et Indexation\n",
"\n",
"Apprenons les différentes méthodes pour récupérer des données à partir d'une DataFrame"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"A 2.706850\n",
"B 0.651118\n",
"C -2.018168\n",
"D 0.188695\n",
"E 0.190794\n",
"Name: W, dtype: float64"
]
},
"metadata": {},
"execution_count": 5
}
],
"source": [
"df['W']"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" W Z\n",
"A 2.706850 0.503826\n",
"B 0.651118 0.605965\n",
"C -2.018168 -0.589001\n",
"D 0.188695 0.955057\n",
"E 0.190794 0.683509"
],
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>W</th>\n <th>Z</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>A</th>\n <td>2.706850</td>\n <td>0.503826</td>\n </tr>\n <tr>\n <th>B</th>\n <td>0.651118</td>\n <td>0.605965</td>\n </tr>\n <tr>\n <th>C</th>\n <td>-2.018168</td>\n <td>-0.589001</td>\n </tr>\n <tr>\n <th>D</th>\n <td>0.188695</td>\n <td>0.955057</td>\n </tr>\n <tr>\n <th>E</th>\n <td>0.190794</td>\n <td>0.683509</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {},
"execution_count": 6
}
],
"source": [
"# Passer une liste de noms de colonnes\n",
"df[['W','Z']]"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"A 2.706850\n",
"B 0.651118\n",
"C -2.018168\n",
"D 0.188695\n",
"E 0.190794\n",
"Name: W, dtype: float64"
]
},
"metadata": {},
"execution_count": 7
}
],
"source": [
"# Syntaxe SQL (Non Recommandée!)\n",
"df.W"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Les colonnes d'un DataFrame sont juste des séries"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"pandas.core.series.Series"
]
},
"metadata": {},
"execution_count": 8
}
],
"source": [
"type(df['W'])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Création d'une nouvelle colonne:**"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [],
"source": [
"df['new'] = df['W'] + df['Y']"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" W X Y Z new\n",
"A 2.706850 0.628133 0.907969 0.503826 3.614819\n",
"B 0.651118 -0.319318 -0.848077 0.605965 -0.196959\n",
"C -2.018168 0.740122 0.528813 -0.589001 -1.489355\n",
"D 0.188695 -0.758872 -0.933237 0.955057 -0.744542\n",
"E 0.190794 1.978757 2.605967 0.683509 2.796762"
],
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>W</th>\n <th>X</th>\n <th>Y</th>\n <th>Z</th>\n <th>new</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>A</th>\n <td>2.706850</td>\n <td>0.628133</td>\n <td>0.907969</td>\n <td>0.503826</td>\n <td>3.614819</td>\n </tr>\n <tr>\n <th>B</th>\n <td>0.651118</td>\n <td>-0.319318</td>\n <td>-0.848077</td>\n <td>0.605965</td>\n <td>-0.196959</td>\n </tr>\n <tr>\n <th>C</th>\n <td>-2.018168</td>\n <td>0.740122</td>\n <td>0.528813</td>\n <td>-0.589001</td>\n <td>-1.489355</td>\n </tr>\n <tr>\n <th>D</th>\n <td>0.188695</td>\n <td>-0.758872</td>\n <td>-0.933237</td>\n <td>0.955057</td>\n <td>-0.744542</td>\n </tr>\n <tr>\n <th>E</th>\n <td>0.190794</td>\n <td>1.978757</td>\n <td>2.605967</td>\n <td>0.683509</td>\n <td>2.796762</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {},
"execution_count": 10
}
],
"source": [
"df"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Supression d'une colonne**"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" W X Y Z\n",
"A 2.706850 0.628133 0.907969 0.503826\n",
"B 0.651118 -0.319318 -0.848077 0.605965\n",
"C -2.018168 0.740122 0.528813 -0.589001\n",
"D 0.188695 -0.758872 -0.933237 0.955057\n",
"E 0.190794 1.978757 2.605967 0.683509"
],
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>W</th>\n <th>X</th>\n <th>Y</th>\n <th>Z</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>A</th>\n <td>2.706850</td>\n <td>0.628133</td>\n <td>0.907969</td>\n <td>0.503826</td>\n </tr>\n <tr>\n <th>B</th>\n <td>0.651118</td>\n <td>-0.319318</td>\n <td>-0.848077</td>\n <td>0.605965</td>\n </tr>\n <tr>\n <th>C</th>\n <td>-2.018168</td>\n <td>0.740122</td>\n <td>0.528813</td>\n <td>-0.589001</td>\n </tr>\n <tr>\n <th>D</th>\n <td>0.188695</td>\n <td>-0.758872</td>\n <td>-0.933237</td>\n <td>0.955057</td>\n </tr>\n <tr>\n <th>E</th>\n <td>0.190794</td>\n <td>1.978757</td>\n <td>2.605967</td>\n <td>0.683509</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {},
"execution_count": 11
}
],
"source": [
"df.drop('new',axis=1)"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" W X Y Z new\n",
"A 2.706850 0.628133 0.907969 0.503826 3.614819\n",
"B 0.651118 -0.319318 -0.848077 0.605965 -0.196959\n",
"C -2.018168 0.740122 0.528813 -0.589001 -1.489355\n",
"D 0.188695 -0.758872 -0.933237 0.955057 -0.744542\n",
"E 0.190794 1.978757 2.605967 0.683509 2.796762"
],
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>W</th>\n <th>X</th>\n <th>Y</th>\n <th>Z</th>\n <th>new</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>A</th>\n <td>2.706850</td>\n <td>0.628133</td>\n <td>0.907969</td>\n <td>0.503826</td>\n <td>3.614819</td>\n </tr>\n <tr>\n <th>B</th>\n <td>0.651118</td>\n <td>-0.319318</td>\n <td>-0.848077</td>\n <td>0.605965</td>\n <td>-0.196959</td>\n </tr>\n <tr>\n <th>C</th>\n <td>-2.018168</td>\n <td>0.740122</td>\n <td>0.528813</td>\n <td>-0.589001</td>\n <td>-1.489355</td>\n </tr>\n <tr>\n <th>D</th>\n <td>0.188695</td>\n <td>-0.758872</td>\n <td>-0.933237</td>\n <td>0.955057</td>\n <td>-0.744542</td>\n </tr>\n <tr>\n <th>E</th>\n <td>0.190794</td>\n <td>1.978757</td>\n <td>2.605967</td>\n <td>0.683509</td>\n <td>2.796762</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {},
"execution_count": 12
}
],
"source": [
"# Pas de remplacement sauf si spécifié!\n",
"df"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [],
"source": [
"df.drop('new',axis=1,inplace=True)"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" W X Y Z\n",
"A 2.706850 0.628133 0.907969 0.503826\n",
"B 0.651118 -0.319318 -0.848077 0.605965\n",
"C -2.018168 0.740122 0.528813 -0.589001\n",
"D 0.188695 -0.758872 -0.933237 0.955057\n",
"E 0.190794 1.978757 2.605967 0.683509"
],
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>W</th>\n <th>X</th>\n <th>Y</th>\n <th>Z</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>A</th>\n <td>2.706850</td>\n <td>0.628133</td>\n <td>0.907969</td>\n <td>0.503826</td>\n </tr>\n <tr>\n <th>B</th>\n <td>0.651118</td>\n <td>-0.319318</td>\n <td>-0.848077</td>\n <td>0.605965</td>\n </tr>\n <tr>\n <th>C</th>\n <td>-2.018168</td>\n <td>0.740122</td>\n <td>0.528813</td>\n <td>-0.589001</td>\n </tr>\n <tr>\n <th>D</th>\n <td>0.188695</td>\n <td>-0.758872</td>\n <td>-0.933237</td>\n <td>0.955057</td>\n </tr>\n <tr>\n <th>E</th>\n <td>0.190794</td>\n <td>1.978757</td>\n <td>2.605967</td>\n <td>0.683509</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {},
"execution_count": 14
}
],
"source": [
"df"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"On peut aussi supprimer une ligne de cette façon:"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" W X Y Z\n",
"A 2.706850 0.628133 0.907969 0.503826\n",
"B 0.651118 -0.319318 -0.848077 0.605965\n",
"C -2.018168 0.740122 0.528813 -0.589001\n",
"D 0.188695 -0.758872 -0.933237 0.955057"
],
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>W</th>\n <th>X</th>\n <th>Y</th>\n <th>Z</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>A</th>\n <td>2.706850</td>\n <td>0.628133</td>\n <td>0.907969</td>\n <td>0.503826</td>\n </tr>\n <tr>\n <th>B</th>\n <td>0.651118</td>\n <td>-0.319318</td>\n <td>-0.848077</td>\n <td>0.605965</td>\n </tr>\n <tr>\n <th>C</th>\n <td>-2.018168</td>\n <td>0.740122</td>\n <td>0.528813</td>\n <td>-0.589001</td>\n </tr>\n <tr>\n <th>D</th>\n <td>0.188695</td>\n <td>-0.758872</td>\n <td>-0.933237</td>\n <td>0.955057</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {},
"execution_count": 15
}
],
"source": [
"df.drop('E',axis=0)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Sélection de lignes**"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"W 2.706850\n",
"X 0.628133\n",
"Y 0.907969\n",
"Z 0.503826\n",
"Name: A, dtype: float64"
]
},
"metadata": {},
"execution_count": 16
}
],
"source": [
"df.loc['A']"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Ou sélectionner en fonction de la position au lieu de l'étiquette "
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"W -2.018168\n",
"X 0.740122\n",
"Y 0.528813\n",
"Z -0.589001\n",
"Name: C, dtype: float64"
]
},
"metadata": {},
"execution_count": 17
}
],
"source": [
"df.iloc[2]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Sélection d'un sous-ensemble de lignes et de colonnes**"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"-0.8480769834036315"
]
},
"metadata": {},
"execution_count": 18
}
],
"source": [
"df.loc['B','Y']"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" W Y\n",
"A 2.706850 0.907969\n",
"B 0.651118 -0.848077"
],
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>W</th>\n <th>Y</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>A</th>\n <td>2.706850</td>\n <td>0.907969</td>\n </tr>\n <tr>\n <th>B</th>\n <td>0.651118</td>\n <td>-0.848077</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {},
"execution_count": 19
}
],
"source": [
"df.loc[['A','B'],['W','Y']]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Sélection conditionnelle\n",
"\n",
"Une caractéristique importante de pandas est la sélection conditionnelle à l'aide des crochets, très similaire à celle de numpy :"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" W X Y Z\n",
"A 2.706850 0.628133 0.907969 0.503826\n",
"B 0.651118 -0.319318 -0.848077 0.605965\n",
"C -2.018168 0.740122 0.528813 -0.589001\n",
"D 0.188695 -0.758872 -0.933237 0.955057\n",
"E 0.190794 1.978757 2.605967 0.683509"
],
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>W</th>\n <th>X</th>\n <th>Y</th>\n <th>Z</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>A</th>\n <td>2.706850</td>\n <td>0.628133</td>\n <td>0.907969</td>\n <td>0.503826</td>\n </tr>\n <tr>\n <th>B</th>\n <td>0.651118</td>\n <td>-0.319318</td>\n <td>-0.848077</td>\n <td>0.605965</td>\n </tr>\n <tr>\n <th>C</th>\n <td>-2.018168</td>\n <td>0.740122</td>\n <td>0.528813</td>\n <td>-0.589001</td>\n </tr>\n <tr>\n <th>D</th>\n <td>0.188695</td>\n <td>-0.758872</td>\n <td>-0.933237</td>\n <td>0.955057</td>\n </tr>\n <tr>\n <th>E</th>\n <td>0.190794</td>\n <td>1.978757</td>\n <td>2.605967</td>\n <td>0.683509</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {},
"execution_count": 20
}
],
"source": [
"df"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" W X Y Z\n",
"A True True True True\n",
"B True False False True\n",
"C False True True False\n",
"D True False False True\n",
"E True True True True"
],
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>W</th>\n <th>X</th>\n <th>Y</th>\n <th>Z</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>A</th>\n <td>True</td>\n <td>True</td>\n <td>True</td>\n <td>True</td>\n </tr>\n <tr>\n <th>B</th>\n <td>True</td>\n <td>False</td>\n <td>False</td>\n <td>True</td>\n </tr>\n <tr>\n <th>C</th>\n <td>False</td>\n <td>True</td>\n <td>True</td>\n <td>False</td>\n </tr>\n <tr>\n <th>D</th>\n <td>True</td>\n <td>False</td>\n <td>False</td>\n <td>True</td>\n </tr>\n <tr>\n <th>E</th>\n <td>True</td>\n <td>True</td>\n <td>True</td>\n <td>True</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {},
"execution_count": 21
}
],
"source": [
"df>0"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" W X Y Z\n",
"A 2.706850 0.628133 0.907969 0.503826\n",
"B 0.651118 NaN NaN 0.605965\n",
"C NaN 0.740122 0.528813 NaN\n",
"D 0.188695 NaN NaN 0.955057\n",
"E 0.190794 1.978757 2.605967 0.683509"
],
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>W</th>\n <th>X</th>\n <th>Y</th>\n <th>Z</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>A</th>\n <td>2.706850</td>\n <td>0.628133</td>\n <td>0.907969</td>\n <td>0.503826</td>\n </tr>\n <tr>\n <th>B</th>\n <td>0.651118</td>\n <td>NaN</td>\n <td>NaN</td>\n <td>0.605965</td>\n </tr>\n <tr>\n <th>C</th>\n <td>NaN</td>\n <td>0.740122</td>\n <td>0.528813</td>\n <td>NaN</td>\n </tr>\n <tr>\n <th>D</th>\n <td>0.188695</td>\n <td>NaN</td>\n <td>NaN</td>\n <td>0.955057</td>\n </tr>\n <tr>\n <th>E</th>\n <td>0.190794</td>\n <td>1.978757</td>\n <td>2.605967</td>\n <td>0.683509</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {},
"execution_count": 22
}
],
"source": [
"df[df>0]"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" W X Y Z\n",
"A 2.706850 0.628133 0.907969 0.503826\n",
"B 0.651118 -0.319318 -0.848077 0.605965\n",
"D 0.188695 -0.758872 -0.933237 0.955057\n",
"E 0.190794 1.978757 2.605967 0.683509"
],
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>W</th>\n <th>X</th>\n <th>Y</th>\n <th>Z</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>A</th>\n <td>2.706850</td>\n <td>0.628133</td>\n <td>0.907969</td>\n <td>0.503826</td>\n </tr>\n <tr>\n <th>B</th>\n <td>0.651118</td>\n <td>-0.319318</td>\n <td>-0.848077</td>\n <td>0.605965</td>\n </tr>\n <tr>\n <th>D</th>\n <td>0.188695</td>\n <td>-0.758872</td>\n <td>-0.933237</td>\n <td>0.955057</td>\n </tr>\n <tr>\n <th>E</th>\n <td>0.190794</td>\n <td>1.978757</td>\n <td>2.605967</td>\n <td>0.683509</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {},
"execution_count": 23
}
],
"source": [
"df[df['W']>0]"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"A 0.907969\n",
"B -0.848077\n",
"D -0.933237\n",
"E 2.605967\n",
"Name: Y, dtype: float64"
]
},
"metadata": {},
"execution_count": 24
}
],
"source": [
"df[df['W']>0]['Y']"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" Y X\n",
"A 0.907969 0.628133\n",
"B -0.848077 -0.319318\n",
"D -0.933237 -0.758872\n",
"E 2.605967 1.978757"
],
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>Y</th>\n <th>X</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>A</th>\n <td>0.907969</td>\n <td>0.628133</td>\n </tr>\n <tr>\n <th>B</th>\n <td>-0.848077</td>\n <td>-0.319318</td>\n </tr>\n <tr>\n <th>D</th>\n <td>-0.933237</td>\n <td>-0.758872</td>\n </tr>\n <tr>\n <th>E</th>\n <td>2.605967</td>\n <td>1.978757</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {},
"execution_count": 25
}
],
"source": [
"df[df['W']>0][['Y','X']]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Pour 2 conditions, vous pouvez utiliser | et & avec des parenthèses:"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" W X Y Z\n",
"E 0.190794 1.978757 2.605967 0.683509"
],
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>W</th>\n <th>X</th>\n <th>Y</th>\n <th>Z</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>E</th>\n <td>0.190794</td>\n <td>1.978757</td>\n <td>2.605967</td>\n <td>0.683509</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {},
"execution_count": 26
}
],
"source": [
"df[(df['W']>0) & (df['Y'] > 1)]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Plus de détails sur l'Index\n",
"\n",
"Discutons d'autres caractéristiques de l'indexation, y compris la réinitialisation de l'index ou la définition d'une autre fonction. Nous parlerons aussi de la hiérarchie des indices !"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" W X Y Z\n",
"A 2.706850 0.628133 0.907969 0.503826\n",
"B 0.651118 -0.319318 -0.848077 0.605965\n",
"C -2.018168 0.740122 0.528813 -0.589001\n",
"D 0.188695 -0.758872 -0.933237 0.955057\n",
"E 0.190794 1.978757 2.605967 0.683509"
],
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>W</th>\n <th>X</th>\n <th>Y</th>\n <th>Z</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>A</th>\n <td>2.706850</td>\n <td>0.628133</td>\n <td>0.907969</td>\n <td>0.503826</td>\n </tr>\n <tr>\n <th>B</th>\n <td>0.651118</td>\n <td>-0.319318</td>\n <td>-0.848077</td>\n <td>0.605965</td>\n </tr>\n <tr>\n <th>C</th>\n <td>-2.018168</td>\n <td>0.740122</td>\n <td>0.528813</td>\n <td>-0.589001</td>\n </tr>\n <tr>\n <th>D</th>\n <td>0.188695</td>\n <td>-0.758872</td>\n <td>-0.933237</td>\n <td>0.955057</td>\n </tr>\n <tr>\n <th>E</th>\n <td>0.190794</td>\n <td>1.978757</td>\n <td>2.605967</td>\n <td>0.683509</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {},
"execution_count": 27
}
],
"source": [
"df"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" index W X Y Z\n",
"0 A 2.706850 0.628133 0.907969 0.503826\n",
"1 B 0.651118 -0.319318 -0.848077 0.605965\n",
"2 C -2.018168 0.740122 0.528813 -0.589001\n",
"3 D 0.188695 -0.758872 -0.933237 0.955057\n",
"4 E 0.190794 1.978757 2.605967 0.683509"
],
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>index</th>\n <th>W</th>\n <th>X</th>\n <th>Y</th>\n <th>Z</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>0</th>\n <td>A</td>\n <td>2.706850</td>\n <td>0.628133</td>\n <td>0.907969</td>\n <td>0.503826</td>\n </tr>\n <tr>\n <th>1</th>\n <td>B</td>\n <td>0.651118</td>\n <td>-0.319318</td>\n <td>-0.848077</td>\n <td>0.605965</td>\n </tr>\n <tr>\n <th>2</th>\n <td>C</td>\n <td>-2.018168</td>\n <td>0.740122</td>\n <td>0.528813</td>\n <td>-0.589001</td>\n </tr>\n <tr>\n <th>3</th>\n <td>D</td>\n <td>0.188695</td>\n <td>-0.758872</td>\n <td>-0.933237</td>\n <td>0.955057</td>\n </tr>\n <tr>\n <th>4</th>\n <td>E</td>\n <td>0.190794</td>\n <td>1.978757</td>\n <td>2.605967</td>\n <td>0.683509</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {},
"execution_count": 28
}
],
"source": [
"# Réinitialisation de l'indice par défaut 0,1...n\n",
"df.reset_index()"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {},
"outputs": [],
"source": [
"newind = 'CA NY WY OR CO'.split()"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {},
"outputs": [],
"source": [
"df['States'] = newind"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" W X Y Z States\n",
"A 2.706850 0.628133 0.907969 0.503826 CA\n",
"B 0.651118 -0.319318 -0.848077 0.605965 NY\n",
"C -2.018168 0.740122 0.528813 -0.589001 WY\n",
"D 0.188695 -0.758872 -0.933237 0.955057 OR\n",
"E 0.190794 1.978757 2.605967 0.683509 CO"
],
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>W</th>\n <th>X</th>\n <th>Y</th>\n <th>Z</th>\n <th>States</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>A</th>\n <td>2.706850</td>\n <td>0.628133</td>\n <td>0.907969</td>\n <td>0.503826</td>\n <td>CA</td>\n </tr>\n <tr>\n <th>B</th>\n <td>0.651118</td>\n <td>-0.319318</td>\n <td>-0.848077</td>\n <td>0.605965</td>\n <td>NY</td>\n </tr>\n <tr>\n <th>C</th>\n <td>-2.018168</td>\n <td>0.740122</td>\n <td>0.528813</td>\n <td>-0.589001</td>\n <td>WY</td>\n </tr>\n <tr>\n <th>D</th>\n <td>0.188695</td>\n <td>-0.758872</td>\n <td>-0.933237</td>\n <td>0.955057</td>\n <td>OR</td>\n </tr>\n <tr>\n <th>E</th>\n <td>0.190794</td>\n <td>1.978757</td>\n <td>2.605967</td>\n <td>0.683509</td>\n <td>CO</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {},
"execution_count": 31
}
],
"source": [
"df"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" W X Y Z\n",
"States \n",
"CA 2.706850 0.628133 0.907969 0.503826\n",
"NY 0.651118 -0.319318 -0.848077 0.605965\n",
"WY -2.018168 0.740122 0.528813 -0.589001\n",
"OR 0.188695 -0.758872 -0.933237 0.955057\n",
"CO 0.190794 1.978757 2.605967 0.683509"
],
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>W</th>\n <th>X</th>\n <th>Y</th>\n <th>Z</th>\n </tr>\n <tr>\n <th>States</th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>CA</th>\n <td>2.706850</td>\n <td>0.628133</td>\n <td>0.907969</td>\n <td>0.503826</td>\n </tr>\n <tr>\n <th>NY</th>\n <td>0.651118</td>\n <td>-0.319318</td>\n <td>-0.848077</td>\n <td>0.605965</td>\n </tr>\n <tr>\n <th>WY</th>\n <td>-2.018168</td>\n <td>0.740122</td>\n <td>0.528813</td>\n <td>-0.589001</td>\n </tr>\n <tr>\n <th>OR</th>\n <td>0.188695</td>\n <td>-0.758872</td>\n <td>-0.933237</td>\n <td>0.955057</td>\n </tr>\n <tr>\n <th>CO</th>\n <td>0.190794</td>\n <td>1.978757</td>\n <td>2.605967</td>\n <td>0.683509</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {},
"execution_count": 32
}
],
"source": [
"df.set_index('States')"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" W X Y Z States\n",
"A 2.706850 0.628133 0.907969 0.503826 CA\n",
"B 0.651118 -0.319318 -0.848077 0.605965 NY\n",
"C -2.018168 0.740122 0.528813 -0.589001 WY\n",
"D 0.188695 -0.758872 -0.933237 0.955057 OR\n",
"E 0.190794 1.978757 2.605967 0.683509 CO"
],
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>W</th>\n <th>X</th>\n <th>Y</th>\n <th>Z</th>\n <th>States</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>A</th>\n <td>2.706850</td>\n <td>0.628133</td>\n <td>0.907969</td>\n <td>0.503826</td>\n <td>CA</td>\n </tr>\n <tr>\n <th>B</th>\n <td>0.651118</td>\n <td>-0.319318</td>\n <td>-0.848077</td>\n <td>0.605965</td>\n <td>NY</td>\n </tr>\n <tr>\n <th>C</th>\n <td>-2.018168</td>\n <td>0.740122</td>\n <td>0.528813</td>\n <td>-0.589001</td>\n <td>WY</td>\n </tr>\n <tr>\n <th>D</th>\n <td>0.188695</td>\n <td>-0.758872</td>\n <td>-0.933237</td>\n <td>0.955057</td>\n <td>OR</td>\n </tr>\n <tr>\n <th>E</th>\n <td>0.190794</td>\n <td>1.978757</td>\n <td>2.605967</td>\n <td>0.683509</td>\n <td>CO</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {},
"execution_count": 33
}
],
"source": [
"df"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {},
"outputs": [],
"source": [
"df.set_index('States',inplace=True)"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" W X Y Z\n",
"States \n",
"CA 2.706850 0.628133 0.907969 0.503826\n",
"NY 0.651118 -0.319318 -0.848077 0.605965\n",
"WY -2.018168 0.740122 0.528813 -0.589001\n",
"OR 0.188695 -0.758872 -0.933237 0.955057\n",
"CO 0.190794 1.978757 2.605967 0.683509"
],
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>W</th>\n <th>X</th>\n <th>Y</th>\n <th>Z</th>\n </tr>\n <tr>\n <th>States</th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>CA</th>\n <td>2.706850</td>\n <td>0.628133</td>\n <td>0.907969</td>\n <td>0.503826</td>\n </tr>\n <tr>\n <th>NY</th>\n <td>0.651118</td>\n <td>-0.319318</td>\n <td>-0.848077</td>\n <td>0.605965</td>\n </tr>\n <tr>\n <th>WY</th>\n <td>-2.018168</td>\n <td>0.740122</td>\n <td>0.528813</td>\n <td>-0.589001</td>\n </tr>\n <tr>\n <th>OR</th>\n <td>0.188695</td>\n <td>-0.758872</td>\n <td>-0.933237</td>\n <td>0.955057</td>\n </tr>\n <tr>\n <th>CO</th>\n <td>0.190794</td>\n <td>1.978757</td>\n <td>2.605967</td>\n <td>0.683509</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {},
"execution_count": 35
}
],
"source": [
"df"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Multi-index et hiérarchie des indices\n",
"\n",
"Voyons comment travailler avec un Multi-Index, nous allons d'abord créer un exemple rapide de ce à quoi ressemblerait un DataFrame Multi-Indexé :"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {},
"outputs": [],
"source": [
"# Niveaux d'Index\n",
"outside = ['G1','G1','G1','G2','G2','G2']\n",
"inside = [1,2,3,1,2,3]\n",
"hier_index = list(zip(outside,inside))\n",
"hier_index = pd.MultiIndex.from_tuples(hier_index)"
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"MultiIndex([('G1', 1),\n",
" ('G1', 2),\n",
" ('G1', 3),\n",
" ('G2', 1),\n",
" ('G2', 2),\n",
" ('G2', 3)],\n",
" )"
]
},
"metadata": {},
"execution_count": 37
}
],
"source": [
"hier_index"
]
},
{
"cell_type": "code",
"execution_count": 38,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" A B\n",
"G1 1 0.302665 1.693723\n",
" 2 -1.706086 -1.159119\n",
" 3 -0.134841 0.390528\n",
"G2 1 0.166905 0.184502\n",
" 2 0.807706 0.072960\n",
" 3 0.638787 0.329646"
],
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th></th>\n <th>A</th>\n <th>B</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th rowspan=\"3\" valign=\"top\">G1</th>\n <th>1</th>\n <td>0.302665</td>\n <td>1.693723</td>\n </tr>\n <tr>\n <th>2</th>\n <td>-1.706086</td>\n <td>-1.159119</td>\n </tr>\n <tr>\n <th>3</th>\n <td>-0.134841</td>\n <td>0.390528</td>\n </tr>\n <tr>\n <th rowspan=\"3\" valign=\"top\">G2</th>\n <th>1</th>\n <td>0.166905</td>\n <td>0.184502</td>\n </tr>\n <tr>\n <th>2</th>\n <td>0.807706</td>\n <td>0.072960</td>\n </tr>\n <tr>\n <th>3</th>\n <td>0.638787</td>\n <td>0.329646</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {},
"execution_count": 38
}
],
"source": [
"df = pd.DataFrame(np.random.randn(6,2),index=hier_index,columns=['A','B'])\n",
"df"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Maintenant, montrons comment indexer ceci ! Pour la hiérarchie d'index nous utilisons df.loc[], si c'était sur l'axe des colonnes, vous n'utiliseriez que la notation normale entre crochets df[]. L'appel d'un niveau de l'index retourne un sous-dataframe :"
]
},
{
"cell_type": "code",
"execution_count": 39,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" A B\n",
"1 0.302665 1.693723\n",
"2 -1.706086 -1.159119\n",
"3 -0.134841 0.390528"
],
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>A</th>\n <th>B</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>1</th>\n <td>0.302665</td>\n <td>1.693723</td>\n </tr>\n <tr>\n <th>2</th>\n <td>-1.706086</td>\n <td>-1.159119</td>\n </tr>\n <tr>\n <th>3</th>\n <td>-0.134841</td>\n <td>0.390528</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {},
"execution_count": 39
}
],
"source": [
"df.loc['G1']"
]
},
{
"cell_type": "code",
"execution_count": 40,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"A 0.302665\n",
"B 1.693723\n",
"Name: 1, dtype: float64"
]
},
"metadata": {},
"execution_count": 40
}
],
"source": [
"df.loc['G1'].loc[1]"
]
},
{
"cell_type": "code",
"execution_count": 41,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"FrozenList([None, None])"
]
},
"metadata": {},
"execution_count": 41
}
],
"source": [
"df.index.names"
]
},
{
"cell_type": "code",
"execution_count": 42,
"metadata": {},
"outputs": [],
"source": [
"df.index.names = ['Group','Num']"
]
},
{
"cell_type": "code",
"execution_count": 43,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" A B\n",
"Group Num \n",
"G1 1 0.302665 1.693723\n",
" 2 -1.706086 -1.159119\n",
" 3 -0.134841 0.390528\n",
"G2 1 0.166905 0.184502\n",
" 2 0.807706 0.072960\n",
" 3 0.638787 0.329646"
],
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th></th>\n <th>A</th>\n <th>B</th>\n </tr>\n <tr>\n <th>Group</th>\n <th>Num</th>\n <th></th>\n <th></th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th rowspan=\"3\" valign=\"top\">G1</th>\n <th>1</th>\n <td>0.302665</td>\n <td>1.693723</td>\n </tr>\n <tr>\n <th>2</th>\n <td>-1.706086</td>\n <td>-1.159119</td>\n </tr>\n <tr>\n <th>3</th>\n <td>-0.134841</td>\n <td>0.390528</td>\n </tr>\n <tr>\n <th rowspan=\"3\" valign=\"top\">G2</th>\n <th>1</th>\n <td>0.166905</td>\n <td>0.184502</td>\n </tr>\n <tr>\n <th>2</th>\n <td>0.807706</td>\n <td>0.072960</td>\n </tr>\n <tr>\n <th>3</th>\n <td>0.638787</td>\n <td>0.329646</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {},
"execution_count": 43
}
],
"source": [
"df"
]
},
{
"cell_type": "code",
"execution_count": 44,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" A B\n",
"Num \n",
"1 0.302665 1.693723\n",
"2 -1.706086 -1.159119\n",
"3 -0.134841 0.390528"
],
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>A</th>\n <th>B</th>\n </tr>\n <tr>\n <th>Num</th>\n <th></th>\n <th></th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>1</th>\n <td>0.302665</td>\n <td>1.693723</td>\n </tr>\n <tr>\n <th>2</th>\n <td>-1.706086</td>\n <td>-1.159119</td>\n </tr>\n <tr>\n <th>3</th>\n <td>-0.134841</td>\n <td>0.390528</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {},
"execution_count": 44
}
],
"source": [
"df.xs('G1')"
]
},
{
"cell_type": "code",
"execution_count": 45,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"A 0.302665\n",
"B 1.693723\n",
"Name: (G1, 1), dtype: float64"
]
},
"metadata": {},
"execution_count": 45
}
],
"source": [
"df.xs(['G1',1])"
]
},
{
"cell_type": "code",
"execution_count": 46,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" A B\n",
"Group \n",
"G1 0.302665 1.693723\n",
"G2 0.166905 0.184502"
],
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>A</th>\n <th>B</th>\n </tr>\n <tr>\n <th>Group</th>\n <th></th>\n <th></th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>G1</th>\n <td>0.302665</td>\n <td>1.693723</td>\n </tr>\n <tr>\n <th>G2</th>\n <td>0.166905</td>\n <td>0.184502</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {},
"execution_count": 46
}
],
"source": [
"df.xs(1,level='Num')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Bon travail!"
]
}
],
"metadata": {
"kernelspec": {
"name": "python3",
"display_name": "Python 3.7.9 64-bit ('pyfinance': conda)",
"metadata": {
"interpreter": {
"hash": "e89404a230d8800c54ad520c7b67d1bd9bb833a07b37dd3e521a178a3dc34904"
}
}
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.9-final"
}
},
"nbformat": 4,
"nbformat_minor": 1
}