python-pour-finance/03-Pandas/03-DataFrames.ipynb

1139 lines
51 KiB
Plaintext
Raw Normal View History

2023-08-21 15:12:19 +00:00
{
"cells": [
{
"cell_type": "markdown",
"metadata": {
"collapsed": true
},
"source": [
"# DataFrames\n",
"\n",
"Les DataFrames sont le point central de pandas et sont directement inspirés par le langage de programmation R. Nous pouvons considérer un DataFrame comme un ensemble d'objets Series assemblés et qui partagent le même index. Utilisons pandas pour explorer ce sujet !"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd\n",
"import numpy as np"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"from numpy.random import randn\n",
"np.random.seed(101)"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"df = pd.DataFrame(randn(5,4),index='A B C D E'.split(),columns='W X Y Z'.split())"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" W X Y Z\n",
"A 2.706850 0.628133 0.907969 0.503826\n",
"B 0.651118 -0.319318 -0.848077 0.605965\n",
"C -2.018168 0.740122 0.528813 -0.589001\n",
"D 0.188695 -0.758872 -0.933237 0.955057\n",
"E 0.190794 1.978757 2.605967 0.683509"
],
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>W</th>\n <th>X</th>\n <th>Y</th>\n <th>Z</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>A</th>\n <td>2.706850</td>\n <td>0.628133</td>\n <td>0.907969</td>\n <td>0.503826</td>\n </tr>\n <tr>\n <th>B</th>\n <td>0.651118</td>\n <td>-0.319318</td>\n <td>-0.848077</td>\n <td>0.605965</td>\n </tr>\n <tr>\n <th>C</th>\n <td>-2.018168</td>\n <td>0.740122</td>\n <td>0.528813</td>\n <td>-0.589001</td>\n </tr>\n <tr>\n <th>D</th>\n <td>0.188695</td>\n <td>-0.758872</td>\n <td>-0.933237</td>\n <td>0.955057</td>\n </tr>\n <tr>\n <th>E</th>\n <td>0.190794</td>\n <td>1.978757</td>\n <td>2.605967</td>\n <td>0.683509</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {},
"execution_count": 4
}
],
"source": [
"df"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Sélection et Indexation\n",
"\n",
"Apprenons les différentes méthodes pour récupérer des données à partir d'une DataFrame"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"A 2.706850\n",
"B 0.651118\n",
"C -2.018168\n",
"D 0.188695\n",
"E 0.190794\n",
"Name: W, dtype: float64"
]
},
"metadata": {},
"execution_count": 5
}
],
"source": [
"df['W']"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" W Z\n",
"A 2.706850 0.503826\n",
"B 0.651118 0.605965\n",
"C -2.018168 -0.589001\n",
"D 0.188695 0.955057\n",
"E 0.190794 0.683509"
],
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>W</th>\n <th>Z</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>A</th>\n <td>2.706850</td>\n <td>0.503826</td>\n </tr>\n <tr>\n <th>B</th>\n <td>0.651118</td>\n <td>0.605965</td>\n </tr>\n <tr>\n <th>C</th>\n <td>-2.018168</td>\n <td>-0.589001</td>\n </tr>\n <tr>\n <th>D</th>\n <td>0.188695</td>\n <td>0.955057</td>\n </tr>\n <tr>\n <th>E</th>\n <td>0.190794</td>\n <td>0.683509</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {},
"execution_count": 6
}
],
"source": [
"# Passer une liste de noms de colonnes\n",
"df[['W','Z']]"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"A 2.706850\n",
"B 0.651118\n",
"C -2.018168\n",
"D 0.188695\n",
"E 0.190794\n",
"Name: W, dtype: float64"
]
},
"metadata": {},
"execution_count": 7
}
],
"source": [
"# Syntaxe SQL (Non Recommandée!)\n",
"df.W"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Les colonnes d'un DataFrame sont juste des séries"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"pandas.core.series.Series"
]
},
"metadata": {},
"execution_count": 8
}
],
"source": [
"type(df['W'])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Création d'une nouvelle colonne:**"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [],
"source": [
"df['new'] = df['W'] + df['Y']"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" W X Y Z new\n",
"A 2.706850 0.628133 0.907969 0.503826 3.614819\n",
"B 0.651118 -0.319318 -0.848077 0.605965 -0.196959\n",
"C -2.018168 0.740122 0.528813 -0.589001 -1.489355\n",
"D 0.188695 -0.758872 -0.933237 0.955057 -0.744542\n",
"E 0.190794 1.978757 2.605967 0.683509 2.796762"
],
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>W</th>\n <th>X</th>\n <th>Y</th>\n <th>Z</th>\n <th>new</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>A</th>\n <td>2.706850</td>\n <td>0.628133</td>\n <td>0.907969</td>\n <td>0.503826</td>\n <td>3.614819</td>\n </tr>\n <tr>\n <th>B</th>\n <td>0.651118</td>\n <td>-0.319318</td>\n <td>-0.848077</td>\n <td>0.605965</td>\n <td>-0.196959</td>\n </tr>\n <tr>\n <th>C</th>\n <td>-2.018168</td>\n <td>0.740122</td>\n <td>0.528813</td>\n <td>-0.589001</td>\n <td>-1.489355</td>\n </tr>\n <tr>\n <th>D</th>\n <td>0.188695</td>\n <td>-0.758872</td>\n <td>-0.933237</td>\n <td>0.955057</td>\n <td>-0.744542</td>\n </tr>\n <tr>\n <th>E</th>\n <td>0.190794</td>\n <td>1.978757</td>\n <td>2.605967</td>\n <td>0.683509</td>\n <td>2.796762</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {},
"execution_count": 10
}
],
"source": [
"df"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Supression d'une colonne**"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" W X Y Z\n",
"A 2.706850 0.628133 0.907969 0.503826\n",
"B 0.651118 -0.319318 -0.848077 0.605965\n",
"C -2.018168 0.740122 0.528813 -0.589001\n",
"D 0.188695 -0.758872 -0.933237 0.955057\n",
"E 0.190794 1.978757 2.605967 0.683509"
],
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>W</th>\n <th>X</th>\n <th>Y</th>\n <th>Z</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>A</th>\n <td>2.706850</td>\n <td>0.628133</td>\n <td>0.907969</td>\n <td>0.503826</td>\n </tr>\n <tr>\n <th>B</th>\n <td>0.651118</td>\n <td>-0.319318</td>\n <td>-0.848077</td>\n <td>0.605965</td>\n </tr>\n <tr>\n <th>C</th>\n <td>-2.018168</td>\n <td>0.740122</td>\n <td>0.528813</td>\n <td>-0.589001</td>\n </tr>\n <tr>\n <th>D</th>\n <td>0.188695</td>\n <td>-0.758872</td>\n <td>-0.933237</td>\n <td>0.955057</td>\n </tr>\n <tr>\n <th>E</th>\n <td>0.190794</td>\n <td>1.978757</td>\n <td>2.605967</td>\n <td>0.683509</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {},
"execution_count": 11
}
],
"source": [
"df.drop('new',axis=1)"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" W X Y Z new\n",
"A 2.706850 0.628133 0.907969 0.503826 3.614819\n",
"B 0.651118 -0.319318 -0.848077 0.605965 -0.196959\n",
"C -2.018168 0.740122 0.528813 -0.589001 -1.489355\n",
"D 0.188695 -0.758872 -0.933237 0.955057 -0.744542\n",
"E 0.190794 1.978757 2.605967 0.683509 2.796762"
],
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>W</th>\n <th>X</th>\n <th>Y</th>\n <th>Z</th>\n <th>new</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>A</th>\n <td>2.706850</td>\n <td>0.628133</td>\n <td>0.907969</td>\n <td>0.503826</td>\n <td>3.614819</td>\n </tr>\n <tr>\n <th>B</th>\n <td>0.651118</td>\n <td>-0.319318</td>\n <td>-0.848077</td>\n <td>0.605965</td>\n <td>-0.196959</td>\n </tr>\n <tr>\n <th>C</th>\n <td>-2.018168</td>\n <td>0.740122</td>\n <td>0.528813</td>\n <td>-0.589001</td>\n <td>-1.489355</td>\n </tr>\n <tr>\n <th>D</th>\n <td>0.188695</td>\n <td>-0.758872</td>\n <td>-0.933237</td>\n <td>0.955057</td>\n <td>-0.744542</td>\n </tr>\n <tr>\n <th>E</th>\n <td>0.190794</td>\n <td>1.978757</td>\n <td>2.605967</td>\n <td>0.683509</td>\n <td>2.796762</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {},
"execution_count": 12
}
],
"source": [
"# Pas de remplacement sauf si spécifié!\n",
"df"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [],
"source": [
"df.drop('new',axis=1,inplace=True)"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" W X Y Z\n",
"A 2.706850 0.628133 0.907969 0.503826\n",
"B 0.651118 -0.319318 -0.848077 0.605965\n",
"C -2.018168 0.740122 0.528813 -0.589001\n",
"D 0.188695 -0.758872 -0.933237 0.955057\n",
"E 0.190794 1.978757 2.605967 0.683509"
],
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>W</th>\n <th>X</th>\n <th>Y</th>\n <th>Z</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>A</th>\n <td>2.706850</td>\n <td>0.628133</td>\n <td>0.907969</td>\n <td>0.503826</td>\n </tr>\n <tr>\n <th>B</th>\n <td>0.651118</td>\n <td>-0.319318</td>\n <td>-0.848077</td>\n <td>0.605965</td>\n </tr>\n <tr>\n <th>C</th>\n <td>-2.018168</td>\n <td>0.740122</td>\n <td>0.528813</td>\n <td>-0.589001</td>\n </tr>\n <tr>\n <th>D</th>\n <td>0.188695</td>\n <td>-0.758872</td>\n <td>-0.933237</td>\n <td>0.955057</td>\n </tr>\n <tr>\n <th>E</th>\n <td>0.190794</td>\n <td>1.978757</td>\n <td>2.605967</td>\n <td>0.683509</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {},
"execution_count": 14
}
],
"source": [
"df"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"On peut aussi supprimer une ligne de cette façon:"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" W X Y Z\n",
"A 2.706850 0.628133 0.907969 0.503826\n",
"B 0.651118 -0.319318 -0.848077 0.605965\n",
"C -2.018168 0.740122 0.528813 -0.589001\n",
"D 0.188695 -0.758872 -0.933237 0.955057"
],
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>W</th>\n <th>X</th>\n <th>Y</th>\n <th>Z</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>A</th>\n <td>2.706850</td>\n <td>0.628133</td>\n <td>0.907969</td>\n <td>0.503826</td>\n </tr>\n <tr>\n <th>B</th>\n <td>0.651118</td>\n <td>-0.319318</td>\n <td>-0.848077</td>\n <td>0.605965</td>\n </tr>\n <tr>\n <th>C</th>\n <td>-2.018168</td>\n <td>0.740122</td>\n <td>0.528813</td>\n <td>-0.589001</td>\n </tr>\n <tr>\n <th>D</th>\n <td>0.188695</td>\n <td>-0.758872</td>\n <td>-0.933237</td>\n <td>0.955057</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {},
"execution_count": 15
}
],
"source": [
"df.drop('E',axis=0)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Sélection de lignes**"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"W 2.706850\n",
"X 0.628133\n",
"Y 0.907969\n",
"Z 0.503826\n",
"Name: A, dtype: float64"
]
},
"metadata": {},
"execution_count": 16
}
],
"source": [
"df.loc['A']"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Ou sélectionner en fonction de la position au lieu de l'étiquette "
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"W -2.018168\n",
"X 0.740122\n",
"Y 0.528813\n",
"Z -0.589001\n",
"Name: C, dtype: float64"
]
},
"metadata": {},
"execution_count": 17
}
],
"source": [
"df.iloc[2]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**Sélection d'un sous-ensemble de lignes et de colonnes**"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"-0.8480769834036315"
]
},
"metadata": {},
"execution_count": 18
}
],
"source": [
"df.loc['B','Y']"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" W Y\n",
"A 2.706850 0.907969\n",
"B 0.651118 -0.848077"
],
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>W</th>\n <th>Y</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>A</th>\n <td>2.706850</td>\n <td>0.907969</td>\n </tr>\n <tr>\n <th>B</th>\n <td>0.651118</td>\n <td>-0.848077</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {},
"execution_count": 19
}
],
"source": [
"df.loc[['A','B'],['W','Y']]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Sélection conditionnelle\n",
"\n",
"Une caractéristique importante de pandas est la sélection conditionnelle à l'aide des crochets, très similaire à celle de numpy :"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" W X Y Z\n",
"A 2.706850 0.628133 0.907969 0.503826\n",
"B 0.651118 -0.319318 -0.848077 0.605965\n",
"C -2.018168 0.740122 0.528813 -0.589001\n",
"D 0.188695 -0.758872 -0.933237 0.955057\n",
"E 0.190794 1.978757 2.605967 0.683509"
],
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>W</th>\n <th>X</th>\n <th>Y</th>\n <th>Z</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>A</th>\n <td>2.706850</td>\n <td>0.628133</td>\n <td>0.907969</td>\n <td>0.503826</td>\n </tr>\n <tr>\n <th>B</th>\n <td>0.651118</td>\n <td>-0.319318</td>\n <td>-0.848077</td>\n <td>0.605965</td>\n </tr>\n <tr>\n <th>C</th>\n <td>-2.018168</td>\n <td>0.740122</td>\n <td>0.528813</td>\n <td>-0.589001</td>\n </tr>\n <tr>\n <th>D</th>\n <td>0.188695</td>\n <td>-0.758872</td>\n <td>-0.933237</td>\n <td>0.955057</td>\n </tr>\n <tr>\n <th>E</th>\n <td>0.190794</td>\n <td>1.978757</td>\n <td>2.605967</td>\n <td>0.683509</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {},
"execution_count": 20
}
],
"source": [
"df"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" W X Y Z\n",
"A True True True True\n",
"B True False False True\n",
"C False True True False\n",
"D True False False True\n",
"E True True True True"
],
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>W</th>\n <th>X</th>\n <th>Y</th>\n <th>Z</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>A</th>\n <td>True</td>\n <td>True</td>\n <td>True</td>\n <td>True</td>\n </tr>\n <tr>\n <th>B</th>\n <td>True</td>\n <td>False</td>\n <td>False</td>\n <td>True</td>\n </tr>\n <tr>\n <th>C</th>\n <td>False</td>\n <td>True</td>\n <td>True</td>\n <td>False</td>\n </tr>\n <tr>\n <th>D</th>\n <td>True</td>\n <td>False</td>\n <td>False</td>\n <td>True</td>\n </tr>\n <tr>\n <th>E</th>\n <td>True</td>\n <td>True</td>\n <td>True</td>\n <td>True</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {},
"execution_count": 21
}
],
"source": [
"df>0"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" W X Y Z\n",
"A 2.706850 0.628133 0.907969 0.503826\n",
"B 0.651118 NaN NaN 0.605965\n",
"C NaN 0.740122 0.528813 NaN\n",
"D 0.188695 NaN NaN 0.955057\n",
"E 0.190794 1.978757 2.605967 0.683509"
],
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>W</th>\n <th>X</th>\n <th>Y</th>\n <th>Z</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>A</th>\n <td>2.706850</td>\n <td>0.628133</td>\n <td>0.907969</td>\n <td>0.503826</td>\n </tr>\n <tr>\n <th>B</th>\n <td>0.651118</td>\n <td>NaN</td>\n <td>NaN</td>\n <td>0.605965</td>\n </tr>\n <tr>\n <th>C</th>\n <td>NaN</td>\n <td>0.740122</td>\n <td>0.528813</td>\n <td>NaN</td>\n </tr>\n <tr>\n <th>D</th>\n <td>0.188695</td>\n <td>NaN</td>\n <td>NaN</td>\n <td>0.955057</td>\n </tr>\n <tr>\n <th>E</th>\n <td>0.190794</td>\n <td>1.978757</td>\n <td>2.605967</td>\n <td>0.683509</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {},
"execution_count": 22
}
],
"source": [
"df[df>0]"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" W X Y Z\n",
"A 2.706850 0.628133 0.907969 0.503826\n",
"B 0.651118 -0.319318 -0.848077 0.605965\n",
"D 0.188695 -0.758872 -0.933237 0.955057\n",
"E 0.190794 1.978757 2.605967 0.683509"
],
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>W</th>\n <th>X</th>\n <th>Y</th>\n <th>Z</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>A</th>\n <td>2.706850</td>\n <td>0.628133</td>\n <td>0.907969</td>\n <td>0.503826</td>\n </tr>\n <tr>\n <th>B</th>\n <td>0.651118</td>\n <td>-0.319318</td>\n <td>-0.848077</td>\n <td>0.605965</td>\n </tr>\n <tr>\n <th>D</th>\n <td>0.188695</td>\n <td>-0.758872</td>\n <td>-0.933237</td>\n <td>0.955057</td>\n </tr>\n <tr>\n <th>E</th>\n <td>0.190794</td>\n <td>1.978757</td>\n <td>2.605967</td>\n <td>0.683509</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {},
"execution_count": 23
}
],
"source": [
"df[df['W']>0]"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"A 0.907969\n",
"B -0.848077\n",
"D -0.933237\n",
"E 2.605967\n",
"Name: Y, dtype: float64"
]
},
"metadata": {},
"execution_count": 24
}
],
"source": [
"df[df['W']>0]['Y']"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" Y X\n",
"A 0.907969 0.628133\n",
"B -0.848077 -0.319318\n",
"D -0.933237 -0.758872\n",
"E 2.605967 1.978757"
],
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>Y</th>\n <th>X</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>A</th>\n <td>0.907969</td>\n <td>0.628133</td>\n </tr>\n <tr>\n <th>B</th>\n <td>-0.848077</td>\n <td>-0.319318</td>\n </tr>\n <tr>\n <th>D</th>\n <td>-0.933237</td>\n <td>-0.758872</td>\n </tr>\n <tr>\n <th>E</th>\n <td>2.605967</td>\n <td>1.978757</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {},
"execution_count": 25
}
],
"source": [
"df[df['W']>0][['Y','X']]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Pour 2 conditions, vous pouvez utiliser | et & avec des parenthèses:"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" W X Y Z\n",
"E 0.190794 1.978757 2.605967 0.683509"
],
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>W</th>\n <th>X</th>\n <th>Y</th>\n <th>Z</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>E</th>\n <td>0.190794</td>\n <td>1.978757</td>\n <td>2.605967</td>\n <td>0.683509</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {},
"execution_count": 26
}
],
"source": [
"df[(df['W']>0) & (df['Y'] > 1)]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Plus de détails sur l'Index\n",
"\n",
"Discutons d'autres caractéristiques de l'indexation, y compris la réinitialisation de l'index ou la définition d'une autre fonction. Nous parlerons aussi de la hiérarchie des indices !"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" W X Y Z\n",
"A 2.706850 0.628133 0.907969 0.503826\n",
"B 0.651118 -0.319318 -0.848077 0.605965\n",
"C -2.018168 0.740122 0.528813 -0.589001\n",
"D 0.188695 -0.758872 -0.933237 0.955057\n",
"E 0.190794 1.978757 2.605967 0.683509"
],
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>W</th>\n <th>X</th>\n <th>Y</th>\n <th>Z</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>A</th>\n <td>2.706850</td>\n <td>0.628133</td>\n <td>0.907969</td>\n <td>0.503826</td>\n </tr>\n <tr>\n <th>B</th>\n <td>0.651118</td>\n <td>-0.319318</td>\n <td>-0.848077</td>\n <td>0.605965</td>\n </tr>\n <tr>\n <th>C</th>\n <td>-2.018168</td>\n <td>0.740122</td>\n <td>0.528813</td>\n <td>-0.589001</td>\n </tr>\n <tr>\n <th>D</th>\n <td>0.188695</td>\n <td>-0.758872</td>\n <td>-0.933237</td>\n <td>0.955057</td>\n </tr>\n <tr>\n <th>E</th>\n <td>0.190794</td>\n <td>1.978757</td>\n <td>2.605967</td>\n <td>0.683509</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {},
"execution_count": 27
}
],
"source": [
"df"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" index W X Y Z\n",
"0 A 2.706850 0.628133 0.907969 0.503826\n",
"1 B 0.651118 -0.319318 -0.848077 0.605965\n",
"2 C -2.018168 0.740122 0.528813 -0.589001\n",
"3 D 0.188695 -0.758872 -0.933237 0.955057\n",
"4 E 0.190794 1.978757 2.605967 0.683509"
],
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>index</th>\n <th>W</th>\n <th>X</th>\n <th>Y</th>\n <th>Z</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>0</th>\n <td>A</td>\n <td>2.706850</td>\n <td>0.628133</td>\n <td>0.907969</td>\n <td>0.503826</td>\n </tr>\n <tr>\n <th>1</th>\n <td>B</td>\n <td>0.651118</td>\n <td>-0.319318</td>\n <td>-0.848077</td>\n <td>0.605965</td>\n </tr>\n <tr>\n <th>2</th>\n <td>C</td>\n <td>-2.018168</td>\n <td>0.740122</td>\n <td>0.528813</td>\n <td>-0.589001</td>\n </tr>\n <tr>\n <th>3</th>\n <td>D</td>\n <td>0.188695</td>\n <td>-0.758872</td>\n <td>-0.933237</td>\n <td>0.955057</td>\n </tr>\n <tr>\n <th>4</th>\n <td>E</td>\n <td>0.190794</td>\n <td>1.978757</td>\n <td>2.605967</td>\n <td>0.683509</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {},
"execution_count": 28
}
],
"source": [
"# Réinitialisation de l'indice par défaut 0,1...n\n",
"df.reset_index()"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {},
"outputs": [],
"source": [
"newind = 'CA NY WY OR CO'.split()"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {},
"outputs": [],
"source": [
"df['States'] = newind"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" W X Y Z States\n",
"A 2.706850 0.628133 0.907969 0.503826 CA\n",
"B 0.651118 -0.319318 -0.848077 0.605965 NY\n",
"C -2.018168 0.740122 0.528813 -0.589001 WY\n",
"D 0.188695 -0.758872 -0.933237 0.955057 OR\n",
"E 0.190794 1.978757 2.605967 0.683509 CO"
],
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>W</th>\n <th>X</th>\n <th>Y</th>\n <th>Z</th>\n <th>States</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>A</th>\n <td>2.706850</td>\n <td>0.628133</td>\n <td>0.907969</td>\n <td>0.503826</td>\n <td>CA</td>\n </tr>\n <tr>\n <th>B</th>\n <td>0.651118</td>\n <td>-0.319318</td>\n <td>-0.848077</td>\n <td>0.605965</td>\n <td>NY</td>\n </tr>\n <tr>\n <th>C</th>\n <td>-2.018168</td>\n <td>0.740122</td>\n <td>0.528813</td>\n <td>-0.589001</td>\n <td>WY</td>\n </tr>\n <tr>\n <th>D</th>\n <td>0.188695</td>\n <td>-0.758872</td>\n <td>-0.933237</td>\n <td>0.955057</td>\n <td>OR</td>\n </tr>\n <tr>\n <th>E</th>\n <td>0.190794</td>\n <td>1.978757</td>\n <td>2.605967</td>\n <td>0.683509</td>\n <td>CO</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {},
"execution_count": 31
}
],
"source": [
"df"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" W X Y Z\n",
"States \n",
"CA 2.706850 0.628133 0.907969 0.503826\n",
"NY 0.651118 -0.319318 -0.848077 0.605965\n",
"WY -2.018168 0.740122 0.528813 -0.589001\n",
"OR 0.188695 -0.758872 -0.933237 0.955057\n",
"CO 0.190794 1.978757 2.605967 0.683509"
],
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>W</th>\n <th>X</th>\n <th>Y</th>\n <th>Z</th>\n </tr>\n <tr>\n <th>States</th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>CA</th>\n <td>2.706850</td>\n <td>0.628133</td>\n <td>0.907969</td>\n <td>0.503826</td>\n </tr>\n <tr>\n <th>NY</th>\n <td>0.651118</td>\n <td>-0.319318</td>\n <td>-0.848077</td>\n <td>0.605965</td>\n </tr>\n <tr>\n <th>WY</th>\n <td>-2.018168</td>\n <td>0.740122</td>\n <td>0.528813</td>\n <td>-0.589001</td>\n </tr>\n <tr>\n <th>OR</th>\n <td>0.188695</td>\n <td>-0.758872</td>\n <td>-0.933237</td>\n <td>0.955057</td>\n </tr>\n <tr>\n <th>CO</th>\n <td>0.190794</td>\n <td>1.978757</td>\n <td>2.605967</td>\n <td>0.683509</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {},
"execution_count": 32
}
],
"source": [
"df.set_index('States')"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" W X Y Z States\n",
"A 2.706850 0.628133 0.907969 0.503826 CA\n",
"B 0.651118 -0.319318 -0.848077 0.605965 NY\n",
"C -2.018168 0.740122 0.528813 -0.589001 WY\n",
"D 0.188695 -0.758872 -0.933237 0.955057 OR\n",
"E 0.190794 1.978757 2.605967 0.683509 CO"
],
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>W</th>\n <th>X</th>\n <th>Y</th>\n <th>Z</th>\n <th>States</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>A</th>\n <td>2.706850</td>\n <td>0.628133</td>\n <td>0.907969</td>\n <td>0.503826</td>\n <td>CA</td>\n </tr>\n <tr>\n <th>B</th>\n <td>0.651118</td>\n <td>-0.319318</td>\n <td>-0.848077</td>\n <td>0.605965</td>\n <td>NY</td>\n </tr>\n <tr>\n <th>C</th>\n <td>-2.018168</td>\n <td>0.740122</td>\n <td>0.528813</td>\n <td>-0.589001</td>\n <td>WY</td>\n </tr>\n <tr>\n <th>D</th>\n <td>0.188695</td>\n <td>-0.758872</td>\n <td>-0.933237</td>\n <td>0.955057</td>\n <td>OR</td>\n </tr>\n <tr>\n <th>E</th>\n <td>0.190794</td>\n <td>1.978757</td>\n <td>2.605967</td>\n <td>0.683509</td>\n <td>CO</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {},
"execution_count": 33
}
],
"source": [
"df"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {},
"outputs": [],
"source": [
"df.set_index('States',inplace=True)"
]
},
{
"cell_type": "code",
"execution_count": 35,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" W X Y Z\n",
"States \n",
"CA 2.706850 0.628133 0.907969 0.503826\n",
"NY 0.651118 -0.319318 -0.848077 0.605965\n",
"WY -2.018168 0.740122 0.528813 -0.589001\n",
"OR 0.188695 -0.758872 -0.933237 0.955057\n",
"CO 0.190794 1.978757 2.605967 0.683509"
],
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>W</th>\n <th>X</th>\n <th>Y</th>\n <th>Z</th>\n </tr>\n <tr>\n <th>States</th>\n <th></th>\n <th></th>\n <th></th>\n <th></th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>CA</th>\n <td>2.706850</td>\n <td>0.628133</td>\n <td>0.907969</td>\n <td>0.503826</td>\n </tr>\n <tr>\n <th>NY</th>\n <td>0.651118</td>\n <td>-0.319318</td>\n <td>-0.848077</td>\n <td>0.605965</td>\n </tr>\n <tr>\n <th>WY</th>\n <td>-2.018168</td>\n <td>0.740122</td>\n <td>0.528813</td>\n <td>-0.589001</td>\n </tr>\n <tr>\n <th>OR</th>\n <td>0.188695</td>\n <td>-0.758872</td>\n <td>-0.933237</td>\n <td>0.955057</td>\n </tr>\n <tr>\n <th>CO</th>\n <td>0.190794</td>\n <td>1.978757</td>\n <td>2.605967</td>\n <td>0.683509</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {},
"execution_count": 35
}
],
"source": [
"df"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Multi-index et hiérarchie des indices\n",
"\n",
"Voyons comment travailler avec un Multi-Index, nous allons d'abord créer un exemple rapide de ce à quoi ressemblerait un DataFrame Multi-Indexé :"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {},
"outputs": [],
"source": [
"# Niveaux d'Index\n",
"outside = ['G1','G1','G1','G2','G2','G2']\n",
"inside = [1,2,3,1,2,3]\n",
"hier_index = list(zip(outside,inside))\n",
"hier_index = pd.MultiIndex.from_tuples(hier_index)"
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"MultiIndex([('G1', 1),\n",
" ('G1', 2),\n",
" ('G1', 3),\n",
" ('G2', 1),\n",
" ('G2', 2),\n",
" ('G2', 3)],\n",
" )"
]
},
"metadata": {},
"execution_count": 37
}
],
"source": [
"hier_index"
]
},
{
"cell_type": "code",
"execution_count": 38,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" A B\n",
"G1 1 0.302665 1.693723\n",
" 2 -1.706086 -1.159119\n",
" 3 -0.134841 0.390528\n",
"G2 1 0.166905 0.184502\n",
" 2 0.807706 0.072960\n",
" 3 0.638787 0.329646"
],
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th></th>\n <th>A</th>\n <th>B</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th rowspan=\"3\" valign=\"top\">G1</th>\n <th>1</th>\n <td>0.302665</td>\n <td>1.693723</td>\n </tr>\n <tr>\n <th>2</th>\n <td>-1.706086</td>\n <td>-1.159119</td>\n </tr>\n <tr>\n <th>3</th>\n <td>-0.134841</td>\n <td>0.390528</td>\n </tr>\n <tr>\n <th rowspan=\"3\" valign=\"top\">G2</th>\n <th>1</th>\n <td>0.166905</td>\n <td>0.184502</td>\n </tr>\n <tr>\n <th>2</th>\n <td>0.807706</td>\n <td>0.072960</td>\n </tr>\n <tr>\n <th>3</th>\n <td>0.638787</td>\n <td>0.329646</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {},
"execution_count": 38
}
],
"source": [
"df = pd.DataFrame(np.random.randn(6,2),index=hier_index,columns=['A','B'])\n",
"df"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Maintenant, montrons comment indexer ceci ! Pour la hiérarchie d'index nous utilisons df.loc[], si c'était sur l'axe des colonnes, vous n'utiliseriez que la notation normale entre crochets df[]. L'appel d'un niveau de l'index retourne un sous-dataframe :"
]
},
{
"cell_type": "code",
"execution_count": 39,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" A B\n",
"1 0.302665 1.693723\n",
"2 -1.706086 -1.159119\n",
"3 -0.134841 0.390528"
],
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>A</th>\n <th>B</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>1</th>\n <td>0.302665</td>\n <td>1.693723</td>\n </tr>\n <tr>\n <th>2</th>\n <td>-1.706086</td>\n <td>-1.159119</td>\n </tr>\n <tr>\n <th>3</th>\n <td>-0.134841</td>\n <td>0.390528</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {},
"execution_count": 39
}
],
"source": [
"df.loc['G1']"
]
},
{
"cell_type": "code",
"execution_count": 40,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"A 0.302665\n",
"B 1.693723\n",
"Name: 1, dtype: float64"
]
},
"metadata": {},
"execution_count": 40
}
],
"source": [
"df.loc['G1'].loc[1]"
]
},
{
"cell_type": "code",
"execution_count": 41,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"FrozenList([None, None])"
]
},
"metadata": {},
"execution_count": 41
}
],
"source": [
"df.index.names"
]
},
{
"cell_type": "code",
"execution_count": 42,
"metadata": {},
"outputs": [],
"source": [
"df.index.names = ['Group','Num']"
]
},
{
"cell_type": "code",
"execution_count": 43,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" A B\n",
"Group Num \n",
"G1 1 0.302665 1.693723\n",
" 2 -1.706086 -1.159119\n",
" 3 -0.134841 0.390528\n",
"G2 1 0.166905 0.184502\n",
" 2 0.807706 0.072960\n",
" 3 0.638787 0.329646"
],
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th></th>\n <th>A</th>\n <th>B</th>\n </tr>\n <tr>\n <th>Group</th>\n <th>Num</th>\n <th></th>\n <th></th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th rowspan=\"3\" valign=\"top\">G1</th>\n <th>1</th>\n <td>0.302665</td>\n <td>1.693723</td>\n </tr>\n <tr>\n <th>2</th>\n <td>-1.706086</td>\n <td>-1.159119</td>\n </tr>\n <tr>\n <th>3</th>\n <td>-0.134841</td>\n <td>0.390528</td>\n </tr>\n <tr>\n <th rowspan=\"3\" valign=\"top\">G2</th>\n <th>1</th>\n <td>0.166905</td>\n <td>0.184502</td>\n </tr>\n <tr>\n <th>2</th>\n <td>0.807706</td>\n <td>0.072960</td>\n </tr>\n <tr>\n <th>3</th>\n <td>0.638787</td>\n <td>0.329646</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {},
"execution_count": 43
}
],
"source": [
"df"
]
},
{
"cell_type": "code",
"execution_count": 44,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" A B\n",
"Num \n",
"1 0.302665 1.693723\n",
"2 -1.706086 -1.159119\n",
"3 -0.134841 0.390528"
],
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>A</th>\n <th>B</th>\n </tr>\n <tr>\n <th>Num</th>\n <th></th>\n <th></th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>1</th>\n <td>0.302665</td>\n <td>1.693723</td>\n </tr>\n <tr>\n <th>2</th>\n <td>-1.706086</td>\n <td>-1.159119</td>\n </tr>\n <tr>\n <th>3</th>\n <td>-0.134841</td>\n <td>0.390528</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {},
"execution_count": 44
}
],
"source": [
"df.xs('G1')"
]
},
{
"cell_type": "code",
"execution_count": 45,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"A 0.302665\n",
"B 1.693723\n",
"Name: (G1, 1), dtype: float64"
]
},
"metadata": {},
"execution_count": 45
}
],
"source": [
"df.xs(['G1',1])"
]
},
{
"cell_type": "code",
"execution_count": 46,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
" A B\n",
"Group \n",
"G1 0.302665 1.693723\n",
"G2 0.166905 0.184502"
],
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>A</th>\n <th>B</th>\n </tr>\n <tr>\n <th>Group</th>\n <th></th>\n <th></th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>G1</th>\n <td>0.302665</td>\n <td>1.693723</td>\n </tr>\n <tr>\n <th>G2</th>\n <td>0.166905</td>\n <td>0.184502</td>\n </tr>\n </tbody>\n</table>\n</div>"
},
"metadata": {},
"execution_count": 46
}
],
"source": [
"df.xs(1,level='Num')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Bon travail!"
]
}
],
"metadata": {
"kernelspec": {
"name": "python3",
"display_name": "Python 3.7.9 64-bit ('pyfinance': conda)",
"metadata": {
"interpreter": {
"hash": "e89404a230d8800c54ad520c7b67d1bd9bb833a07b37dd3e521a178a3dc34904"
}
}
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.9-final"
}
},
"nbformat": 4,
"nbformat_minor": 1
}