python-pour-finance/03-Pandas/Pandas-Exercices/Pandas Exercices - SOLUTION...

788 lines
19 KiB
Plaintext

{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Pandas Exercices - Solutions"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Il est temps de tester vos nouvelles compétences en pandas ! Utilisez le fichier csv de ce dossier pour effectuer les tâches en gras ci-dessous!\n",
"\n",
"#### NOTE : TOUTES LES TÂCHES DOIVENT ÊTRE EFFECTUÉES SUR UNE SEULE LIGNE AVEC DU CODE PANDAS. BLOQUé(e) ? PAS DE PROBLÈME ! CONSULTEZ LE NOTEBOOK AVEC LES SOLUTIONS ! "
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Importer pandas et lire le fichier banklist.csv dans un dataframe nommé banks. "
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import pandas as pd"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"banks = pd.read_csv('banklist.csv')"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Afficher les 5 premières lignes du dataframe"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"# CODE ICI"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Bank Name</th>\n",
" <th>City</th>\n",
" <th>ST</th>\n",
" <th>CERT</th>\n",
" <th>Acquiring Institution</th>\n",
" <th>Closing Date</th>\n",
" <th>Updated Date</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>0</th>\n",
" <td>Fayette County Bank</td>\n",
" <td>Saint Elmo</td>\n",
" <td>IL</td>\n",
" <td>1802</td>\n",
" <td>United Fidelity Bank, fsb</td>\n",
" <td>26-May-17</td>\n",
" <td>1-Jun-17</td>\n",
" </tr>\n",
" <tr>\n",
" <th>1</th>\n",
" <td>Guaranty Bank, (d/b/a BestBank in Georgia &amp; Mi...</td>\n",
" <td>Milwaukee</td>\n",
" <td>WI</td>\n",
" <td>30003</td>\n",
" <td>First-Citizens Bank &amp; Trust Company</td>\n",
" <td>5-May-17</td>\n",
" <td>1-Jun-17</td>\n",
" </tr>\n",
" <tr>\n",
" <th>2</th>\n",
" <td>First NBC Bank</td>\n",
" <td>New Orleans</td>\n",
" <td>LA</td>\n",
" <td>58302</td>\n",
" <td>Whitney Bank</td>\n",
" <td>28-Apr-17</td>\n",
" <td>23-May-17</td>\n",
" </tr>\n",
" <tr>\n",
" <th>3</th>\n",
" <td>Proficio Bank</td>\n",
" <td>Cottonwood Heights</td>\n",
" <td>UT</td>\n",
" <td>35495</td>\n",
" <td>Cache Valley Bank</td>\n",
" <td>3-Mar-17</td>\n",
" <td>18-May-17</td>\n",
" </tr>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>Seaway Bank and Trust Company</td>\n",
" <td>Chicago</td>\n",
" <td>IL</td>\n",
" <td>19328</td>\n",
" <td>State Bank of Texas</td>\n",
" <td>27-Jan-17</td>\n",
" <td>18-May-17</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Bank Name City ST \\\n",
"0 Fayette County Bank Saint Elmo IL \n",
"1 Guaranty Bank, (d/b/a BestBank in Georgia & Mi... Milwaukee WI \n",
"2 First NBC Bank New Orleans LA \n",
"3 Proficio Bank Cottonwood Heights UT \n",
"4 Seaway Bank and Trust Company Chicago IL \n",
"\n",
" CERT Acquiring Institution Closing Date Updated Date \n",
"0 1802 United Fidelity Bank, fsb 26-May-17 1-Jun-17 \n",
"1 30003 First-Citizens Bank & Trust Company 5-May-17 1-Jun-17 \n",
"2 58302 Whitney Bank 28-Apr-17 23-May-17 \n",
"3 35495 Cache Valley Bank 3-Mar-17 18-May-17 \n",
"4 19328 State Bank of Texas 27-Jan-17 18-May-17 "
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"banks.head()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Quels sont les noms de colonne?"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [],
"source": [
"# CODE ICI"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"Index(['Bank Name', 'City', 'ST', 'CERT', 'Acquiring Institution',\n",
" 'Closing Date', 'Updated Date'],\n",
" dtype='object')"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"banks.columns"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Combien d'états (ST) sont représentés dans cet ensemble de données?"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [],
"source": [
"# CODE ICI"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"44"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"banks['ST'].nunique()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Obtenir une liste ou tableau de tous les états du dataset."
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [],
"source": [
"# CODE ICI"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array(['IL', 'WI', 'LA', 'UT', 'NJ', 'AR', 'GA', 'PA', 'TN', 'WA', 'CO',\n",
" 'PR', 'FL', 'MN', 'CA', 'MD', 'OK', 'OH', 'SC', 'VA', 'ID', 'TX',\n",
" 'CT', 'AZ', 'NV', 'NC', 'KY', 'MO', 'KS', 'AL', 'MI', 'IN', 'IA',\n",
" 'NE', 'MS', 'NM', 'OR', 'NY', 'MA', 'SD', 'WY', 'WV', 'NH', 'HI'],\n",
" dtype=object)"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"banks['ST'].unique()"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Quels sont les 5 états avec le plus de banques en faillite?"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {
"scrolled": true
},
"outputs": [],
"source": [
"# CODE ICI"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"ST\n",
"GA 93\n",
"FL 75\n",
"IL 67\n",
"CA 41\n",
"MN 23\n",
"Name: Bank Name, dtype: int64"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"banks.groupby(\"ST\").count().sort_values('Bank Name',ascending=False).iloc[:5]['Bank Name']"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Quels sont les 5 principaux établissements acquéreurs?"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [],
"source": [
"# CODE ICI"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"No Acquirer 31\n",
"State Bank and Trust Company 12\n",
"First-Citizens Bank & Trust Company 11\n",
"Ameris Bank 10\n",
"U.S. Bank N.A. 9\n",
"Name: Acquiring Institution, dtype: int64"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"banks['Acquiring Institution'].value_counts().iloc[:5]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Combien de banques State Bank of Texas a-t-elle acquises ? Combien d'entre elles étaient au Texas ?"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [],
"source": [
"# CODE ICI"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Bank Name</th>\n",
" <th>City</th>\n",
" <th>ST</th>\n",
" <th>CERT</th>\n",
" <th>Acquiring Institution</th>\n",
" <th>Closing Date</th>\n",
" <th>Updated Date</th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>4</th>\n",
" <td>Seaway Bank and Trust Company</td>\n",
" <td>Chicago</td>\n",
" <td>IL</td>\n",
" <td>19328</td>\n",
" <td>State Bank of Texas</td>\n",
" <td>27-Jan-17</td>\n",
" <td>18-May-17</td>\n",
" </tr>\n",
" <tr>\n",
" <th>21</th>\n",
" <td>The National Republic Bank of Chicago</td>\n",
" <td>Chicago</td>\n",
" <td>IL</td>\n",
" <td>916</td>\n",
" <td>State Bank of Texas</td>\n",
" <td>24-Oct-14</td>\n",
" <td>6-Jan-16</td>\n",
" </tr>\n",
" <tr>\n",
" <th>450</th>\n",
" <td>Millennium State Bank of Texas</td>\n",
" <td>Dallas</td>\n",
" <td>TX</td>\n",
" <td>57667</td>\n",
" <td>State Bank of Texas</td>\n",
" <td>2-Jul-09</td>\n",
" <td>26-Oct-12</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Bank Name City ST CERT \\\n",
"4 Seaway Bank and Trust Company Chicago IL 19328 \n",
"21 The National Republic Bank of Chicago Chicago IL 916 \n",
"450 Millennium State Bank of Texas Dallas TX 57667 \n",
"\n",
" Acquiring Institution Closing Date Updated Date \n",
"4 State Bank of Texas 27-Jan-17 18-May-17 \n",
"21 State Bank of Texas 24-Oct-14 6-Jan-16 \n",
"450 State Bank of Texas 2-Jul-09 26-Oct-12 "
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"banks[banks['Acquiring Institution']=='State Bank of Texas']"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Quelle est la ville la plus fréquente en Californie pour qu'une banque fasse faillite ?"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [],
"source": [
"# CODE ICI"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"<div>\n",
"<style scoped>\n",
" .dataframe tbody tr th:only-of-type {\n",
" vertical-align: middle;\n",
" }\n",
"\n",
" .dataframe tbody tr th {\n",
" vertical-align: top;\n",
" }\n",
"\n",
" .dataframe thead th {\n",
" text-align: right;\n",
" }\n",
"</style>\n",
"<table border=\"1\" class=\"dataframe\">\n",
" <thead>\n",
" <tr style=\"text-align: right;\">\n",
" <th></th>\n",
" <th>Bank Name</th>\n",
" <th>ST</th>\n",
" <th>CERT</th>\n",
" <th>Acquiring Institution</th>\n",
" <th>Closing Date</th>\n",
" <th>Updated Date</th>\n",
" </tr>\n",
" <tr>\n",
" <th>City</th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" <th></th>\n",
" </tr>\n",
" </thead>\n",
" <tbody>\n",
" <tr>\n",
" <th>Los Angeles</th>\n",
" <td>4</td>\n",
" <td>4</td>\n",
" <td>4</td>\n",
" <td>4</td>\n",
" <td>4</td>\n",
" <td>4</td>\n",
" </tr>\n",
" </tbody>\n",
"</table>\n",
"</div>"
],
"text/plain": [
" Bank Name ST CERT Acquiring Institution Closing Date \\\n",
"City \n",
"Los Angeles 4 4 4 4 4 \n",
"\n",
" Updated Date \n",
"City \n",
"Los Angeles 4 "
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"banks[banks['ST']=='CA'].groupby('City').count().sort_values('Bank Name',ascending=False).head(1)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Combien de banques en faillite n'ont pas le mot \"Bank\" dans leur nom?"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [],
"source": [
"# CODE ICI"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"14"
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# banks['Bank Name'].apply(lambda name: 'Bank' not in name).value_counts()\n",
"sum(banks['Bank Name'].apply(lambda name: 'Bank' not in name))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Combien de noms de banques commencent par la lettre 's' ?"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [],
"source": [
"# CODE ICI"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"53"
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"sum(banks['Bank Name'].apply(lambda name:name[0].upper() =='S'))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Combien de valeurs CERT sont supérieures à 20000 ?"
]
},
{
"cell_type": "code",
"execution_count": 23,
"metadata": {},
"outputs": [],
"source": [
"# CODE ICI"
]
},
{
"cell_type": "code",
"execution_count": 24,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"417"
]
},
"execution_count": 24,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"sum(banks['CERT']>20000)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"##### Combien de noms de banques se composent de seulement deux mots ? (par exemple \"First Bank\" et \"Bank Georgia\")"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {},
"outputs": [],
"source": [
"# CODE ICI"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"114"
]
},
"execution_count": 26,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"sum(banks['Bank Name'].apply(lambda name: len(name.split())==2))"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"#### Bonus: Combien de banques ont fermé en 2008 ? (c'est difficile parce que techniquement nous n'avons pas encore vu les séries temporelles avec pandas ! N'hésitez pas à passer !"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {},
"outputs": [],
"source": [
"# CODE ICI"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"25"
]
},
"execution_count": 28,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# ON APPRENDRA UN MEILLEUR MOYEN POUR FAIRE CELA BIENTOT!\n",
"sum(banks['Closing Date'].apply(lambda date: date[-2:]) == '08')\n",
"\n",
"# Meilleure solution\n",
"# sum(pd.to_datetime(banks['Closing Date']).apply(lambda date: date.year) == 2008)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Bon travail!"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.5"
}
},
"nbformat": 4,
"nbformat_minor": 2
}