python-pour-finance/03-Pandas/02-Series.ipynb

430 lines
8.5 KiB
Plaintext
Raw Normal View History

2023-08-21 15:12:19 +00:00
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Series"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Le premier grand type de données que nous apprendrons sur pandas est le type de données de la série. Importez Pandas et explorez l'objet Série.\n",
"\n",
"Une série est très similaire à un tableau NumPy (en fait, elle est construite sur l'objet tableau NumPy). Ce qui différencie le tableau NumPy d'une série, c'est qu'une série peut avoir des étiquettes d'axes, ce qui signifie qu'elle peut être indexée par une étiquette, au lieu d'un simple emplacement de numéro. Il n'a pas non plus besoin de contenir des données numériques, il peut contenir n'importe quel objet Python arbitraire.\n",
"\n",
"Examinons ce concept à travers quelques exemples :"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [],
"source": [
"import numpy as np\n",
"import pandas as pd"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Création d'une d'un objet Series\n",
"\n",
"Vous pouvez convertir une liste, un tableau numpy ou un dictionnaire en série :"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"labels = ['a','b','c']\n",
"my_list = [10,20,30]\n",
"arr = np.array([10,20,30])\n",
"d = {'a':10,'b':20,'c':30}"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**en utilisant des listes**"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"0 10\n",
"1 20\n",
"2 30\n",
"dtype: int64"
]
},
"metadata": {},
"execution_count": 3
}
],
"source": [
"pd.Series(data=my_list)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"a 10\n",
"b 20\n",
"c 30\n",
"dtype: int64"
]
},
"metadata": {},
"execution_count": 4
}
],
"source": [
"pd.Series(data=my_list,index=labels)"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"a 10\n",
"b 20\n",
"c 30\n",
"dtype: int64"
]
},
"metadata": {},
"execution_count": 5
}
],
"source": [
"pd.Series(my_list,labels)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**en utilisant des tableaux NumPy**"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"0 10\n",
"1 20\n",
"2 30\n",
"dtype: int32"
]
},
"metadata": {},
"execution_count": 6
}
],
"source": [
"pd.Series(arr)"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"a 10\n",
"b 20\n",
"c 30\n",
"dtype: int32"
]
},
"metadata": {},
"execution_count": 7
}
],
"source": [
"pd.Series(arr,labels)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"**en utilisant des dictionnaires**"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"a 10\n",
"b 20\n",
"c 30\n",
"dtype: int64"
]
},
"metadata": {},
"execution_count": 8
}
],
"source": [
"pd.Series(d)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"### Données dans une Série\n",
"\n",
"Une série pandas peut contenir une variété de types d'objets :"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"0 a\n",
"1 b\n",
"2 c\n",
"dtype: object"
]
},
"metadata": {},
"execution_count": 9
}
],
"source": [
"pd.Series(data=labels)"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"0 <built-in function sum>\n",
"1 <built-in function print>\n",
"2 <built-in function len>\n",
"dtype: object"
]
},
"metadata": {},
"execution_count": 10
}
],
"source": [
"# Même es fonctions (bien qu'il soit peu probable que vous l'utilisiez)\n",
"pd.Series([sum,print,len])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## En utilisant un Index\n",
"\n",
"La clé de l'utilisation d'une série est la compréhension de son index. Pandas utilise ces noms ou numéros d'index en permettant une recherche rapide de l'information (fonctionne comme une table de hachage ou un dictionnaire).\n",
"\n",
"Voyons quelques exemples de la façon d'extraire des informations d'une série. Créons deux séries ser1 et ser2 :"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [],
"source": [
"ser1 = pd.Series([1,2,3,4],index = ['USA', 'Germany','France', 'Japan']) "
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"USA 1\n",
"Germany 2\n",
"France 3\n",
"Japan 4\n",
"dtype: int64"
]
},
"metadata": {},
"execution_count": 12
}
],
"source": [
"ser1"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [],
"source": [
"ser2 = pd.Series([1,2,5,4],index = ['USA', 'Germany','Italy', 'Japan']) "
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"USA 1\n",
"Germany 2\n",
"Italy 5\n",
"Japan 4\n",
"dtype: int64"
]
},
"metadata": {},
"execution_count": 14
}
],
"source": [
"ser2"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"1"
]
},
"metadata": {},
"execution_count": 15
}
],
"source": [
"ser1['USA']"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Les opérations sont alors également effectuées sur la base de l'index :"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"output_type": "execute_result",
"data": {
"text/plain": [
"France NaN\n",
"Germany 4.0\n",
"Italy NaN\n",
"Japan 8.0\n",
"USA 2.0\n",
"dtype: float64"
]
},
"metadata": {},
"execution_count": 16
}
],
"source": [
"ser1 + ser2"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Arrêtons-nous ici pour l'instant et passons aux DataFrames, qui vont développer le concept de Série !\n",
"# Bon travail!"
]
}
],
"metadata": {
"kernelspec": {
"name": "python3",
"display_name": "Python 3.7.9 64-bit ('pyfinance': conda)",
"metadata": {
"interpreter": {
"hash": "e89404a230d8800c54ad520c7b67d1bd9bb833a07b37dd3e521a178a3dc34904"
}
}
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.9-final"
}
},
"nbformat": 4,
"nbformat_minor": 1
}