396 lines
8.1 KiB
Plaintext
396 lines
8.1 KiB
Plaintext
{
|
|
"cells": [
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"# Introduction aux Séries temporelles avec Pandas\n",
|
|
"\n",
|
|
"Beaucoup de nos données financières auront un index datatime, alors apprenons comment traiter ce genre de données avec pandas !"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 1,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"import numpy as np\n",
|
|
"import pandas as pd\n",
|
|
"import matplotlib.pyplot as plt"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 2,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"from datetime import datetime"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 3,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# pour illustrer l'ordre des arguments\n",
|
|
"my_year = 2017\n",
|
|
"my_month = 1\n",
|
|
"my_day = 2\n",
|
|
"my_hour = 13\n",
|
|
"my_minute = 30\n",
|
|
"my_second = 15"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 4,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# January 2nd, 2017\n",
|
|
"my_date = datetime(my_year,my_month,my_day)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 5,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"output_type": "execute_result",
|
|
"data": {
|
|
"text/plain": [
|
|
"datetime.datetime(2017, 1, 2, 0, 0)"
|
|
]
|
|
},
|
|
"metadata": {},
|
|
"execution_count": 5
|
|
}
|
|
],
|
|
"source": [
|
|
"# Par défaut 0:00\n",
|
|
"my_date "
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 6,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"# January 2nd, 2017 à 13:30:15\n",
|
|
"my_date_time = datetime(my_year,my_month,my_day,my_hour,my_minute,my_second)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 7,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"output_type": "execute_result",
|
|
"data": {
|
|
"text/plain": [
|
|
"datetime.datetime(2017, 1, 2, 13, 30, 15)"
|
|
]
|
|
},
|
|
"metadata": {},
|
|
"execution_count": 7
|
|
}
|
|
],
|
|
"source": [
|
|
"my_date_time"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"Vous pouvez saisir n'importe quelle partie de l'objet datetime:"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 8,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"output_type": "execute_result",
|
|
"data": {
|
|
"text/plain": [
|
|
"2"
|
|
]
|
|
},
|
|
"metadata": {},
|
|
"execution_count": 8
|
|
}
|
|
],
|
|
"source": [
|
|
"my_date.day"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 9,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"output_type": "execute_result",
|
|
"data": {
|
|
"text/plain": [
|
|
"13"
|
|
]
|
|
},
|
|
"metadata": {},
|
|
"execution_count": 9
|
|
}
|
|
],
|
|
"source": [
|
|
"my_date_time.hour"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"### Index dataetime avec Pandas\n",
|
|
"\n",
|
|
"Vous traiterez généralement les séries temporelles comme un index lorsque vous travaillez avec des dataframes pandas obtenues à partir d'APIs financières. Heureusement, pandas a beaucoup de fonctions et de méthodes pour travailler avec des séries temporelles !"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 10,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"output_type": "execute_result",
|
|
"data": {
|
|
"text/plain": [
|
|
"[datetime.datetime(2016, 1, 1, 0, 0), datetime.datetime(2016, 1, 2, 0, 0)]"
|
|
]
|
|
},
|
|
"metadata": {},
|
|
"execution_count": 10
|
|
}
|
|
],
|
|
"source": [
|
|
"# Créer un exemple de liste/tableau datetime\n",
|
|
"first_two = [datetime(2016, 1, 1), datetime(2016, 1, 2)]\n",
|
|
"first_two"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 11,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"output_type": "execute_result",
|
|
"data": {
|
|
"text/plain": [
|
|
"DatetimeIndex(['2016-01-01', '2016-01-02'], dtype='datetime64[ns]', freq=None)"
|
|
]
|
|
},
|
|
"metadata": {},
|
|
"execution_count": 11
|
|
}
|
|
],
|
|
"source": [
|
|
"# Converti en un index\n",
|
|
"dt_ind = pd.DatetimeIndex(first_two)\n",
|
|
"dt_ind"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 12,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"output_type": "stream",
|
|
"name": "stdout",
|
|
"text": [
|
|
"[[-1.05763777 -0.15818632]\n [-0.79985649 -0.93157171]]\n"
|
|
]
|
|
}
|
|
],
|
|
"source": [
|
|
"# Quelques données alétaoires\n",
|
|
"data = np.random.randn(2,2)\n",
|
|
"print(data)\n",
|
|
"cols = ['A','B']"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 13,
|
|
"metadata": {},
|
|
"outputs": [],
|
|
"source": [
|
|
"df = pd.DataFrame(data,dt_ind,cols)"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 14,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"output_type": "execute_result",
|
|
"data": {
|
|
"text/plain": [
|
|
" A B\n",
|
|
"2016-01-01 -1.057638 -0.158186\n",
|
|
"2016-01-02 -0.799856 -0.931572"
|
|
],
|
|
"text/html": "<div>\n<style scoped>\n .dataframe tbody tr th:only-of-type {\n vertical-align: middle;\n }\n\n .dataframe tbody tr th {\n vertical-align: top;\n }\n\n .dataframe thead th {\n text-align: right;\n }\n</style>\n<table border=\"1\" class=\"dataframe\">\n <thead>\n <tr style=\"text-align: right;\">\n <th></th>\n <th>A</th>\n <th>B</th>\n </tr>\n </thead>\n <tbody>\n <tr>\n <th>2016-01-01</th>\n <td>-1.057638</td>\n <td>-0.158186</td>\n </tr>\n <tr>\n <th>2016-01-02</th>\n <td>-0.799856</td>\n <td>-0.931572</td>\n </tr>\n </tbody>\n</table>\n</div>"
|
|
},
|
|
"metadata": {},
|
|
"execution_count": 14
|
|
}
|
|
],
|
|
"source": [
|
|
"df"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 15,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"output_type": "execute_result",
|
|
"data": {
|
|
"text/plain": [
|
|
"DatetimeIndex(['2016-01-01', '2016-01-02'], dtype='datetime64[ns]', freq=None)"
|
|
]
|
|
},
|
|
"metadata": {},
|
|
"execution_count": 15
|
|
}
|
|
],
|
|
"source": [
|
|
"df.index"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 16,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"output_type": "execute_result",
|
|
"data": {
|
|
"text/plain": [
|
|
"1"
|
|
]
|
|
},
|
|
"metadata": {},
|
|
"execution_count": 16
|
|
}
|
|
],
|
|
"source": [
|
|
"# Emplacement de la date la plus récente\n",
|
|
"df.index.argmax()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 17,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"output_type": "execute_result",
|
|
"data": {
|
|
"text/plain": [
|
|
"Timestamp('2016-01-02 00:00:00')"
|
|
]
|
|
},
|
|
"metadata": {},
|
|
"execution_count": 17
|
|
}
|
|
],
|
|
"source": [
|
|
"df.index.max()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 18,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"output_type": "execute_result",
|
|
"data": {
|
|
"text/plain": [
|
|
"0"
|
|
]
|
|
},
|
|
"metadata": {},
|
|
"execution_count": 18
|
|
}
|
|
],
|
|
"source": [
|
|
"# Emplacement de la date la plus ancienne\n",
|
|
"df.index.argmin()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "code",
|
|
"execution_count": 19,
|
|
"metadata": {},
|
|
"outputs": [
|
|
{
|
|
"output_type": "execute_result",
|
|
"data": {
|
|
"text/plain": [
|
|
"Timestamp('2016-01-01 00:00:00')"
|
|
]
|
|
},
|
|
"metadata": {},
|
|
"execution_count": 19
|
|
}
|
|
],
|
|
"source": [
|
|
"df.index.min()"
|
|
]
|
|
},
|
|
{
|
|
"cell_type": "markdown",
|
|
"metadata": {},
|
|
"source": [
|
|
"## Top, passons à la suite!"
|
|
]
|
|
}
|
|
],
|
|
"metadata": {
|
|
"anaconda-cloud": {},
|
|
"kernelspec": {
|
|
"name": "python3",
|
|
"display_name": "Python 3.7.9 64-bit ('pyfinance': conda)",
|
|
"metadata": {
|
|
"interpreter": {
|
|
"hash": "e89404a230d8800c54ad520c7b67d1bd9bb833a07b37dd3e521a178a3dc34904"
|
|
}
|
|
}
|
|
},
|
|
"language_info": {
|
|
"codemirror_mode": {
|
|
"name": "ipython",
|
|
"version": 3
|
|
},
|
|
"file_extension": ".py",
|
|
"mimetype": "text/x-python",
|
|
"name": "python",
|
|
"nbconvert_exporter": "python",
|
|
"pygments_lexer": "ipython3",
|
|
"version": "3.7.9-final"
|
|
}
|
|
},
|
|
"nbformat": 4,
|
|
"nbformat_minor": 1
|
|
} |