{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "___\n", "\n", " \n", "___\n", "
*Copyright Pierian Data 2017*
\n", "
*For more information, visit us at www.pieriandata.com*
" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Introduction to Time Series with Pandas\n", "\n", "A lot of our financial data will have a datatime index, so let's learn how to deal with this sort of data with pandas!" ] }, { "cell_type": "code", "execution_count": 15, "metadata": { "collapsed": true }, "outputs": [], "source": [ "import numpy as np\n", "import pandas as pd\n", "import matplotlib.pyplot as plt" ] }, { "cell_type": "code", "execution_count": 2, "metadata": { "collapsed": true }, "outputs": [], "source": [ "from datetime import datetime" ] }, { "cell_type": "code", "execution_count": 3, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# To illustrate the order of arguments\n", "my_year = 2017\n", "my_month = 1\n", "my_day = 2\n", "my_hour = 13\n", "my_minute = 30\n", "my_second = 15" ] }, { "cell_type": "code", "execution_count": 4, "metadata": { "collapsed": false }, "outputs": [], "source": [ "# January 2nd, 2017\n", "my_date = datetime(my_year,my_month,my_day)" ] }, { "cell_type": "code", "execution_count": 5, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "datetime.datetime(2017, 1, 2, 0, 0)" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Defaults to 0:00\n", "my_date " ] }, { "cell_type": "code", "execution_count": 6, "metadata": { "collapsed": true }, "outputs": [], "source": [ "# January 2nd, 2017 at 13:30:15\n", "my_date_time = datetime(my_year,my_month,my_day,my_hour,my_minute,my_second)" ] }, { "cell_type": "code", "execution_count": 7, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "datetime.datetime(2017, 1, 2, 13, 30, 15)" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "my_date_time" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "You can grab any part of the datetime object you want" ] }, { "cell_type": "code", "execution_count": 8, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "2" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "my_date.day" ] }, { "cell_type": "code", "execution_count": 9, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "13" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "my_date_time.hour" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Pandas with Datetime Index\n", "\n", "You'll usually deal with time series as an index when working with pandas dataframes obtained from some sort of financial API. Fortunately pandas has a lot of functions and methods to work with time series!" ] }, { "cell_type": "code", "execution_count": 13, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "[datetime.datetime(2016, 1, 1, 0, 0), datetime.datetime(2016, 1, 2, 0, 0)]" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Create an example datetime list/array\n", "first_two = [datetime(2016, 1, 1), datetime(2016, 1, 2)]\n", "first_two" ] }, { "cell_type": "code", "execution_count": 14, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "DatetimeIndex(['2016-01-01', '2016-01-02'], dtype='datetime64[ns]', freq=None)" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Converted to an index\n", "dt_ind = pd.DatetimeIndex(first_two)\n", "dt_ind" ] }, { "cell_type": "code", "execution_count": 29, "metadata": { "collapsed": false }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[-1.58270607 0.47766839]\n", " [ 0.34171008 0.5889566 ]]\n" ] } ], "source": [ "# Attached to some random data\n", "data = np.random.randn(2,2)\n", "print(data)\n", "cols = ['A','B']" ] }, { "cell_type": "code", "execution_count": 28, "metadata": { "collapsed": false }, "outputs": [], "source": [ "df = pd.DataFrame(data,dt_ind,cols)" ] }, { "cell_type": "code", "execution_count": 30, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/html": [ "
\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
AB
2016-01-010.165224-0.767629
2016-01-02-0.4823050.307934
\n", "
" ], "text/plain": [ " A B\n", "2016-01-01 0.165224 -0.767629\n", "2016-01-02 -0.482305 0.307934" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df" ] }, { "cell_type": "code", "execution_count": 31, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "DatetimeIndex(['2016-01-01', '2016-01-02'], dtype='datetime64[ns]', freq=None)" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.index" ] }, { "cell_type": "code", "execution_count": 34, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "1" ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Latest Date Location\n", "df.index.argmax()" ] }, { "cell_type": "code", "execution_count": 35, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "Timestamp('2016-01-02 00:00:00')" ] }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.index.max()" ] }, { "cell_type": "code", "execution_count": 37, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "0" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# Earliest Date Index Location\n", "df.index.argmin()" ] }, { "cell_type": "code", "execution_count": 38, "metadata": { "collapsed": false }, "outputs": [ { "data": { "text/plain": [ "Timestamp('2016-01-01 00:00:00')" ] }, "execution_count": 38, "metadata": {}, "output_type": "execute_result" } ], "source": [ "df.index.min()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Great, let's move on!" ] } ], "metadata": { "anaconda-cloud": {}, "kernelspec": { "display_name": "Python [conda root]", "language": "python", "name": "conda-root-py" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.5.3" } }, "nbformat": 4, "nbformat_minor": 0 }