python-pour-finance/02-NumPy/.ipynb_checkpoints/3-Numpy-Indexing-et-Selecti...

647 lines
12 KiB
Plaintext

{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"___\n",
"\n",
"<a href='http://www.pieriandata.com'> <img src='../Pierian_Data_Logo.png' /></a>\n",
"___\n",
"<center>*Copyright Pierian Data 2017*</center>\n",
"<center>*For more information, visit us at www.pieriandata.com*</center>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# NumPy Indexing and Selection\n",
"\n",
"In this lecture we will discuss how to select elements or groups of elements from an array."
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [],
"source": [
"import numpy as np"
]
},
{
"cell_type": "code",
"execution_count": 3,
"metadata": {},
"outputs": [],
"source": [
"#Creating sample array\n",
"arr = np.arange(0,11)"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#Show\n",
"arr"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Bracket Indexing and Selection\n",
"The simplest way to pick one or some elements of an array looks very similar to python lists:"
]
},
{
"cell_type": "code",
"execution_count": 5,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"8"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#Get a value at an index\n",
"arr[8]"
]
},
{
"cell_type": "code",
"execution_count": 6,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([1, 2, 3, 4])"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#Get values in a range\n",
"arr[1:5]"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([0, 1, 2, 3, 4])"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#Get values in a range\n",
"arr[0:5]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Broadcasting\n",
"\n",
"Numpy arrays differ from a normal Python list because of their ability to broadcast:"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([100, 100, 100, 100, 100, 5, 6, 7, 8, 9, 10])"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#Setting a value with index range (Broadcasting)\n",
"arr[0:5]=100\n",
"\n",
"#Show\n",
"arr"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Reset array, we'll see why I had to reset in a moment\n",
"arr = np.arange(0,11)\n",
"\n",
"#Show\n",
"arr"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([0, 1, 2, 3, 4, 5])"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#Important notes on Slices\n",
"slice_of_arr = arr[0:6]\n",
"\n",
"#Show slice\n",
"slice_of_arr"
]
},
{
"cell_type": "code",
"execution_count": 11,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([99, 99, 99, 99, 99, 99])"
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#Change Slice\n",
"slice_of_arr[:]=99\n",
"\n",
"#Show Slice again\n",
"slice_of_arr"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Now note the changes also occur in our original array!"
]
},
{
"cell_type": "code",
"execution_count": 12,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([99, 99, 99, 99, 99, 99, 6, 7, 8, 9, 10])"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"arr"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Data is not copied, it's a view of the original array! This avoids memory problems!"
]
},
{
"cell_type": "code",
"execution_count": 13,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([99, 99, 99, 99, 99, 99, 6, 7, 8, 9, 10])"
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#To get a copy, need to be explicit\n",
"arr_copy = arr.copy()\n",
"\n",
"arr_copy"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Indexing a 2D array (matrices)\n",
"\n",
"The general format is **arr_2d[row][col]** or **arr_2d[row,col]**. I recommend usually using the comma notation for clarity."
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[ 5, 10, 15],\n",
" [20, 25, 30],\n",
" [35, 40, 45]])"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"arr_2d = np.array(([5,10,15],[20,25,30],[35,40,45]))\n",
"\n",
"#Show\n",
"arr_2d"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([20, 25, 30])"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#Indexing row\n",
"arr_2d[1]\n"
]
},
{
"cell_type": "code",
"execution_count": 16,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"20"
]
},
"execution_count": 16,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Format is arr_2d[row][col] or arr_2d[row,col]\n",
"\n",
"# Getting individual element value\n",
"arr_2d[1][0]"
]
},
{
"cell_type": "code",
"execution_count": 17,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"20"
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Getting individual element value\n",
"arr_2d[1,0]"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[10, 15],\n",
" [25, 30]])"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# 2D array slicing\n",
"\n",
"#Shape (2,2) from top right corner\n",
"arr_2d[:2,1:]"
]
},
{
"cell_type": "code",
"execution_count": 19,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([35, 40, 45])"
]
},
"execution_count": 19,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#Shape bottom row\n",
"arr_2d[2]"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([35, 40, 45])"
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#Shape bottom row\n",
"arr_2d[2,:]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## More Indexing Help\n",
"Indexing a 2d matrix can be a bit confusing at first, especially when you start to add in step size. Try google image searching NumPy indexing to fins useful images, like this one:\n",
"\n",
"<img src= 'http://memory.osu.edu/classes/python/_images/numpy_indexing.png' width=500/>"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## Conditional Selection\n",
"\n",
"This is a very fundamental concept that will directly translate to pandas later on, make sure you understand this part!\n",
"\n",
"Let's briefly go over how to use brackets for selection based off of comparison operators."
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([ 1, 2, 3, 4, 5, 6, 7, 8, 9, 10])"
]
},
"execution_count": 28,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"arr = np.arange(1,11)\n",
"arr"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([False, False, False, False, True, True, True, True, True, True], dtype=bool)"
]
},
"execution_count": 30,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"arr > 4"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {
"collapsed": true
},
"outputs": [],
"source": [
"bool_arr = arr>4"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([False, False, False, False, True, True, True, True, True, True], dtype=bool)"
]
},
"execution_count": 32,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"bool_arr"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([ 5, 6, 7, 8, 9, 10])"
]
},
"execution_count": 33,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"arr[bool_arr]"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([ 3, 4, 5, 6, 7, 8, 9, 10])"
]
},
"execution_count": 34,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"arr[arr>2]"
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([ 3, 4, 5, 6, 7, 8, 9, 10])"
]
},
"execution_count": 37,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"x = 2\n",
"arr[arr>x]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Great Job!\n"
]
}
],
"metadata": {
"anaconda-cloud": {},
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.6.1"
}
},
"nbformat": 4,
"nbformat_minor": 1
}