{ "cells": [ { "cell_type": "markdown", "id": "6cdd9a9f", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "(array_ops_nb)=\n", "# Introduction to Array Operations in Python\n", "\n", "> Meenal Jhajharia" ] }, { "cell_type": "markdown", "id": "8773d758", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "**Meenal Jhajharia. she/her.**\n", "\n", "- CS and Math undergrad, University of Delhi\n", "- PyMC core contributor | GSoC student\n", "- Contact: [meenal@mjhajharia.com](mailto:meenal@mjhajharia.com) | [mjhajharia.com](https://mjhajharia.com)" ] }, { "cell_type": "markdown", "id": "56c4a741", "metadata": { "slideshow": { "slide_type": "slide" }, "tags": [ "remove-cell" ] }, "source": [ "
\n", "
\n", " \n", "
\n", "
" ] }, { "cell_type": "markdown", "id": "418f6eb4", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "This banner is generated from [this code](https://raw.githubusercontent.com/pymc-devs/pymc-data-umbrella/main/banner.py), the code in this link is a trivial customization of the [original code](https://github.com/pymc-devs/pymcon/blob/gh-pages/assets/make_trajectories.py) by [Colin Caroll](https://colindcarroll.com/) who designed a [similar banner for pymcon’20](https://pymcon.com/), Colin is amazing at visualization stuff and even has a couple of talks about it!!" ] }, { "cell_type": "markdown", "id": "3831be6f", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Overview \n", "\n", "- Introduction\n", "- Python Objects\n", "- List Comprehension\n", "- Basics of NumPy" ] }, { "cell_type": "markdown", "id": "216f1aa6", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Why Python?\n", "\n", "- Useful for quick prototyping\n", "- Dynamically Typed, Interpreted, High level data types\n", "- Large number of scientific open source software\n", "\n", "Best Place to learn more : [Official Python Tutorial](https://docs.python.org/3/tutorial/index.html)" ] }, { "cell_type": "markdown", "id": "d14a7454", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Let's get started!\n", "\n", "All the code that is shown in this webinar can be executed from its website. Therefore you have two ways to follow\n", "along:\n", "\n", "* Click on the run code button and execute the code straight from this page\n", "\n", "::::{div} sd-d-flex-row sd-align-major-center\n", ":::{thebe-button}\n", ":::\n", "::::\n", "\n", "* Clone the GitHub repo: [pymc-devs/pymc-data-umbrella](https://github.com/pymc-devs/pymc-data-umbrella) \n", " and follow along locally using Jupyter" ] }, { "cell_type": "markdown", "id": "5dc851f9", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Python Data Types\n", "\n", "![data types](data_types.png)" ] }, { "cell_type": "markdown", "id": "0c26bc43", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Numbers\n", "\n", "Certain numeric modules ship with Python" ] }, { "cell_type": "code", "execution_count": 1, "id": "956c70dd", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "0.962693373774243" ] }, "execution_count": 1, "metadata": {}, "output_type": "execute_result" } ], "source": [ "import random\n", "random.random()" ] }, { "cell_type": "markdown", "id": "dfa9a9bb", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Strings\n", "\n", "Sequence Operations" ] }, { "cell_type": "code", "execution_count": 2, "id": "6ea905d6", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "4" ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "X = 'Data'\n", "len(X)" ] }, { "cell_type": "code", "execution_count": 3, "id": "43c43aef", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "'Da'" ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "X[0:-2]" ] }, { "cell_type": "markdown", "id": "3c38a482", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Immutability\n", "\n", "Immutable objects cannot be changed" ] }, { "cell_type": "code", "execution_count": 4, "id": "deee27a9", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "'DataUmbrella'" ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "X = 'Data'\n", "X + 'Umbrella'" ] }, { "cell_type": "code", "execution_count": 5, "id": "4f775eb3", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "ename": "TypeError", "evalue": "'str' object does not support item assignment", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m/var/folders/sd/pc07b3wn65nflgx8wpkkr7wr0000gn/T/ipykernel_2451/1913827336.py\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mX\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m0\u001b[0m\u001b[0;34m]\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0;34m'P'\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mTypeError\u001b[0m: 'str' object does not support item assignment" ] } ], "source": [ "X[0] = 'P'" ] }, { "cell_type": "markdown", "id": "a4171a33", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Polymorphism\n", "\n", "Operators or functions mean different things for different objects" ] }, { "cell_type": "code", "execution_count": null, "id": "4f58c1e2", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "1+2" ] }, { "cell_type": "code", "execution_count": 6, "id": "8e2e910d", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "'PyMC'" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "'Py'+'MC'" ] }, { "cell_type": "markdown", "id": "1844a5f0", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Length or size means different things for different datatypes\n", "\n" ] }, { "cell_type": "code", "execution_count": 7, "id": "b8770fd2", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "6" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len(\"Python\")" ] }, { "cell_type": "code", "execution_count": 8, "id": "91a27b11", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "3" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len([\"Python\", \"Java\", \"C\"])" ] }, { "cell_type": "code", "execution_count": 9, "id": "225d8273", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "2" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "len({\"Language\": \"Python\", \"IDE\": \"VSCode\"})" ] }, { "cell_type": "markdown", "id": "fa543ece", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Related: Class Polymorphism, Method Overriding and Inheritance" ] }, { "cell_type": "markdown", "id": "978ffddb", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## Lists\n", "\n", "Positionally ordered collections of arbitrarily typed objects (mutable, no fixed size)" ] }, { "cell_type": "code", "execution_count": 10, "id": "d2b14d34", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "3" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "L = ['Python', 45, 1.23]\n", "len(L)" ] }, { "cell_type": "code", "execution_count": 11, "id": "62091eb5", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "['Python', 45, 1.23, 4, 5, 6]" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "L + [4, 5, 6]" ] }, { "cell_type": "code", "execution_count": 12, "id": "26fc55f6", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "1.23" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "L[-1]" ] }, { "cell_type": "markdown", "id": "6825a607", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "List-specific operations" ] }, { "cell_type": "code", "execution_count": 13, "id": "ac588cbe", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "['Python', 45, 1.23, 'Aesara']" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "L.append('Aesara');L" ] }, { "cell_type": "code", "execution_count": 14, "id": "0a7bb5eb", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "['Python', 45, 'Aesara']" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "L.pop(2); L" ] }, { "cell_type": "markdown", "id": "583dd11e", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "More: sort(), reverse()" ] }, { "cell_type": "markdown", "id": "2699ce32", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "List indexing and slicing" ] }, { "cell_type": "code", "execution_count": 15, "id": "6b171a71", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "ename": "IndexError", "evalue": "list index out of range", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mIndexError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m/var/folders/sd/pc07b3wn65nflgx8wpkkr7wr0000gn/T/ipykernel_2451/2456052074.py\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[0;32m----> 1\u001b[0;31m \u001b[0mL\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m99\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mIndexError\u001b[0m: list index out of range" ] } ], "source": [ "L[99]" ] }, { "cell_type": "code", "execution_count": 16, "id": "79248346", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "2 2\n" ] } ], "source": [ "X = [[1,2],[2,1]]\n", "print(len(X), len(X[0]))" ] }, { "cell_type": "code", "execution_count": 17, "id": "b13a2c03", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "1" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "X[0][0]" ] }, { "cell_type": "code", "execution_count": 18, "id": "da6186da", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "data": { "text/plain": [ "['Python', 45, 'Aesara']" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "L[:]" ] }, { "cell_type": "code", "execution_count": 19, "id": "44a5ffdd", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "['Python', 45, 'Aesara']" ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "L[-3:]" ] }, { "cell_type": "code", "execution_count": 20, "id": "26fdf15b", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "[2, 4, 6, 8, 10]" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "L = [1,2,3,4,5,6,7,8,9,10]\n", "L[1::2] #L[start:end:step_size]" ] }, { "cell_type": "code", "execution_count": 21, "id": "53c09385", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "[10, 9, 8, 7, 6, 5, 4, 3, 2, 1]" ] }, "execution_count": 21, "metadata": {}, "output_type": "execute_result" } ], "source": [ "L[::-1]" ] }, { "cell_type": "markdown", "id": "2a638f20", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## List Comprehension" ] }, { "cell_type": "code", "execution_count": 22, "id": "f4fd3888", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "List = []\n", " \n", "for character in 'Python':\n", " List.append(character)" ] }, { "cell_type": "code", "execution_count": 23, "id": "f84018dc", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "List = [character for character in 'Python']" ] }, { "cell_type": "code", "execution_count": 24, "id": "18e094bc", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "M = [['OS','Percentage of Users'],['Linux', '40'],['Windows', '20'], ['OSX','40']]" ] }, { "cell_type": "code", "execution_count": 25, "id": "0f69c570", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "['Linux', 'Windows', 'OSX']" ] }, "execution_count": 25, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[row[0] for row in M][1:]" ] }, { "cell_type": "code", "execution_count": 26, "id": "7179a832", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "['Linux*', 'Windows*', 'OSX*']" ] }, "execution_count": 26, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[row[0] + '*' for row in M][1:]" ] }, { "cell_type": "code", "execution_count": 27, "id": "3d9ec073", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "['Linux', 'Windows']" ] }, "execution_count": 27, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[row[0] for row in M if row[0][0]!='O']" ] }, { "cell_type": "markdown", "id": "f4acfb5b", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Nested List Comprehension" ] }, { "cell_type": "code", "execution_count": 28, "id": "9733037a", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "[[1, 0, 0], [0, 1, 0], [0, 0, 1]]" ] }, "execution_count": 28, "metadata": {}, "output_type": "execute_result" } ], "source": [ "n = 3; [[ 1 if i==j else 0 for i in range(n) ] for j in range(n)]" ] }, { "cell_type": "code", "execution_count": 29, "id": "a63e9792", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "[0, 6, 12, 18]" ] }, "execution_count": 29, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[x for x in range(21) if x%2==0 if x%3==0] " ] }, { "cell_type": "markdown", "id": "9d1afcb8", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Lambda Function" ] }, { "cell_type": "code", "execution_count": 30, "id": "a7d91f19", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "[0, 10, 20, 30, 40, 50, 60, 70, 80, 90]" ] }, "execution_count": 30, "metadata": {}, "output_type": "execute_result" } ], "source": [ "[i*10 for i in range(10)]" ] }, { "cell_type": "code", "execution_count": 31, "id": "dc55d59e", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "[0, 10, 20, 30, 40, 50, 60, 70, 80, 90]" ] }, "execution_count": 31, "metadata": {}, "output_type": "execute_result" } ], "source": [ "list(map(lambda i: i*10, [i for i in range(10)]))" ] }, { "cell_type": "markdown", "id": "09109d7c", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "## NumPy" ] }, { "cell_type": "markdown", "id": "700f3b54", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "NumPy’s array class -> ndarray(array)\n", "\n", "- ndarray.ndim\n", "- ndarray.shape\n", "- ndarray.size" ] }, { "cell_type": "code", "execution_count": 32, "id": "39ffee59", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [], "source": [ "import numpy as np\n", "\n", "a = np.arange(16).reshape(4, 4)" ] }, { "cell_type": "code", "execution_count": 33, "id": "46128000", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([[ 0, 1, 2, 3],\n", " [ 4, 5, 6, 7],\n", " [ 8, 9, 10, 11],\n", " [12, 13, 14, 15]])" ] }, "execution_count": 33, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a" ] }, { "cell_type": "markdown", "id": "84f4d7d8", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Simple array operation" ] }, { "cell_type": "code", "execution_count": 34, "id": "b6f60dd8", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([[ 0, 2, 4, 6],\n", " [ 8, 10, 12, 14],\n", " [16, 18, 20, 22],\n", " [24, 26, 28, 30]])" ] }, "execution_count": 34, "metadata": {}, "output_type": "execute_result" } ], "source": [ "2*a" ] }, { "cell_type": "markdown", "id": "408aa1f0", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### General Properties of ndarrays" ] }, { "cell_type": "code", "execution_count": 35, "id": "b79cc705", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "(4, 4)" ] }, "execution_count": 35, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a.shape" ] }, { "cell_type": "code", "execution_count": 36, "id": "966729b1", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "2" ] }, "execution_count": 36, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a.ndim" ] }, { "cell_type": "code", "execution_count": 37, "id": "8a9cbcef", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "16" ] }, "execution_count": 37, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a.size" ] }, { "cell_type": "markdown", "id": "981cd213", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Ways to create new arrays\n", "\n" ] }, { "cell_type": "code", "execution_count": 38, "id": "d5fd0260", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "a = np.array(['PyMC', 'Arviz', 'Aesara'])" ] }, { "cell_type": "code", "execution_count": 39, "id": "ba1de54e", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([[0., 0., 0., 0.],\n", " [0., 0., 0., 0.],\n", " [0., 0., 0., 0.],\n", " [0., 0., 0., 0.]])" ] }, "execution_count": 39, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.zeros((4, 4))" ] }, { "cell_type": "code", "execution_count": 40, "id": "6d76bdaa", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([[1., 1., 1., 1.],\n", " [1., 1., 1., 1.],\n", " [1., 1., 1., 1.],\n", " [1., 1., 1., 1.]])" ] }, "execution_count": 40, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.ones((4, 4))" ] }, { "cell_type": "markdown", "id": "bc3aef8b", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "#### Generate values in a certain range\n", "\n" ] }, { "cell_type": "code", "execution_count": 41, "id": "9ab1b01a", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([ 1, 11, 21, 31, 41, 51, 61, 71, 81, 91])" ] }, "execution_count": 41, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.arange(1, 100, 10)" ] }, { "cell_type": "markdown", "id": "c97941e5", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Random Number Generator" ] }, { "cell_type": "code", "execution_count": 42, "id": "0bce9196", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([0.51182162, 0.9504637 , 0.14415961])" ] }, "execution_count": 42, "metadata": {}, "output_type": "execute_result" } ], "source": [ "rg = np.random.default_rng(1)\n", "x = rg.random(3);x" ] }, { "cell_type": "markdown", "id": "6b893144", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Cumulative sum against specified axis (in this case only one axis is present)\n", "\n" ] }, { "cell_type": "code", "execution_count": 43, "id": "9a1486da", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([0.51182162, 1.46228532, 1.60644493])" ] }, "execution_count": 43, "metadata": {}, "output_type": "execute_result" } ], "source": [ "x.cumsum()" ] }, { "cell_type": "markdown", "id": "827b4173", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Multi-dimensional arrays\n", "\n" ] }, { "cell_type": "code", "execution_count": 44, "id": "dfc3022a", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "c = np.array([[[0, 1, 2],[ 10, 12, 13]],\n", "[[100, 101, 102],[110, 112, 113]]])" ] }, { "cell_type": "code", "execution_count": 45, "id": "98221d46", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "(2, 2, 3)" ] }, "execution_count": 45, "metadata": {}, "output_type": "execute_result" } ], "source": [ "c.shape" ] }, { "cell_type": "code", "execution_count": 46, "id": "8a3edefa", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "[[ 0 1 2]\n", " [10 12 13]] -\n", "[[100 101 102]\n", " [110 112 113]] -\n" ] } ], "source": [ "for row in c:\n", " print(row,'-')" ] }, { "cell_type": "markdown", "id": "7cc9767b", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Element-wise printing\n", "\n" ] }, { "cell_type": "code", "execution_count": 47, "id": "3fa280c1", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "0\n", "1\n", "2\n", "10\n", "12\n", "13\n", "100\n", "101\n", "102\n", "110\n", "112\n", "113\n" ] } ], "source": [ "for row in c.flat:\n", " print(row)" ] }, { "cell_type": "markdown", "id": "e1124672", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Transpose" ] }, { "cell_type": "code", "execution_count": 48, "id": "94237df4", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([[[ 0, 100],\n", " [ 10, 110]],\n", "\n", " [[ 1, 101],\n", " [ 12, 112]],\n", "\n", " [[ 2, 102],\n", " [ 13, 113]]])" ] }, "execution_count": 48, "metadata": {}, "output_type": "execute_result" } ], "source": [ "c.T" ] }, { "cell_type": "markdown", "id": "932f8909", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Reshape" ] }, { "cell_type": "code", "execution_count": 49, "id": "61e10acd", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([[ 0],\n", " [ 1],\n", " [ 2],\n", " [ 10],\n", " [ 12],\n", " [ 13],\n", " [100],\n", " [101],\n", " [102],\n", " [110],\n", " [112],\n", " [113]])" ] }, "execution_count": 49, "metadata": {}, "output_type": "execute_result" } ], "source": [ "c.reshape((12,1))" ] }, { "cell_type": "markdown", "id": "d49e9d6d", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Stacking" ] }, { "cell_type": "code", "execution_count": 50, "id": "9f3cf80d", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [], "source": [ "a = np.ones((2,2))\n", "b = np.zeros((2,2))" ] }, { "cell_type": "code", "execution_count": 51, "id": "5ac0a52b", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([[1., 1.],\n", " [1., 1.],\n", " [0., 0.],\n", " [0., 0.]])" ] }, "execution_count": 51, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.vstack((a, b))" ] }, { "cell_type": "code", "execution_count": 52, "id": "80911dd4", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([[1., 1., 0., 0.],\n", " [1., 1., 0., 0.]])" ] }, "execution_count": 52, "metadata": {}, "output_type": "execute_result" } ], "source": [ "np.hstack((a, b))" ] }, { "cell_type": "markdown", "id": "442dde2b", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Broadcasting\n", "\n", "Used to deal with inputs that do not have exactly the same shape\n", "\n", "- If all input arrays do not have the same number of dimensions, a “1” will be repeatedly prepended to the shapes of the smaller arrays until all the arrays have the same number of dimensions.\n", "\n", "- Arrays with a size of 1 along a particular dimension act as if they had the size of the array with the largest shape along that dimension. The value of the array element is assumed to be the same along that dimension for the “broadcast” array." ] }, { "cell_type": "markdown", "id": "c0fe75bc", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Arrays with same dimensions\n", "\n" ] }, { "cell_type": "code", "execution_count": 53, "id": "e3f6b545", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([3, 6, 9])" ] }, "execution_count": 53, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a = np.array([1, 2, 3])\n", "b = np.array([3, 3, 3])\n", "a*b" ] }, { "cell_type": "markdown", "id": "eedeae5a", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "1-d Array and a Scalar\n", "\n" ] }, { "cell_type": "code", "execution_count": 54, "id": "fb9f5c77", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([3, 6, 9])" ] }, "execution_count": 54, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a = np.array([1, 2, 3])\n", "b = 3\n", "a*b" ] }, { "cell_type": "markdown", "id": "6ca5d77b", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Intuitively: scalar b being \"stretched\" to same shape as a\n", " \n", "Reality: broadcasting moves less memory around (computationally efficient)" ] }, { "cell_type": "markdown", "id": "d2d6c451", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Arrays where dimensions aren’t exactly same, but are aligned along the leading dimension" ] }, { "cell_type": "code", "execution_count": 55, "id": "a56f2023", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([[[1., 1., 1.],\n", " [1., 1., 1.]],\n", "\n", " [[1., 1., 1.],\n", " [1., 1., 1.]],\n", "\n", " [[1., 1., 1.],\n", " [1., 1., 1.]],\n", "\n", " [[1., 1., 1.],\n", " [1., 1., 1.]],\n", "\n", " [[1., 1., 1.],\n", " [1., 1., 1.]]])" ] }, "execution_count": 55, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a = np.ones((5,2,3))\n", "b = np.ones((2,3))\n", "a*b" ] }, { "cell_type": "markdown", "id": "a5a84ddd", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Arrays where dimensions aren’t exactly same, but leading dimension is 1, so it works" ] }, { "cell_type": "code", "execution_count": 56, "id": "acb24e19", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([[[1., 1., 1.],\n", " [1., 1., 1.]],\n", "\n", " [[1., 1., 1.],\n", " [1., 1., 1.]],\n", "\n", " [[1., 1., 1.],\n", " [1., 1., 1.]],\n", "\n", " [[1., 1., 1.],\n", " [1., 1., 1.]],\n", "\n", " [[1., 1., 1.],\n", " [1., 1., 1.]]])" ] }, "execution_count": 56, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a = np.ones((5,2,1))\n", "b = np.ones((2,3))\n", "a*b" ] }, { "cell_type": "markdown", "id": "ddb3c352", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Broadcasting fails!\n", "\n" ] }, { "cell_type": "code", "execution_count": 57, "id": "5250837e", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "ename": "ValueError", "evalue": "operands could not be broadcast together with shapes (5,2,2) (2,3) ", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mValueError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m/var/folders/sd/pc07b3wn65nflgx8wpkkr7wr0000gn/T/ipykernel_2451/2083074354.py\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0ma\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mnp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mones\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m5\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m2\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m2\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 2\u001b[0m \u001b[0mb\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mnp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mones\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m2\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m3\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 3\u001b[0;31m \u001b[0ma\u001b[0m\u001b[0;34m*\u001b[0m\u001b[0mb\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mValueError\u001b[0m: operands could not be broadcast together with shapes (5,2,2) (2,3) " ] } ], "source": [ "a = np.ones((5,2,2))\n", "b = np.ones((2,3))\n", "a*b" ] }, { "cell_type": "markdown", "id": "359a847e", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "NumPy compares shapes element-wise for two given arrays\n", "\n", "It starts with the trailing (i.e. rightmost) dimensions Two dimensions are compatible when\n", "- they are equal, or\n", "- one of them is 1\n", "\n", "Arrays do not need to have the same exact number of dimensions to be compatible. Broadcasting is a convenient way of taking the outer product (or any outer operation)\n", "\n", "Here broadcasting fails because of the mismatch of leading dimensions" ] }, { "cell_type": "code", "execution_count": 58, "id": "91b7912a", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "ename": "ValueError", "evalue": "operands could not be broadcast together with shapes (4,) (3,) ", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mValueError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m/var/folders/sd/pc07b3wn65nflgx8wpkkr7wr0000gn/T/ipykernel_2451/257425980.py\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0ma\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mnp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0marray\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m2\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m3\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m4\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 2\u001b[0m \u001b[0mb\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0mnp\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0marray\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0;36m1\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m2\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m3\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 3\u001b[0;31m \u001b[0ma\u001b[0m\u001b[0;34m*\u001b[0m\u001b[0mb\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mValueError\u001b[0m: operands could not be broadcast together with shapes (4,) (3,) " ] } ], "source": [ "a = np.array([1,2,3,4])\n", "b = np.array([1,2,3])\n", "a*b" ] }, { "cell_type": "markdown", "id": "8053cdb3", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "We transpose a to reshape it along a new axix\n", "\n" ] }, { "cell_type": "code", "execution_count": 59, "id": "31c3343a", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "(4, 1)" ] }, "execution_count": 59, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a = np.asarray([a]).T #a[:, np.newaxis]\n", "a.shape" ] }, { "cell_type": "markdown", "id": "c4bd95b2", "metadata": { "slideshow": { "slide_type": "fragment" } }, "source": [ "Now it works!" ] }, { "cell_type": "code", "execution_count": 60, "id": "6f6c01dd", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([[ 1, 2, 3],\n", " [ 2, 4, 6],\n", " [ 3, 6, 9],\n", " [ 4, 8, 12]])" ] }, "execution_count": 60, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a*b" ] }, { "cell_type": "markdown", "id": "c2382b87", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "### Indexing" ] }, { "cell_type": "code", "execution_count": 61, "id": "fdb1f26f", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([6, 6, 9, 8])" ] }, "execution_count": 61, "metadata": {}, "output_type": "execute_result" } ], "source": [ "a = np.array([0, 6, 9, 8, 8, 6, 2, 7, 2, 8, 1, 0, 4, 6, 9, 0])\n", "i = np.array([1, 1, 2, 3])\n", "a[i]" ] }, { "cell_type": "code", "execution_count": 62, "id": "64eb9f69", "metadata": { "slideshow": { "slide_type": "fragment" } }, "outputs": [ { "data": { "text/plain": [ "array([[8, 0],\n", " [9, 6]])" ] }, "execution_count": 62, "metadata": {}, "output_type": "execute_result" } ], "source": [ "j = np.array([[3, 0], [2, 1]])\n", "a[j]" ] }, { "cell_type": "code", "execution_count": 63, "id": "358c6005", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(16,) (4,) (2, 2)\n" ] }, { "ename": "IndexError", "evalue": "too many indices for array: array is 1-dimensional, but 2 were indexed", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mIndexError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m/var/folders/sd/pc07b3wn65nflgx8wpkkr7wr0000gn/T/ipykernel_2451/3150026016.py\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0ma\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mshape\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mi\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mshape\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mj\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mshape\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 2\u001b[0;31m \u001b[0ma\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mi\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0mj\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mIndexError\u001b[0m: too many indices for array: array is 1-dimensional, but 2 were indexed" ] } ], "source": [ "print(a.shape, i.shape, j.shape)\n", "a[i,j]" ] }, { "cell_type": "code", "execution_count": 64, "id": "7aa688d6", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(4, 4) (4,) (2, 2)\n" ] }, { "ename": "IndexError", "evalue": "shape mismatch: indexing arrays could not be broadcast together with shapes (4,) (2,2) ", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mIndexError\u001b[0m Traceback (most recent call last)", "\u001b[0;32m/var/folders/sd/pc07b3wn65nflgx8wpkkr7wr0000gn/T/ipykernel_2451/214778350.py\u001b[0m in \u001b[0;36m\u001b[0;34m\u001b[0m\n\u001b[1;32m 1\u001b[0m \u001b[0ma\u001b[0m \u001b[0;34m=\u001b[0m \u001b[0ma\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mreshape\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0;36m4\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0;36m4\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[1;32m 2\u001b[0m \u001b[0mprint\u001b[0m\u001b[0;34m(\u001b[0m\u001b[0ma\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mshape\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mi\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mshape\u001b[0m\u001b[0;34m,\u001b[0m \u001b[0mj\u001b[0m\u001b[0;34m.\u001b[0m\u001b[0mshape\u001b[0m\u001b[0;34m)\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0;32m----> 3\u001b[0;31m \u001b[0ma\u001b[0m\u001b[0;34m[\u001b[0m\u001b[0mi\u001b[0m\u001b[0;34m,\u001b[0m\u001b[0mj\u001b[0m\u001b[0;34m]\u001b[0m\u001b[0;34m\u001b[0m\u001b[0;34m\u001b[0m\u001b[0m\n\u001b[0m", "\u001b[0;31mIndexError\u001b[0m: shape mismatch: indexing arrays could not be broadcast together with shapes (4,) (2,2) " ] } ], "source": [ "a = a.reshape((4,4))\n", "print(a.shape, i.shape, j.shape)\n", "a[i,j]" ] }, { "cell_type": "code", "execution_count": 65, "id": "96dde615", "metadata": { "slideshow": { "slide_type": "slide" } }, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "(4, 4) (2, 2) (2, 2)\n" ] }, { "data": { "text/plain": [ "array([[7, 8],\n", " [1, 6]])" ] }, "execution_count": 65, "metadata": {}, "output_type": "execute_result" } ], "source": [ "i = i.reshape((2,2))\n", "print(a.shape, i.shape, j.shape)\n", "a[i,j]" ] }, { "cell_type": "markdown", "id": "0b3c6050", "metadata": { "slideshow": { "slide_type": "slide" } }, "source": [ "Next thing to look at -> https://numpy.org/doc/stable/user/basics.html\n", "\n", "Note / Reference: A lot of the things here are modified/original versions of examples given in official Python or NumPy documentation, so that’s the best source to learn comprehensively, this is meant to be an accessible introduction!!" ] } ], "metadata": { "celltoolbar": "Tags", "kernelspec": { "display_name": "Python 3 (ipykernel)", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.9.10" } }, "nbformat": 4, "nbformat_minor": 5 }