{
 "cells": [
  {
   "attachments": {},
   "cell_type": "markdown",
   "metadata": {
    "id": "TsH_T3FL-Ms4"
   },
   "source": [
    "# 9. DataFrame indexing\n",
    "\n",
    "We have seen two ways of doing indexing in Numpy: with numerical indices and with boolean arrays. We will see that we can apply a very similar approach to DataFrame indexing. However we have one additional component that we didn't have before: while elements in a Numpy array can be purely located by their position (numerical indices), elements of a DataFrame can additionally be located thanks to their column name and index (not necessarily numerical).\n",
    "\n",
    "To understand this, we have to explore a bit further the \"anatomy\" of a dataframe. Let's first load one:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {
    "executionInfo": {
     "elapsed": 611,
     "status": "ok",
     "timestamp": 1614350856080,
     "user": {
      "displayName": "Guillaume Witz",
      "photoUrl": "https://lh3.googleusercontent.com/a-/AOh14GgT0K2JVYzEsjzsS5nhkUVjUrSIJ5jHzXnBoYrmVf8=s64",
      "userId": "16033870147214403532"
     },
     "user_tz": -60
    },
    "id": "3XFBO4ws-RIM",
    "tags": []
   },
   "outputs": [],
   "source": [
    "import pandas as pd\n",
    "import numpy as np"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {
    "executionInfo": {
     "elapsed": 1034,
     "status": "ok",
     "timestamp": 1614350856765,
     "user": {
      "displayName": "Guillaume Witz",
      "photoUrl": "https://lh3.googleusercontent.com/a-/AOh14GgT0K2JVYzEsjzsS5nhkUVjUrSIJ5jHzXnBoYrmVf8=s64",
      "userId": "16033870147214403532"
     },
     "user_tz": -60
    },
    "id": "nQCKYzxG_DfN",
    "tags": []
   },
   "outputs": [],
   "source": [
    "composers = pd.read_excel('https://github.com/guiwitz/NumpyPandas_course/blob/master/Data/composers.xlsx?raw=true')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 3,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 173
    },
    "executionInfo": {
     "elapsed": 748,
     "status": "ok",
     "timestamp": 1614350856773,
     "user": {
      "displayName": "Guillaume Witz",
      "photoUrl": "https://lh3.googleusercontent.com/a-/AOh14GgT0K2JVYzEsjzsS5nhkUVjUrSIJ5jHzXnBoYrmVf8=s64",
      "userId": "16033870147214403532"
     },
     "user_tz": -60
    },
    "id": "9ENJCS3MKLeY",
    "outputId": "5ec5d74e-65b4-4468-e38a-1d51d387cfc8",
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>composer</th>\n",
       "      <th>birth</th>\n",
       "      <th>death</th>\n",
       "      <th>city</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>Mahler</td>\n",
       "      <td>1860</td>\n",
       "      <td>1911</td>\n",
       "      <td>Kaliste</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Beethoven</td>\n",
       "      <td>1770</td>\n",
       "      <td>1827</td>\n",
       "      <td>Bonn</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>Puccini</td>\n",
       "      <td>1858</td>\n",
       "      <td>1924</td>\n",
       "      <td>Lucques</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Shostakovich</td>\n",
       "      <td>1906</td>\n",
       "      <td>1975</td>\n",
       "      <td>Saint-Petersburg</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "       composer  birth  death              city\n",
       "0        Mahler   1860   1911           Kaliste\n",
       "1     Beethoven   1770   1827              Bonn\n",
       "2       Puccini   1858   1924           Lucques\n",
       "3  Shostakovich   1906   1975  Saint-Petersburg"
      ]
     },
     "execution_count": 3,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "composers"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "JMfsEzTJKMxL"
   },
   "source": [
    "## Accessing columns\n",
    "\n",
    "We have already see how to access single columns e.g.:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "executionInfo": {
     "elapsed": 305,
     "status": "ok",
     "timestamp": 1614350856994,
     "user": {
      "displayName": "Guillaume Witz",
      "photoUrl": "https://lh3.googleusercontent.com/a-/AOh14GgT0K2JVYzEsjzsS5nhkUVjUrSIJ5jHzXnBoYrmVf8=s64",
      "userId": "16033870147214403532"
     },
     "user_tz": -60
    },
    "id": "iNHtJLM6SVJ-",
    "outputId": "c8dd79a2-832f-4e8d-f789-a48aa155e5f7",
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0    1860\n",
       "1    1770\n",
       "2    1858\n",
       "3    1906\n",
       "Name: birth, dtype: int64"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "composers['birth']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "phMp80ViSXAT"
   },
   "source": [
    "We can also access muliple columns by specifying a list of those that we need:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 5,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 173
    },
    "executionInfo": {
     "elapsed": 503,
     "status": "ok",
     "timestamp": 1614350858511,
     "user": {
      "displayName": "Guillaume Witz",
      "photoUrl": "https://lh3.googleusercontent.com/a-/AOh14GgT0K2JVYzEsjzsS5nhkUVjUrSIJ5jHzXnBoYrmVf8=s64",
      "userId": "16033870147214403532"
     },
     "user_tz": -60
    },
    "id": "c5NVRE9JSdH_",
    "outputId": "549adef4-d77d-40be-9a98-3fb7accf88e9",
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>birth</th>\n",
       "      <th>city</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>0</th>\n",
       "      <td>1860</td>\n",
       "      <td>Kaliste</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>1770</td>\n",
       "      <td>Bonn</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>2</th>\n",
       "      <td>1858</td>\n",
       "      <td>Lucques</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>1906</td>\n",
       "      <td>Saint-Petersburg</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   birth              city\n",
       "0   1860           Kaliste\n",
       "1   1770              Bonn\n",
       "2   1858           Lucques\n",
       "3   1906  Saint-Petersburg"
      ]
     },
     "execution_count": 5,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "composers[['birth', 'city']]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "s6YDATmkSgee"
   },
   "source": [
    "As you can see, as soon as we have more than one column, the returned object is a DataFrame.\n",
    "\n",
    "## Accessing indices\n",
    "\n",
    "We have seen an example with Numpy where we accessed elements in a 2D array using a pair of indices e.g.:\n",
    "\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "executionInfo": {
     "elapsed": 424,
     "status": "ok",
     "timestamp": 1614350860255,
     "user": {
      "displayName": "Guillaume Witz",
      "photoUrl": "https://lh3.googleusercontent.com/a-/AOh14GgT0K2JVYzEsjzsS5nhkUVjUrSIJ5jHzXnBoYrmVf8=s64",
      "userId": "16033870147214403532"
     },
     "user_tz": -60
    },
    "id": "ZNOGZuTvYNQL",
    "outputId": "52417273-183d-49c5-a4a1-9419a2f50be4",
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[-0.27943032,  0.53917334,  0.04905016, -0.29523685, -0.32803448],\n",
       "       [ 1.29974059, -0.87046224,  0.56457039, -0.16125738,  1.90088289],\n",
       "       [ 1.18116165,  0.95418676,  0.47466928,  1.12268135, -0.74403994]])"
      ]
     },
     "execution_count": 6,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "my_array = np.random.normal(size=(3,5))\n",
    "my_array"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "executionInfo": {
     "elapsed": 560,
     "status": "ok",
     "timestamp": 1614350860949,
     "user": {
      "displayName": "Guillaume Witz",
      "photoUrl": "https://lh3.googleusercontent.com/a-/AOh14GgT0K2JVYzEsjzsS5nhkUVjUrSIJ5jHzXnBoYrmVf8=s64",
      "userId": "16033870147214403532"
     },
     "user_tz": -60
    },
    "id": "IQwWkUybYVlb",
    "outputId": "0b7e7845-0b36-4a87-fc2c-69f772c032a0",
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "0.9541867643923373"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "my_array[2,1]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "M9h0HJWBYMok"
   },
   "source": [
    "Even though our DataFrame is a two dimensional object, we can't access its elements in the same way. For example we cannot recover the top left element (```Mahler```) by using:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 562
    },
    "executionInfo": {
     "elapsed": 436,
     "status": "error",
     "timestamp": 1614350861908,
     "user": {
      "displayName": "Guillaume Witz",
      "photoUrl": "https://lh3.googleusercontent.com/a-/AOh14GgT0K2JVYzEsjzsS5nhkUVjUrSIJ5jHzXnBoYrmVf8=s64",
      "userId": "16033870147214403532"
     },
     "user_tz": -60
    },
    "id": "4XIaEAR7Yuij",
    "outputId": "f1398aec-c5f3-4c25-ad8a-f7bf19a4f393",
    "tags": []
   },
   "outputs": [
    {
     "ename": "KeyError",
     "evalue": "(0, 0)",
     "output_type": "error",
     "traceback": [
      "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
      "\u001b[0;31mKeyError\u001b[0m                                  Traceback (most recent call last)",
      "File \u001b[0;32m~/mambaforge/envs/DAVPy2023/lib/python3.10/site-packages/pandas/core/indexes/base.py:3802\u001b[0m, in \u001b[0;36mIndex.get_loc\u001b[0;34m(self, key, method, tolerance)\u001b[0m\n\u001b[1;32m   3801\u001b[0m \u001b[38;5;28;01mtry\u001b[39;00m:\n\u001b[0;32m-> 3802\u001b[0m     \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_engine\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mget_loc\u001b[49m\u001b[43m(\u001b[49m\u001b[43mcasted_key\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m   3803\u001b[0m \u001b[38;5;28;01mexcept\u001b[39;00m \u001b[38;5;167;01mKeyError\u001b[39;00m \u001b[38;5;28;01mas\u001b[39;00m err:\n",
      "File \u001b[0;32m~/mambaforge/envs/DAVPy2023/lib/python3.10/site-packages/pandas/_libs/index.pyx:138\u001b[0m, in \u001b[0;36mpandas._libs.index.IndexEngine.get_loc\u001b[0;34m()\u001b[0m\n",
      "File \u001b[0;32m~/mambaforge/envs/DAVPy2023/lib/python3.10/site-packages/pandas/_libs/index.pyx:165\u001b[0m, in \u001b[0;36mpandas._libs.index.IndexEngine.get_loc\u001b[0;34m()\u001b[0m\n",
      "File \u001b[0;32mpandas/_libs/hashtable_class_helper.pxi:5745\u001b[0m, in \u001b[0;36mpandas._libs.hashtable.PyObjectHashTable.get_item\u001b[0;34m()\u001b[0m\n",
      "File \u001b[0;32mpandas/_libs/hashtable_class_helper.pxi:5753\u001b[0m, in \u001b[0;36mpandas._libs.hashtable.PyObjectHashTable.get_item\u001b[0;34m()\u001b[0m\n",
      "\u001b[0;31mKeyError\u001b[0m: (0, 0)",
      "\nThe above exception was the direct cause of the following exception:\n",
      "\u001b[0;31mKeyError\u001b[0m                                  Traceback (most recent call last)",
      "Cell \u001b[0;32mIn[8], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[43mcomposers\u001b[49m\u001b[43m[\u001b[49m\u001b[38;5;241;43m0\u001b[39;49m\u001b[43m,\u001b[49m\u001b[38;5;241;43m0\u001b[39;49m\u001b[43m]\u001b[49m\n",
      "File \u001b[0;32m~/mambaforge/envs/DAVPy2023/lib/python3.10/site-packages/pandas/core/frame.py:3807\u001b[0m, in \u001b[0;36mDataFrame.__getitem__\u001b[0;34m(self, key)\u001b[0m\n\u001b[1;32m   3805\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mcolumns\u001b[38;5;241m.\u001b[39mnlevels \u001b[38;5;241m>\u001b[39m \u001b[38;5;241m1\u001b[39m:\n\u001b[1;32m   3806\u001b[0m     \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_getitem_multilevel(key)\n\u001b[0;32m-> 3807\u001b[0m indexer \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mcolumns\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mget_loc\u001b[49m\u001b[43m(\u001b[49m\u001b[43mkey\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m   3808\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m is_integer(indexer):\n\u001b[1;32m   3809\u001b[0m     indexer \u001b[38;5;241m=\u001b[39m [indexer]\n",
      "File \u001b[0;32m~/mambaforge/envs/DAVPy2023/lib/python3.10/site-packages/pandas/core/indexes/base.py:3804\u001b[0m, in \u001b[0;36mIndex.get_loc\u001b[0;34m(self, key, method, tolerance)\u001b[0m\n\u001b[1;32m   3802\u001b[0m     \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_engine\u001b[38;5;241m.\u001b[39mget_loc(casted_key)\n\u001b[1;32m   3803\u001b[0m \u001b[38;5;28;01mexcept\u001b[39;00m \u001b[38;5;167;01mKeyError\u001b[39;00m \u001b[38;5;28;01mas\u001b[39;00m err:\n\u001b[0;32m-> 3804\u001b[0m     \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;167;01mKeyError\u001b[39;00m(key) \u001b[38;5;28;01mfrom\u001b[39;00m \u001b[38;5;21;01merr\u001b[39;00m\n\u001b[1;32m   3805\u001b[0m \u001b[38;5;28;01mexcept\u001b[39;00m \u001b[38;5;167;01mTypeError\u001b[39;00m:\n\u001b[1;32m   3806\u001b[0m     \u001b[38;5;66;03m# If we have a listlike key, _check_indexing_error will raise\u001b[39;00m\n\u001b[1;32m   3807\u001b[0m     \u001b[38;5;66;03m#  InvalidIndexError. Otherwise we fall through and re-raise\u001b[39;00m\n\u001b[1;32m   3808\u001b[0m     \u001b[38;5;66;03m#  the TypeError.\u001b[39;00m\n\u001b[1;32m   3809\u001b[0m     \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_check_indexing_error(key)\n",
      "\u001b[0;31mKeyError\u001b[0m: (0, 0)"
     ]
    }
   ],
   "source": [
    "composers[0,0]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "ZxDFtA8T1KFN"
   },
   "source": [
    "As Pandas has columns names and indices (the bold numbers on the left of the DataFrame), there are specific functions to access elements either by using those values or directly by numerical indexing.\n",
    "\n",
    "### ```loc```\n",
    "\n",
    "The ```.loc[index, name]``` method allows us to access a specific element situated at a specific ```index``` (row) and ```name``` (column). For example:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 35
    },
    "executionInfo": {
     "elapsed": 497,
     "status": "ok",
     "timestamp": 1614350864943,
     "user": {
      "displayName": "Guillaume Witz",
      "photoUrl": "https://lh3.googleusercontent.com/a-/AOh14GgT0K2JVYzEsjzsS5nhkUVjUrSIJ5jHzXnBoYrmVf8=s64",
      "userId": "16033870147214403532"
     },
     "user_tz": -60
    },
    "id": "-Qyy11xVU0uQ",
    "outputId": "3647fd3c-d984-41c5-a1f2-db1a762fecfe",
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'Saint-Petersburg'"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "composers.loc[3, 'city']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "IXk1dX8nZfL7"
   },
   "source": [
    "We can also recover all the items of a given index by only speciying the latter:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "executionInfo": {
     "elapsed": 432,
     "status": "ok",
     "timestamp": 1614350866412,
     "user": {
      "displayName": "Guillaume Witz",
      "photoUrl": "https://lh3.googleusercontent.com/a-/AOh14GgT0K2JVYzEsjzsS5nhkUVjUrSIJ5jHzXnBoYrmVf8=s64",
      "userId": "16033870147214403532"
     },
     "user_tz": -60
    },
    "id": "RTqvuO04Z0SY",
    "outputId": "4bee092c-86ab-4b3b-edc0-f5d225a0f5ab",
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "composer        Shostakovich\n",
       "birth                   1906\n",
       "death                   1975\n",
       "city        Saint-Petersburg\n",
       "Name: 3, dtype: object"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "composers.loc[3]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "m5fRD3_yVBfF"
   },
   "source": [
    "We see that this returns a simple Series with all the items defined in the DataFrame. Just like with the columns we can also pass a **list of indices** to recover more than one line, in which case we recover a DataFrame:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 11,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 111
    },
    "executionInfo": {
     "elapsed": 337,
     "status": "ok",
     "timestamp": 1614350868584,
     "user": {
      "displayName": "Guillaume Witz",
      "photoUrl": "https://lh3.googleusercontent.com/a-/AOh14GgT0K2JVYzEsjzsS5nhkUVjUrSIJ5jHzXnBoYrmVf8=s64",
      "userId": "16033870147214403532"
     },
     "user_tz": -60
    },
    "id": "csGE0hcLWWFh",
    "outputId": "ba7ae9cc-bdb6-4c3d-fbd9-5b970087ba62",
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>city</th>\n",
       "      <th>birth</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>1</th>\n",
       "      <td>Bonn</td>\n",
       "      <td>1770</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>3</th>\n",
       "      <td>Saint-Petersburg</td>\n",
       "      <td>1906</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "               city  birth\n",
       "1              Bonn   1770\n",
       "3  Saint-Petersburg   1906"
      ]
     },
     "execution_count": 11,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "composers.loc[[1,3], ['city','birth']]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "F2TUgZ_zWVZT"
   },
   "source": [
    "Note that the index **does not have to be a sequential integer**. For example we can replace the index with a list of strings:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 12,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 173
    },
    "executionInfo": {
     "elapsed": 492,
     "status": "ok",
     "timestamp": 1614350915666,
     "user": {
      "displayName": "Guillaume Witz",
      "photoUrl": "https://lh3.googleusercontent.com/a-/AOh14GgT0K2JVYzEsjzsS5nhkUVjUrSIJ5jHzXnBoYrmVf8=s64",
      "userId": "16033870147214403532"
     },
     "user_tz": -60
    },
    "id": "NE7Cy81ASn9s",
    "outputId": "9e5f9eea-3920-4613-faec-96d28f55eada",
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>composer</th>\n",
       "      <th>birth</th>\n",
       "      <th>death</th>\n",
       "      <th>city</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>a</th>\n",
       "      <td>Mahler</td>\n",
       "      <td>1860</td>\n",
       "      <td>1911</td>\n",
       "      <td>Kaliste</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>b</th>\n",
       "      <td>Beethoven</td>\n",
       "      <td>1770</td>\n",
       "      <td>1827</td>\n",
       "      <td>Bonn</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>c</th>\n",
       "      <td>Puccini</td>\n",
       "      <td>1858</td>\n",
       "      <td>1924</td>\n",
       "      <td>Lucques</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>d</th>\n",
       "      <td>Shostakovich</td>\n",
       "      <td>1906</td>\n",
       "      <td>1975</td>\n",
       "      <td>Saint-Petersburg</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "       composer  birth  death              city\n",
       "a        Mahler   1860   1911           Kaliste\n",
       "b     Beethoven   1770   1827              Bonn\n",
       "c       Puccini   1858   1924           Lucques\n",
       "d  Shostakovich   1906   1975  Saint-Petersburg"
      ]
     },
     "execution_count": 12,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "composers.index = np.array(['a','b','c','d'])\n",
    "composers"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "uTim5DLdSrXY"
   },
   "source": [
    "Here we can still use the ```.loc``` method even though we don't deal with integers"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 13,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 111
    },
    "executionInfo": {
     "elapsed": 522,
     "status": "ok",
     "timestamp": 1614350936044,
     "user": {
      "displayName": "Guillaume Witz",
      "photoUrl": "https://lh3.googleusercontent.com/a-/AOh14GgT0K2JVYzEsjzsS5nhkUVjUrSIJ5jHzXnBoYrmVf8=s64",
      "userId": "16033870147214403532"
     },
     "user_tz": -60
    },
    "id": "Eh1SIWHTTWG5",
    "outputId": "f49f3ab9-4d22-4a4b-e358-d50bfb52a737",
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>composer</th>\n",
       "      <th>birth</th>\n",
       "      <th>death</th>\n",
       "      <th>city</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>a</th>\n",
       "      <td>Mahler</td>\n",
       "      <td>1860</td>\n",
       "      <td>1911</td>\n",
       "      <td>Kaliste</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>c</th>\n",
       "      <td>Puccini</td>\n",
       "      <td>1858</td>\n",
       "      <td>1924</td>\n",
       "      <td>Lucques</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "  composer  birth  death     city\n",
       "a   Mahler   1860   1911  Kaliste\n",
       "c  Puccini   1858   1924  Lucques"
      ]
     },
     "execution_count": 13,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "composers.loc[['a','c']]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "4FEqqQkFWGb-"
   },
   "source": [
    "### ```iloc```\n",
    "\n",
    "The alternative to the ```.loc[index, name]``` method is the ```.iloc[row, column]``` method. This method is closer to the Numpy approach, as here we can use the **actual location** in the DataFrame to recover elements. For example to recover the ```city``` of Puccini, we need the third line (row index = 2) and the fourth columns (column index = 3):"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 173
    },
    "executionInfo": {
     "elapsed": 588,
     "status": "ok",
     "timestamp": 1614350971083,
     "user": {
      "displayName": "Guillaume Witz",
      "photoUrl": "https://lh3.googleusercontent.com/a-/AOh14GgT0K2JVYzEsjzsS5nhkUVjUrSIJ5jHzXnBoYrmVf8=s64",
      "userId": "16033870147214403532"
     },
     "user_tz": -60
    },
    "id": "6m6qaZkzXAoJ",
    "outputId": "a8cfc7ac-e212-4ad5-b246-bb97233337c9",
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>composer</th>\n",
       "      <th>birth</th>\n",
       "      <th>death</th>\n",
       "      <th>city</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>a</th>\n",
       "      <td>Mahler</td>\n",
       "      <td>1860</td>\n",
       "      <td>1911</td>\n",
       "      <td>Kaliste</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>b</th>\n",
       "      <td>Beethoven</td>\n",
       "      <td>1770</td>\n",
       "      <td>1827</td>\n",
       "      <td>Bonn</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>c</th>\n",
       "      <td>Puccini</td>\n",
       "      <td>1858</td>\n",
       "      <td>1924</td>\n",
       "      <td>Lucques</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>d</th>\n",
       "      <td>Shostakovich</td>\n",
       "      <td>1906</td>\n",
       "      <td>1975</td>\n",
       "      <td>Saint-Petersburg</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "       composer  birth  death              city\n",
       "a        Mahler   1860   1911           Kaliste\n",
       "b     Beethoven   1770   1827              Bonn\n",
       "c       Puccini   1858   1924           Lucques\n",
       "d  Shostakovich   1906   1975  Saint-Petersburg"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "composers"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 35
    },
    "executionInfo": {
     "elapsed": 456,
     "status": "ok",
     "timestamp": 1614350975979,
     "user": {
      "displayName": "Guillaume Witz",
      "photoUrl": "https://lh3.googleusercontent.com/a-/AOh14GgT0K2JVYzEsjzsS5nhkUVjUrSIJ5jHzXnBoYrmVf8=s64",
      "userId": "16033870147214403532"
     },
     "user_tz": -60
    },
    "id": "FW0FuGkiXRQe",
    "outputId": "f4b87a96-5a0d-4731-9ba5-cfcc6f6c8837",
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "'Lucques'"
      ]
     },
     "execution_count": 15,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "composers.iloc[2,3]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "xtFC74r3XiCW"
   },
   "source": [
    "With ```iloc``` we can now use the **same indexing** approach as seen with Numpy arrays, i.e. we can select a range of values:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 16,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 111
    },
    "executionInfo": {
     "elapsed": 920,
     "status": "ok",
     "timestamp": 1614353640093,
     "user": {
      "displayName": "Guillaume Witz",
      "photoUrl": "https://lh3.googleusercontent.com/a-/AOh14GgT0K2JVYzEsjzsS5nhkUVjUrSIJ5jHzXnBoYrmVf8=s64",
      "userId": "16033870147214403532"
     },
     "user_tz": -60
    },
    "id": "aXsPnfqFXsNG",
    "outputId": "5df3fca2-25a3-4e71-b3d0-331aea6bfbe4",
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>death</th>\n",
       "      <th>city</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>a</th>\n",
       "      <td>1911</td>\n",
       "      <td>Kaliste</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>b</th>\n",
       "      <td>1827</td>\n",
       "      <td>Bonn</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "   death     city\n",
       "a   1911  Kaliste\n",
       "b   1827     Bonn"
      ]
     },
     "execution_count": 16,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "composers.iloc[0:2,2:4]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "eCYc3srDXwVy"
   },
   "source": [
    "## Logical indexing\n",
    "\n",
    "We have seen before that we can create a boolean array by using for example a comparison such as:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 17,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "executionInfo": {
     "elapsed": 535,
     "status": "ok",
     "timestamp": 1614353663081,
     "user": {
      "displayName": "Guillaume Witz",
      "photoUrl": "https://lh3.googleusercontent.com/a-/AOh14GgT0K2JVYzEsjzsS5nhkUVjUrSIJ5jHzXnBoYrmVf8=s64",
      "userId": "16033870147214403532"
     },
     "user_tz": -60
    },
    "id": "xCQz_qDR-fy0",
    "outputId": "5c409085-cf8f-4d6b-c5e0-3252dddb8b90",
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[-0.27943032,  0.53917334,  0.04905016, -0.29523685, -0.32803448],\n",
       "       [ 1.29974059, -0.87046224,  0.56457039, -0.16125738,  1.90088289],\n",
       "       [ 1.18116165,  0.95418676,  0.47466928,  1.12268135, -0.74403994]])"
      ]
     },
     "execution_count": 17,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "my_array"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "executionInfo": {
     "elapsed": 460,
     "status": "ok",
     "timestamp": 1614353666053,
     "user": {
      "displayName": "Guillaume Witz",
      "photoUrl": "https://lh3.googleusercontent.com/a-/AOh14GgT0K2JVYzEsjzsS5nhkUVjUrSIJ5jHzXnBoYrmVf8=s64",
      "userId": "16033870147214403532"
     },
     "user_tz": -60
    },
    "id": "7imUxMM5aigA",
    "outputId": "dcddbf75-0549-46c5-9d61-e396211be193",
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[False,  True,  True, False, False],\n",
       "       [ True, False,  True, False,  True],\n",
       "       [ True,  True,  True,  True, False]])"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "my_array > 0"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "2kgIxf-WajyQ"
   },
   "source": [
    "and then use it to **extract** the ```True``` elements in another array:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 19,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "executionInfo": {
     "elapsed": 473,
     "status": "ok",
     "timestamp": 1614353674913,
     "user": {
      "displayName": "Guillaume Witz",
      "photoUrl": "https://lh3.googleusercontent.com/a-/AOh14GgT0K2JVYzEsjzsS5nhkUVjUrSIJ5jHzXnBoYrmVf8=s64",
      "userId": "16033870147214403532"
     },
     "user_tz": -60
    },
    "id": "tBGIUiwOaqg5",
    "outputId": "82d2f2f8-d817-4036-8698-3049b37607bb",
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([0.53917334, 0.04905016, 1.29974059, 0.56457039, 1.90088289,\n",
       "       1.18116165, 0.95418676, 0.47466928, 1.12268135])"
      ]
     },
     "execution_count": 19,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "my_array[my_array > 0]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "C3l5qyXOauTX"
   },
   "source": [
    "We can apply the **same logic** to Pandas DataFrames. For example, let's select composers by data of birth. Let's see what happens:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 173
    },
    "executionInfo": {
     "elapsed": 439,
     "status": "ok",
     "timestamp": 1614353715375,
     "user": {
      "displayName": "Guillaume Witz",
      "photoUrl": "https://lh3.googleusercontent.com/a-/AOh14GgT0K2JVYzEsjzsS5nhkUVjUrSIJ5jHzXnBoYrmVf8=s64",
      "userId": "16033870147214403532"
     },
     "user_tz": -60
    },
    "id": "IQZOv2M0-sRO",
    "outputId": "2783485c-733c-4162-95f0-abfed78714e4",
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>composer</th>\n",
       "      <th>birth</th>\n",
       "      <th>death</th>\n",
       "      <th>city</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>a</th>\n",
       "      <td>Mahler</td>\n",
       "      <td>1860</td>\n",
       "      <td>1911</td>\n",
       "      <td>Kaliste</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>b</th>\n",
       "      <td>Beethoven</td>\n",
       "      <td>1770</td>\n",
       "      <td>1827</td>\n",
       "      <td>Bonn</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>c</th>\n",
       "      <td>Puccini</td>\n",
       "      <td>1858</td>\n",
       "      <td>1924</td>\n",
       "      <td>Lucques</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>d</th>\n",
       "      <td>Shostakovich</td>\n",
       "      <td>1906</td>\n",
       "      <td>1975</td>\n",
       "      <td>Saint-Petersburg</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "       composer  birth  death              city\n",
       "a        Mahler   1860   1911           Kaliste\n",
       "b     Beethoven   1770   1827              Bonn\n",
       "c       Puccini   1858   1924           Lucques\n",
       "d  Shostakovich   1906   1975  Saint-Petersburg"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "composers"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "executionInfo": {
     "elapsed": 441,
     "status": "ok",
     "timestamp": 1614353717582,
     "user": {
      "displayName": "Guillaume Witz",
      "photoUrl": "https://lh3.googleusercontent.com/a-/AOh14GgT0K2JVYzEsjzsS5nhkUVjUrSIJ5jHzXnBoYrmVf8=s64",
      "userId": "16033870147214403532"
     },
     "user_tz": -60
    },
    "id": "8Xl6Rq4nbFCy",
    "outputId": "35c78351-d0ea-4136-9ee4-23ea027a1633",
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "a     True\n",
       "b    False\n",
       "c    False\n",
       "d     True\n",
       "Name: birth, dtype: bool"
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "composers['birth'] > 1859"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "Ch-USCi5bLwa"
   },
   "source": [
    "We see that we obtain just like we obtained a boolean array previously, we obtain here a **boolean series**. And here as well we can use it to extract elements from the DataFrame. This works again with simple brackets:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/",
     "height": 111
    },
    "executionInfo": {
     "elapsed": 496,
     "status": "ok",
     "timestamp": 1614353737020,
     "user": {
      "displayName": "Guillaume Witz",
      "photoUrl": "https://lh3.googleusercontent.com/a-/AOh14GgT0K2JVYzEsjzsS5nhkUVjUrSIJ5jHzXnBoYrmVf8=s64",
      "userId": "16033870147214403532"
     },
     "user_tz": -60
    },
    "id": "6J5oP2xdbiTJ",
    "outputId": "480de67c-79c1-4cc9-9836-d911c97fa078",
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/html": [
       "<div>\n",
       "<style scoped>\n",
       "    .dataframe tbody tr th:only-of-type {\n",
       "        vertical-align: middle;\n",
       "    }\n",
       "\n",
       "    .dataframe tbody tr th {\n",
       "        vertical-align: top;\n",
       "    }\n",
       "\n",
       "    .dataframe thead th {\n",
       "        text-align: right;\n",
       "    }\n",
       "</style>\n",
       "<table border=\"1\" class=\"dataframe\">\n",
       "  <thead>\n",
       "    <tr style=\"text-align: right;\">\n",
       "      <th></th>\n",
       "      <th>composer</th>\n",
       "      <th>birth</th>\n",
       "      <th>death</th>\n",
       "      <th>city</th>\n",
       "    </tr>\n",
       "  </thead>\n",
       "  <tbody>\n",
       "    <tr>\n",
       "      <th>a</th>\n",
       "      <td>Mahler</td>\n",
       "      <td>1860</td>\n",
       "      <td>1911</td>\n",
       "      <td>Kaliste</td>\n",
       "    </tr>\n",
       "    <tr>\n",
       "      <th>d</th>\n",
       "      <td>Shostakovich</td>\n",
       "      <td>1906</td>\n",
       "      <td>1975</td>\n",
       "      <td>Saint-Petersburg</td>\n",
       "    </tr>\n",
       "  </tbody>\n",
       "</table>\n",
       "</div>"
      ],
      "text/plain": [
       "       composer  birth  death              city\n",
       "a        Mahler   1860   1911           Kaliste\n",
       "d  Shostakovich   1906   1975  Saint-Petersburg"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "composers[composers['birth'] > 1859]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "CVdKrt8Gb9Dy"
   },
   "source": [
    "As well as with the more complex function ```loc```:"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 23,
   "metadata": {
    "colab": {
     "base_uri": "https://localhost:8080/"
    },
    "executionInfo": {
     "elapsed": 561,
     "status": "ok",
     "timestamp": 1614353757398,
     "user": {
      "displayName": "Guillaume Witz",
      "photoUrl": "https://lh3.googleusercontent.com/a-/AOh14GgT0K2JVYzEsjzsS5nhkUVjUrSIJ5jHzXnBoYrmVf8=s64",
      "userId": "16033870147214403532"
     },
     "user_tz": -60
    },
    "id": "FR4RLKnJblVs",
    "outputId": "727492cf-4a54-4a56-fd48-b60c9514bfbc",
    "tags": []
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "a             Kaliste\n",
       "d    Saint-Petersburg\n",
       "Name: city, dtype: object"
      ]
     },
     "execution_count": 23,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "composers.loc[composers['birth'] > 1859, 'city']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "RnWH-dSscDO9"
   },
   "source": [
    "This is extremely useful to extract only specific elements from a large table if some values should be discarded"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {
    "id": "0LCIbzojcPxr"
   },
   "source": [
    "## Exercises\n",
    "\n",
    "1. Import the penguin dataset https://raw.githubusercontent.com/allisonhorst/palmerpenguins/master/inst/extdata/penguins.csv\n",
    "2. Create a new dataframe ```new_dataframe``` by extracting the ```species```, ```bill_length_mm``` and ```body_mass_g``` columns.\n",
    "3. Extract the row with index 4 of ```new_dataframe```\n",
    "4. Extract the ```bill_length_mm``` of the 3 first rows of ```new_dataframe```\n",
    "5. Extract all rows for which the ```body_mass_g > 6000```"
   ]
  }
 ],
 "metadata": {
  "colab": {
   "authorship_tag": "ABX9TyMpzZqNr/SheIIWLwhtDLw2",
   "collapsed_sections": [],
   "name": "08-DataFrame_indexing.ipynb",
   "provenance": [],
   "toc_visible": true
  },
  "kernelspec": {
   "display_name": "Python 3 (ipykernel)",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.10.10"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}