{ "cells": [ { "cell_type": "markdown", "id": "7415d1af-3f30-4166-9f72-7b2ec8a42f61", "metadata": {}, "source": [ "# Predicting categories: logistic regression\n", "\n", "In the past chapters we have seen how we could create an ML model to predict a **continuous** variable such as the price of an apartment. However very often we want to predict a category, i.e. a **discrete** value. Such a prediction is generally called **classification** and can be obtained by multiple methods. Here we will first have a loot at **logistic regression**, which is conceptually close to the linear regression seen before. We will first try to solve the problem with linear regression to understand why it is not a good solution." ] }, { "cell_type": "markdown", "id": "324e28c0-c74a-4ea3-878a-cbf0aa53e396", "metadata": {}, "source": [ "## Data exploration\n", "\n", "Here we use a datasets where different types of movements were recorded, resulting in variables indicating acceleration and angular velocity." ] }, { "cell_type": "code", "execution_count": 3, "id": "74fbc59c-6326-4ae2-898d-d61bf70fee80", "metadata": { "tags": [] }, "outputs": [], "source": [ "import pandas as pd\n", "import numpy as np\n", "import matplotlib.pyplot as plt\n", "import seaborn as sns\n", "\n", "movement = pd.read_csv('../data/movement.csv')\n" ] }, { "cell_type": "code", "execution_count": 4, "id": "94e13e6d-4cfd-43c7-b116-c429b77e5b51", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", " | t | \n", "x_acc | \n", "y_acc | \n", "z_acc | \n", "x_rot | \n", "y_rot | \n", "z_rot | \n", "move_type | \n", "
---|---|---|---|---|---|---|---|---|
0 | \n", "0.001658 | \n", "2.603237 | \n", "-0.068707 | \n", "9.457633 | \n", "0.098532 | \n", "0.079406 | \n", "-1.539468 | \n", "1 | \n", "
1 | \n", "0.011610 | \n", "1.871558 | \n", "-0.642763 | \n", "10.219399 | \n", "0.128653 | \n", "-0.004559 | \n", "-1.495045 | \n", "1 | \n", "
2 | \n", "0.021562 | \n", "1.897454 | \n", "-0.996478 | \n", "10.046209 | \n", "0.142974 | \n", "-0.081562 | \n", "-1.505501 | \n", "1 | \n", "
3 | \n", "0.031514 | \n", "2.120041 | \n", "-0.338596 | \n", "9.839938 | \n", "0.115268 | \n", "-0.088669 | \n", "-1.557105 | \n", "1 | \n", "
4 | \n", "0.041466 | \n", "2.452201 | \n", "-0.256117 | \n", "9.470506 | \n", "0.050623 | \n", "-0.083579 | \n", "-1.636483 | \n", "1 | \n", "
LogisticRegression()In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
LogisticRegression()
LogisticRegression()In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
LogisticRegression()