{ "cells": [ { "cell_type": "markdown", "source": [ "## Introduction to Modeling" ], "metadata": { "id": "gIPdySTgL9k7" } }, { "cell_type": "markdown", "source": [ "\n", "\n", "---\n", "\n" ], "metadata": { "id": "eeMKpX2jMDqM" } }, { "cell_type": "markdown", "source": [ "### Demonstrate idea behind MSE" ], "metadata": { "id": "6uZyaJdzL61x" } }, { "cell_type": "markdown", "source": [ "Complete below" ], "metadata": { "id": "RM5qknxdMU8T" } }, { "cell_type": "code", "source": [], "metadata": { "id": "i-e56REtMXSd" }, "execution_count": null, "outputs": [] }, { "cell_type": "markdown", "source": [ "\n", "\n", "---\n", "\n" ], "metadata": { "id": "OLNuc4ZfMEwG" } }, { "cell_type": "markdown", "metadata": { "id": "BruPxyad0fWj" }, "source": [ "### Linear regression" ] }, { "cell_type": "markdown", "metadata": { "id": "tObKDZrP0fWk" }, "source": [ "**Simple Example with Simulated Data**\n", "\n", "For this example, we are going to keep it simple, stay in 2 dimensions, and use OLS to fit a line to some data." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "Tmxldb2C0fWk" }, "outputs": [], "source": [ "import numpy as np\n", "%matplotlib inline\n", "# this accommodates high resolution displays\n", "%config InlineBackend.figure_format = 'retina'\n", "import matplotlib.pyplot as plt" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "tzo-Y8-C0fWl" }, "outputs": [], "source": [ "n = 10\n", "np.random.seed(146)\n", "x = np.random.normal(size=(n,1))" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "Rixag8Kr0fWl", "collapsed": true }, "outputs": [], "source": [ "noise_strength = 0.5\n", "np.random.seed(147)\n", "noise = np.random.normal(scale=noise_strength, size=(n,1))\n", "y = 1 + 2*x + noise\n", "plt.scatter(x,y, label='Original data', color='k')\n", "plt.xlabel('x')\n", "plt.ylabel('y')\n", "plt.show()" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "id": "zfkOmO1v0fWl" }, "outputs": [], "source": [ "from sklearn.linear_model import LinearRegression as LR\n" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "tags": [], "id": "r2FlKS5w0fWl", "outputId": "b74fc26d-5825-4cb5-a5a5-e2981d7de01f", "colab": { "base_uri": "https://localhost:8080/", "height": 78 } }, "outputs": [ { "output_type": "execute_result", "data": { "text/plain": [ "LinearRegression()" ], "text/html": [ "
LinearRegression()In a Jupyter environment, please rerun this cell to show the HTML representation or trust the notebook.
LinearRegression()
\n", " | Cement | \n", "Blast_Furnace_Slag | \n", "Fly_Ash | \n", "Water | \n", "Superplasticizer | \n", "Coarse_Aggregate | \n", "Fine_Aggregate | \n", "Age | \n", "Concrete_compressive_strength | \n", "
---|---|---|---|---|---|---|---|---|---|
0 | \n", "540.0 | \n", "0.0 | \n", "0.0 | \n", "162.0 | \n", "2.5 | \n", "1040.0 | \n", "676.0 | \n", "28 | \n", "79.986111 | \n", "
1 | \n", "540.0 | \n", "0.0 | \n", "0.0 | \n", "162.0 | \n", "2.5 | \n", "1055.0 | \n", "676.0 | \n", "28 | \n", "61.887366 | \n", "
2 | \n", "332.5 | \n", "142.5 | \n", "0.0 | \n", "228.0 | \n", "0.0 | \n", "932.0 | \n", "594.0 | \n", "270 | \n", "40.269535 | \n", "
3 | \n", "332.5 | \n", "142.5 | \n", "0.0 | \n", "228.0 | \n", "0.0 | \n", "932.0 | \n", "594.0 | \n", "365 | \n", "41.052780 | \n", "
4 | \n", "198.6 | \n", "132.4 | \n", "0.0 | \n", "192.0 | \n", "0.0 | \n", "978.4 | \n", "825.5 | \n", "360 | \n", "44.296075 | \n", "