About this lesson
As we begin to set up our linear regression model, we must define testing and training splits.
Download this lesson’s related exercise files.Split Data into Training and Testing Set.docx
57 KB Split Data into Training and Testing Set - Solution.docx
Split Data into Training and Testing Set
When to use
Before running any linear regression, you'll need to designate an X, a y, and a Train/Test Split.
First, we need to import a couple of things into Jupyter Notebook:
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
Next we need to designate our X and our y:
X = bost[bost.columns]
y = pd.DataFrame(boston.target, columns=['Price'])
Finally, we need to designate which of our data will be test data and which will be training data for our model:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.4, random_state=10)
Hints & tips
- Import LinearRegression
- Import the Train Test Split model
- Set our X and y
- Designate what we want to train and test
Lesson notes are only available for subscribers.