9. Python - Data Analytics - Creating a Model

import pandas as pd

data_file_path = 'filepath\file.csv' // indicate the path

data_data = pd.read_csv(data_file_path) // reads the data and craete a data frame

data_data.describe() // shows a summary statistic of the data.

data_data.columns // lists out column headers

data_data.head() // summarize the first few rows of the data frame

Read Methods

read_csv(datapath)

read_excel(datapath)

Prediction Target // the data we want to predict. Usually represented as y.

Use dot notation to select a prediction from the data frame.

y = data_data.columnname // this creates a data frame for the prediction

Features // data used to make and predict the predictions. Represented as X.

Create a list of features to use by making a list of columns.

data_features = ['Feature1', 'Feature2', 'Feature3']

x = data_data[data_features] // this creates a data frame for the features

Scit-learn Library to Create Models:

from sklearn.tree import DecisionTreeRegressor

The steps to building and using a model are:

Define: What type of model will it be? A decision tree? Some other type of model? Some other parameters of the model type are specified too.

Fit: Capture patterns from provided data. This is the heart of modeling.

Predict: Just what it sounds like

Evaluate: Determine how accurate the model's predictions are.

Define Model with sklearn:

data_data_model = DecisionTreeRegressor(random_state=1)

// defines a model and specifying a random state so results stay the same.

// decision tree model

Fit the model with prediction and features:

data_data_model.fit(x,y)

Predict with the model:

print(data_data_model.predict(dataframeOFfeaturesTOpredictY)

Learning Notes