14. Python - Pandas - Summary Functions & Maps
Summary Methods:
data_frame.describe()
data_frame.COLUMNA.mean() // return mean
data_frame.COLUMNA.median() // return median
data_frame.COLUMNA.unique() // returns only unque values
data_frame.COLUMNA.value_counts() // returns value and their count
data_frame.COLUMNA.idxmax() // returns max
data_frame.COLUMNA.sum()
Maps:
data_frame_COLUMNA_mean = data_frame.COLUMNA.mean()
// finds the mean of the column
data_frame.COLUMNA.map(lambda p: p - data_frame_COLUMNA_mean)
// creates a map from COLUMNA. The values of the keys per index will be determined with the lambda function. Lamdba function returns result of COLUMNA's value minus the mean.
// lambda is anonymous function.
data_frame_mean = data_frame.COLUMNA.mean()
data_frame.COLUMNA - data_frame_mean
// returns same result as the two earlier map steps.
data_frame.COLUMNA + "SPACE" + data_frame.COLUMNB
// returns a new map with values that combine COLUMNA and COLUMNB with a string "SPACE" in between
Apply Function: // creates a new series or small data frame of 1 column
def remean_points(row):
row.points = row.points - review_points_mean
return row
// this is a new function that takes a row value from data_frame.points and subtract it by the mean and returning the new value
data_frame.apply(remean_points, axis='columns')
// apply method applies the function to every value in the column used in the function
// returns a new whole data_frame with new values for that column
Comments
Post a Comment