15. Python - Pandas - Sorting/Grouping
Group By:
// Basic Methods
data_frame.groupby('COLUMNA').COLUMNAA.count()
// creates a map with unique values of COLUMNA as index in the first column and then with values showing the count of them in the 2nd column (count of data_frame.COLUMNAA)
// also have min(), max(), etc instead of count()
// COLUMNAA placeholder is for values that relate to COLUMNA groups.
// can have more than one group along with COLUMNA.
// Lambda Functions
data_frame.groupby('COLUMNA').apply(lambda p: p.COLUMNAA.iloc[0])
// this gets the first item of the list of COLUMNAA items that correspond to each group made from COLUMNA.
// agg() function // this creates a data frame
data_frame.groupby(['COLUMNA']).COLUMNB.agg([len,min,max])
// groups by unique values in COLUMNA as keys and then provides values for len,min,and max of COLUMNB relating to COLUMNA.
data_frame.groupby(['COLUMNA']).COLUMNB.min()
// using this instead makes a series
// options: max(), mean(), min()
Multi-Indexes:
data_frame.groupby(['COL_A','COL_B']).COL_C.agg([len])
// have multiple index columns by adding them in a list
// Reset Index:
data_frame.reset_index()
// resets the index to regular 0...n
Sorting:
data_frame.sort_values(by = ['COLUMNA','COLUMNB',etc], ascending = FALSE)
// ascending can be false or true, but true on default
data_frame.sort_index()
// sort by index (if you have a multi-index that is not regular 0..n
Series Methods:
min()
max()
mean()
size() // get count
sort_values(by = ['ETC'], ascending = False)
Comments
Post a Comment