31. Data Analytics - Data Analysis with R Programming - Week 4

 Visualizing in R, using ggplot2 of tidyverse.

Base R, and other packages like plotly and RGL can do basic visuals.


Definitions:

Aesthetic (R) // a visual property of an object in your plot. Size, shape, color.

Geom (R) // the geometric object used to represent your data

Facets (R) // let you display smaller groups, or subsets, of your data. can create separate graphs for every data in dataset

Labels and annotations (R) // lets you customize your plot. Titles, annotations, subtitles, etc

Mapping (R) // matching up a specific variable in your dataset with a specific aesthetic

Annotate // to add notes to a document or diagram to explain or comment upon it


Popular visual packages:

        - ggplot2                - Plotly

        - Lattice                - RGL

        - Dygraphs            - Leaflet

        - Highcharter        - Patchwork

        - gganimate            - ggridges


Benefits of ggplot2:

        - Create different types of plots

        - Customize the look and feel of plots

        - Create high quality visuals

        - Combine data manipulation and visualization


ggplot2 core concepts:

        - aesthetics 

        - geoms

        - facets

        - labels and annotations


ggplot2:

        - install.packages("ggplot2")

        - library(ggplot2)


        Steps:

            1. Start with ggplot function and choose a dataset to work with

            2. Add a geom_ function to display your data

            3. Map the variables you want to plot in the arguments of the aes() function


        Creating a plot:

                - ggplot(data = dataname) + geom_point(mapping = aes(x= datacolumn, y = datacolum))

                        // creates a scatter plot to see correlation between 2 data

                        // the "+" adds a layer to the ggplot. add layer by adding to the end of codes

                - mapping = aes(properties, properties,etc)

                        // color = columndata // differ color between each data

                        // x = columndata //

                        // y = columndata //

                        // shape = columndata // // differ values by shape

                        // size = columndata // differ values by size

                        // alpha = columndata // differ alpha by values

                - geom_point(properties,properties,etc)

                        // mapping = aes() // edit mapping of points

                        // color = "color" // change color of all points

        Geom functions:

                - geom_point // creates scatterplot

                - geom_bar // creates bar chart

                        mapping = aes() properties:

                                - fill = datacolumn // solid color fill

                                - color = datacolumn // outline color

                - geom_line // create line chart    

                - geom_smooth // smooth line chart

                        mapping = aes() properties:

                                - linetype = datacolumn

                - geom_jitter // add a bit of noise to each data point to make overlapping less common


Types of smoothing:

Type of smoothing

Description

Example code

Loess smoothing

The loess smoothing process is best for smoothing plots with less than 1000 points.

ggplot(data, aes(x=, y=))+  geom_point() +       geom_smooth(method="loess")

Gam smoothing

Gam smoothing, or generalized additive model smoothing, is useful for smoothing plots with a large number of points. 

ggplot(data, aes(x=, y=)) + geom_point() +         geom_smooth(method="gam", formula = y ~s(x))


Facet functions:

        - Facet_wrap(~datacolumn) // facet a plot by single variable

                - add facet_wrap(~datacolumn) to "ggplot() +"

                // creates separate plot for each datacolumn

        - Facet_grid(1stcolumn~2ndcolumn) // facet a plot by multiple variables. split plots into facets

                vertically by values of first variable and horizontally by second variable.

                - add facet_wrap(1stcolumn~2ndcolumn)


Filtering before plotting: (uses dplyr package)

        - filter(variable1 == "SOMETHING") then ggplot()


Label and Annotating:

        - titles

                    - (+) to end of ggplot() with labs(title="TITLE")

        - subtitles

                    - (+) to end of ggplot() with labs(subtitle="SUBTITLE")

        - captions

                    - (+) to end of ggplot() with labs(caption="CAPTION")

        - annotate // used to put text on specific data points

                    - (+) to end of ggplot() with annotate(properties)

                    - annotate("TEXT", x=poslabel,y=poslabel, label = "LABEL")

                            Optional properties:

                            - color = "COLOR"

                            - fontface = "bold"

                            - size = float

                            - angle = int


Saving plots:

        - Export option or ggsave() function

        - Export option found in plots tab

        - ggsave() // saves last plot you displayed.

                - ggsave("filename.png")

Test

Test


Additional Resources:

https://ggplot2.tidyverse.org/

http://statseducation.com/Introduction-to-R/modules/graphics/aesthetics/

https://www.rdocumentation.org/packages/ggplot2/versions/3.3.3/topics/aes

https://rladiessydney.org/courses/ryouwithme/03-vizwhiz-1/#1-4-putting-it-all-together-dplyr-ggplot

https://r4ds.had.co.nz/transform.html

https://datacarpentry.org/dc_zurich/R-ecology/05-visualisation-ggplot2.html

https://ggplot2.tidyverse.org/reference/annotate.html

https://www.r-graph-gallery.com/233-add-annotations-on-ggplot2-chart.html

https://ggplot2-book.org/annotations.html

https://www.r-bloggers.com/2017/02/how-to-annotate-a-plot-in-ggplot2/

https://viz-ggplot2.rsquaredacademy.com/textann.html

https://ggplot2.tidyverse.org/reference/ggsave.html#saving-images-without-ggsave-

https://www.datanovia.com/en/blog/how-to-save-a-ggplot/

https://www.datamentor.io/r-programming/saving-plot/

Comments

Popular posts from this blog

20. Data Analytics - Analyze Data to Answer Questions - Week 1

2. FreeCodeCamp - Dynamic Programming - Learn to Solve Algorithmic Problems & Coding Challenges

5. SQL Injection - Blind SQL Injection