28. Data Analytics - Data Analysis with R Programming - Week 1

- Introduction to programming.

Definition:

Computer programming // giving instructions to a computer to perform an action or set of actions

Programming languages // words and symbols we use to write instructions for computers to follow

Syntax // shows how to arrange words and symbols for programming

Coding // writing instructions to a computer in the syntax of a specific programming language

R // programming language frequently used for statistical analysis, visualization, and other data analysis. Based on the S language.

Open source // code that is freely available and may be modified and shared by the people who use it

Integrated Development Environment (IDE) // software application that brings together all the tools you may want to use in a single place


R Features:

        - R packages // addon packages, or (libraries)


R uses for data analysis:

        - Reproducing your analysis // R can reproduce every steps of your analysis

        - Processing lots of data // Just like SQL

        - Creating data visualizations


        Also:

                - take specific analysis step and perform it across many different groups of data.

                - flexible visualizations

                - automatically create an output of summary stats

Programming languages:

        - R

        - Python

        - JavaScript

        - SAS, Scala, Julia


Why work with R:

        - Accessible // easy to use for beginners

        - Data-centric // specifically designed to make data analysis easier

        - Open source // freely available and ready to be modified

        - Community // has an active community


Benefits using programming with data:

        - Clarify the steps of your analysis

        - Saves time

        - Reproduce and share your work


Key question

Spreadsheets

SQL

R

What is it?

A program that uses rows and columns to organize data and allows for analysis and manipulation through formulas, functions, and built-in features

A database programming language used to communicate with databases to conduct an analysis of data

A general purpose programming language used for statistical analysis, visualization, and other data analysis

W​hat is a primary advantage?

I​ncludes a variety of visualization tools and features

A​llows users to manipulate and reorganize data as needed to aid analysis

P​rovides an accessible language to organize, modify, and clean data frames, and create insightful data visualizations

Which datasets does it work best with?

Smaller datasets

Larger datasets

Larger datasets

What is the source of the data?

Entered manually or imported from an external source

Accessed from an external database

Loaded with R when installed, imported from your computer, or loaded from external sources

Where is the data from my analysis usually stored?

In a spreadsheet file on your computer

Inside tables in the accessed database

In an R file on your computer

Do I use formulas and functions?

Yes

Yes

Yes

Can I create visualizations?

Yes

Yes, by using an additional tool like a database management system (DBMS) or a business intelligence (BI) tool

Yes


Additional Resources:

https://medium.com/analytics-and-data/r-vs-python-a-comprehensive-guide-for-data-professionals-321e8dead598

https://blog.rstudio.com/2019/12/17/r-vs-python-what-s-the-best-for-language-for-data-science/

https://www.rstudio.com/solutions/r-and-python/

https://www.r-project.org/

https://cran.r-project.org/manuals.html

https://ourcodingclub.github.io/tutorials.html

https://cran.r-project.org/doc/contrib/Paradis-rdebuts_en.pdf

https://docs.python.org/3/tutorial/

https://ourcodingclub.github.io/tutorials.html

https://lgatto.github.io/2017_11_09_Rcourse_Jena/before-we-start.html

https://www.theanalysisfactor.com/the-advantages-of-rstudio/

https://community.rstudio.com/

Comments

Popular posts from this blog

2. FreeCodeCamp - Dynamic Programming - Learn to Solve Algorithmic Problems & Coding Challenges

20. Data Analytics - Analyze Data to Answer Questions - Week 1

3. Algorithms - Selection Sort