Practice
Resources
Contests
Online IDE
New
Free Mock
Events New Scaler
Practice
Improve your coding skills with our resources
Contests
Compete in popular contests with top coders
logo
Events
Attend free live masterclass hosted by top tech professionals
New
Scaler
Explore Offerings by SCALER

Data Analysis

Last Updated: Jan 10, 2022
Go to Problems
Contents

EDA

Exploratory Data Analysis (EDA) is an approach/philosophy for data analysis that employs a variety of techniques (mostly graphical) to

  1. maximize insight into a data set;
  2. uncover underlying structure;
  3. extract important variables;
  4. detect outliers and anomalies;
  5. test underlying assumptions;
  6. develop parsimonious models; and
  7. determine optimal factor settings.

 

  • EDA isn't just like statistical graphics although the 2 terms are used almost interchangeably. Statistical graphics may be a collection of techniques--all graphically based and everyone that specializes in one data characterization aspect. 
  • EDA encompasses a bigger venue; EDA is an approach to data analysis that postpones the standard assumptions about what model the info follows with the more direct approach of allowing the info itself to reveal its underlying structure and model. 
  • EDA isn't a mere collection of techniques; EDA may be a philosophy on how we dissect a knowledge set; what we glance for; how we look; and the way we interpret. It is true that EDA heavily uses the gathering of techniques that we call "statistical graphics", but it's not just like statistical graphics.
  • Most EDA techniques are graphical in nature with a couple of quantitative techniques. The reason for the heavy reliance on graphics is that by its very nature the main role of EDA is to open-mindedly explore, and graphics gives the analysts unparalleled power to try to do so, enticing the info to reveal its structural secrets, and being always able to gain some new, often unsuspected, insight into the info. 
  • Many data scientists will agree that it's very easy to get lost in data—the more you collect, study and analyze, the more you would like to explore. Rabbit holes of data are familiar and friendly places for data analysts and data scientists to dive into and spend hours extracting, modeling, and analyzing these large datasets.
  • The EDA sorts of techniques are either graphical or quantitative (non-graphical). While the graphical methods involve summarising the info in a diagrammatic or visual way, the quantitative method, on the opposite hand, involves the calculation of summary statistics. These two sorts of methods are further divided into univariate and multivariate methods.

EDA Steps:-


  1. Data Sourcing
  2. Data Cleaning
  3. Univariate analysis
  4. Bivariate analysis
  5. Multivariate analysis
  6. Handle Missing value
  7. Removing duplicates
  8. Outlier Treatment
  9. Normalizing and Scaling( Numerical Variables)
  10. Encoding Categorical variables( Dummy Variables)

Types of Graphical Analysis:-


  1. Numerical vs. Numerical
    1. Scatterplot
    2. Line plot
    3. Heatmap for correlation
    4. Joint plot

  1. Categorical vs. Numerical
    1. Bar chart
    2. Categorical box plot

Handling Missing Values:-

  1. Deleting rows with missing values
  2. Imputing missing data based on mean/median/mode
  3. Estimating missing data using ML classifiers - knn

Outlier Detection:-

  1. Based on standard deviations away from the mean (continuous variables)
  2. Based on inter-quartile distance (categorical data)





 

Video Courses
By

View All Courses
Excel at your interview with Masterclasses Know More
Certificate included
What will you Learn?
Free Mock Assessment
Fill up the details for personalised experience.
Phone Number *
OTP will be sent to this number for verification
+91 *
+91
Change Number
Graduation Year *
Graduation Year *
1994
1995
1996
1997
1998
1999
2000
2001
2002
2003
2004
2005
2006
2007
2008
2009
2010
2011
2012
2013
2014
2015
2016
2017
2018
2019
2020
2021
2022
2023
2024
2025
2026
2027
2028
2029
*Enter the expected year of graduation if you're student
Current Employer
Company Name
College you graduated from
College/University Name
Job Title
Job Title
Engineering Leadership
Software Development Engineer (Backend)
Software Development Engineer (Frontend)
Software Development Engineer (Full Stack)
Data Scientist
Android Engineer
iOS Engineer
Devops Engineer
Support Engineer
Research Engineer
Engineering Intern
QA Engineer
Co-founder
SDET
Product Manager
Product Designer
Backend Architect
Program Manager
Release Engineer
Security Leadership
Database Administrator
Data Analyst
Data Engineer
Non Coder
Other
Please verify your phone number
Edit
Resend OTP
By clicking on Start Test, I agree to be contacted by Scaler in the future.
Already have an account? Log in
Free Mock Assessment
Instructions from Interviewbit
Start Test