Data Science and Application 2020 PDF

Title Data Science and Application 2020
Author Tariq Islam
Course Introduction to Data Science and Analytics
Institution University of Toronto
Pages 21
File Size 376.9 KB
File Type PDF
Total Downloads 29
Total Views 140

Summary

book...


Description

DATA SCIENCE AND APPLICATION - ADVANCED DIPLOMA Program Description This program is designed to provide students with the knowledge of data analytics and statistics and the skills in data analysis using SQL, VBA Programming, SAS, R and Python programming languages. This program also covers data mining, advanced statistical modeling, machine learning and big data analytics. Students will also learn Tableau to visualize and present insights from data analytics. The data science projects will trained students to apply statistical theory and data analysis programming skills to perform actual data analysis. Upon successful completion of this program, students will be awarded an advanced diploma and they will be competent in many data analysis positions such as data scientist, data analyst, data engineer …

Admission Requirements 1) Post-secondary diploma or degree in business administration/ management, accounting, marketing, finance, financial service, economics, commerce, mathematics, statistics, actuarial science, public health, medicine pharmaceutical technology, computer science, technology, engineering, natural science or equivalent 2) Computer skills

Course Outline Introduction to Data Science 

Data Analytics Thinking The ubiquity of Data Opportunities What is driving data deluge? Examples of Data Opportunities State of the practice in Analytics Business Intelligence vs Data Science Current Analytical Architecture Three Key Roles of the New Data Ecosystem A New Approach to Analytics

789 Don Mills Road, SUITE 500 TORONTO, ON M3C 1T5

TEL.: 416-585-9880

FAX: 416-585-2117



The Analytics Framework Linking Business with Analytics The Analytics Framework Data Analytics Lifecycle overview



Business Problems and Analytics solution Data Driven Decision (DDD) Fundamental concepts of data mining From business problem to data mining tasks Answering Business Question with the analytics techniques Decision Analytic Thinking Data Science and strategy Use Cases



Basic and Advanced Data Analytics Methods Attributes and Data Types Descriptive Statistics Exploratory Data Analysis (EDA) Statistical Methods for evaluation Advanced Analytics: Regression and Classification Advanced Analytics: Association Rules, Clustering Advanced Analytics: Text Analytics, Natural Language Processing Use Cases



Analytics: Tools and Technology SAS/SQL/R/Python Essentials Analytics for Unstructured Data The Hadoop Ecosystem



Putting It All Together: The Endgame Data Visualization Basic Communicating Operationalizing an Analytics Project Creating the Final Deliveries

789 Don Mills Road, SUITE 500 TORONTO, ON M3C 1T5

TEL.: 416-585-9880

FAX: 416-585-2117

Fundamentals of Data Analytics and Statistics 

Introduction to Data Science Data science process and components --learn history of statistics, science of statistics, application of statistics data analytics and their connections with data science Practical cases of applying data science to various industries



Data Collection and Preparation Data collection, processing, sampling, QC, production, report and visualization Various uses cases of data processing



Descriptive Statistics Concepts of descriptive statistics, and calculation method Examples of computing descriptive statistics using different software packages



Introducing Distributions and Statistical Tests Concepts of distribution, frequency, quintiles, outliers Different kinds of distribution, shape and attributes Different statistical tests, p values and interpretation Basic statistical test using statistical software



Introducing Correlation Introduce concepts of correlation, different type of correlation measures Introduce binning data, bucket analysis and profiling to recognize pattern

Use intuitive methods to present data association Weight of evidence, IV and calculation Conduct correlation study using software 

Understanding the Concept and Reason of Feature Engineering and Selection Concepts of feature Various methods of feature engineering and selection Test impact and efficiency of feature engineering and selection Use cases of feature engineering and selection

789 Don Mills Road, SUITE 500 TORONTO, ON M3C 1T5

TEL.: 416-585-9880

FAX: 416-585-2117



Introducing Analytics Tools Overview of popular Analytics Tools Comparison of Analytic Tools Test different analytic tools



Introducing SQL for Data Analysis Overview of databases and SQL Data query and manipulation using SQL Test SQL using R, Python and SAS



Introducing Predictive Model, Data Mining and Machine Learning Basic concepts and knowledge of predictive model, data mining and machining learning, including regression, classification, association rules Introducing application of predictive model, data mining and machining learning to various use cases in different industry fields



Introducing Big Data and Hadoop Learn Big Data concept, distributed Architecture of Hadoop and related products Introduce MapReduce, big data analytics package in different software.

SQL Programming 

Introduction to Query, Expressions, Conditions and Operators General rules of syntax Building blocks of data retrieval Changing order of the column Indenting code, and operators



Functions: Modeling the Data you Retrieve Aggregate, date, time, and arithmetic functions Character and conversion functions



Clauses in SQL Where, starting with, grouping by, having clauses



Joining Tables Referring to multiple tables in a single statement

789 Don Mills Road, SUITE 500 TORONTO, ON M3C 1T5

TEL.: 416-585-9880

FAX: 416-585-2117

Inner joins, outer joins, joining a table to itself 

The Embedded SELECT Statement Using aggregate functions with subqueries, nested subqueries, and correlated subqueries Using EXISTS, ANY, and ALL



Manipulating Data Insert, update, delete statements Working with insert and select statements



Creating Views and Indexes Create views and modifying data in views View, security, and indexing type



Controlling Transactions Beginning and finishing transactions Using transaction savepoints Table or view does not exist, invalid username/password, invalid column name, and missing comma



Using SQL to Generate SQL Statement Setting echo, feedback, and heading on/off ODBC, dynamic uses of SQL and setting up Visual Basic and SQL



NoSQL Introduction to NoSQL PostGre SQL MemSQL Popular Windows Functions

Microsoft Excel and VBA Programming 

Introduction to Excel Charts Why we need to use Excel Charts? How to use Excel Charts? How to select the right Excel Charts to display your data



PivotTable and Pivot Chart

789 Don Mills Road, SUITE 500 TORONTO, ON M3C 1T5

TEL.: 416-585-9880

FAX: 416-585-2117

Create a Pivot Table Use the Field List to arrange fields in a PivotTable Change the source data for a Pivot Table Update (refresh) data in a Pivot Table Insert Pivot Chart Filter Pivot Chart Change Pivot Chart Type 

Introduction to Excel VBA Programming What is VBA? What can VBA do? What is Macro? Recording, Testing, Examining, Modifying Macro Working with the Visual Basic Editor Working with the Project Window, Code Window Customizing the VBA Environment



VBA Object and Procedures Introducing the Excel Object Model Object Properties, Methods, Events VBA Sub and Function Procedure Executing Sub procedure Executing Function Procedure



Programming Concepts Essential VBA Language Elements Using Variables, Constants and Data Types Using Assignment Statements Working with Arrays Using Label Using Comments in VBA code



Working with Range Objects The Ways to Refer to a Range Range Object Properties Range Object Methods

789 Don Mills Road, SUITE 500 TORONTO, ON M3C 1T5

TEL.: 416-585-9880

FAX: 416-585-2117



Using VBA and Worksheet Functions Built-In VBA Functions Worksheet Functions in VBA Custom Functions



Controlling Program Flow and Making Decisions GoTo Statement The If-Then structure The Select Case structure For-Next loop Do-While loop Do-Until loop



Dialog Boxes The MsgBox Function The InputBox Function The Built-in Dialog Boxes User Form Basics Inserting a new Userform Viewing the UserForm Code Window Displaying a UserForm A UserForm Example



Using UserForm Controls Adding controls Control properties Dialog Box Controls Working with Dialog Box Control



Accessing Macros through the User Interface Ribbon Customization Customizing Shortcut Menus Error-Handling Techniques

Fundamentals of SAS Programming 

Introduction to SAS

789 Don Mills Road, SUITE 500 TORONTO, ON M3C 1T5

TEL.: 416-585-9880

FAX: 416-585-2117

SAS basic components and concepts SAS windows Library Save code and dataset



SAS Steps, SAS Dataset and Procedures Data step and Proc step Dataset, variables and observations SAS syntax rules Input data to dataset



SAS Procedures PROC PRINT and PROC CONTENTS PROC SORT and OPTIONS PROC FORMAT PROC MEANS/SUMMARY/UNIVARIATE PROC FREQ



SAS Operators and Functions SAS comparison and logical operators Where statement and IF statement Numeric, Date and Character function



SAS Statements Create variables by recoding or grouping Do-loop statement First and last processing in SAS Array statement



Combine Datasets Concatenate datasets Interleave datasets Merge datasets

789 Don Mills Road, SUITE 500 TORONTO, ON M3C 1T5

TEL.: 416-585-9880

FAX: 416-585-2117

Advanced SAS Programming 

Input and Output Infile options Reading data techniques Restricting observations options Controlling data output Random sampling ODS features



SAS Data Technique Data validation Data cleaning Data manipulation Data Transformation



SAS SQL Procedures SQL procedures and its clauses Data combination using SQL Join tables: inner, left, right and full join



SAS Macro Language SAS Macro variables SAS Macro functions SAS Macro programs



SAS Statistics Procedures Introduction to statistics Estimation for population mean Hypothesis test of mean comparison Correlation and linear regression

789 Don Mills Road, SUITE 500 TORONTO, ON M3C 1T5

TEL.: 416-585-9880

FAX: 416-585-2117

Logistic regression

R for Data Analytics 

R Background, Console and Environment; Introducing Learning Roadmap Understand R variables and create and use R variables R objects that can hold multiple values ATOMIC DATA TYPES in R R packages and usages Generate a ‘VECTOR’ with the certain patterns, Extract a sub-vectors based on index and conditions, special values in R. Concatenate several vectors Change data types, some functions that return vector attributes



More Data Objects in R Understand and use array in R Data analysis using array More about ‘MATRIX’ and its computations in R Assign (and refer) column or row names of matrix objects List object in R, create list object, Refer and work on the elements of a list, and manipulate list List factor in R, create factor object Increase or drop a level from a factor, Refer and work on the elements of a list



Data Frame, Attributes and Methods Create a data frame by directly inputting raw data Create a data frame by converting other data objects Create and remove variables (columns) of a data frame Obtain the column or row name of a matrix-like object Change a column names in data frames Return the number of rows and columns of a data frame

789 Don Mills Road, SUITE 500 TORONTO, ON M3C 1T5

TEL.: 416-585-9880

FAX: 416-585-2117

Extract a subset (as another data objects such as data frame, matrix, vector..) from a data frame Functionalities of ‘attach()’ and ‘detach()’ for data frames



R Language Syntax and Structure Understand ‘if else’ and ‘loop’ structure Write user defined functions Debug and handle errors



Data Analysis Using Data Frame, Attributes and Methods Read external data (such as CVS and TXT files) into data objects Save data object into external file Missing values in a data object and treat them Sort a data frame by some variables (columns Change the values of element in vectors, array and data frames Add columns or rows of data frames Merge two data frames by common columns or row names Math functions to summarize data Statistical distribution and character and date/time functions in R Data (frames) analysis for summary or counting by examples Data (frames) analysis for reshaping, sampling, standardization and deduplication by examples



Data Analysis Using SQL in R Use R package for SQL statements on R data frames Application of SQL by examples Use sqldf() function to execute SQL statement and return a resulting data frame by use cases



Advanced R Programming Generate your owner package R-specific and advanced objects and features Apply Object-oriented Programming in R Use S4 OOP. Access values in the slots within an object Create Method. Understand Inheritance of Class

789 Don Mills Road, SUITE 500 TORONTO, ON M3C 1T5

TEL.: 416-585-9880

FAX: 416-585-2117



Introducing Graphics in R R Graphics – packages and functions Use different Graphics packages and procedures to realize various data visualization by use cases

Python for Data Analytics 

Python Background, Installation Setup and Overview, Ipython Notebook Overview, Introducing Python Packages Understand Python environment, version, Anaconda, and other resource on web, installation and usages for data analysis Use the IDLE (Integrated Development Environment) and interactive Shell Window Provide code examples Test Python codes, packages by use cases



Python Data Types, Conversion And Operations Use cases of data types, conversion methods Scalar concepts Numeric, Boolean and date/time data types and arithmetic operators. Understand String definition, slicing and concatenating methods by use cases



Data Collection Objects, Conversion and Operations Understand collection objects concepts by examples. Compare with scalar objects Python data objects by examples various methods of creating data objects list, tuples, dictionaries and sets definitions access, query and manipulate objects based on certain conditions



Control Flow, User Defined Functions and Usage of Python Packages If. Else, loop statements and structures, Indentation for block structure by various use cases and examples

789 Don Mills Road, SUITE 500 TORONTO, ON M3C 1T5

TEL.: 416-585-9880

FAX: 416-585-2117

Python user defined functions by examples Test programs and functionalities Understand and use Python packages by different classes



Understanding and Using File I/O Different methods for File Input by examples Different Procedures for File Output by examples Various I/O functions for data analysis and examples



Numpy (Numerical Python) and SciPy Package Download and install Numpy and SciPy packages Single and N-dimensional Numpy Arrays, data type, Create or convert Numpy Arrays using different methods Test several Numpy Array examples



Various Operations on Numpy Arrays Mathematical operations on Numpy Array Understand and return properties of Numpy Array Show slicing Numpy Array by examples Understand splitting Numpy Arrays Understand and apply fancy index on Numpy Arrays by examples Understand reshaping Numpy Array methods by examples Merge Numpy Arrays Use Numpy Data Process Functions Sampling and Data Generation Load and Write NP Array



Data Analysis Using Pandas Understand Series and Data Frame objects Create Series and Data Frame by inputting raw data Read (Write) Data Frames from (into) .TXT and CSV Files, etc. Arithmetic operations on Series or Data Frames Select (Copy) Sub-Data Frames (rows and columns)

789 Don Mills Road, SUITE 500 TORONTO, ON M3C 1T5

TEL.: 416-585-9880

FAX: 416-585-2117

Manipulate (Add, Delete, Update) columns of Data Frames Manipulate (Add, Delete, Update) rows of Data Frames under different conditions Sort and rank Data Frames and Series Slice Series and Data Frames using various methods Combine Data Frames with different manners Reshape Data Frames under different conditions Re-index and Hierarchical Indexing of a Data Frame Handling missing and duplicated values in Data Frames Draw different samples from Data Frames Summarize and analyze Data Frames (various properties and methods) Load (and save) Data Frames from (into) other Data Sources such as websites



Matplotlib Package and Plotting Methods Various methods in Matplotlib package to draw scatter, bar, pie, 3-D and time series plots Show cases of applying Matplotlib package and methods

Data Mining- Advanced Statistical Modeling 

Introduction to Predictive Analytics Understand predictive models, supervised and unsupervised learning, data mining and machining concepts and use cases The process for conducting predictive analytics step by step The different types of predictive models and business background



Introducing Data Sampling, Target, Bivariate Analysis, Interaction and Transformation Understand and conduct data sampling Understand target rate Perform random and weighted sampling and testing power Conduct bivariate analysis, interaction and transformation using computer software Feature engineering and selection methods



Introducing Data Exploration, Performance Checking and Univariate Analysis

789 Don Mills Road, SUITE 500 TORONTO, ON M3C 1T5

TEL.: 416-585-9880

FAX: 416-585-2117

Understand data exploration, performance checking and univariate analysis Conduct analysis using software. Read and organize outputs Various performance measures Obtain these measures using software Understand statistics using tables and visualization data cleaning, missing values, imputation and variable generation



Introducing Linear Regression and Application Understand theory, procedure and related statistics Use computer software to perform linear regression Understand output and interpretation Diagnose model by statistics and plots Detect outliers Apply linear regression model to use cases Feature selection and cross validati on



Introducing Logistic Regression and Application Understand theory, procedure and model usage Difference between logistic and linear regression Use software to perform logistic regression Understand output and interpretation Apply logistic regression model to use cases Feature selection and cross validation by decil...


Similar Free PDFs