Statistics II

Course Description:

The course consists of concepts of sampling, testing hypothesis, parametric and non parametric tests, correlation and regression, experimental designs and stochastic processes.

Course Objectives:

The main objective of the course is to acquire the theoretical as well as practical knowledge of estimation, testing of hypothesis, application of parametric and non-parametric statistical tests, design of experiments, multiple regression analysis, and basic concept of stochastic process with special focus to data/problems related with computer science and information technology

Course Contents:

Unit 1: Sampling Distribution and Estimation (6 Hrs.)

Sampling distribution; sampling distribution of mean and proportion; Central Limit Theorem; Concept of inferential Statistics; Estimation; Methods of estimation; Properties of good sestimator; Determination of sample size; Relationship of sample size with desired level of error

Problems and illustrative examples related to computer Science and IT

Unit 2: Testing of hypothesis (8 Hrs.)

Types of statistical hypotheses; Power of the test, concept of p-value and use of p -value in decision making, steps used in testing of hypothesis, one sample tests for mean of normal population (for known and unknown variance), test for single proportion, test for difference between two means and two proportions, paired sample t-test; Linkage between confidence interval and testing of hypothesis

Problems and illustrative examples related to computer Science and IT

Unit 3: Non parametric test (8 Hrs.)

Parametric vs. non-parametric test; Needs of applying non-parametric tests; One-sample test: Run test, Binomial test, Kolmogorov–Smirnov test; Two independent sample test: Median test, Kolmogorov-Smirnov test, Wilcoxon Mann Whitney test, Chi-square test; Paired-sample test: Wilcoxon signed rank test; Cochran’s Q test; Friedman two way analysis of variance test; Kruskal Wallis test

Problems and illustrative examples related to computer Science and IT

Unit 4: Multiple correlation and regression (6 Hrs.)

Multiple and partial correlation; Introduction of multiple linear regression; Hypothesis testing of multiple regression; Test of significance of regression; Test of individual regression coefficient; Model adequacy tests

Problems and illustrative examples related to computer Science and IT

Unit 5: Design of experiment (10 Hrs.)

Experimental design; Basic principles of experimental designs; Completely Randomized Design (CRD); Randomized Block Design (RBD); ANOVA table, Efficiency of RBD relative to CRD, Estimations of missing value (one observation only), Advantages and disadvantages; Latin Square Design (LSD): Statistical analysis of m × m LSD for one observation per experimental unit, ANOVA table, Estimation of missing value in LSD (one observation only), Efficiency of LSD relative to RBD, Advantage and disadvantages.

Problems and illustrative examples related to computer Science and IT

Unit 6: Stochastic Process (7 Hrs.)

Definition and classification; Markov Process: Markov chain, Matrix approach, Steady- State distribution; Counting process: Binomial process, Poisson process; Simulation of stochastic process; Queuing system: Main component of queuing system, Little’s law; Bernoulli single server queuing process: system with limited capacity; M/M/1 system: Evaluating the system performance.

Laboratory Works:

The laboratory work includes implementing concepts of statistics using statistical software tools such as SPSS, STATA etc.

S. No.	Practical problems	No. of practical problems
1	Sampling distribution, random number generation, and computation of sample size	1
2	Methods of estimation (including interval estimation)	1
3	Parametric tests (covering most of the tests)	3
4	Non-parametric test(covering most of the tests)	3
5	Partial correlation	1
6	Multiple regression	1
7	Design of Experiments	3
8	Stochastic process	2
	Total number of practical problems	15

Text Books:

Ronald E. Walpole, Raymond H. Myers, Sharon L. Myers, & Keying Ye(2012). Probability & Statistics for Engineers & Scientists. 9th Ed., Printice Hall
Michael Baron (2013). Probability and Statistics for Computer Scientists. 2nd Ed., CRC Press, Taylor & Francis Group, A Chapman & Hall Book

Reference Books:

Douglas C. Montgomery & George C. Runger (2003). Applied Statistics and Probability for Engineers. 3rd Ed., John Willey and Sons, Inc.
Sidney Siegel, & N. John Castellan, Jr. Nonparametric Statistics for the Behavioral Sciences, 2nd Ed., McGraw Hill International Editions.