# Data Analysis, Design of Experiments and Machine Learning

## ECE59500

### Credit Hours:

1This is a mini-course that will run for 5-weeks.

## Course Details

Lecture Hours: 3 Credits: 1

### Areas of Specialization:

- Microelectronics and Nanotechnology

### Counts as:

- EE Elective
- CMPE Special Content Elective

### Normally Offered:

Each Fall

### Campus/Online:

On-campus and online

### Requisites:

Grade of B or better in the following courses: [ENGR 13200 or ENGR 16200 or ENGR 13300] and [MA 16200 or MA 16600] and MA 261 and [MA 26200 or [MA 26500 and 26600]]

### Requisites by Topic:

Two semesters of calculus; Working knowledge of linear algebra topics such as vectors and matrices; Simple first order differential equations; Familiarity with the concepts of probability distribution function and cumulative distribution functions; Computer literacy and experience with programming language such as MatLab and spreadsheet programs such as Excel.

### Catalog Description:

This course will provide the conceptual foundation so that a student can use modern statistical concepts and tools to analyze data generated by experiments or numerical simulation. We will also discuss principles of design of experiments so that the data generated by experiments/simulation are statistically relevant and useful. We will conclude with a discussion of analytical tools for machine learning and principal component analysis. At the end of the course, a student will be able to use a broad range of tools embedded in MATLAB and Excel to analyze and interpret their data.

### Required Text(s):

None.

### Recommended Text(s):

*Applied Statistics and Probability for Engineers*, 3rd Edition , Montomery and Runger , Wiley , 2003*Understanding Robust and Exploratory Data Analysis*, D. C. Hoaglen, F. Mosteller, and J.W. Tukey , Wiley Interscience , 1983*Video Lectures by Stuart Hunter (Available on Youtube)*

### Learning Outcomes

A student who successfully fulfills the course requirements will have demonstrated:

- an ability to analyze data taken from variety of sources
- an ability to design randomized experiments and analyze the result
- an ability to understand key machine learning algorithms and use them in illustrative examples

### Lecture Outline:

Lecture | Topic(s) |
---|---|

Lecture | Where do Data Come From: A short history of data; An example of small data; Small vs. Big data: The key characteristics of big data; Examples: Genome sequencing, Energy Farms, Health information Consumer information; Treat the data with respect, because they have a story to tell. |

Lecture | Collecting and Plotting Data: Data is not information: Nature of Statistical Inference; Parametric vs. nonparameteric information; Preparing data for projection: Hazen formula; Preparing data for projection: Kaplan formula |

Lecture | Physical vs Empirical Distribution: Physical vs. empirical distribution; Properties of classical distribution function; Moment-based fitting of data |

Lecture | Model Selection and Goodness of Fit: Classical Approach: The problem of matching data with theoretical distribution; Parameter extractions: Moments, linear regression, maximum likelihood; Classical measure of Goodness of fit: Residual, Q-Q, K-S, chi^2 Pearson, Cox |

Lecture | Computer-Aided and Information Theoretic Approach to Goodness of Fit: Information theoretic approach: Adjusted R-square, AIC Methods; Computer aided techniques: Jack-knife, Cross-validation and bootstrap methods; Parametric vs. non-parametric distribution |

Lecture | Design of Experiments: nondimensionalizing equations: Rules of scaling or nondimensionalization; Scaling of ordinary differential equations; Scaling of partial differential equations; Equivalence of equations and solutions |

Lecture | Equation-free scaling theory for design of experiments: Buckingham Pi theorem; A few illustrative example; Why does the method work |

Lecture | Statistical Design of Experiments: Single factor and full factorial method; Orthogonal vector analysis: Taguchi/Fisher model; Correlation in dependent parameters |

Lecture | Fractional Design of Experiment Taguchi Model: Three representation of the full factorial design; Linear graph approach to partial factorial design; Taguchi table approach to partial factorial design |

Lecture | Design of Experiments: Analysis by ANOVA: Introduction to Analysis of Variance; Single factor analysis of variance; Two factor analysis of variance; Multi-factor analysis of variance |

Lecture | Basics of Machine Learning: Principle component analysis; Illustrative examples; Good, bad, and ugly parts of Principle component analysis |

Lecture | Machine learning by Principle Component Analysis: Principle component analysis; Illustrative examples; Good, bad, and ugly parts of Principle component analysis |

Lecture | Machine learning from big data ??? Part 1: Neutral network and Classification, Noise reduction, and Prediction by Neural networks; Hidden Markov Models and deep neural network; Good, bad, and ugly parts of Neural network modeling. |

Lecture | Physics-based Machine Learning From Big Data: What is a physics-based Machine Learning; An example involving dropping of a ball from a height; A second Example of Lake Temperature modeling; Opportunities and challenges |

Lecture | Concluding lecture: Data is not information; Homework and solutions; If you want to learn more: Introduction to advanced literature |