Course code MateD005

Credit points 3

Multivariate Data Analysis I

Total Hours in Course81

Number of hours for lectures16

Number of hours for seminars and practical classes16

Independent study hours49

Date of course confirmation04.09.2019

Responsible UnitInstitute of Computer Systems and Data Science

Course developers

author prof.

Līga Paura

Dr. agr.

author prof.

Irina Arhipova

Dr. sc. ing.

Course abstract

PhD students acquire multivariate data analysis methods, as well as check the model assumptions violations. The study course is oriented to multivariate methods choice acquisition and methods comparison. During the studies real examples related to biology, agriculture and other sciences are used. Statistical software is provided for problem solving tasks. The course topics are the following: parametric and non-parametric two and more sample statistical methods, ANOVA, repeated measures ANOVA, GLM.

Learning outcomes and their assessment

After completing the course Ph.D. student will have:
• knowledge about multivariate data analysis methods for emerging scientific theories and their application for PhD research in professional fields (during the practical classes the necessary statistical methods for testing the hypothesis of the doctoral thesis are analysed);
• skills to individually evaluate and select multivariate data analysis methods for scientific research of important, original and international cited research (during the practical classes the necessary statistical methods for data analysis of the doctoral thesis have been selected);
• competences to individually perform critical analysis and evaluation in collaboration with PhD supervisor. Solve important scientific or innovative tasks and individually propose research ideas (independent work has been developed and defended).

Course Content(Calendar)

1. Introduction to study course [L - 2h].
2. Statistical analysis software [P - 2h].
3. Statistical hypothesis testing. Types of hypothesis testing. Types of errors [L - 2h].
4. Parametric methods for data analysis. Testing of Assumptions[L - 2h].
5. Non-parametric methods for data analysis and application in research [P - 2h].
6. Parametric methods for two independent and dependent samples[P - 2h].
7. Non-parametric methods for two independent samples [L/P - 2h].
8. Parametric methods for two dependent samples [L/P - 2h].
9. Parametric methods for three and more samples [L/P - 2h].
10. One-way, two-way and multi-way analysis of variance (ANOVA) [L/P - 2h].
11. Testing ANOVA assumptions [L/P - 2h].
12. Multiple comparison analysis testing in ANOVA [L/P - 2h].
13. Non-parametric methods for three and more samples [L/P - 2h].
14. Repeated Measures ANOVA [L/P - 2h].
15. Introduction to General Linear Models [L/P - 2h].
16. Defense of independent work [P - 2h].

Requirements for awarding credit points

Independent work has been developed and defended. Oral defense of a project at the end of the course. At least three different statistical

Description of the organization and tasks of students’ independent work

The organisation of independent work during the semester by independent literature studies using academic staff member consultation.

Criteria for Evaluating Learning Outcomes

The assessment of learning outcomes depends on the degree of the development of the independent work. To obtain the minimal assessment it is necessary to formulate and test the hypotheses using at least 3 statistical methods, based on the data of the doctoral thesis: two and more samples parametric or non-parametric statistical methods, ANOVA, repeated measures ANOVA or GLM.

Compulsory reading

1. Are Hugo Pripp. (2013) Statistics in food science and nutrition. New York : Springer, 66 pp.
2. François Chollet & J.J. Allaire (2018) Deep learning with R. Shelter Island, NY: Manning Publications Co., 335 pp.
3. James Gareth, at al. (2017) An introduction to statistical learning: with applications in R. New York : Springer, 426 pp.
4. Joseph F. Hair, Jr., William C. Black, Barry J. Babin, Rolph E. Anderson (2014) Multivariate data analysis. Harlow, Essex: Pearson, 734 pp.
5. Robert I. Kabacoff (2015) R in action: data analysis and graphics with R. Shelter Island, NY: Manning, 579 pp.

6. Siegmund Brandt. (2014) Data analysis: statistical and computational methods for scientists and engineer. Springer, 523 pp.

Further reading

1. John H. Schuenemeyer, Lawrence J. Drew. (2011). Statistics for Earth and Environmental Scientists. Hoboken, New Jersey: John Wiley & Sons, 407 pp.
2. Joseph F. Hair [et al.] (2010) Multivariate data analysis: a global perspective. Upper Saddle River [etc.]: Pearson, 800 pp.
3. Klaus Backhaus, Bernd Erichson, Wulff Plinke, Rolf Weiber (2000) Multivariate Analysemethoden: eine anwendungsorientierte Einführung. Berlin etc.: Springer, 661 pp.
4. Massimiliano Bonamente. (2013) Statistics and analysis of scientific data. New York: Springer, 301 pp.

5. Nathabandu T. Kottegoda, Renzo Rosso (2008) Applied statistics for civil and environmental engineers. Oxford; Malden, MA: Blackwell Publishing, 718 pp.

Periodicals and other sources

1. Statistical Analysis and Data Mining.Hoboken, Wiley-Blackwell. ISSN:1932-1864. E-ISSN:1932-1872

2. Electronic Journal of Applied Statistical Analysis. ESE - Salento University Publishing. ISSN:2070-5948

Notes

Elective course for all LLU doctoral programs.