1.6 KiB
Conduct the following analysis for the dataset:
-
Exploratory Data Analysis Explore the statistical aspects of the dataset. Analyze the distributions and provide summaries of the relevant statistics. Perform any cleaning, transformations, interpolations, smoothing, outlier detection/ removal, etc. required on the data. Include figures and descriptions of this exploration and a short description of what you concluded (e.g. nature of distribution, indication of suitable model approaches you would try, etc.) Min.1 page text + graphics (required).
-
Model Development, Validation and Optimization Develop and evaluate three (4000-level) or four (6000-level) or more J models. If possible, these models should cover more than one objective, i.e. regression, classification, clustering. Consider the efect of dimension reduction of the dataset on model performance. Diferent models means diferent combinations of an algorithm and a formula (input and output features). The choice of independent and response variables is up to you. Explain why you chose them. Construct the models, test/ validate them. Briefly explain the validation approach. You can use any method(s) covered in the course. Include your code in your submission. Compare model results if applicable. Report the results of the model (fits, coeficients, sample trees, other measures of fit/ importance, etc., predictors and summary statistics). Min. 2 pages of text + graphics (required).
-
Decisions Describe your conclusions from the model fits, predictions and how well (or not) it could be used for decisions and why. Min. 1/2 page of text + graphics.