26 lines
1.6 KiB
Markdown
26 lines
1.6 KiB
Markdown
|
|
Conduct the following analysis for the dataset:
|
||
|
|
1. Exploratory Data Analysis
|
||
|
|
Explore the statistical aspects of the dataset. Analyze the
|
||
|
|
distributions and provide summaries of the relevant statistics. Perform any cleaning,
|
||
|
|
transformations, interpolations, smoothing, outlier detection/ removal, etc. required on the
|
||
|
|
data. Include figures and descriptions of this exploration and a short description of what
|
||
|
|
you concluded (e.g. nature of distribution, indication of suitable model approaches you
|
||
|
|
would try, etc.) Min.1 page text + graphics (required).
|
||
|
|
|
||
|
|
2. Model Development, Validation and Optimization
|
||
|
|
Develop and evaluate three (4000-level) or four (6000-level) or more J models. If possible,
|
||
|
|
these models should cover more than one objective, i.e. regression, classification,
|
||
|
|
clustering. Consider the efect of dimension reduction of the dataset on model
|
||
|
|
performance. Diferent models means diferent combinations of an algorithm and a
|
||
|
|
formula (input and output features). The choice of independent and response variables is
|
||
|
|
up to you. Explain why you chose them. Construct the models, test/ validate them. Briefly explain the
|
||
|
|
validation approach. You can use any method(s) covered in the course. Include your code
|
||
|
|
in your submission. Compare model results if applicable. Report the results of the model
|
||
|
|
(fits, coeficients, sample trees, other measures of fit/ importance, etc., predictors and
|
||
|
|
summary statistics). Min. 2 pages of text + graphics (required).
|
||
|
|
|
||
|
|
3. Decisions
|
||
|
|
Describe your conclusions from the model
|
||
|
|
fits, predictions and how well (or not) it could be used for decisions and why. Min. 1/2 page
|
||
|
|
of text + graphics.
|