This repository has been archived on 2026-05-09. You can view files and clone it. You cannot open issues or pull requests or push a commit.
Files

26 lines
1.6 KiB
Markdown
Raw Permalink Normal View History

2025-12-05 19:59:00 -05:00
Conduct the following analysis for the dataset:
1. Exploratory Data Analysis
Explore the statistical aspects of the dataset. Analyze the
distributions and provide summaries of the relevant statistics. Perform any cleaning,
transformations, interpolations, smoothing, outlier detection/ removal, etc. required on the
data. Include figures and descriptions of this exploration and a short description of what
you concluded (e.g. nature of distribution, indication of suitable model approaches you
would try, etc.) Min.1 page text + graphics (required).
2. Model Development, Validation and Optimization
Develop and evaluate three (4000-level) or four (6000-level) or more J models. If possible,
these models should cover more than one objective, i.e. regression, classification,
clustering. Consider the efect of dimension reduction of the dataset on model
performance. Diferent models means diferent combinations of an algorithm and a
formula (input and output features). The choice of independent and response variables is
up to you. Explain why you chose them. Construct the models, test/ validate them. Briefly explain the
validation approach. You can use any method(s) covered in the course. Include your code
in your submission. Compare model results if applicable. Report the results of the model
(fits, coeficients, sample trees, other measures of fit/ importance, etc., predictors and
summary statistics). Min. 2 pages of text + graphics (required).
3. Decisions
Describe your conclusions from the model
fits, predictions and how well (or not) it could be used for decisions and why. Min. 1/2 page
of text + graphics.