apply computational statistics methods to a
topic of your choice using an analysis-friendly dataset.
Analysis ideas:
Below are some suggestions to help you get started:
•Using bootstrap algorithm to estimating model coefficients, or population statistics
•Implement a random variable simulation algorithm (such as Metropolis-Hastings)
using different type of proposal functions
•Using non-parametric methods to estimate optimal curve for bi-variate data
Dataset ideas:
and here are some ideas for datasets (click for link)
•Kaggle
•UCI Machine Learning repository
•Pre-loaded datasets in R
R package references:
Below are some example of R packages that could be applied in your final project.
Check for examples using these packages for more ideas:
•R package for kernel density estimation: https://cran.rproject.org/web/packages/ks/vignettes/kde.pdf
•R package for Markov chain Monte Carlo simulation: https://cran.rproject.org/web/packages/mcmc/mcmc.pdf
•R package for bootstrap algorithm: https://cran.rproject.org/web/packages/boot/index.html
Detailed guideline:
While you are not expected to build a computationally complex model, your work needs
to show logical flow, and demonstrates the Bayesian analysis concepts discussed in the
course. This includes the following:
1. Description of the problem: What is the problem you are trying to solve? What is the
motivation and significance behind this? Why might your approach be useful here?
2. Description of your data: What are the variables of interest and their summary? What
are some caveats of the data (such as data quality issues) that we need to be aware of,
if any?
3. Formulation of your analysis approach: How is the model or estimation algorithm
defined?
4. Computational approach: What methods are you using to analyze the data? You are
encouraged to use existing R packages.
5. Results and conclusion: What is the takeaway from your analysis? What makes your
approach advantageous (or challenging) in your problem? What are the next steps in
your analysis?