STAT 330 Home Page

Here are links to the two code files I mentioned today, MCMC4.R and MCMC5.R; the problem with running the first file turned out to be due to a stray invisible character that managed to get into the file.

Here is a link to the code file for the Poisson problem we discussed on 10/11. I have modified it slightly so that the selection of u=b+s is conditional on u≥b, that is, s=u-b≥0. I think the code I showed in class today was not quite correct, but this code is correct.

Problem Sets

Examples from Class (Number indicates corresponding chart set)

Syllabus

This is a course in Bayesian statistics. Bayesian inference is a powerful and increasingly popular statistical approach, which allows one to deal with complex problems in a conceptually simple and unified way. The recent introduction of Markov Chain Monte Carlo (MCMC) simulation methods has made possible the solution of large problems in Bayesian inference that were formerly intractable. This course will introduce the student to the basic methods and techniques of modern Bayesian inference, including parameter estimation, MCMC simulation, hypothesis testing, and model selection/model averaging in the context of practical problems.

Books

Bayesian Data Analysis, Second Edition (Andrew Gelman, John B. Carlin, Hal S. Stein and Donald B. Rubin. London: Chapman and Hall)

Introduction to Statistical Thought (Michael Lavine), available here as a free web download.

Topics (not necessarily in this order; subtopics will be presented as appropriate)

Review of probability calculus. Interpretations of probability (e.g., frequency, degree-of-belief). Coherence. Bayes's Theorem. Joint, conditional, and marginal distribution. Independence. Prior distribution, likelihood, and posterior distribution. Bayesian estimation and inference on discrete state spaces. Likelihoods, odds and Bayes factors. Simple and composite alternatives.

Markov Chain Monte Carlo (MCMC) simulation as a method for practical calculation of Bayesian results. The Gibbs sampler. Metropolis-Hastings sampling. Metropolis-within-Gibbs sampling. Computer tools, e.g., BUGS,S+, R.

Bayesian point and interval parameter estimation. Bayesian credible intervals. Comparison with frequentist parameter estimation and confidence intervals. Bayesian inference on Gaussian distributions. Maximum Likelihood estimation as a Bayesian approximation. Laplace's approximation. Bayesian inference in non-Gaussian cases, e.g., Poisson, Cauchy, and arbitrary distributions. Linear and nonlinear models. Errors-in-variables models. Selection models. Hierarchical models

Prior selection. Subjective and objective priors. Priors as a way to encode actual prior knowledge. Sensitivity of the posterior distribution to the prior. Priors for hierarchical models.

Bayesian hypothesis testing. Comparison with frequentist hypothesis testing. Model selection and model averaging. Reversible jump MCMC for models of variable size. Approximations, e.g., AIC, BIC. Philosophical issues, likelihood principle, and the Bayesian Ockham's Razor.

Grading

The course grade will be based 80% on the assignments and 20% on class participation. By class participation, I mean that I will often leave unanswered questions in the notes that will be found on the web. You should read the notes in advance and attempt to answer these questions for yourselves. I will ask students for their answers to these questions in class. Also, I will sometimes ask for students' ideas about how they solved the assignments.

In general, I encourage students to work on the assignments in small groups of two or three (maximum). Statistics is by nature a cooperative enterprise. Statisticians act as experts in that field and advise clients (who are experts in their fields) on how to apply statistics to their problem. By working in groups, I hope to foster this sort of cooperative attitude between students in the class. If a group works on an assignment, I would like one paper turned in for the group, with everyone's name at the top. It goes without saying that I expect that everyone who works in a group will contribute roughly equally to the final result. For example, in a programming assignment, each member of the group should attempt to program the problem, and the group should then try to work out differences (e.g., if different students in the group arrive at different results, the group should try to figure out why this is so, to locate the sources of the discrepancies and fix them; if no resolution can be found, then the students should turn in a paper that displays the several different attempts with a discussion explaining the group's best understanding of the reasons for the discrepancy). Similarly, if a problem is worked and different members of the group obtain different answers, a similar resolution should be attempted, and if no agreement is obtained, the group should present a discussion. My role will be to examine what each group presents and comment on them, as well as to provide a grade.

Office Hours

Bill Jefferys' information: Office hours 11:30-12:30, 107 Lord (Math Department). Email bill@bayesrules.net

Web Resources

Tom Loredo's Bayesian Inference in the Physical Sciences (BIPS) website has a lot of useful information about Bayesian inference. Note particularly the first four items in his Bayesian Reprints page, which are very nice tutorials on practical application of Bayesian inference. He also has extensive pointers to other websites including software, reprint archives, etc.

First Bayes is a software package that is intended to help students with the first steps in understanding Bayesian inference. It runs under Windows. It concentrates on simple, closed-form examples but may be helpful to you.

The International Society for Bayesian Analysis (ISBA) is the international Bayesian organization. It sponsors meetings and publishes a newsletter. Dues are not expensive, and for students are set at a reduced rate of $15/year.

Bayesians Worldwide contains links to the home pages of a large number of Bayesians. Many of these individuals maintain collections of their reprints. Most of the prominent Bayesians are listed.

The Bayesian Songbook contains songs that have been presented at various Bayesian meetings over the years. Just for fun. There are also links to pictures of the infamous "Cabarets" at which these songs were sung.

Free Software

Carnegie-Mellon University's statistics group has a library of many different statistics packages, including Bayesian packages. It can be accessed here.

Although CMU archives the R package, it's best to go to the R Project (CRAN) homepage, since you'll probably get the most recent version of it. Click here. R runs on Windows, Linux, UNIX and Macintosh (OS 9 or higher). The introductory tutorial for R can be found here. Many add-on packages for R are available at CRAN.

The BUGS project at the University of Cambridge offers the BUGS (Bayesian inference Using Gibbs Sampling) package. It does both Gibbs and Metropolis-Hastings sampling, and the software can be downloaded here. It runs on Windows and the "classic" version runs on UNIX. There is no Mac version of BUGS. However, if you purchase Virtual PC you can run it on a Mac (at reduced speed), but Virtual PC is not free. Virtual PC is sold by Microsoft. Another (cheap) system is sold by iEmulator. The new Macintosh computers (based on Intel chips) can run BUGS using Boot Camp and your own copy of Windows, and they will run it at full speed. There are other systems that will also run Windows on the new Macintoshes, for example, Parallels.

Not so free software

S Plus is not free, but there is a fairly inexpensive student package. It is sold by Mathsoft. Most of the functionality of S Plus can be found in the free R package (above) so unless you need something not available in R, you don't need to buy S Plus. Also, S Plus has had some memory management problems that cause problems in large simulations. R does not have this problem.

Matlab is another software package that has become popular for MCMC simulations. It is faster than R or S Plus. A student version is available. Matlab can be instructed to produce C or C++ code, which will run very fast.

Another software package that has been used successfully in MCMC simulations is Gauss, sold by Aptech.

SAS is extremely powerful. It is reputed to be the most difficult of the popular packages to learn. There are versions for Windows, and UNIX. The recent Intel Macintoshes can run it under Boot Camp.

Yet another popular package is SPSS. There are versions for Windows and Macintosh.

Welcome to the Stat 330
Home Page

This cartoon is from Mike West's website at Duke University

Reading

Chart Sets

Problem Sets

Examples from Class (Number indicates corresponding chart set)

Syllabus

Books

Topics (not necessarily in this order; subtopics will be presented as appropriate)

Grading

Office Hours

Web Resources

Free Software

Not so free software

Welcome to the Stat 330 Home Page

This cartoon is from Mike West's website at Duke University

Reading

Chart Sets

Problem Sets

Examples from Class (Number indicates corresponding chart set)

Syllabus

Books

Topics (not necessarily in this order; subtopics will be presented as appropriate)

Grading

Office Hours

Web Resources

Free Software

Not so free software

Welcome to the Stat 330
Home Page