CSE675: Stochastic Simulation and Inference

                           Instructor: Wen Dong (UBIT: wendong)

                                             Abstract

An emerging application for the stochastic process is analyzing complex natural and social systems using data collected from sensor networks and the World Wide Web that tracks the temporal evolution of those systems. By applying the stochastic process to data, we can improve commuter traffic, tune organizational efficiency, and control the spread of epidemics. In this course, we will learn how to construct and simulate the Markov processes that are the common abstractions of many dynamic systems, and learn the mathematical properties of these processes. Overall, this hands-on course trains students to apply stochastic-process thinking to the temporal dynamics of complex systems, as captured by data.

Course Outline

The stochastic process is a language for studying the statistics of how systems with randomness evolve over time. This process has found application in statistical mechanics, systems biology, finance, and artificial intelligence – to name only a few examples. However, an emerging application for the stochastic process is analyzing complex natural and social systems using the data collected from sensor networks and the World Wide Web. By applying the stochastic process to data, we can do things from improving commuter traffic and tuning organizational efficiency to controlling the spread of epidemics.

This hands-on course trains students to apply stochastic-process thinking to the temporal dynamics of complex systems, and emphasizes the value of this kind of thinking. The stochastic process is a wide topic that has developed for more than a century in math and applied fields, so we will confine our math to the undergraduate non-math-major level but keep our simulation and inference algorithms parallel to the corresponding proofs that our mathematics colleagues are familiar with. We will learn how to construct and simulate the Markov processes that are the common abstractions of many dynamic systems, and learn the mathematical properties of these processes. We will also sample some of the most significant breakthroughs in the field in order to inspire students to build useful and innovative applications for complex systems and data. Ultimately, this course will explore the novel applications of stochastic processes for studying the dynamics of our social systems – as captured by textual and sensor data through Markov chain Monte Carlo, mean field approximation, and Gaussian process approximation.

Goals

By the end of this course, students should be able to do the following:
*      Simulate Markov processes and understand their properties
*      Use Markov processes to study the dynamics of complex systems
*      Make inferences about complex systems from data that tracks the temporal evolution of those systems with Markov processes.

Lecture Schedule

Part A: Simulation of Markov Processes
01 Introduction: examples, why we care, what we can do, review of probability theory
02 Simulating random variables from different probability distributions
03 Simulating Markov processes

Part B: Bayesian Inference of Markov Processes
04 Exact Inference: message passing and expectation maximization
05 MCMC: Gibbs sampler, Metropolis-Hastings algorithm, and hybrid MCMC schemes
06 Legendre-Fenchel transform, mean field approximation, Gaussian process approximation

Part C: Complex Systems as Markov Processes
07 The dynamics of network formation
08 Diffusion in social networks: voting process and contact process
09 Urban dynamics
10 Non-equilibrium statistical mechanics

Part D: Properties of Markov Processes
11 Markov chain: absorbing, regular and ergodic Markov chains, stopping time, strong Markov property
12 Random walks: first passage time distribution, stopping time and strong Markov property, Wald’s identity, strong law of large numbers
13 Queuing process
14-15 Research paper presentations: poster or talk

Assessment Overview

Assessment will be based on four problem sets (15% each) plus your final research paper (30%). Late submissions are worth only half credit. Each student is supposed to share with the class detailed notes of 1-2 lectures (10%).