Posterior concentration for Bayesian regression trees and their ensembles - VVSOR - VVSOR

23 February 2018

Posterior concentration for Bayesian regression trees and their ensembles

We cordially invite you to the next Bayes club meeting, taking place on Friday, the 23rd of February in Amsterdam (UvA, Science Park 904). 

Speaker: Stéphanie van der Pas (Leiden University)
Time: 16:00-17:00 on the 23rd February, 2018
Location: room D1.115, Science Park 904, Amsterdam

Abstract: Since their inception in the 1980’s, regression trees have been one of the more widely used nonparametric prediction methods. Tree-structured methods yield a histogram reconstruction of the regression surface, where the bins correspond to terminal nodes of recursive partitioning. Trees are powerful, yet susceptible to overfitting. Strategies against overfitting have traditionally relied on pruning greedily grown trees. The Bayesian framework offers an alternative remedy against overfitting through priors. Roughly speaking, a good prior charges smaller trees where overfitting does not occur. In this paper, we take a step towards understanding why/when do Bayesian trees and their ensembles not overfit. We study the speed at which the posterior concentrates around the true smooth regression function. We propose a spike-and-tree variant of the popular Bayesian CART prior and establish new theoretical results showing that regression trees (and their ensembles) (a) are capable of recovering smooth regression surfaces, achieving optimal rates up to a log factor, (b) can adapt to the unknown level of smoothness and (c) can perform effective dimension reduction when $p > n$. These results provide a piece of missing theoretical evidence explaining why Bayesian trees (and additive variants thereof) have worked so well in practice.

This is a joint work with Veronika Rockova.

For the list of upcoming talks and further information about the seminar please visit the (relocated) seminar website: bszabo/bayes_club