Abstract
We propose subject matter expert refined topic (SMERT) allocation, a generative probabilistic model applicable to clustering freestyle text. SMERT models are three-level hierarchical Bayesian models in which each item is modeled as a finite mixture over a set of topics. In addition to discrete data inputs, we introduce binomial inputs. These 'high-level' data inputs permit the 'boosting' or affirming of terms in the topic definitions and the 'zapping' of other terms. We also present a collapsed Gibbs sampler for efficient estimation. The methods are illustrated using real world data from a call center. Also, we compare SMERT with three alternative approaches and two criteria.
Original language | English |
---|---|
Pages (from-to) | 57-73 |
Number of pages | 17 |
Journal | Applied Stochastic Models in Business and Industry |
Volume | 32 |
Issue number | 1 |
DOIs | |
Publication status | Published - 1 Jan 2016 |
Externally published | Yes |
Keywords
- Bayesian modeling
- Gibbs sampling
- latent Dirichlet allocation