Custom LDA

March 22, 2019

Report Abuse
Custom LDA topic modeling template based on R language tm package. Unlike existing VW LDA model this module create toptopics dataset.
## Gibbs Sampling parameters ## burnin <- 4000 iter <- 2000 thin <- 500 seed <-list(2003,5,63,100001,765) nstart <- 5 best <- TRUE ## Inputs ports ## dataset1 - Corpus with document body as **prepReview** and document name as **InternalId** dataset2 - A dataset containing a single column and a single row - **nb_topics**. We created a separate dataset for number of topics instead of creating a parameter because this makes it easier to launch an experiment from azure functions. ## Output ports ## 1. Top terms per topic 2. Top topics per document 3. Gamma distribution The source code is available [here][1] [1]: