Topic Classification of Texts Locally Using BERTopic

I’ve been recently working on survey response data that in addition to aggregatable question types like Likert-scale and multiple-choice questions, includes optional free-text questions. Although we are lucky that thousands of the respondents spend time elaborating on questions and leaving comprehensive free-text responses, getting insights from these text responses is challenging. While investigating how to enrich this text data with proper metadata related to their topics, I came across BERTopic which introduces itself as a topic modeling technique to create clusters allowing for easily interpretable topics. In this post, I’ll explore BERTopic and will go through an example to explain what adjustments worked for me. ...

2023-09-12 · 7 min · Saeed Esmaili