Strategies for Chat Knowledge Analytics with Python | by Robin von Malottki | Nov, 2024
Within the first a part of this sequence, I launched you to my artificially created buddy John, who was good sufficient to supply us together with his chats with 5 of the closest folks in his life. We used simply the metadata, equivalent to who despatched messages at what time, to visualise when John met his girlfriend, when he had fights with certainly one of his finest mates and which relations he ought to write to extra usually. For those who didn’t learn the primary a part of the sequence, you could find it here.
What we didn’t cowl but however we are going to dive deeper into now could be an evaluation of precise messages. Due to this fact, we are going to use the chat between John and Maria to establish the subjects they focus on. And naturally, we is not going to undergo the messages one after the other and classify them — no, we are going to use the Python library BERTopic to extract the subjects that the chats revolve round.
What’s BERTopic?
BERTopic is a subject modeling method launched by Maarten Grootendorst that makes use of transformer-based embeddings, particularly BERT embeddings, to generate coherent and interpretable subjects from giant collections of paperwork. It was designed to beat the restrictions of conventional matter modeling approaches like LDA (Latent Dirichlet Allocation), which regularly battle to deal with quick…