Thinking Allowed

medical / technology / education / art / flub

Gaining insights with Natural Language Processing of Reddit Data to Evaluate Dermatology Patient Experiences and Therapeutics.

Gaining insights with Natural Language Processing of Reddit Data to Evaluate Dermatology Patient Experiences and Therapeutics.

"There is a lack of research studying patient-generated data on Reddit, one of the world’s most popular forums with active users interested in dermatology. Techniques within natural language processing, a field of artificial intelligence, can analyze large amounts of text information and extract insights. ... Reddit data has viability and utility for dermatologic research and engagement with the public, especially for common dermatology topics such as tanning, acne, and psoriasis." Edidiong Okon, BSE, Vishnutheja Rachakonda, BS, Hyo Jung Hong, AB, Chris Callison-Burch, PhD, Jules Lipoff, MD. Journal of the American Academy of Dermatology.

Researchers tested the viability of using Reddit as a source for information on common dermatology conditions using Latent Dirichlet Allocation (LDA). They found that gaining insights into the concerns and information needs of patients can be achieved.

Whilst this sort of analysis is not new it is an example of unsupervised machine learning in a particular specialist topic area.

We were big fans of LDA at onexamination having used it as one method of automatically clustering questions and finding related topics for learners. The challenge is with its unsupervised nature - it can be hard to work out exactly what the machine has used to cluster certain documents. It requires a large corpus - a big collection of words and documents to work convincingly.

Source: www.sciencedirect.com

dermatology reddit insights lda viability gaining unsupervised information