Skip to main content

One post tagged with "data science"

View All Tags

Understanding Latent Semantic Analysis

ยท 15 min read
Nick Silva
College Rising Junior

Introduction to LSA:#

LSA#

A classical problem in the field of natural language processing (NLP) is that of finding and matching semantic meaning from large bodies of text. Some classical solutions involve stemming, controlled vocabularies, and using thesauri 1. A different approach to this problem is Latent Semantic Analysis (LSA). LSA is an application of dimension reduction to the fields of natural language processing and information retrieval 2. Mathematically, it involves (1) creating a vector encoding of a large body of text, (2) computing a truncated singular value decomposition on the matrix, and (3) retrieving word and vector similarity in the dimension-reduced space. LSA is a well-studied model that remains an important tool and benchmark in the task of semantic similarity.