Machine Learning and Friends Lunch





home
past talks
resources
conferences

Trend Detection and Entity Identification


Gary Huang
UMass

Abstract


During my internship at IBM Almaden in summer 2001, another intern and I experimented with simple heuristics to detect trends from textual information on corporate websites. On each crawl of a website, we used term-frequency vectors to hold words and bigrams along with their frequencies of occurrence in each source page. We attempted to identify people, place, company, and possible product names using word lists and capitalizations. I will present some of our experimental results. Joint work with Eugene Shvets (now at Microsoft) and Wayne Niblack (IBM).

Back to ML Lunch home