Machine Learning and Friends Lunch

past talks

Trend Detection and Entity Identification

Gary Huang


During my internship at IBM Almaden in summer 2001, another intern and I experimented with simple heuristics to detect trends from textual information on corporate websites. On each crawl of a website, we used term-frequency vectors to hold words and bigrams along with their frequencies of occurrence in each source page. We attempted to identify people, place, company, and possible product names using word lists and capitalizations. I will present some of our experimental results. Joint work with Eugene Shvets (now at Microsoft) and Wayne Niblack (IBM).

Back to ML Lunch home