UMass Machine Learning and Friends Lunch | Main / Tracking The World State With Recurrent Entity Networks

Abstract: We introduce a new model, the Recurrent Entity Network (EntNet). It is equipped with a dynamic long-term memory which allows it to maintain and update a representation of the state of the world as it receives new data. For language understanding tasks, it can reason on-the-fly as it reads text, not just when it is required to answer a question or respond as is the case for a Memory Network (Sukhbaatar et al., 2015). Like a Neural Turing Machine or Differentiable Neural Computer (Graves et al., 2014; 2016) it maintains a fixed size memory and can learn to perform location and content-based read and write operations. However, unlike those models it has a simple parallel architecture in which several memory locations can be updated simultaneously. The EntNet sets a new state-of-the-art on the bAbI tasks, and is the first method to solve all the tasks in the 10k training examples setting. We also demonstrate that it can solve a reasoning task which requires a large number of supporting facts, which other methods are not able to solve, and can generalize past its training horizon. It can also be practically used on large scale datasets such as Children's Book Test, where it obtains competitive performance, reading the story in a single pass.

Bio: I am a Ph.D student at NYU, advised by Yann LeCun. During my Ph.D I have interned at Facebook AI Research and worked on projects involving natural language, reasoning and memory. Prior to that I worked at the NYU Center for Health Informatics and Bioinformatics, applying machine learning methods to problems in biomedicine. I received my undergraduate degree in math from the University of Texas at Austin.