Ramesh, Sree Harsha, and Krishna Prasad Sankaranarayanan. "Neural Machine Translation for Low Resource Languages using Bilingual Lexicon Induced from Comparable Corpora." In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Student Research Workshop, pp. 112-119. 2018.
Evaluating Deep Learning Approaches for
Character Identification in Multiparty Dialogues
Character identification is an entity linking task that identifies each mention as
a certain character in multiparty dialogue where mentions are typically nominals
referring to a person and entities maybe speakers themselves or even external
characters. Identifying such mentions as real characters requires cross-document
entity resolution, which makes this task challenging. This task involves coreference
resolution which clusters together the mentions corresponding to the same
referent followed by an entity linking stage where the clusters of mentions are
mapped to their corresponding entities. Historically, coreference models have
been trained on the NewsWire dataset which is not as rich in terms of the complexity
of the coreferences as those in multiparty dialogues. As has been the
norm on various natural language processing tasks, deep-learning models are
the state-of-the-art in coreference resolution as well. However, coreference resolution
systems have been shown not to handle dialogues as well . This
motivates us to extend and evaluate the existing coreference systems including
rule-based, statistical and deep-learning based models, for the annotated TV
Show transcripts dataset released as part of SemEval 2018 Task-4.
Neural machine translation for low resource languages using bilingual lexicon induced by
Both neural and statistical machine translation approaches are data-hungry and are known to
perform poorly in low resource settings. Recent crowd-sourcing efforts and workshops on
machine translation have resulted in small amounts of parallel texts for building viable machine
translation systems for low-resource pairs. But, they have been shown to suffer from low
accuracy (incorrect translation) and low coverage (high out-of-vocabulary rates), due to
insufficient training data.
Comparable corpora such as Wikipedia, have collections of topic-aligned but non-sentencealigned
multilingual documents. We propose to create additional training data by inducing the
scarce bilingual lexicon with sentence-pairs automatically extracted from Wikipedia. Subsequently, we would be reporting on how the dataset affects NMT and phrase-based
The problem is interesting because it addresses the issue of parallel-corpus scarcity for low
resource languages like the morphologically rich Indian languages and also has potential
downstream benefits for multilingual natural language processing.
Crop Monitoring and Recommendation System using Machine Learning Techniques
This project proposes a crop recommendation system using Spectral spatial classification and Support Vector Machine (SVM).The farmer provides the crop field image as an input to the application. In the pre-processing stage, Denoising is done using Multiple morphological component analysis (MMCA) and as a result, filtering the image retaining its necessary portions. SVM prefixed by Spatial Spectral Schrodinger Eigen Maps (SSSE) is used as a classification method wherein partial knowledge propagation is leveraged to improve the classification accuracy.
The classified image along with the Ground truth statistical data containing the weather, crop yield, state & county wise crops are used to predict the yield of a particular crop under a particular weather condition. This predictive model used AdaBoost classifier. Crop recommendation is facilitated then by collaborative filtering. Further scope of the project would extend to predictive analytics on the commodity market of the goods grown in the agricultural fields to predict its waxing and waning.
University department automation system
This project deals with the design and implementation of a surveillance
system using a security camera to remotely open/close the Raspberry pi door
lock system of laboratories in educational institutions via a mobile application .
The proposed security system captures video footage and transmits it via a WIFI
to a static IP, which is viewed using a web browser from a dedicated a smart
device in the admin console. The camera streams live video and records the
motion detected parts in the cloud and/or in the system shared folder for video
analysis. The video analysis result causes a notification to be sent to the mobile
application which in turn controls the raspberry pi to open/close the door lock
system. A Raspberry pi fitted door lock is used so as remotely control opening
and closing mechanism via the mobile application. A security camera is used
for surveillance of the labs from which live footage is sent to the admin console.
Video analysis is done on the footage using MATLAB to detect motion and
human presence. Notification is sent to the mobile application 24/7 when
presence of motion is detected.
Learning Collective Behavior using Variant of K means
In this project, I Collected Social media data and feeds from twitter, facebook and reddit.Analyzed collective behavior of certain topics like social activism, politics to predict the likelihood of them being frequently discussed.Used R programming and Octave to create a map of all participating topics and their collective likelihood parameter with a 92% accuracy upon live testing.