The two-day RE•WORK Deep Learning Summit Boston 2019 gathered more than 60 speakers from top AI labs such as MIT CSAIL, Uber AI Labs, Adobe Research and other experts from the AI healthcare industry who provided high-level deep learning technical discussions and industry application insights. There were two separate tracks this year: Deep Learning and Healthcare
Machines are using deep neural networks to solve previously challenging problems such as Go and Atari games. Speaking in the Deep Learning track, Uber AI Labs Co-Founder and Research Scientist Jason Yosinki said that although researchers can train more complex networks than ever on today’s large-scale datasets, the gap between what they can build and what they can understand is widening.
Yosinki believes this understanding gap hinders progress toward overall AI system competency and bodes poorly for a future world increasingly reliant on algorithms with decreasing interpretability. Yosinki surveyed research aimed at closing this gap, including interactive model exploration, optimization, and visualization.
Computational models of attention process images or video and can help guide image processing algorithms. They can for example direct a model to compose more meaningful image captions or be used to provide feedback within graphic design tools. Adobe Research Scientist Zoya Bylinskii discussed human attention and three approaches researchers use to capture human attention: eye tracking with dedicated hardware, camera-based eye tracking, and cursor-based eye tracking.
Bylinskii said cursor-based eye tracking can be used as a proxy for eye movement and predicting attention on data visualizations such as attention maps, which can be used to create thumbnails or facilitate database search. She wrapped up her talk with a quote from designer Jason Tselentis: “Today, we’re on the verge of another revolution, as artificial intelligence and machine learning turn the graphic design field on its head again.”
Machine learning is widely implemented in Twitter for personalized search, recommendations, content monitoring and ads. Twitter Cortex Senior Machine Learning Engineer Jay Baxter shared some deep learning approaches that can be used to improve recommender systems, including using neural networks to train co-embeddings of new users and items and serving them efficiently at runtime via approximate nearest neighbor algorithms like LSH or HNSW. He also discussed difficulties involved with evaluating such models both offline and online in the context of A/B tests, such as very few positive examples of likes, replies, retweets vs negative examples and so on.
On the healthcare track, CVS Pharmacy Principal Data Scientist Janos Perge discussed real-life applications such as predicting costly clinical outcomes from health insurance claims data in low back surgery models and kidney failure models. Perge introduced three methods for improving deep learning sequence model performance: Embedding medical codes for input to the model; transfer learning from a pretrained general language model to improve model performance on small and context-specific data sets; and using an attention mechanism to make neural networks more transparent. These methods have been implemented to train deep learning models on massive claims datasets and are currently used by one of the largest players in the health insurance industry.
Deep learning methods can reduce reliance on clinical input, eliminate manual feature engineering, predict multiple outcomes simultaneously, impute missing predictors, and track a user’s “health journey.”
There can be different names for the same disease within the biomedical information space and so developing efficient disease named entity recognition (NER) is a critical task for biomedical natural language processing (NLP) applications. Disease annotation in biomedical articles can help information search engines to accurately index, enabling clinicians to more easily find relevant articles, etc.
Philips Research Senior Scientist Sadid Hasan discussed his company’s recently proposed domain knowledge-enhanced long short-term memory network-conditional random field (LSTM-CRF) model for the disease named entity recognition, which augments a character-level convolutional neural network (CNN) and a character-level LSTM network for input embedding. Experiment results demonstrated that their proposed model achieves new state-of-the-art results in disease named entity recognition.
Featured presentations from RE•WORK Deep Learning Summit Boston 2019 are available here.
Author: Yuqing Li | Editor: Michael Sarazen