Designing accurate and efficient CNNs for mobile devices is challenging due to the large design space and expensive computational methods. Although many mobile CNNs are available for developers to train and deploy to mobile devices, existing CNN architecture may not be able to achieve the best results for some tasks on mobile devices.
Researchers from Facebook, the National University of Singapore, and the Qihoo 360 AI Institute have jointly proposed OctConv (Octave Convolution), a promising new alternative to traditional convolution operations. Akin to a “compressor” for Convolutional Neural Networks (CNN), the OctConv method saves computational resources while boosting effectiveness.
It is no secret that deep neural networks (DNNs) can achieve state-of-the-art performance in a wide range of complicated tasks. DNN models such as BigGAN, BERT, and GPT 2.0 have proved the high potential of deep learning. Deploying DNNs on mobile devices, consumer devices, drones and vehicles however remains a bottleneck for researchers.
In his 1988 IEEE paper Cellular Neural Networks: Theory, UC Berkeley PhD student Lin Yang proposed Cellular Neural Network theory, a predecessor of the Convolutional Neural Networks (CNN) that would later revolutionize machine learning. Based on this theory, Yang blueprinted a 20*20 parallel simulated circuit chip in the university lab.
The internet loves those little looping action images we call GIFs. They can tell a short visual story in a small file size that has high portability. The visual quality of GIFs is however usually low compared to the videos they were sourced from. If you are sick of fuzzy, low resolution GIFs, then researchers from Stony Brook University, UCLA, and Megvii Research have just the thing for you: “the first learning-based method for enhancing the visual quality of GIFs in the wild.”
The proliferation of social media in our daily lives has profoundly changed the way we work and play with others. It has also created an entirely new job: thousands of people worldwide now work for Google, Facebook and Twitter “Community Operations Teams.” Whenever a user flags content as offensive, it’s sent to these guys for review.
Text-based CAPTCHA remain one of the most visible and commonly used mechanisms for website security. As a sort of online gatekeeper that distinguishes between humans and bots, the little solvable image fields have critical commercial applications in blocking automatic spam and preventing e-transfer fraud; and can also stop bots from spreading fraudulent information, etc.
The dearth of AI talents capable of manually designing neural architecture such as AlexNet and ResNet has spurred research in automatic architecture design. Google’s Cloud AutoML is an example of a system that enables developers with limited machine learning expertise to train high quality models. The trade-off, however, is AutoML’s high computational costs.
From Hayao Miyazaki’s Spirited Away to Satoshi Kon’s Paprika, Japanese anime has made it okay for adults everywhere to enjoy cartoons again. Now, a team of Tsinghua University and Cardiff University researchers have introduced CartoonGAN — an AI-powered technology that simulates the styles of Japanese anime maestri from snapshots of real world scenery.
To boost learning research aimed at endowing robots with better generalization capabilities, Yi Wu from UC Berkeley and Yuxin Wu, Georgia Gkioxari, and Yuandong Tian from Facebook AI research recently published the paper Building Generalizable Agents with a Realistic and Rich 3D Environment.
The ShuffleNet utilizes pointwise group convolution and channel shuffle to reduce computation cost while maintaining accuracy. It manages to obtain lower top-1 error than the MobileNet system on ImageNet classification, and achieves ~13x actual speedup over AlexNet while maintaining comparable accuracy.
Compared to SMT, NMT can train multiple features jointly and does not need prior domain knowledge, enabling zero-shot translation. In addition to higher BLEU score and better sentence structure, NMT can also help reduce morphology errors, syntax errors, and word order errors of SMT.
PixelGAN is an autoencoder for which the generative path is a convolutional autoregressive neural network on pixels, conditioned on a latent code, and the recognition path uses a generative adversarial network (GAN) to impose a prior distribution on the latent code.