MonoLayout | Bird’s-Eye Layout Estimation from A Single Image
MonoLayout, a practical deep neural architecture that takes just a single image of a road scene as input and outputs an amodal scene layout in bird’s-eye view.
AI Technology & Industry Review
MonoLayout, a practical deep neural architecture that takes just a single image of a road scene as input and outputs an amodal scene layout in bird’s-eye view.
Joseph Redmon, creator of the popular object detection algorithm YOLO, tweeted last week that he had ceased his computer vision research to avoid enabling potential misuse of the tech.
The tool enables researchers to try, compare, and evaluate models to decide which work best on their datasets or for their research purposes.
Thanks to AI technologies such as image recognition and machine learning, people can now save time, food and money in the kitchen while discovering creative and tasty recipes and even generating their own new and personalized flavours.
The study introduces an Event Recognition in Aerial video (ERA) dataset comprising 2,866 aerial videos collected from YouTube and annotated with labels from 25 different classes corresponding to an event that can be seen unfolding over a period of five seconds.
In a bid to simplify 3D deep learning and improve processing performance and efficiency, Facebook recently introduced an open-source framework for 3D computer vision.
Researchers from Beijing’s National Laboratory of Pattern Recognition (NLPR), SenseTime Research, and Nanyang Technological University have taken the tech one step further with a new framework that enables totally arbitrary audio-video translation.
AI systems are already helping farmers with soil analysis, planting, animal husbandry, water conservation and more.
A team of researchers from Institut de Robòtica i Informàtica Industrial and Harvard University recently introduced 3DPeople, a large-scale comprehensive dataset with specific geometric shapes of clothes that is suitable for many computer vision tasks involving clothed humans.
The Fujitsu Laboratories and R&D Center behavioural analysis technology identifies suspicious activity by analysing complex combinations of human actions and movements — and does so with minimal training data.
A new study from Peking University and Microsoft Research Asia proposes a novel two-phase framework, FaceShifter, that aims for high-fidelity and occlusion-aware face exchange.
Although “Sudoku“ grid-based number puzzles are no match for today’s artificial intelligence systems, a novel approach to the challenge is trending on GitHub due to its practical integration of computer vision technologies.
In the conclusion to our year-end series, Synced spotlights ten datasets that were open-sourced in 2019.
There is increasing attention on machine learning, deep learning, IoT and computer vision technologies in attempts to reduce the damage done by alcohol and improve the safety of drinkers.
The model reduces the number of parameters from some 3 billion to 270 million while improving task performance by an average of 2.05 points.
Synced 10 AI Failures of 2019
Researchers from Beijing University of Posts and Telecommunication have introduced a novel visual dialogue state tracking (VDST) model that’s got pretty good at the similar, visual dialogue guessing game “GuessWhat?!”
Face recognition will continue to play a big role as short video platforms develop functions to appeal to new target groups and provide more innovative, personalized and engaging content to users.
In fact, the accuracy of results in few-shot learning, both with and without labeled data, is very high.
In a bid to improve object localization in less-than-ideal circumstances, an MIT and IBM research group has proposed a cross-modal auditory localization framework that can effectively locate objects using stereo sound.
Much of the compute for such services is done on the cloud, but ideally these applications would be light enough to run directly on devices without an Internet connection.
ICCV 2019 received 4,303 papers — more than twice the number submitted to ICCV 2017 — and accepted 1,075, for a reception rate of roughly 25 percent.
Inspired by OpenCV, Kornia is based on PyTorch and designed to solve generic computer vision problems.
For better or worse, AI can now figure out what you’re doing even without “seeing” you. The MIT Computer Science & AI Lab (CSAIL) has unveiled a neural network model that can detect human actions through walls or in extremely dark places.
Since the business mindset is to focus on short-term feasible technologies, the lack of serious buyers is the real problem for the LiDAR industry.
A group of Chinese researchers have come up with a novel method for identifying mirrors in images which outperforms state-of-the-art detection and segmentation methods on targeted baselines.
The traditional retail industry is facing challenges as the rapid development and continuous improvement of AI tools and techniques ushers in the era of New Retail.
The Neural Architects Workshop gathers experts and researchers in the field of deep neural network (DNN) design to share their insights and experiences working in this domain.
Artificial intelligence is changing the operation modes of industrial production, and has become one of the key tools in the new era of global manufacturing.
In the late 2000s Fortune Global 500 healthcare companies ramped up AI deployment in the industry, from in-hospital diagnosis and treatment to drug supply chain and out-of-hospital scenarios.
Synced Global AI Weekly June 23th
The traditional retail industry is undergoing a significant reinvention and upgrade as more and more brick and mortar stores boost business by adopting e-commerce platforms powered by cutting-edge tech.
In a new paper accepted at CVPR 2019, researchers from the Max Planck Institute for Intelligent Systems introduce a unique 4D (moving 3D images) face dataset and its learned model VOCA — Voice Operated Character Animation.
acebook’s AI’s new “Inverse Cooking” AI system reverse-engineers recipes from food images, predicting both the ingredients in the dish and their preparation and cooking instructions.
The 2019 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) kicked off today in Long Beach, California. CVPR is one of the world’s top three academic conferences in the field of computer vision (along with ICCV and ECCV).
Building large datasets is a time-consuming and labor-intensive task which challenges entities with limited budgets. There are hundreds of open visual datasets out there, but searching across them and their millions of entries is not a simple task.
Google AI has introduced a deep learning based approach that generates depth prediction from videos where both camera and subject are in motion.
Google’s deep learning TensorFlow platform has added Differentiable Graphics Layers with TensorFlow Graphics, a combination of computer graphics and computer vision. Google says TensorFlow Graphics can solve data labeling challenges for complex 3D vision tasks by leveraging a self-supervised training approach.
Google has released its updated open-source image dataset Open Image V5 and announced the second Open Images Challenge for this autumn’s 2019 International Conference on Computer Vision (ICCV 2019).
Microsoft Build 2019 is around the corner. From May 6 to 8, developers and software engineers will fill Seattle’s Washington State Convention Center, where Microsoft is expected to announce updates to Windows, Office 365, its Azure cloud computing platform, and other company platforms and services.







































