IBM Fine-Grained Visual Recognition Powers AR Tech Support

A broken computer, appliance or car used to mean a visit to a technician or mechanic, but the recent proliferation of DIY videos has more people performing such repair jobs themselves. Now, a pair of IBM researchers have taken instructional video to the next level, with a new fine-grained visual recognition approach and augmented reality (AR) system that can look at the actual piece of hardware being working on and integrate real-time, step-by-step tech support and guidance.

The researchers say the proposed method can increase the rate of first-time fixes and reduce hardware disruption recovery time by automatically detecting the state of an object and presenting the right set of information in the right context.

AR basically overlays media and graphics on what we see in the real world. Major technological advances and the increased availability of AR software development kits (SDKs) such as ARKit and ARCore over the last decade have lowered the entry barrier for AR developers. In recent years, machine learning has informed the emergence of intelligent systems that further enhance the AR experience.

Despite all the progress, the IBM researchers say most AR experiences remain primitive and lack intelligence and automation, which results in unintuitive user interactions. As users seek increased automation and new and more natural interactions in AR, new visual recognition techniques will have to be developed.

“Our research addresses this gap and provides enriched AR user experiences by enabling a more fine-grained visual recognition feature in AR, which is desirable in a wide range of application scenarios including technical support,” the researchers say.

The proposed solution leverages AR specific data such as real-time generated 3D feature points and camera pose to complement images captured by the camera for fine-grained visual recognition.

The researchers first used a set of training videos to learn Regions of Interest (RoIs) where appearance changes can distinguish different states. Because these videos included images that were motion-blurred or unclear due to viewing angle, etc., a filtered RoI image set with device movement and rotation speed actively tracked was created to minimize occlusions or other noise in the images used to train the visual recognition model.

Steps for object model construction and RoI identificationSteps for object model construction and RoI identification.png — *Steps for object model construction and RoI identification*

To improve recognition robustness the researchers developed a Discrete-CNN-LSTM (DCL) model comprising a discrete multi-stream convolutional neural network with bi-directional long short-term memory. The model can extract both spatial and temporal data to predict state changes.

The researchers compared their DCL models with LW (a naive light-weight CNN model) and representative object recognition model VGG16. In RoI capturing, the DCL model had the highest accuracy, 99.87 percent.

accuracy results.png — *Accuracy of different models*

The researchers say their fine-grained visual recognition method will enable AR systems to provide a more immersive and intuitive user experience. Its ability to detect very subtle changes to tiny connectors etc. and guide users reliably through repair processes could open the door to a wide range of automatic and immersive AR-based self-assist user experiences.

The researchers have built an iOS application using ARKit and Tensorflow to demonstrate the effectiveness of their solution and provided comprehensive evaluations in a hardware maintenance application scenario.

The paper Fine-Grained Visual Recognition in Mobile Augmented Reality for Technical Support is on IEEE Xplore.

Reporter: Yuan Yuan | Editor: Michael Sarazen

Synced Report | A Survey of China’s Artificial Intelligence Solutions in Response to the COVID-19 Pandemic — 87 Case Studies from 700+ AI Vendors

This report offers a look at how China has leveraged artificial intelligence technologies in the battle against COVID-19. It is also available on Amazon Kindle. Along with this report, we also introduced a database covering additional 1428 artificial intelligence solutions from 12 pandemic scenarios.

Click here to find more reports from us.

We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.

4 comments on “IBM Fine-Grained Visual Recognition Powers AR Tech Support”

Pingback: [R] IBM Fine-Grained Visual Recognition Powers AR Tech Support – tensor.io
Pingback: [R] IBM Fine-Grained Visual Recognition Powers AR Tech Support > Seekalgo
Mark Dark

2024-11-28

Tech trends are always exciting to watch, especially when it comes to AI and connectivity—it’s crazy how quickly things are evolving. But staying up-to-date with these advancements means your devices and networks need to work seamlessly. I’ve run into issues with network settings slowing me down, especially when testing new apps or tools. That’s when I used Simbase to fix my APN settings, and it made a huge difference. Having a smooth connection lets you fully dive into exploring these tech https://www.simbase.com/blog/mobile-signal-strength-guidelines trends without annoying interruptions, which is a must if you’re as tech-obsessed as I am!

Loading...

RichardKous

2025-11-07

I believe that wedding rings are a category of jewelry where uniqueness is paramount. No one wants to have the same symbol of love as their neighbor. Custom design allows you to create rings that reflect your unique story: from personalized engraving to a unique combination of gold and gemstones.

The process of creating unique wedding rings begins with a design that takes into account all the couple’s personal preferences https://olertis.com/, guaranteeing perfect quality and fit. If you’re looking for a custom jewelry studio that can help you realize your personal dream of a custom ring.

Loading...

IBM Fine-Grained Visual Recognition Powers AR Tech Support

Like this:

4 comments on “IBM Fine-Grained Visual Recognition Powers AR Tech Support”

Leave a Reply Cancel reply

Related

Share this:

Like this:

4 comments on “IBM Fine-Grained Visual Recognition Powers AR Tech Support”

Leave a Reply Cancel reply

Related