google-hand-tracking — Image courtesy google Research

Google Releases Real-time Mobile Hand Tracking to R&D Community

By

-

Aug 30, 2019

Google has released to researchers and developers its own mobile device-based hand tracking method using machine learning, something Google Research engineers Valentin Bazarevsky and Fan Zhang call a “new approach to hand perception.”

First unveiled at CVPR 2019 back in June, Google’s on-device, real-time hand tracking method is now available for developers to explore—implemented in MediaPipe, an open source cross-platform framework for developers looking to build processing pipelines to handle perceptual data, like video and audio.

The approach is said to provide high-fidelity hand and finger tracking via machine learning, which can infer 21 3D ‘keypoints’ of a hand from just a single frame.

“Whereas current state-of-the-art approaches rely primarily on powerful desktop environments for inference, our method achieves real-time performance on a mobile phone, and even scales to multiple hands,” Bazarevsky and Zhang say in a blog post.

Google Research hopes its hand-tracking methods will spark in the community “creative use cases, stimulating new applications and new research avenues.”

Bazarevsky and Zhang explain that there are three primary systems at play in their hand tracking method, a palm detector model (called BlazePalm), a ‘hand landmark’ model that returns high fidelity 3D hand keypoints, and a gesture recognizer that classifies keypoint configuration into a discrete set of gestures.

Indie Dev Experiment Brings Google Lens to VR, Showing Real-time Text Translation

Here’s a few salient bits, boiled down from the full blog post:

The BlazePalm technique is touted to achieve an average precision of 95.7% in palm detection, researchers claim.
The model learns a consistent internal hand pose representation and is robust even to partially visible hands and self-occlusions.
The existing pipeline supports counting gestures from multiple cultures, e.g. American, European, and Chinese, and various hand signs including “Thumb up”, closed fist, “OK”, “Rock”, and “Spiderman”.
Google is open sourcing its hand tracking and gesture recognition pipeline in the MediaPipe framework, accompanied with the relevant end-to-end usage scenario and source code, here.

In the future, Bazarevsky and Zhang say Google Research plans on continuing its hand tracking work with more robust and stable tracking, and also hopes to enlarge the amount of gestures it can reliably detect. Moreover, they hope to also support dynamic gestures, which could be a boon for machine learning-based sign language translation and fluid hand gesture controls.

Not only that, but having more reliable on-device hand tracking is a necessity for AR headsets moving forward; as long as headsets rely on outward-facing cameras to visualize the world, understanding that world will continue to be a problem for machine learning to address.

starchaser28

I would love to see a demo of this implemented on the Oculus Quest!
TonyVT SkarredGhost

I’ve tried it on my phone. I am not super-impressed: the tracking is not reliable with artificial light and cluttered environment. The demo doesn’t work with multiple hands. Anyway, it’s a good job… when it works, the 3D tracking is very well performed.
Greyl

Kinda makes the Index controllers look like a good solution in the here and now, but somewhat redundant as camera finger tracking improves.
- George Stewart
  
  But how do you integrate haptics with camera based solutions? It’s exciting for sure, but it isn’t right for every application.
  - Ratm
    
    Current haptics are bit trashy to begin with,only good wheels provide descend feedback.
    Gloves using this google tech is probably the future.
    I wish we had that tracking even without the feedback,looks enough good to even paint with it.
  - Greyl
    
    I guess they could make a small and cheap Bluetooth PC peripheral you strap to your palms, to send haptic feedback. It could even emit IR light to be used to further improve tracking.
Eric Draven

I can’t see how this is different from the Leap Motion device, leaving the “mobile” aside… I tried using Leap with my rift, and it was really cool, but haptics and movement (without a joystick) were the main issues for me, so I moved on to touch controllers
Sponge Bob

wait a minute…

How can they track in 3D with just one camera ?

Impossible by definition

LeapMotion uses sophisticated system of IR ray projections from 2 angles, not just one

There is no 3D tracking with just one camera, Period.
- Lachlan Sleight
  
  There are loads of ways to infer 3D information from a single camera feed. Multiple cameras are used to detect depth using stereo disparity, which is certainly more robust and accurate than many of the one-camera methods.
  
  How do you think Rift DK2, ARKit / ARCore, PS Move, Vuforia etc perform their 3D tracking? Those are all single-camera 3D tracking systems. They all achieve their tracking in different ways.
Addevice

Minimum viable product development and prototype creation are often confused. Let’s clear the air when it comes to these two notions.

Let’s explore MVP software development.
Sebastian Hasselrud

Have somebody tried to implement Blockchain NFTs into VR Concept?
I heard Norwegian company Rasklån.no has tried this out with consumer loans