DiscoBand: Multiview Depth-Sensing Smartwatch Strap for Hand, Body and Environment Tracking
Hands are the chief appendage with which humans manipulate the world around them, and for this reason, digitization of the hands for use in interactive computing systems has been sought after for half a century. Applications vary tremendously, including smart environments with hand tracking, sign language recognition, gesture sensing smartwatches, and whole-hand replication in virtual reality for object manipulation. Approaches generally fall into one of three categories. First, we can instrument both the user and the environment, for example with optical, acoustic, magnetic or other markers worn by the user and sensed by external sensors. Second, it is possible to instrument only the environment, for example with cameras and use computer vision to track a user’s body pose. Finally, we can instrument only the user, for example with body-worn IMUs, cameras, and many other types of active and passive devices. The latter category has the significant benefit of being mobile (i.e., self-contained and not limited to a specially-instrumented room), and generally encounters less occlusion from objects in the environment (e.g., furniture) and the user themselves, depending on the instrumentation point.
The wrist is a particularly popular instrumentation point from which to sense the hands for three key reasons. First and foremost, it is a common place to wear jewelry, watches, and other bands. Second, it is a highly practical location to affix a small device to the body. And third, wrists are proximate to a user’s hands, offering the potential for superior data capture. For these same reasons, we too focus on the wrist location. Also similar to prior work, we employ optical sensors to capture the hand pose. However, as we will discuss, many prior systems employing such sensors have had to elevate components centimeters above the skin in order to achieve reliable line-of-sight, which results in form factors less amenable for consumer adoption. Additionally, techniques using cameras tend to elevate privacy concerns (unless otherwise noted, when we discuss "cameras" in regards to related work, we are referring to commonly-used, high-resolution, RGB or infrared cameras).
In response, we set out to create a sensor band that is comparable to a smartwatch strap: thin and self contained. For this, we use low-resolution (8×8 pixels), ultra-small (4.9×2.5×1.6 mm) depth cameras. To mitigate natural occlusion (e.g., fingers blocking line-of-sight to other features), we use eight sensors distributed around the periphery of the wrist. When one view is occluded, the others are generally not, and in this manner they can work together to composite a live 3D point cloud and resolve a probable hand pose. Our band also features eight additional depth sensors facing outwards (i.e., normal to the skin), which we use to track the arm and upper body. Taken together, these sixteen depth cameras capture a fisheye-like 1024 point cloud, the rays of which are reminiscent of a disco ball reflecting light, and so we dubbed our prototype DiscoBand. The sensors we use have a range of 4 m, allowing us to also capture the proximate environment, opening other application areas discussed later.
Overall, DiscoBand offers a unique combination of features and properties that differentiate it from prior work. First and foremost, our band is thin, and could be plausibly integrated into future smartwatches. Second, our multi-view approach is inherently more robust to occlusion than single-view methods. Third, our low-resolution depth data is more privacy preserving than conventional camera-based wrist systems. Finally, our band's unique design and data opens entirely new capabilities not previously demonstrated with wrist-worn setups, including the ability to estimate user upper body pose, detect held objects, and scan the environment for obstacles and contextual clues. In this paper, we document the implementation of DiscoBand and different example applications we explored, as well as report results from a series of user studies that underscore the potential of our approach.
Research Team: Nathan Devrio, Chris Harrison
Citation
Nathan Devrio and Chris Harrison. 2022. DiscoBand: Multiview Depth-Sensing Smartwatch Strap for Hand, Body and Environment Tracking. In Proceedings of the 35th Annual ACM Symposium on User Interface Software and Technology (UIST '22). Association for Computing Machinery, New York, NY, USA, Article 56, 1–13. https://doi.org/10.1145/3526113.3545634