MITB Banner

Inside PyTouch, Facebook’s ML Library For Touch Processing

Share

Facebook AI recently launched an open-source machine learning library, PyTouch, to process touch sensing signals. It provides state-of-the-art touch processing capabilities as a service to unify the tactile sensing community, and help build scalable, proven, performance-validated modules. The library is currently available on GitHub

With the increased availability of tactile sensors, the sense of touch is becoming a new paradigm in robotics and machine learning. However, ready-to-use touch processing software is limited, resulting in a high entry barrier for budding developers. The processing of raw sensor measurements into high-level features is challenging.

On the other hand, computer vision has algorithmic and programmatic methods for understanding images and videos. The popular open-source libraries such as Google’s TensorFlow, PyTorch, CAFFE, OpenCV have further accelerated the research by providing unified interfaces, algorithms and platforms. 

Even though tools like PyTorch and CAFFEE can be used for touch processing, precursor development is needed to support algorithms for the experiment and research needs. PyTouch provides an entry point here. The library has been designed to support beginners as well as experts. 

With PyTouch, Facebook aims to help researchers develop machine learning models that seamlessly process touch sensing signals. “Sensing the world through touch opens exciting new challenges and opportunities to measure, understand and interact with the world around us,” said Facebook. 

“We believe that similar to computer vision, the availability of open-source and maintained software libraries for processing touch reading would lessen the barrier of entry to tactile based tasks, experimentation, and research in the touch sensing domain,” said Facebook.  

PyTouch architecture 

In a paper called ‘PyTouch: A Machine Learning Library for Touch Processing,” co-authored by Mike Lambeta, Huazhe Xu, Jingwei Xu, Po-Wei Chou, Shaoxiong Wang, Trevor Darrell, and Roberto Calandra, the researchers have described the architectural choice of library and demonstrated its capabilities and benefits through several experiments.

The image depicts PyTouch architecture, where tactile touch processing is delivered to the end application ‘as a service’ through released pre-trained models. (Source: arXiv.org) 

As shown in the image above, the software library modularises a set of commonly used tactile-processing functions valuable for various downstream tasks like tactile manipulation, object recognition based on touch, slip detection, etc. With this architecture, PyTouch is dialling up the efforts to standardise robotics and machine learning research for better benchmarks and more reproducible results.

Most importantly, the library aims to standardise how touch-based experiments are designed and look to reduce the amount of individual software developed, keeping the PyTouch library as a foundation for expanding future research applications. 

Highlights: 

  • PyTouch is built on the machine learning framework PyTorch. 
  • Built on a library of pre-trained models, PyTouch provides real-time touch processing functionalities. 
  • Provides functions such as contact classification, slip detection, contact area estimation, and interfaces for training and transfers learning
  • The library can train models using data from other vision or non-vision based tactile sensors. 
  • PyTouch allows performance benchmarking of real-world experiments of creating a tactile task baseline. 

“Finally, in hand with the framework, we have released a set of pre-trained models which PyTouch uses in the background for tactile based tasks,” said Facebook.

Performance

Facebook has evaluated the performance of machine learning models trained across different models of vision-based tactile sensors, including DIGIT, OmniTact and GelSight.

(Source: arXiv.org)

The above table shows the classification accuracy [%] of touch detection (mean and standard) using cross-validation (k = 5). The joint models are trained with data from all three sensors, including DIGIT, OmniTact and GelSight. The cross-validation accuracy with varying train dataset size for single and joint models are shown below.

(Source: arXiv.org)

The experiments showed the same amount of data that training a joint model using data across multiple sensors (DIGIT, OmniTact and GelSight) results in better model performance than training from a single sensor.

Showcasing examples of data used in training touch prediction models. The dataset includes data across several DIGITs, OmniTact and GelSight sensors showing different lighting conditions and objects of various spatial resolutions. (Source: arXiv.org)

Road ahead

Facebook is looking to create an extendable library for touch processing similar to what PyTorch and OpenCV are for computer vision.

PyTouch is still in the early days. With multiple pre-trained models in place, it will allow researchers to focus on rapid prototyping. “We believe that this would beneficially impact the robotic and machine learning community by enabling new capabilities and accelerate research,” concluded Facebook.

PS: The story was written using a keyboard.
Share
Picture of Amit Raja Naik

Amit Raja Naik

Amit Raja Naik is a seasoned technology journalist who covers everything from data science to machine learning and artificial intelligence for Analytics India Magazine, where he examines the trends, challenges, ideas, and transformations across the industry.
Related Posts

CORPORATE TRAINING PROGRAMS ON GENERATIVE AI

Generative AI Skilling for Enterprises

Our customized corporate training program on Generative AI provides a unique opportunity to empower, retain, and advance your talent.

Upcoming Large format Conference

May 30 and 31, 2024 | 📍 Bangalore, India

Download the easiest way to
stay informed

Subscribe to The Belamy: Our Weekly Newsletter

Biggest AI stories, delivered to your inbox every week.

AI Courses & Careers

Become a Certified Generative AI Engineer

AI Forum for India

Our Discord Community for AI Ecosystem, In collaboration with NVIDIA. 

Flagship Events

Rising 2024 | DE&I in Tech Summit

April 4 and 5, 2024 | 📍 Hilton Convention Center, Manyata Tech Park, Bangalore

MachineCon GCC Summit 2024

June 28 2024 | 📍Bangalore, India

MachineCon USA 2024

26 July 2024 | 583 Park Avenue, New York

Cypher India 2024

September 25-27, 2024 | 📍Bangalore, India

Cypher USA 2024

Nov 21-22 2024 | 📍Santa Clara Convention Center, California, USA

Data Engineering Summit 2024

May 30 and 31, 2024 | 📍 Bangalore, India