PyRobot: An Open-source Robotics Framework for Research and Benchmarking

Adithyavairavan Murali, Tao Chen, Kalyan Vasudev Alwala, Dhiraj Gandhi, Lerrel Pinto, Saurabh Gupta, Abhinav Gupta

Introduction

Over the last few years there have been significant advances in AI, specifically in the fields of machine learning, computer vision, natural language processing and speech. Most of these advancements have been fueled by high-capacity neural networks and the availability of large-scale datasets. However, an often overlooked reason for this fast-paced progress has been the development of a conducive research ecosystem. Platforms such as Caffe , PyTorch , TensorFlow have reduced the entry barrier, which has democratized and accelerated research in these fields. For example, a new researcher in computer vision can get started with training state-of-the-art detectors using PyTorch and MSCOCO in less than a day. Common platforms and datasets have also led to standardized evaluations and benchmarks which also helps quantify progress in these areas.

The field of data-driven robotics has also seen tremendous excitement and energy in the past several years . However, compared to other areas in AI, it has been relatively hard for a new researcher to get started and contribute to the progress in robotics. Why is that the case? One obvious reason is that researchers have to set up significant hardware infrastructure. This creates a high entry-barrier for researchers both in terms of financial cost and development time. Fortunately, there has been substantial progress on this front with the development of low-cost robots such as Blue , LoCoBot and others . In fact, the cost of a robot is now comparable to that of the cost of a GPU! However even with these low-cost robots, getting started in robotics is still hard due to the lack of research platforms and a self-sustaining ecosystem.

Frameworks such as ROS have made setting up robots substantially easier by providing a common mid-level communication layer and tools that are agnostic to low-level hardware and program context. However, there are two issues with such open-source frameworks:

ROS requires expertise: Dominant robotic software packages like ROS and MoveIt! are complex and require a substantial breadth of knowledge to understand the full stack of planners, kinematics libraries and low-level controllers. On the other hand, most new users do not have the necessary expertise or time to acquire a thorough understanding of the software stack. A light weight, high-level interface would ease the learning curve for AI practitioners, students and hobbyists interested in getting started in robotics.

Lack of hardware-independent APIs: Writing hardware-independant software is extremely challenging. In the ROS ecosystem, this was partly handled by encapsulating hardware-specific details in the Universal Robot Description Format (URDF) which other downstream services could read from. Yet, from the perspective of high-level AI applications, most robotics code is still hardware dependent. As a community, we lack a research platform and a common API that we can use to share code, datasets and models.

In this white-paper, we attempt to tackle these challenges via an open-source research platform – PyRobot. PyRobot is a light weight, high-level interface on top of ROS that provides hardware independent mid-level APIs and high-level examples for manipulation and navigation. PyRobot also provides libraries for hand-eye calibration, tele-operation, trajectory tracking, and SLAM-based navigation. We believe PyRobot combined with the recently released LoCoBot robot will reduce both the financial cost and development time – leading to democratization of data-driven robotics. The hardware-independent API will lead to development of code and datasets that can be shared across the community. While the current PyRobot release interfaces with LoCoBot and Sawyer, we plan to release integration with several new robots like the UR5 and Franka , and simulator platforms like MuJoCo and Habitat .

PyRobot Framework

PyRobot is a python-based robotics framework that isolates the ROS system from the user-end and supports the same API across different robots (see Figure 2 for an overview). Essentially, it provides a python wrapper around the mid-level features provided by ROS and the low-level C++/C controllers and driver backends. PyRobot has common utility functions for all robots, such as joint position control, joint velocity control, joint torque control, cartesian path planning, forward kinematics and inverse kinematics (based on the robot URDF file), path planning, visual SLAM, among other features. Though it abstracts away the complexity of the underlying software stack, users still have the flexibility to use components at varying levels of the hierarchy, such as commanding low-level velocities and torques by-passing a planner. We summarize the design philosophy behind PyRobot below.

Beginner-friendly. Ideally, new users should be able to start commanding a robot in just a few lines of code, as shown in the Listing 1, without learning ROS or the underlying software and firmware stack.

Hardware-agnostic design. PyRobot is designed to easily accommodate common robotic manipulators and mobile bases. Currently, it supports LoCoBot, a low-cost mobile robot with a 5-DOF manipulator and a Sawyer robot. Each robot has a YACS configuration file that specifies the necessary robot-specific parameters: joint names, ROS topics to get state and set commands, base frame, end-effector frame, planner configuration, inverse kinematics solution tolerance, whether it has an arm or base or camera, etc. A PyRobot object requires the config file for initialization. As shown in Listing 1, the Sawyer robot can be commanded in a manner identical to that of LoCoBot.

Open Source. Robotics systems development has typically been constrained to robotics experts in academia and industry with access to expensive and niche robotics systems. However, the extensive scope of artificial intelligence requires strong collaboration between researchers to build and maintain these large systems and one can contribute to all layers of the stack with open sourcing. Apart from the open software, LoCoBot works as an affordable open hardware that can be easily assembled for use with PyRobot. While simulation is useful for software testing and running experiments, writing software that works on the real robot is the eventual goal of the field and has severe challenges. As more developers have access to both open hardware and software, high quality applications tested on real robots can be publicly shared.

Supported Hardware and Simulators

PyRobot is currently integrated with the following robots. In addition to real robots, PyRobot can also be used to control robots in simulators like Gazebo.

Sawyer: The Sawyer is a 7-DOF collaborative robot arm from Rethink Robotics . PyRobot interfaces with the Intera SDK provided with the Sawyer.

Simulators: PyRobot currently supports Gazebo simulator , a 3D rigid body simulator popular in the robotics community. For LoCoBot and LoCoBot-Lite, PyRobot supports tight integration with Gazebo i.e., the same code can be run on both Gazebo and the real robot.

PyRobot Controllers

While a number of robots come with their own implementations for low-level control, PyRobot implements basic controllers for differential drive bases. It also interfaces with planners such as MoveIt! and Movebase . We measure the performance of these controllers and planners implemented in PyRobot for the LoCoBot base and arm.

PyRobot implements position controllers to command the robot base to a desired target position (parameterized as a 3-DOF pose, $(x,y)$ location of the base and its heading $\theta$ : $[x,y,\theta]$ ). We implement the following three controllers: DWA Controller from Movebase: We implemented Dynamic Window Approach Controller (DWA) for our robot through Movebase navigation engine. In this approach, we repeatedly sample a discrete sequence in the robot’s control space with the highest score and execute the sequence until the target is reached. Proportional Controller: We decompose the motion into an on-spot rotation, linear motion and a final on-spot rotation at the target location. Each segment of this motion is executed using a proportional controller that applies velocities proportional to the tracking error. For smooth motion, we bound the velocities and the change in velocities. Linear Quadratic Regulator: We analytically compute a trajectory (a sharp one that breaks the motion into on-spot rotation, straight motion and a final on-spot rotation; or a smooth one by fitting a bézier curve between the stating state and the ending state). We sample this trajectory to obtain a state trajectory using constraints on maximum linear and angular velocities. We linearize the dynamics of the robot (assumed to be a bicycle model ) around this state trajectory, and construct a LQR feedback controller to track this state trajectory.

Table 1 reports translation and rotation errors for the different controllers for the two robots for these different cases. We generally note that errors are lower for LoCoBot vs. LoCoBot-Lite. Additionally, LQR and proportional controller generally perform better than the DWA controller from Movebase. As all these controllers close the loop on the base odometry, we additionally include errors with respect to base odometry in right part of the table. We observe that the LQR controller is more effective at closing the loop.

PyRobot also implements trajectory tracking (using feedback controllers as described above). We show qualitative comparisons between different controllers in Figure 5.

2 Repeatability Tests for Manipulator

High-Level AI Applications

We discuss implementation of a few example high-level AI applications through the PyRobot API.

Visual SLAM algorithms provide more accurate odometry as compared to odometry that is derived purely from inertial sensors on the base. We deployed ORB-SLAM2 , a leading visual SLAM systems in the PyRobot library. ORB-SLAM2 is a feature-based indirect visual SLAM system that uses ORB features to perform tracking, mapping, and loop closing. We adapt the open-source ORB-SLAM2 code into a ROS package. This package saves RGB and depth images of the keyframes and continuously publishes camera trajectory and camera pose. PyRobot uses this published pose information to return the robot base state and trajectory. This state derived from visual SLAM can be used in downstream controllers or algorithms for more accurate behavior. PyRobot also supports dense map reconstruction, by integrating depth image observations using the ORB-SLAM2 estimated camera pose. This can be used for motion planning for navigation tasks.

2 Navigation via SLAM and Path Planning

We deployed Movebase ROS package on LoCoBot and LoCoBot-Lite for safe navigation in environments with obstacles. We use the occupancy map as obtained from visual SLAM, to compute a 2D cost-map that denotes regions of the environment where the robot is safe to move. Movebase uses this cost-map to generate collision free trajectories to goals specified in the environment. These trajectories can be executed using any of the controllers implemented in PyRobot. These steps are run continuously, and the plan is updated if it becomes infeasible as the robot perceives previously unseen parts of the environment.

3 Learned Visual Navigation

We deploy learned policies for visual navigation on LoCoBot using PyRobot API. We work with the cognitive mapping and planning policy (CMP) from Gupta et al. . Given an input goal location, CMP policy takes in the current image from the on-board camera to output one of four macro-actions (stop, turn left, turn right or go straight). We use the base position control interface in PyRobot API to execute these actions. Listing 9 shows simplified code, and Figure 7 shows frames from a sample execution.

4 Grasping

We deploy a learned-based grasping algorithm to grasp objects placed on the ground from RGB images using the PyRobot API. The model is trained on data from people’s homes and is robust to a wide variety of objects and backgrounds. This model outputs a grasp in the image space. This grasp is parameterized by 2D location in the image and the gripper orientation. We convert this 2D location and orientation into the grasp position (3D location and orientation) using known camera parameters, and the depth image. We command the robot to the pre-grasp location, that is a few centimeter above the grasp position, lower the arm to reach the object, and close the gripper to grasp the object. Listing 10 shows simplified code, and Figure 8 shows sample grasps using the LoCoBot.

5 Pushing

We deploy a heuristic-based pushing algorithm using PyRobot. It relies on the depth sensor, and thus the quality of the pushing depends on how well the stereo-based depth sensor behaves in different background. To achieve the best performance, it is best to place the robot on a floor with non-uniform texture.

The algorithm can be summarized with the following steps: (1) Move the arm out of the camera’s field of view. (2) Filter the point cloud seen by the RGBD camera, specifically removing points too far away and those that correspond to the floor by coordinate thresholding. (3) Project the remaining point cloud onto the xy-plane and use DBSCAN algorithm to automatically cluster the projected points. (4) Randomly select one cluster and choose a random push-start point on the enclosing bounding box of the cluster. (5) Move the gripper to the push-start point and move the gripper horizontally towards the center of the cluster. Listing 11 shows simplified code.

Related Work

Robotics Software Design. The robotics community has embraced a layered hierarchical software design from the early days and re-usability has been a core design principle . We refer readers to Tsardoulias and Mitkas for a comprehensive review. There have been several motion planning libraries such as OpenRave , MoveIt! , OMPL which provide hardware-agnostic core functionalities that can be compiled for each specific robot. In the likes of ROS, there have also been robotics ecosystems, such as OROCOS and the Microsoft Robotics Studio that support kinematic libraries, distributed processes, state machines for the real time control of robots.

Low-cost Mobile Manipulators. There has been very limited research on learning on low-cost robots, given that most researchers use standard industrial or collaborative robots. Deisenroth et al. used model-based RL to teach a cheap inaccurate 6 DOF robot to stack multiple blocks and a previous iteration of LoCoBot was used in Gupta et al. to learn visual grasping policies with real data collected in people’s homes. Recently, Gealy et al. proposed a compliant low-cost arm using quasi-direct drive actuation.

Open Source Manipulators. There has been very limited work in open sourced manipulators. Raven is a open architecture surgical research robot . Recently, the Open Manipulator project from Robotis allows one to build their own low cost robot with custom kinematics and design .

Research Ecosystems in AI Fields. Research in a number of AI fields has benefited from there being common tasks (such as object detection in computer vision or parsing in NLP), common datasets (such as BSDS , ImageNet , PASCAL VOC and MSCOCO in computer vision, or Penn Tree Bank , GLUE , SentEval and WMT in NLP, etc.), and common code bases to experiment with (DPMs , Caffe , Stanford CoreNLP , spaCy , etc.). While some people argue that such use of common tasks and datasets can prevent creative progress, at the same time, it has lead to rapid progress in these fields, as researchers can quickly replicate results and build upon each other work.

Benchmarking in Robotics. Benchmarking in robotics is extremely challenging given the vast scope of applications and diversity of physical test conditions (hardware, objects, environment, etc.). It is a well acknowledged concern within the robotics community that we are yet to develop reliable benchmarking metrics that can be widely adopted to quantify research progress. Several workshops have tried to stimulate discourse towards this end and different task specific metrics have been proposed for grasping , gripper design , SLAM , etc. Research has also benefited from creating object datasets with shape and grasp information, such as the Columbia Grasp Database , DexNet and KIT Object Models , which could be used for perception and motion planning. The YCB dataset went a step further by distributing a physical dataset of household and kitchen objects with corresponding meta data (shape, RGBD scans, etc) . While there is no consensus yet on benchmarking in robotics, we hope that the combination of PyRobot and LoCoBot will facilitate further discussion.

Discussion

In this paper, we describe the PyRobot framework, which provides a high-level hardware independent API to control different robots. We believe PyRobot when combined with low-cost robots such as LoCoBot, will reduce the barrier to entry into robotics. In the immediate future, we will continue to grow the functionality in PyRobot such as by interfacing with simulators (like AI Habitat , Gibson and MuJoCo ), improving controllers such as be implementing gravity compensation for LoCoBot. But more broadly, we believe PyRobot will lead to the development of a research and teaching ecosystem.

PyRobot for robotics instruction. Having a beginner-friendly and open architecture is great for robotics education, as affordable robotic setups with LoCoBot and PyRobot could easily be assembled and scaled for hands-on instruction. 10 LoCoBots were used in the Spring 2019 offering of 16-662: Robot Autonomy (by Professor Oliver Kroemer) in the Robotics Institute at CMU, to support homework assignments and projects. We believe many more such courses will follow.

PyRobot as a research ecosystem. Compared to other fields, benchmarking in robotics is challenging due to several reasons. PyRobot’s unified API and LoCoBot’s standard hardware, will allow researchers to share their high level algorithmic implementations, models and datasets collected on a real robot. This will allow researchers to collaborate and iterate faster on robotics applications. We will continue to expand the set of pre-trained models. Hopefully, other researchers will find the PyRobot framework useful and contribute their models for others to use as well.

Acknowledgements

We would like to thank Soumith Chintala for countless discussions and providing software engineering guidance. We would also like to thank Deepak Pathak and Shubham Tulsiani for testing, advice and discussions. Finally, we would like to thank Oliver Kroemer, Timothy Lee and Mohit Sharma for introducing LoCoBots in teaching 16-662: Robot Autonomy at CMU, and Justin MacEy for helping with the motion capture experiments.