RSS2017

Abstract:
In this paper, using the Hypercube Diagonal Experiment we first investigate the convergence rates of sampling-based path-planning algorithms in terms of the dimensionnality of the search space. We show that the probability of sampling a point that improves the solution decreases exponentially with the dimension of the problem. We then analyze how the samples can be repositioned in the search space in order to minimize the approximation error. Finally, we present the DRRT (Deformable Rapidly Exploring Random Tree) algorithm that utilizes optimization of sample location in the framework of RRT algorithms to improve convergence. It is shown that the DRRT algorithm significantly outperforms all current sampling-based algorithms in terms of convergence.

Abstract:
We address the problem of uncertainty-aware local collision avoidance within the context of time-to-collision based navigation of multiple agents. We consider two specific models that account for uncertainty in the future trajectories of interacting agents: an isotropic model which conservatively considers all possible errors, and an adversarial model that assumes the error is towards a head-on collision. We compare the two models experimentally via a number of simulation scenarios, and also provide theoretical guarantees about the collision avoidance behavior of the agents.

Abstract:
Unlike traditional third-person cameras mounted on robots, a first-person camera, captures a person's visual sensorimotor object interactions from up close. In this paper, we study the tight interplay between our momentary visual attention and motor action with objects from a first-person camera. We propose a concept of action-objects---the objects that capture person's conscious visual (watching a TV) or tactile (taking a cup) interactions. Action-objects may be task-dependent but since many tasks share common person-object spatial configurations, action-objects exhibit a characteristic 3D spatial distance and orientation with respect to the person. We design a predictive model that detects action-objects using EgoNet, a joint two-stream network that holistically integrates visual appearance (RGB) and 3D spatial layout (depth and height) cues to predict per-pixel likelihood of action-objects. Our network also incorporates a first-person coordinate embedding, which is designed to learn a spatial distribution of the action-objects in the first-person data. We demonstrate EgoNet's predictive power, by showing that it consistently outperforms previous baseline approaches. Furthermore, EgoNet also exhibits a strong generalization ability, i.e., it predicts semantically meaningful objects in novel first-person datasets. Our method's ability to effectively detect action-objects could be used to improve robots' understanding of human-object interactions.

Abstract:
We study the effectiveness of metrics for Multi-Robot Motion-Planning (MRMP) when using RRT-style sampling-based planners. These metrics play the crucial role of determining the nearest neighbors of configurations and in that they regulate the connectivity of the underlying roadmaps produced by the planners and other properties like the quality of solution paths. After screening over a dozen different metrics we focus on the five most promising ones' two more traditional metrics, and three novel ones which we propose here, adapted from the domain of shape-matching. In addition to the novel multi-robot metrics, a central contribution of this work are tools to analyze and predict the effectiveness of metrics in the MRMP context. We identify a suite of possible substructures in the configuration space, for which it is fairly easy (i) to define a so-called natural distance, which allows us to predict the performance of a metric. This is done by comparing the distribution of its values for sampled pairs of configurations to the distribution induced by the natural distance; (ii) to define equivalence classes of configurations and test how well a metric covers the different classes. We provide experiments that attest to the ability of our tools to predict the effectiveness of metrics: those metrics that qualify in the analysis yield higher success rate of the planner with fewer vertices in the roadmap. We also show how combining several metrics together leads to better results (success rate and size of roadmap) than using a single metric.

Abstract:
This paper explores the application of Koopman operator theory to the control of robotic systems. The operator is introduced as a method to generate data-driven models that have utility for model-based control methods. We then motivate the use of the Koopman operator towards augmenting model-based control. Specifically, we illustrate how the operator can be used to obtain a linearizable data-driven model for an unknown dynamical process that is useful for model-based control synthesis. Simulated results show that with increasing complexity in the choice of the basis functions, a closed-loop controller is able to invert and stabilize a cart- and VTOL-pendulum systems. Furthermore, the specification of the basis function are shown to be of importance when generating a Koopman operator for specific robotic systems. Experimental results with the Sphero SPRK robot explore the utility of the Koopman operator in a reduced state representation setting where increased complexity in the basis function improve open- and closed-loop controller performance in various terrains, including sand.

Abstract:
Simulators are powerful tools for reasoning about a robot's interactions with its environment. However, when simulations diverge from reality, that reasoning becomes less useful. In this paper, we show how to close the loop between liquid simulation and real-time perception. We use observations of liquids to correct errors when tracking the liquid's state in a simulator. Our results show that closed-loop simulation is an effective way to prevent large divergence between the simulated and real liquid states. As a direct consequence of this, our method can enable reasoning about liquids that would otherwise be infeasible due to large divergences, such as reasoning about occluded liquid.

Abstract:
Many practical tasks in robotic systems, such as cleaning windows, writing or grasping, are inherently constrained. Learning policies subject to constraints is a challenging problem. We propose a \emphlocally weighted constrained projection learning method (LWCPL) that first estimates the constraint and then exploits this estimate across multiple observations of the constrained motion to learn an unconstrained policy. The generalization is achieved by projecting the unconstrained policy onto a new, previously unseen, constraint. We do not require any prior knowledge about the task or the policy, so we can use generic regressors to model the task and the policy. However, any prior beliefs about the structure of the motion can be expressed by choosing task-specific regressors. In particular, we can use robot kinematics and motion priors to improve the accuracy. Our evaluation results show that LWCPL outperform the state of the art method in accuracy of learning the constraints as well as the unconstrained policy, even in noisy conditions. We have validated our method by learning a wiping task from human demonstration on flat surfaces and reproducing it on an unknown curved surface using a force/torque based controller to achieve tool alignment. We show that, despite of the differences between the training and validation scenarios, we learn a policy that still provides the desired wiping motion.

Abstract:
In the adaptive information gathering problem, a policy is required to select an informative sensing location using the history of measurements acquired thus far. While there is an extensive amount of prior work investigating effective practical approximations using variants of Shannon's entropy, the efficacy of such policies heavily depends on the geometric distribution of objects in the world. On the other hand, the principled approach of employing online POMDP solvers is rendered impractical by the need to explicitly sample online from a posterior distribution of world maps. We present a novel data-driven imitation learning framework to efficiently train information gathering policies. The policy imitates a clairvoyant oracle - an oracle that at train time has full knowledge about the world map and can compute maximally informative sensing locations. We analyze the learnt policy by showing that offline imitation of a clairvoyant oracle is implicitly equivalent to online oracle execution in conjunction with posterior sampling. This observation allows us to obtain powerful near-optimality guarantees for information gathering problems possessing an adaptive sub-modularity property. As demonstrated on a spectrum of 2D and 3D exploration problems, the trained policies enjoy the best of both worlds - they adapt to different world map distributions while being computationally inexpensive to evaluate.

Abstract:
Reward function design and exploration time are arguably the biggest obstacles to the deployment of reinforcement learning (RL) agents in the real world. In many real-world tasks, designing a reward function takes considerable hand engineering and often requires additional and potentially visible sensors to be installed just to measure whether the task has been executed successfully. Furthermore, many interesting tasks consist of multiple implicit intermediate steps that must be executed in sequence. Even when the final outcome can be measured, it does not necessarily provide feedback on these intermediate steps or sub-goals. To address these issues, we propose leveraging the abstraction power of intermediate visual representations learned by deep models to quickly infer perceptual reward functions from small numbers of demonstrations. We present a method that is able to identify key intermediate steps of a task from only a handful of demonstration sequences, and automatically identify the most discriminative features for identifying these steps. This method makes use of the features in a pre-trained deep model, but does not require any explicit specification of sub-goals. The resulting reward functions, which are dense and smooth, can then be used by an RL agent to learn to perform the task in real-world settings. To evaluate the learned reward functions, we present qualitative results on two real-world tasks and a quantitative evaluation against a human-designed reward function. We also demonstrate that our method can be used to learn a complex real-world door opening skill using a real robot, even when the demonstration used for reward learning is provided by a human using their own hand. To our knowledge, these are the first results showing that complex robotic manipulation skills can be learned directly and without supervised labels from a video of a human performing the task. Supplementary material and dataset are available at sermanet.github.io/rewards

Abstract:
Our goal is to efficiently learn reward functions encoding a human's preferences for how a dynamical system should act. There are two challenges with this. First, in many problems it is difficult for people to provide demonstrations of the desired system trajectory (like a high-DOF robot arm motion or an aggressive driving maneuver), or to even assign how much numerical reward an action or trajectory should get. We build on work in label ranking and propose to learn from preferences (or comparisons) instead: the person provides the system a relative preference between two trajectories. Second, the learned reward function strongly depends on what environments and trajectories were experienced during the training phase. We thus take an active learning approach, in which the system decides on what preference queries to make. A novel aspect of our work is the complexity and continuous nature of the queries: continuous trajectories of a dynamical system in environments with other moving agents (humans or robots). We contribute a method for actively synthesizing queries that satisfy the dynamics of the system. Further, we learn the reward function from a continuous hypothesis space by maximizing the volume removed from the hypothesis space by each query. We assign weights to the hypothesis space in the form of a log-concave distribution and provide a bound on the number of iterations required to converge. We show that our algorithm converges faster to the desired reward compared to approaches that are not active or that do not synthesize queries in an autonomous driving domain. We then run a user study to put our method to the test with real people.

Abstract:
Trajectory generation approaches for mobile robots generally aim to optimize with respect to a cost function such as energy, execution time, or other mission-relevant parameters within the constraints of vehicle dynamics and obstacles in the environment. We propose to add the cost of state observability to the trajectory optimization in order to ensure fast and accurate state estimation throughout the mission while still respecting the constraints of vehicle dynamics and the environment. Our approach finds a dynamically feasible estimation-optimized trajectory in a sequence of connected convex polytopes representing free space in the environment. In addition, we show a statistical procedure that enables observability-aware trajectory optimization for heterogeneous states in the system both in magnitude and units, which was not supported in previous formulations. We validate our approach with extensive simulations of a visual-inertial state estimator on an aerial platform as a specific realization of our general method. We show that the optimized trajectories lead to more accurate navigation while eliminating the need for a separate calibration procedure.

Abstract:
The overarching goal of this work is to efficiently enable end-users to correctly anticipate a robot's behavior in novel situations. Since a robot's behavior is often a direct result of its underlying objective function, our insight is that end-users need to have an accurate mental model of this objective function in order to understand and predict what the robot will do. While people naturally develop such a mental model over time through observing the robot act, this familiarization process may be lengthy. Our approach reduces this time by having the robot model how people infer objectives from observed behavior, and then it selects those behaviors that are maximally informative. The problem of computing a posterior over objectives from observed behavior is known as Inverse Reinforcement Learning (IRL), and has been applied to robots learning human objectives. We consider the problem where the roles of human and robot are swapped. Our main contribution is to recognize that unlike robots, humans will not be exact in their IRL inference. We thus introduce two factors to define candidate approximate-inference models for human learning in this setting, and analyze them in a user study in the autonomous driving domain. We show that certain approximate-inference models lead to the robot generating example behaviors that better enable users to anticipate what it will do in novel situations. Our results also suggest, however, that additional research is needed in modeling how humans extrapolate from examples of robot behavior.

Abstract:
XPose is a new touch-based interactive system for photo taking, designed to take advantage of the autonomous flying capability of a drone-mounted camera. It enables the user to interact with photos directly and focus on taking photos instead of piloting the drone. XPose introduces a two-stage eXplore-and-comPose approach to photo taking in static scenes. In the first stage, the user explores the ''photo space'' through predefined interaction modes: Orbit, Pano, and Zigzag. Under each mode, the camera visits many points of view (POVs) and takes exploratory photos through autonomous drone flying. In the second stage, the user restores a selected POV with the help of a gallery preview and uses direct manipulation gestures to refine the POV and compose a final photo. Our prototype implementation, based on a Parrot Bebop quadcopter, relies mainly on a single monocular camera and works reliably in a GPS-denied environment. A systematic user study indicates that XPose results in more successful user performances in photo-taking tasks than the touchscreen joystick interface widely used in commercial drones today.

Abstract:
As drones and autonomous cars become more widespread it is becoming increasingly important that robots can operate safely under realistic conditions. The noisy information fed into real systems means that robots must use estimates of the environment to plan navigation. Efficiently guaranteeing that the resulting motion plans are safe under these circumstances has proved difficult. We examine how to guarantee that a trajectory or policy is safe with only imperfect observations of the environment. We examine the implications of various mathematical formalisms of safety and arrive at a mathematical notion of safety of a long-term execution, even when conditioned on observational information. We present efficient algorithms that can prove that trajectories or policies are safe with much tighter bounds than in previous work. Notably, the complexity of the environment does not affect our method's ability to evaluate if a trajectory or policy is safe. We then use these safety checking methods to design a safe variant of the RRT planning algorithm.

Abstract:
Time series classification is an important task in robotics that is often solved using supervised machine learning. However, classifier models are typically not `readable' in the sense that humans cannot intuitively learn useful information about the relationship between inputs and outputs. In this paper, we address the problem of rich time series classification where we propose a novel framework for finding a temporal logic classifier specified in a human-readable form. The classifier is represented as a signal temporal logic (STL) formula that is expressive in capturing spatial, temporal and logical relations from a continuous-valued dataset over time. In the framework, we first find a set of representative logical formulas from the raw dataset, and then construct an STL classifier using a tree-based clustering algorithm. We show that the framework runs in polynomial time and validate it using simulated examples where our framework is significantly more efficient than the closest existing framework (up to 920 times faster).

Abstract:
Real world scenarios contain many structural patterns that, if appropriately extracted and modeled, can be used to reduce problems associated with sensor failure and occlusions, while improving planning methods in tasks such as navigation and grasping. This paper devises a novel unsupervised procedure that is able to learn 3D structures from unorganized point clouds as occupancy maps. Our framework enables the learning of unique and arbitrarily complex features using a Bayesian Convolutional Variational Auto-Encoder that compresses local information into a latent low-dimensional representation and then decodes it back in order to reconstruct the original scene. This reconstructive model is trained on features obtained automatically from a wide variety of scenarios to improve its generalization and interpolative powers. We show that the proposed framework is able to recover partially missing structures and reason over occlusion with high accuracy, while maintaining a detailed reconstruction of observed areas. To seamlessly combine this localized feature information into a single global structure, we employ a Hilbert Map, recently proposed as a robust and efficient occupancy mapping technique. Experimental tests are conducted in large-scale 2D and 3D datasets, and a study on the impact of various accuracy/speed trade-offs is provided to assess the limits of the proposed framework.

Abstract:
Pose estimation is central to several robotics applications such as registration, hand-eye calibration, SLAM, etc. Online pose estimation methods typically use Gaussian distributions to describe the uncertainty in the pose parameters. Such a description can be inadequate when using parameters such as unit-quaternions that are not unimodally distributed. A Bingham distribution can effectively model the uncertainty in unit-quaternions, as it has antipodal symmetry and is defined on a unit-hypersphere. A combination of Gaussian and Bingham distributions is used to develop a linear filter that accurately estimates the distribution of the pose parameters, in their true space. To the best of our knowledge our approach is the first implementation to use a Bingham distribution for 6 DoF pose estimation. Experiments assert that this approach is robust to initial estimation errors as well as sensor noise. Compared to state of the art methods, our approach takes fewer iterations to converge onto the correct pose estimate. The efficacy of the formulation is illustrated with a number of simulated examples on standard datasets as well as real-world experiments.

Abstract:
We provide a unified probabilistic framework for trajectory estimation and planning. The key idea is to view these two problems, usually considered separately, as a single problem. At each time-step the robot is tasked with finding the complete continuous-time trajectory from start to goal. This can be quite difficult; the robot must contend with a potentially high-degree-of-freedom (DOF) trajectory space, uncertainty due to limited sensing capabilities, model inaccuracy, and the stochastic effect of executing actions, and the robot must find the solution in (faster than) real time. To overcome these challenges, we build on recent probabilistic inference approaches to continuous-time localization and mapping and continuous-time motion planning. We solve the joint problem by iteratively recomputing the maximum a posteriori trajectory conditioned on all available sensor data and cost information. Finally, we evaluate our framework empirically in both simulation and on a mobile manipulator.

Abstract:
Robust robotic perception and manipulation of household objects requires the ability to detect, localize and manipulate a wide variety of objects, which may be mirror reflective like polished metal, glossy like smooth plastic, or transparent like glass; for example, picking a metal fork out of a sink full of running water or screwing a metal nut onto a bolt. Existing perceptual approaches based on photographs only take into account the average intensity of light arriving at each pixel from one direction, which limits their ability to account for these non-Lambertian scenes. To address this problem, we demonstrate time-lapse light field photography with an eye-in-hand camera of a manipulator robot. An eye-in-hand robot can capture both the intensity of rays, as in a conventional photograph, as well as the direction of the rays. We present a formal model for robotic light-field photography that fits into a probabilistic robotics framework. Using this model, we can synthesize orthographic photographs, remove specular highlights from those photographs, and perform 3D reconstruction with a monocular camera by finding approximate maximum-likelihood estimates. This information can be used to detect, localize and manipulate non-Lambertian objects in non-Lambertian scenes: our approach enables the Baxter robot to pick a shiny metal fork out of a sink filled with running water 24/25 times, as well as to localize objects well enough to screw a nut onto a quarter inch bolt. The techniques in this paper point the way toward new approaches to robotic perception that leverage a robot's ability to move its camera to infer the state of the external world.

Abstract:
Autonomous navigation of miniaturized robots (e.g., nano/pico aerial vehicles) is currently a grand challenge for robotics research, due to the need for processing a large amount of sensor data (e.g., camera frames) with limited on-board computational resources. In this paper we focus on the design of a visual-inertial odometry (VIO) system in which the robot estimates its ego-motion (and a landmark-based map) from on- board camera and IMU data. We argue that scaling down VIO to miniaturized platforms (without sacrificing performance) requires a paradigm shift in the design of perception algorithms, and we advocate a co-design approach in which algorithmic and hardware design choices are tightly coupled. Our contribution is four-fold. First, we discuss the VIO co-design problem, in which one tries to attain a desired resource-performance trade-off, by making suitable design choices (in terms of hardware, algorithms, implementation, and parameters). Second, we characterize the design space, by discussing how a relevant set of design choices affects the resource-performance trade-off in VIO. Third, we provide a systematic experiment-driven way to explore the design space, towards a design that meets the desired trade-off. Fourth, we demonstrate the result of the co-design process by providing a VIO implementation on specialized hardware and showing that such implementation has the same accuracy and speed of a desktop implementation, while requiring a fraction of the power.

Abstract:
A fundamental requirement for legged robots is to maintain balance and prevent potentially damaging falls whenever possible. As a response to outside disturbances, fall prevention can be achieved by a combination of active balancing actions, e.g. through ankle torques and upper-body motion, and through reactive step placement. While it is widely accepted that stepping is required to respond to large disturbances, the limits of active motions on balancing and step recovery are only well understood for the simplest of walking models. Recent advances in convex optimization-based verification and control techniques enable a more complete understanding of the limits and capabilities of more complex models. In this work, we present an algorithmic approach for formal analysis of the viable-capture basins of walking robots, calculating both inner and outer approximations and corresponding push recovery control strategies. Extending beyond the classic Linear Inverted Pendulum Model (LIPM), we analyze a series of centroidal momentum based planar walking models, examining the effects of center of mass height, angular momentum, and impact dynamics during stepping on capturability. This formal analysis enables an explicit calculation of the differences between these models, and assessment of whether the simplest models ultimately sacrifice capability, and thus stability, when designing push recovery control policies.

Abstract:
There has been a great deal of progress in developing probabilistically complete methods that move beyond motion planning to multi-modal problems including various forms of task planning. This paper presents a general-purpose formulation of a large class of discrete-time planning problems, with hybrid state and action spaces. The formulation characterizes conditions on the submanifolds in which solutions lie, leading to a characterization of robust feasibility that incorporates dimensionality-reducing constraints. It then connects those conditions to corresponding conditional samplers that are provided as part of a domain specification. We present domain-independent sample-based planning algorithms and show that they are both probabilistically complete and computationally efficient on a set of challenging benchmark problems.

Abstract:
In this paper, we develop an algorithm for intent inference via goal disambiguation with a shared-control assistive robotic arm. Assistive systems are often required to infer human intent and this often is a bottleneck for providing assistance quickly and accurately. We introduce the notion of inverse legibility in which the human-generated actions are legible enough for the robot to infer the human intent confidently and accurately. The proposed disambiguation paradigm seeks to elicit legible control commands from the human by selecting control modes, for the robotic arm, in which human-directed motion will maximally disambiguate between multiple goals. We present simulation results which look into the robustness of our algorithm and the impact of the choice of confidence functions on the performance of the system. Our simulations results suggest that the choice of confidence function is a critical factor in determining the disambiguation algorithm's capability to capture human intent. We also present a pilot study that explores the efficacy of the algorithm on real hardware with promising preliminary results.

Abstract:
We present a technique for learning control Lyapunov (potential) functions, which are used in turn to synthesize controllers for nonlinear dynamical systems. The learning framework uses a demonstrator that implements a black-box, untrusted strategy presumed to solve the problem of interest, a learner that poses finitely many queries to the demonstrator to infer a candidate function and a verifier that checks whether the current candidate is a valid control Lyapunov function. The overall learning framework is iterative, eliminating a set of candidates on each iteration using the counterexamples discovered by the verifier and the demonstrations over these counterexamples. We prove its convergence using ellipsoidal approximation techniques from convex optimization. We also implement this scheme using nonlinear MPC controllers to serve as demonstrators for a set of state and trajectory stabilization problems for nonlinear dynamical systems. Our approach is able to synthesize relatively simple polynomial control Lyapunov functions, and in that process replace the MPC using a guaranteed and computationally less expensive controller.

Abstract:
The literature on Inverse Reinforcement Learning (IRL) typically assumes that humans take actions in order to minimize the expected value of a cost function, i.e., that humans are risk neutral. Yet, in practice, humans are often far from being risk neutral. To fill this gap, the objective of this paper is to devise a framework for risk-sensitive IRL in order to explicitly account for an expert's risk sensitivity. To this end, we propose a flexible class of models based on coherent risk metrics, which allow us to capture an entire spectrum of risk preferences from risk-neutral to worst-case. We propose efficient algorithms based on Linear Programming for inferring an expert's underlying risk metric and cost function for a rich class of static and dynamic decision-making settings. The resulting approach is demonstrated on a simulated driving game with ten human participants. Our method is able to infer and mimic a wide range of qualitatively different driving styles from highly risk-averse to risk-neutral in a data-efficient manner. Moreover, comparisons of the Risk-Sensitive (RS) IRL approach with a risk-neutral model show that the RS-IRL framework more accurately captures observed participant behavior both qualitatively and quantitatively.

Abstract:
We introduce Bayesian Eigenobjects (BEOs), a novel object representation that is the first technique able to perform joint classification, pose estimation, and 3D geometric completion on previously unencountered and partially observed query objects. BEOs employ Variational Bayesian Principal Component Analysis (VBPCA) directly on 3D object representations to create generative and compact probabilistic models for classes of 3D objects. Using only depth information, we significantly outperform the current state-of-the-art method for joint classification and 3D completion in both accuracy and query time. Additionally, we show that BEOs are well suited for the extremely challenging task of joint classification, completion, and pose estimation on a large dataset of household objects.

Abstract:
3D scene understanding is important for robots to interact with the 3D world in a meaningful way. Most previous works on 3D scene understanding focus on recognizing geometrical or semantic properties of the scene independently. In this work, we introduce Data Associated Recurrent Neural Networks (DA-RNNs), a novel framework for joint 3D scene mapping and semantic labeling. DA-RNNs use a new recurrent neural network architecture for semantic labeling on RGB-D videos. The output of the network is integrated with mapping techniques such as KinectFusion in order to inject semantic information into the reconstructed 3D scene. Experiments conducted on a real world dataset and a synthetic dataset with RGB-D videos demonstrate the ability of our method in semantic 3D scene mapping.

Abstract:
This paper proposes a novel approach to performing in-grasp manipulation planning: the problem of moving an object with reference to the palm from an initial pose to a goal pose without breaking or making contacts. Our method to perform in-grasp manipulation uses kinematic trajectory optimization which requires no knowledge of dynamic properties of the object or the robot. We define a cost function that attempts to maintain the initial grasp points, while relaxing the constraint that the contacts between finger and object remain rigid. Hence, we name this new formulation relaxed-rigidity constraints. We implement our approach on an Allegro robot hand and perform experiments on 10 objects from the YCB dataset. However, the implementation would work for any object the robot can grasp. We perform thorough analysis and compare to alternative optimization formulations. Our method reaches the desired object pose with a median position error of 13mm across all of the 500 trials without ever dropping the object.

Abstract:
Relying on reduced models is nowadays a standard cunning to tackle the computational complexity of multi-contact locomotion. To be really effective, reduced models must respect some feasibility constraints in regards to the full model. However, such kind of constraints are either partially considered or just neglected inside the existing reduced problem formulation. This work presents a systematic approach to incorporate feasibility constraints inside trajectory optimization problems. In particular, we show how to learn the kinematic feasibility of the centre of mass to be achievable by the whole-body model. We validate the proposed method in the context of multi-contact locomotion: we perform two stairs climbing experiments on two humanoid robots, namely the HRP-2 robot and the new TALOS platform.

Abstract:
Deep reinforcement learning has emerged as a promising and powerful technique for automatically acquiring control policies that can process raw sensory inputs, such as images, and perform complex behaviors. However, extending deep RL to real-world robotic tasks has proven challenging, particularly in safety-critical domains such as autonomous flight, where a trial-and-error learning process is often impractical. In this paper, we explore the following question: can we train vision-based navigation policies entirely in simulation, and then transfer them into the real world to achieve real-world flight without a single real training image? We propose a learning method that we call CAD2RL, which can be used to perform collision-free indoor flight in the real world while being trained entirely on 3D CAD models. Our method uses single RGB images from a monocular camera, without needing to explicitly reconstruct the 3D geometry of the environment or perform explicit motion planning. Our learned collision avoidance policy is represented by a deep convolutional neural network that directly processes raw monocular images and outputs velocity commands. This policy is trained entirely on simulated images, with a Monte Carlo policy evaluation algorithm that directly optimizes the network's ability to produce collision-free flight. By highly randomizing the rendering settings for our simulated training set, we show that we can train a policy that generalizes to the real world, without requiring the simulator to be particularly realistic or high-fidelity. We evaluate our method by flying a real quadrotor through indoor environments, and further evaluate the design choices in our simulator through a series of ablation studies on depth prediction. For supplementary video see: https://youtu.be/nXBWmzFrj5s

Abstract:
Rovers operating on Mars have been delayed, diverted, and trapped by loose granular materials. Vision-based mobility prediction approaches often fail because hazardous sand is difficult to distinguish from safe sand based on surface appearance alone. Unlike surface appearance, the thermal inertia of terrain is directly correlated to the same geophysical properties that control slip. This paper presents a quantitative analysis showing that considering thermal inertia improves rover slip prediction on Mars using in-situ data from the Curiosity rover. Thermal inertia is estimated for each slip measurement in sand using both on-board and orbital instruments. Slip models are learned using a mixture of experts approach where the experts are identified using thermal inertia. Two-expert models are compared to a single-expert, vision-only model to show that slip predictions are improved by separating high-slip, low thermal inertia sand from low-slip, high thermal inertia sand. These results support the hypothesis that the consideration of thermal inertia improves mobility estimates for rovers on Mars.

Abstract:
Detection of objects in cluttered indoor environments is one of the key enabling functionalities for service robots. The best performing object detection approaches in computer vision exploit deep Convolutional Neural Networks (CNN) to simultaneously detect and categorize the objects of interest in cluttered scenes. Training of such models typically requires large amounts of annotated training data which is time consuming and costly to obtain. In this work we explore the ability of using synthetically generated composite images for training state-of-the-art object detectors, especially for object instance detection. We superimpose 2D images of textured object models into images of real environments at variety of locations and scales. Our experiments evaluate different superimposition strategies ranging from purely image-based blending all the way to depth and semantics informed positioning of the object models into real scenes. We demonstrate the effectiveness of these object detector training strategies on two publicly available datasets, the GMU-Kitchens and the Washington RGB-D Scenes v2. As one observation, augmenting some hand-labeled training data with synthetic examples carefully composed onto scenes yields object detectors with comparable performance to using much more hand-labeled data. Broadly, this work charts new opportunities for training detectors for new objects by exploiting existing object model repositories in either a purely automatic fashion or with only a very small number of human-annotated examples.

Abstract:
We present a motion planning algorithm to compute collision-free and smooth trajectories for high-DOF robots interacting with humans in a shared workspace. Our approach uses offline learning of human actions along with temporal coherence to predict the human actions. Our intention-aware online planning algorithm uses the learned database to compute a reliable trajectory based on the predicted actions. We represent the predicted human motion using a Gaussian distribution and compute tight upper bounds on collision probabilities for safe motion planning. We highlight the performance of our planning algorithm in complex simulated scenarios and real world benchmarks with 7-DOF robot arms operating in a workspace with a human performing complex tasks. We demonstrate the benefits of our intention-aware planner in terms of computing safe trajectories in such uncertain environments.

Abstract:
This paper presents a framework for optimizing both the shape and the motion of a planar rigid end-effector to satisfy a desired manipulation task. We frame this design problem as a nonlinear optimization program, where shape and motion are decision variables represented as splines. The task is represented as a series of constraints, along with a fitness metric,which force the solution to be compatible with the dynamics of frictional hard contact while satisfying the task. We illustrate the approach with the example problem of moving a disk along a desired path or trajectory, and we verify it by applying it to three classical design problems: the rolling brachistochrone, the design of teeth of involute gears, and the pitch curve of rolling cams. We conclude with a case study involving the optimization and real implementation of the shape and motion of a dynamic throwing arm.

Abstract:
Humans can ground natural language commands to tasks at both abstract and fine-grained levels of specificity. For instance, a human forklift operator can be instructed to perform a high-level action, like 'grab a pallet' or a low-level action like 'tilt back a little bit.' While robots are also capable of grounding language commands to tasks, previous methods implicitly assume that all commands and tasks reside at a single, fixed level of abstraction. Additionally, methods that do not use multiple levels of abstraction encounter inefficient planning and execution times as they solve tasks at a single level of abstraction with large, intractable state-action spaces closely resembling real world complexity. In this work, by grounding commands to all the tasks or subtasks available in a hierarchical planning framework, we arrive at a model capable of interpreting language at multiple levels of specificity ranging from coarse to more granular. We show that the accuracy of the grounding procedure is improved when simultaneously inferring the degree of abstraction in language used to communicate the task. Leveraging hierarchy also improves efficiency: our proposed approach enables a robot to respond to a command within one second on 90% of our tasks, while baselines take over twenty seconds on half the tasks. Finally, we demonstrate that a real, physical robot can ground commands at multiple levels of abstraction allowing it to efficiently plan different subtasks within the same planning hierarchy.

Abstract:
Robots that use learned perceptual models in the real world must be able to safely handle cases where they are forced to make decisions in scenarios that are unlike any of their training examples. However, state-of-the-art deep learning methods are known to produce erratic or unsafe predictions when faced with novel inputs. Furthermore, recent ensemble, bootstrap and dropout methods for quantifying neural network uncertainty may not efficiently provide accurate uncertainty estimates when queried with inputs that are very different from their training data. Rather than unconditionally trusting the predictions of a neural network for unpredictable real-world data, we use an autoencoder to recognize when a query is novel, and revert to a safe prior behavior. With this capability, we can deploy an autonomous deep learning system in arbitrary environments, without concern for whether it has received the appropriate training. We demonstrate our method with a vision-guided robot that can leverage its deep neural network to navigate 50% faster than a safe baseline policy in familiar types of environments, while reverting to the prior behavior in novel environments so that it can safely collect additional training data and continually improve. A video illustrating our approach is available at: http://groups.csail.mit.edu/rrg/videos/safe visual navigation.

Abstract:
This work addresses the problem of efficient online exploration and mapping using multi-robot teams via a distributed algorithm for planning for multi-robot exploration---distributed sequential greedy assignment (DSGA)---based on the sequential greedy assignment (SGA) algorithm. SGA permits bounds on suboptimality but requires all robots to plan in series. Rather than plan for robots sequentially as in SGA, DSGA assigns plans to subsets of robots during a fixed number of rounds. DSGA retains the same suboptimality bounds as SGA with the addition of a term describing suboptimality introduced due to redundant sensor information. We use this result to extend a single-robot planner based on Monte-Carlo tree search to the multi-robot domain and evaluate the resulting planner in simulated exploration of a confined and cluttered environment. The experimental results show that suboptimality due to redundant sensor information introduced by the distributed planning rounds remains near zero in practice when using as few as two or three distributed planning rounds and that DSGA achieves similar or better objective values and entropy reduction as SGA while providing a 2--6 times computational speedup for multi-robot teams ranging from 4 to 32 robots.

Abstract:
Motivated by applications in robotics and computer vision, we study problems related to spatial reasoning of a 3D environment using sublevel sets of polynomials. These include: tightly containing a cloud of points (e.g., representing an obstacle) with convex or nearly-convex basic semialgebraic sets, computation of Euclidean distance between two such sets, separation of two convex basic semalgebraic sets that overlap, and tight containment of the union of several basic semialgebraic sets with a single convex one. We use algebraic techniques from sum of squares optimization that reduce all these tasks to semidefinite programs of small size and present numerical experiments in realistic scenarios.

Abstract:
An inspiration for developing a bipedal walking system is the ability to navigate rough terrain with discrete footholds like stepping stones. In this paper, we present a novel methodology to overcome the problem of dynamic walking over stepping stones with significant random changes to step length and step height at each step. Using a 2-step gait optimization, we not only consider the desired location of the next footstep but also the current configuration of the robot, thereby resolving the problem of step transition when we switch between different walking gaits. We then use gait interpolation to generate the desired walking gait in real-time. We demonstrate the method on a planar dynamical walking model of ATRIAS, an underactuated bipedal robot walking over a randomly generated stepping stones with step length and step height changing in the range of [30:80] (cm) and [-30:30] (cm) respectively. Experimental validation on the real robot was also successful for the problem of dynamic walking on stepping stones with step lengths varied within [23:78] (cm) and average walking speed of 0.6 (m/s).

Abstract:
In highly constrained settings, e.g., a tentacle-like medical robot maneuvering through narrow cavities in the body for minimally invasive surgery, it may be difficult or impossible for a robot with a generic kinematic design to reach all desirable targets while avoiding obstacles. We introduce a design optimization method to compute kinematic design parameters that enable a single robot to reach as many desirable goal regions as possible while avoiding obstacles in an environment. We focus on the kinematic design of piecewise cylindrical robots, robotic manipulators whose shape can be modeled via cylindrical components. Our method appropriately integrates sampling-based motion planning in configuration space into stochastic optimization in design space so that, over time, our evaluation of a design's ability to reach goals increases in accuracy and our selected designs approach global optimality. We prove the asymptotic optimality of our method and demonstrate performance in simulation for (i) a serial manipulator and (ii) a concentric tube robot, a tentacle-like medical robot that can bend around anatomical obstacles to safely reach clinically-relevant goal regions.

Abstract:
Many critical robotics applications require robustness to disturbances arising from unplanned forces, state uncertainty, and model errors. Motion planning algorithms that explicitly reason about robustness require a coupling of trajectory optimization and feedback design, where the system's closed-loop response to bounded disturbances is optimized. Due to the often-heavy computational demands of solving such problems, the practical application of robust trajectory optimization in robotics has so far been limited. We derive a tractable robust optimization algorithm that combines direct transcription with linear-quadratic control design to reason about closed-loop responses to disturbances. In the case of ellipsoidal disturbance sets, the state and input deviations along a nominal trajectory can be computed locally in closed form, thus allowing for fast evaluations of robust cost and constraint functions. The resulting algorithm, called DIRTREL, is an extension of classical direct transcription that demonstrably improves tracking performance over non-robust formulations while incurring only a modest increase in computational cost. We evaluate the algorithm in several simulated robot control tasks.

Abstract:
Design and control of a novel extra robotic arm attached to the shoulder of a worker for performing tasks in the overhead area are presented. The wearable robot, called Supernumerary Robotic Limbs (SRL), can lift an object and hold it while the wearer is securing the object using a tool with both hands. The worker does not have to take a laborious posture for a long time, reducing fatigue and injuries. Furthermore, a single worker can execute the task, which would otherwise require two workers. Two technical challenges and novel solutions are presented. One is to make the wearable robot simple and lightweight with use of a new type of granular jamming gripper that can grasp diverse objects from an arbitrary direction. This eliminates the need for orienting the gripper against the object with three-axis wrist joints, reducing the number of degrees of freedom (DOF) from 6 to 3. The other is an effective control algorithm that allows the wearer to move freely while the robot on the shoulder is holding an object. Unlike a robot sitting on a floor, the SRL worn by a human is disturbed by the movement of the wearer. An admittance-based control algorithm allows the robot to hold the object stably and securely despite the human movement and changes in posture. A 3 DOF prototype robot with a new granular jamming gripper and an ergonomic body mounting gear is developed and tested. It is demonstrated that the robot can hold a large object securely in the overhead area despite the movement of the wearer while performing an assembly work.

Abstract:
This paper presents a novel unified theoretical framework for differential kinematics and dynamics for complex robot motion optimization. By introducing 18 times 18 comprehensive motion transformation matrix (CMTM), forward differential kinematics and dynamics including velocity and acceleration can be written in a simple chain products like ordinary rotational matrix. This formulation enables analytical computation of derivative of various physical quantities including joint force or torques with respect to joint coordinate variables and their derivatives for a robot trajectory in an efficient manner (O(Nj), where Nj is the number of the robot's DOF), which is useful for motion optimization.

Abstract:
Developing robots capable of making sense of their environment requires the ability to learn from observations. An important paradigm that allows for robots to both imitate humans and gain an understanding of the tasks people perform is that of action primitive discovery. Action primitives have been used as a representation of the main building blocks that compose motion. Automatic primitive discovery is an active area of research, with existing methods that can provide viable solutions for learning primitives from demonstrations. However, when we learn primitives directly from raw data, we need a mechanism to determine those primitives that are appropriate for the task at hand: is "brushing one's teeth" a suitable primitive or are the actions of "grabbing the toothbrush", "adding toothpaste onto it", and "executing the brushing motion" better suited? It is this level of granularity that is important for determining well-suited primitives for applications. Existing methods for learning primitives do not provide a solution for discovering their granularity. Rather, these techniques stop at arbitrarily chosen levels, and often use clear, repetitive actions in order to easily label the primitives. Our contribution provides a framework for discovering the appropriate granularity level of learned primitives for a task. We apply our framework to action primitives learned from a set of motion capture data obtained from human demonstrations that includes hand and object motions. This helps find a well-suited granularity level for our task, avoiding the use of low levels that don't capture the necessary core pattern in the actions, or high levels that miss important differences between actions. Our results show that this framework is able to discover the best suited primitive granularity level for a specific application.

Abstract:
This paper presents the design and optimization of a self-adaptive, a.k.a. underactuated, finger targeted to be used with collaborative robots. Typical robots, whether collaborative or not, mostly rely on standard translational grippers for pick-and-place operations. These grippers are constituted from an actuated motion platform on which a set of jaws is rigidly attached. These jaws are often designed to secure a precise and limited range of objects through the application of pinching forces. In this paper, the design of a self-adaptive robotic finger is presented which can be attached to these typical translational gripper to replace the common monolithic jaws and provide the gripper with shape-adaptation capabilities without any control or sensors. A new design is introduced here and specially optimized for collaborative robots. The kinetostatic analysis of this new design is briefly discussed and then followed by the optimization of relevant geometric parameters. Finally, a practical prototype attached to a very common collaborative robot is demonstrated. While the resulting finger design could be attached to any translational gripper, specifically targeting collaborative robots as an application comes with certain specificities in the choice of the design parameters as will be shown and the optimized parameters that are found match them.

Abstract:
In this paper we present a system for the state estimation of a dynamically walking and trotting quadruped. The approach fuses four heterogeneous sensor sources (inertial, kinematic, stereo vision and LIDAR) to maintain an accurate and consistent estimate of the robot's base link velocity and position in the presence of disturbances such as slips and missteps. We demonstrate the performance of our system, which is robust to changes in the structure and lighting of the environment, as well as the terrain over which the robot crosses. Our approach builds upon a modular inertial-driven Extended Kalman Filter which incorporates a rugged, probabilistic leg odometry component with additional inputs from stereo visual odometry and LIDAR registration. The simultaneous use of both stereo vision and LIDAR helps combat operational issues which occur in real applications. To the best of our knowledge, this paper is the first to discuss the complexity of consistent estimation of pose and velocity states, as well as the fusion of multiple exteroceptive signal sources at largely different frequencies and latencies, in a manner which is acceptable for a quadruped's feedback controller. A substantial experimental evaluation demonstrates the robustness and accuracy of our system, achieving continuously accurate localization and drift per distance traveled below 1 cm/m.

Abstract:
Robots designed to interact with humans in realistic environments must be able to handle uncertainty with respect to the identities and properties of the people, places, and things found in their environments. When humans refer to these entities using under-specified language, robots must often generate clarification requests to determine which entities were meant. In this paper, we present recommendations for designers of robots needing to generate such requests, and show how a Dempster-Shafer theoretic pragmatic reasoning component capable of generating requests to clarify pragmatic uncertainty can also generate requests to resolve referential uncertainty when integrated with a probabilistic reference resolution component.

Abstract:
This paper presents a novel geometric approach for learning and reproducing trajectory-based skills from human demonstrations. Our approach models a skill as a Generalized Cylinder, a geometric representation composed of an arbitrary space curve called spine and a smoothly varying cross-section. While this model has been utilized to solve other robotics problems, this is the first application of Generalized Cylinders to manipulation. The strengths of our approach are the model's ability to identify and extract the implicit characteristics of the demonstrated skill, support for multiple reproduction of trajectories that maintain those characteristics, generalization to new situations through nonrigid registration, and interactive human refinement of the resulting model through kinesthetic teaching. We validate our approach through several real-world experiments with a Jaco 6-DOF robotic arm.

Abstract:
In this work we present a novel, inductance-based system to measure and control the motion of bellows-driven continuum joints in soft robots. The sensing system relies on coils of wire wrapped around the minor diameters of each bellows on the joint. As the bellows extend, these coils of wire become more distant, decreasing their mutual inductance. Measuring this change in mutual inductance allows us to measure the motion of the joint. By dividing the sensing of the joint into two sections and measuring the motion of each section independently, we are able to measure the overall deformation of the joint with a piece-wise constant-curvature approximation. This technique allows us to measure lateral displacements that would be otherwise unobservable. When measuring bending, the inductance sensors measured the joint orientation with an RMS error of 1.1 degrees. The inductance sensors were also successfully used as feedback to control the orientation of the joint. The sensors proposed and tested in this work provided accurate motion feedback that would be difficult to achieve robustly with other sensors. This sensing system enables the creation of robust, self-sensing, and soft robots based on bellows-driven continuum joints.

Abstract:
Executing agile quadrotor maneuvers with cable-suspended payloads is a challenging problem and complications induced by the dynamics typically require trajectory optimization. State-of-the-art approaches often need significant computation time and complex parameter tuning. We present a novel dynamical model and a fast trajectory optimization algorithm for quadrotors with a cable-suspended payload. Our first contribution is a new formulation of the suspended payload behavior, modeled as a link attached to the quadrotor with a combination of two revolute joints and a prismatic joint, all being passive. Differently from state of the art, we do not require the use of hybrid modes depending on the cable tension. Our second contribution is a fast trajectory optimization technique for the aforementioned system. Our model enables us to pose the trajectory optimization problem as a Mathematical Program with Complementarity Constraints (MPCC). Desired behaviors of the system (e.g., obstacle avoidance) can easily be formulated within this framework. We show that our approach outperforms the state of the art in terms of computation speed and guarantees feasibility of the trajectory with respect to both the system dynamics and control input saturation, while utilizing far fewer tuning parameters. We experimentally validate our approach on a real quadrotor showing that our method generalizes to a variety of tasks, such as flying through desired waypoints while avoiding obstacles, or throwing the payload toward a desired target. To the best of our knowledge, this is the first time that three-dimensional, agile maneuvers exploiting the system dynamics have been achieved on quadrotors with a cable-suspended payload.

Abstract:
Modern perception systems are notoriously complex, featuring dozens of interacting parameters that must be tuned to achieve good performance. Conventional tuning approaches require expensive ground truth, while heuristic methods are difficult to generalize. In this work, we propose an introspective ground-truth-free approach to evaluating the performance of a generic perception system. By using the posterior distribution estimate generated by a Bayesian estimator, we show that the expected performance can be estimated efficiently and without ground truth. Our simulated and physical experiments in a demonstrative indoor ground robot state estimation application show that our approach can order parameters similarly to using a ground-truth system, and is able to accurately identify top-performing parameters in varying contexts. In contrast, baseline approaches that reason only about observation log-likelihood fail in the face of challenging perceptual phenomena.

Abstract:
We present a novel approach to shared control of human-machine systems. Our method assumes no a priori knowledge of the system dynamics. Instead, we learn both the dynamics and information about the user's interaction from observation through the use of the Koopman operator. Using the learned model, we define an optimization problem to compute the optimal policy for a given task, and compare the user input to the optimal input. We demonstrate the efficacy of our approach with a user study. We also analyze the individual nature of the learned models by comparing the effectiveness of our approach when the demonstration data comes from a user's own interactions, from the interactions of a group of users and from a domain expert. Positive results include statistically significant improvements on task metrics when comparing a user-only control paradigm with our shared control paradigm. Surprising results include findings that suggest that individualizing the model based on a user's own data does not effect the ability to learn a useful dynamic system. We explore this tension as it relates to developing human-in-the-loop systems further in the discussion.

Abstract:
This paper studies the underlying combinatorial structure of a class of object rearrangement problems, which appear frequently in applications. The problems involve multiple, similar-geometry objects placed on a flat, horizontal surface, where a robot can approach them from above and perform pick-and-place operations to rearrange them. The paper considers both the case where the start and goal object poses overlap, and where they do not. For overlapping poses, the primary objective is to minimize the number of pick-and-place actions and then to minimize the distance traveled by the end-effector. For the non-overlapping case, the objective is solely to minimize the end-effector distance. While such problems do not involve all the complexities of general rearrangement, they remain computationally hard challenges in both cases. This is shown through two-way reductions between well-understood, hard combinatorial challenges and these rearrangement problems. The benefit of the reduction is that there are well studied algorithms for solving these well-established combinatorial challenges. These algorithms can be very efficient in practice despite the hardness results. The paper builds on these reduction results to propose an algorithmic pipeline for dealing with the rearrangement problems. Experimental evaluation shows that the proposed pipeline achieves high-quality paths with regards to the optimization objectives. Furthermore, it exhibits highly desirable scalability as the number of objects increases in both the overlapping and non-overlapping setups.

Abstract:
This paper derives nonlinear feedback control synthesis for general control affine systems using second-order actions---the needle variations of optimal control---as the basis for choosing each control response to the current state. A second result of the paper is that the method provably exploits the nonlinear controllability of a system by virtue of an explicit dependence of the second-order needle variation on the Lie bracket between vector fields. As a result, each control decision necessarily decreases the objective when the system is nonlinearly controllable using first-order Lie brackets. Simulation results using a differential drive cart, an underactuated kinematic vehicle in three dimensions, and an underactuated dynamic model of an underwater vehicle demonstrate that the method finds control solutions when the first-order analysis is singular. Moreover, the simulated examples demonstrate superior convergence when compared to synthesis based on first-order needle variations. Lastly, the underactuated dynamic underwater vehicle model demonstrates the convergence even in the presence of a velocity field.

Abstract:
We present an extension to Experience-driven Predictive Control (EPC) that leverages a Gaussian belief propagation strategy to compute an uncertainty set bounding the evolution of the system state in the presence of time-varying state uncertainty. This uncertainty set is used to tighten the constraints in the predictive control formulation via a chance constrained approach, thereby providing a probabilistic guarantee of constraint satisfaction. The parameterized form of the controllers produced by EPC coupled with online uncertainty estimates ensures this robust constraint satisfaction property persists even as the system switches controllers and experiences variations in the uncertainty model. We validate the online performance and robust constraint satisfaction of the proposed Robust EPC algorithm through a series of experimental trials with a small quadrotor platform subjected to changes in state estimate quality.

Abstract:
This article analyzes two classes of job selection policies that control how a network of autonomous aerial vehicles delivers goods from depots to customers. Customer requests (jobs) occur according to a spatio-temporal stochastic process not known by the system. If job selection uses a policy in which the first job (FJ) is served first, the system may collapse to instability by removing just one vehicle. Policies that serve the nearest job (NJ) first show such threshold behavior only in some settings and can be implemented in a distributed manner. The timing of job selection has significant impact on delivery time and stability for NJ while it has no impact for FJ. Based on these findings we introduce a methodological approach for decision- making support to set up and operate such a system, taking into account the trade-off between monetary cost and service quality. In particular, we compute a lower bound for the infrastructure expenditure required to achieve a certain expected delivery time. The approach includes three time horizons: long-term decisions on the number of depots to deploy in the service area, mid- term decisions on the number of vehicles to use, and short-term decisions on the policy to operate the vehicles.

Abstract:
As the spatial scale of robots decrease in multi-robot systems, collisions cease to be catastrophic events that need to be avoided at all costs. This implies that less conservative, coordinated control strategies can be employed, where collisions are not only tolerated, but can potentially be harnessed as an information source. In this paper, we follow this line of inquiry by employing collisions as a sensing modality that provides information about the robots' surroundings. We envision a collection of robots moving around with no sensors other than binary, tactile sensors that can determine if a collision occurred, and let the robots use this information to determine their locations. We apply a probabilistic localization technique based on mean-field approximations that allows each robot to maintain and update a probability distribution over all possible locations. Simulations and real multi-robot experiments illustrate the feasibility of the proposed approach, and demonstrate how collisions in multi-robot systems can indeed be employed as useful information sources.

Abstract:
For co-manipulation involving humans and robots, robot controllers that are based on human-human behavior should allow more comfortable and coordinated movement between the human-robot dyad. In this paper, we describe an experiment between human-human dyads where we recorded the force and motion data as leader-follower dyads moved in translation and rotation. The force/motion data was then analyzed for patterns found during lateral translation only. For extended objects, lateral translation and in-place rotation are ambiguous, but this paper determines a way to characterize lateral translation triggers for future use in human-robot interaction. The study has 4 main results. First, interaction forces are non-negligible and are necessary for co-manipulation. Second, minimum-jerk trajectories are found in the lateral direction only for lateral movement. Third, the beginning of a lateral movement is characterized by distinct force triggers by the leader. Fourth, there are different metrics that can be calculated to determine which dyads moved most effectively in the lateral direction.

Abstract:
We present a new method of learning control policies that successfully operate under unknown dynamic models. We create such policies by leveraging a large number of training examples that are generated using a physical simulator. Our system is made of two components: a Universal Policy (UP) and a function for Online System Identification (OSI). We describe our control policy as 'universal' because it is trained over a wide array of dynamic models. These variations in the dynamic model may include differences in mass and inertia of the robot's components, variable friction coefficients, or unknown mass of an object to be manipulated. By training the Universal Policy with this variation, the control policy is prepared for a wider array of possible conditions when executed in an unknown environment. The second part of our system uses the recent state and action history of the system to predict the dynamics model parameters mu. The value of mu from the Online System Identification is then provided as input to the control policy (along with the system state). Together, UP-OSI is a robust control policy that can be used across a wide range of dynamic models, and that is also responsive to sudden changes in the environment. We have evaluated the performance of this system on a variety of tasks, including the problem of cart-pole swing-up, the double inverted pendulum, locomotion of a hopper, and block-throwing of a manipulator. UP-OSI is effective at these tasks across a wide range of dynamic models. Moreover, when tested with dynamic models outside of the training range, UP-OSI outperforms the Universal Policy alone, even when UP is given the actual value of the model dynamics. In addition to the benefits of creating more robust controllers, UP-OSI also holds out promise of narrowing the Reality Gap between simulated and real physical systems.

Abstract:
A novel method is presented for efficiently testing the stability of an object under gravity and contact forces, that accommodates empirical determination of the set of admissible forces exerted at contacting pairs of surfaces. These admissible force volumes may exhibit a wide variety of geometries, including anisotropy, adhesion, and even non-convexity. The method discretizes the contact region into patches, performs a convex decomposition of a polyhedral approximation to each admissible force volume, and then formulates the problem as a mixed integer linear program. The model can also accommodate articulated robot hands with joint torques, joint frictions, and spring preloads. Predictions of our method are evaluated experimentally in object lifting tasks using a gripper that exploits microspines to exert strongly anisotropic forces.

Abstract:
This paper presents a tool for addressing a key component in many algorithms for planning robot trajectories under uncertainty: evaluation of the safety of a robot whose actions are governed by a closed-loop feedback policy near a nominal planned trajectory. We describe an adaptive importance sampling Monte Carlo framework that enables the evaluation of a given control policy for satisfaction of a probabilistic collision avoidance constraint which also provides an associated certificate of accuracy (in the form of a confidence interval). In particular this adaptive technique is well-suited to addressing the complexities of rigid-body collision checking applied to non-linear robot dynamics. As a Monte Carlo method it is amenable to parallelization for computational tractability, and is generally applicable to a wide gamut of simulatable systems, including alternative noise models. Numerical experiments demonstrating the effectiveness of the adaptive importance sampling procedure are presented and discussed.

Abstract:
We present a novel computational approach to optimizing the morphological design of robotic devices. Our framework takes as input a parameterized robot design, and a motion plan consisting of end-effector trajectories and/or a body trajectory. The algorithm we propose is used to optimize a set of design parameters including link lengths and actuator layout, while concurrently adjusting motion parameters such as joint trajectories, actuator forces, and contact forces. Our key insight is that the complex relationship between design and motion parameters can be made explicit through the implicit function theorem if movements are modeled as spatio-temporal solutions to optimal control problems. This explicitly modeled relationship allows us to formulate the task of optimizing robot designs using quadratic programming. We evaluate the model we propose by optimizing two simulated robots that employ linear actuators: a manipulator and a large quadruped. We further validate our framework by optimizing the design of a small quadrupedal robot and testing its performance using a hardware implementation.

Abstract:
Human machine teaming has, for decades, been conceptualized as a function allocation (FA) or levels of autonomy (LOA) process: the human is suited for some tasks, while the machine is suitable for others, and as machines improve they take over duties previously assigned to humans. A wide variety of methods'including adaptive, adjustable, blended, supervisory and mixed initiative control, implemented discretely or continuously, as potential fields, as virtual fixture interfaces, or haptic interfaces'are derivatives of FA/LOA. We formalize FA/LOA (and all their derivatives) under a single mathematical formulation called classical shared control (CSC). Despite the widespread adoption of CSC, we prove that it fails to optimize human and robot agreement and intent if either the human or robot model displays 'intention ambiguity' (e.g., the human's intended goal is unclear or the robot finds multiple viable solutions). Practically, this suboptimality can manifest as unnecessary and unresolvable disagreement (an unnecessary deadlock). For instance, if the robot chooses to go left around an obstacle and the human chooses to go right, CSC only provides two solutions: freeze in place or collide with the obstacle (we provide a wide variety of failure examples in [52], https://arxiv.org/abs/1611.09490). We find that CSC suboptimality stems from arbitrating over model samples, rather than over models. Our key insight is thus to arbitrate over human and robot distributions; we prove this method optimizes human and robot agreement and intent and resolves deadlocking. Our key contribution is computationally efficient distribution arbitration: if the human and robot carry N^h_t and N^R_t 'intentions,' the joint (naively) has N^hTN^R_t intentions. In our approach, deadlock solutions have vanishingly small coefficients and only Nmin = arg minN^h_t,N^R_t non-zero coefficients remain: our joint has fewer modes than the individual agent models. We call our approach Nmin-sparse generalized shared control.

Abstract:
We present a theoretical analysis of a recent whole body motion planning method, the Randomized Possibility Graph, which uses a high-level decomposition of the feasibility constraint manifold in order to rapidly find routes that may lead to a solution. These routes are then examined by lower-level planners to determine feasibility. In this paper, we show that this approach is probabilistically complete for bipedal robots performing quasi-static walking in "semi-unstructured" environments. Furthermore, we show that the decomposition into higher and lower level planners allows for a considerably higher rate of convergence in the probability of finding a solution when one exists. We illustrate this convergence with a series of simulated scenarios.

Abstract:
In this paper, we propose a solution to the problem of \it herding by caging: given a set of mobile robots (called herders) and a group of moving agents (called sheep), we move the latter to some predefined location in such a way that they cannot escape from the robots while moving. We model the interaction between the herders and the sheep by assuming that the former exert virtual ``repulsive forces" pushing the sheep away from them. These forces induce a potential field, in which the sheep move in a way that does not increase their potential. This enables the robots to partially control the motion of the sheep. We formalize this behavior geometrically by applying the notion of \it caging, widely used in robotic grasping. We show that our approach is provably correct in the sense that the sheep cannot escape from the robots. We propose an RRT-based motion planning algorithm, demonstrate its probabilistic completeness, and evaluate it in simulations.

Abstract:
We present a methodology for fast prototyping of morphologies and controllers for robot locomotion. Going beyond simulation-based approaches, we argue that the form and function of a robot, as well as their interplay with real-world environmental conditions are critical. Hence, fast design and learning cycles are necessary to adapt robot shape and behavior to their environment. To this end, we present a combination of laminate robot manufacturing and sample-efficient reinforcement learning. We leverage this methodology to conduct an extensive robot learning experiment. Inspired by locomotion in sea turtles, we design a low-cost crawling robot with variable, interchangeable fins. Learning is performed using both bio-inspired and original fin designs in an artificial indoor environment as well as a natural environment in the Arizona desert. The findings of this study show that static policies developed in the laboratory do not translate to effective locomotion strategies in natural environments. In contrast to that, sample-efficient reinforcement learning can help to rapidly accommodate changes in the environment or the robot.

Abstract:
Being inherently compliant, the robotic artificial muscles are increasingly popular in applications such as safe human-robot interaction, legged robotics, prostheses and orthoses, and soft robotics. Their full utilization is often challenged by the coupled hysteresis among input, strain, and tension force. Although conventional two-dimensional hysteresis models are available, no prior studies on three-dimensional hysteresis models with coupled inputs have been reported for robotic artificial muscles. This paper presents a new approach to capturing the three-dimensional hysteresis of robotic artificial muscles by embedding a two-stage Preisach model. The proposed method is applied to shape memory alloy (SMA) actuators. Since direct temperature measurement of the SMA actuator is not available, the concept of temperature surrogate, representing the constant voltage value in Joule heating that would result in a given temperature at the steady-state, is adopted. The proposed approach is utilized to capture the hysteresis among temperature surrogate, contraction length, and force of an SMA actuator. Model verification is further conducted. For comparison purposes, two modeling approaches, namely, the Summed Preisach and the Linear Preisach, are also realized. Experimental results demonstrate that the proposed scheme can effectively characterize and estimate the three-dimensional hysteresis in SMA actuators. This study can be applied towards other robotic artificial muscles such as McKibben actuators and Super-coiled Polymer actuators.

Abstract:
Based on the convex force-motion polynomial model for quasi-static sliding, we derive the kinematic contact model to determine the contact modes and instantaneous object motion on a supporting surface given a position controlled manipulator. The inherently stochastic object-to-surface friction distribution is modelled by sampling physically consistent parameters from appropriate distributions, with only one parameter to control the amount of noise. Thanks to the high fidelity and smoothness of convex polynomial models, the mechanics of patch contact is captured while being computationally efficient without mode selection at support points. The motion equations for both single and multiple frictional contacts are given. Simulation based on the model is validated with robotic pushing and grasping experiments.

Abstract:
In this paper, we present an algorithm for planning and control of legged robot locomotion. Given the desired contact sequence, this method generates gaits and dynamic motions for legged robots without resorting to simplified stability criteria. The method uses direct collocation for searching for solutions within the constraint consistent subspace defined by the robot's contact configuration. For the differential equation constraints of the collocation algorithm, we use the so-called direct dynamics of a constrained multibody system. The dynamics of a legged robot is different for each contact configuration. Our method deals with such a hybrid nature, and it allows for velocity discontinuities when contacts are made. We introduce the projected impact dynamics constraint to enforce consistency during mode switching. We stabilize the plan using an inverse dynamics controller compatible with the optimal feed-forward control of the motion plan. As a whole, this approach reduces the complexity associated with specifying dynamic motions of a floating-base robot under the constant influence of contact forces. We apply this method on a hydraulically-actuated quadruped robot. We show two types of gaits (walking and trotting) as well as diverse jumping motions (forward, sideways, turning) on the real system. The results presented here are one of the few examples of an optimal control problem solved and satisfactorily transferred to a real torque-controlled legged robot.

Abstract:
This paper explores the quasi-static motion of a planar slider being pushed or pulled through a single contact point assumed not to slip. The main contribution is to derive a method for computing exact bounds on the object's motion for classes of pressure distributions where the center of pressure is known but the distribution of support forces is unknown. The second contribution is to show that the exact motion bounds can be used to plan robotic pulling trajectories that guarantee convergence to the final pose. The planner was tested on the task of pulling an acrylic rectangle to random locations within the robot workspace. The generated plans were accurate to 4.00mm +- 3.02mm of the target position and 4.35 degrees +- 3.14 degrees of the target orientation.

Abstract:
To reduce data collection time for deep learning of robust robotic grasp plans, we explore training from a synthetic dataset of 6.7 million point clouds, grasps, and robust analytic grasp metrics generated from thousands of 3D models from Dex-Net 1.0 in randomized poses on a table. We use the resulting dataset, Dex-Net 2.0, to train a Grasp Quality Convolutional Neural Network (GQ-CNN) model that rapidly predicts the probability of success of grasps from depth images, where grasps are specified as the planar position, angle, and depth of a gripper relative to an RGB-D sensor. Experiments with over 1,000 trials on an ABB YuMi comparing grasp planning methods on singulated objects suggest that a GQ-CNN trained with only synthetic data from Dex-Net 2.0 can be used to plan grasps in 0.8s with a success rate of 93% on eight known objects with adversarial geometry and is 3x faster than registering point clouds to a precomputed dataset of objects and indexing grasps. The Dex-Net 2.0 grasp planner is also the highest performing method on a dataset of 10 novel rigid objects and achieves 99% precision (one false positive out of 69 grasps classified as robust) on a dataset of 40 novel household objects, some of which are articulated or deformable.

Abstract:
In this paper, we extend the concept of control barrier functions, developed initially for continuous time systems, to the discrete-time domain. We demonstrate safety-critical control for nonlinear discrete-time systems with applications to 3D bipedal robot navigation. Particularly, we mathematically analyze two different formulations of control barrier functions, based on their continuous-time counterparts, and demonstrate how these can be applied to discrete-time systems. We show that the resulting formulation is a nonlinear program in contrast to the quadratic program for continuous-time systems and under certain conditions, the nonlinear program can be formulated as a quadratically constrained quadratic program. Furthermore, using the developed concept of discrete control barrier functions, we present a novel control method to address the problem of navigation of a high-dimensional bipedal robot through environments with moving obstacles that present time-varying safety-critical constraints.