RSS2015

Abstract:
Multi-robot networks use wireless communication to provide wide-ranging services such as aerial surveillance and unmanned delivery. However, effective coordination between multiple robots requires trust, making them particularly vulnerable to cyber-attacks. Specifically, such networks can be gravely disrupted by the Sybil attack, where even a single malicious robot can spoof a large number of fake clients. This paper proposes a new solution to defend against the Sybil attack, without requiring expensive cryptographic key-distribution. Our core contribution is a novel algorithm implemented on commercial Wi-Fi radios that can "sense" spoofers using the physics of wireless signals. We derive theoretical guarantees on how this algorithm bounds the impact of the Sybil Attack on a broad class of robotic coverage problems. We experimentally validate our claims using a team of AscTec quadrotor servers and iRobot Create ground clients, and demonstrate spoofer detection rates over 96%.

Abstract:
In shared autonomy, user input and robot autonomy are combined to control a robot to achieve a goal. Often, the robot does not know a priori which goal the user wants to achieve, and must both predict the user's intended goal, and assist in achieving that goal. We formulate the problem of shared autonomy as a Partially Observable Markov Decision Process with uncertainty over the user's goal. We utilize maximum entropy inverse optimal control to estimate a distribution over the user's goal based on the history of inputs. Ideally, the robot assists the user by solving for an action which minimizes the expected cost-to-go for the (unknown) goal. As solving the POMDP to select the optimal action is intractable, we use hindsight optimization to approximate the solution. In a user study, we compare our method to a standard predict-then-blend approach. We find that our method enables users to accomplish tasks more quickly while utilizing less input. However, when asked to rate each system, users were mixed in their assessment, citing a tradeoff between maintaining control authority and accomplishing tasks quickly.

Abstract:
In this work, we develop a monocular SLAM-aware object recognition system that is able to achieve considerably stronger recognition performance, as compared to classical object recognition systems that function on a frame-by-frame basis. By incorporating several key ideas including multi-view object proposals and efficient feature encoding methods, our proposed system is able to detect and robustly recognize objects in its environment using a single RGB camera in near-constant time. Through experiments, we illustrate the utility of using such a system to effectively detect and recognize objects, incorporating multiple object viewpoint detections into a unified prediction hypothesis. The performance of the proposed recognition system is evaluated on the UW RGB-D Dataset, showing strong recognition performance and scalable run-time performance compared to current state-of-the-art recognition systems.

Abstract:
Inverse Optimal Control (IOC) has strongly impacted the systems engineering process, enabling automated planner tuning through straightforward and intuitive demonstration. The most successful and established applications, though, have been in lower dimensional problems such as navigation planning where exact optimal planning or control is feasible. In higher dimensional systems, such as humanoid robots, research has made substantial progress toward generalizing the ideas to model free or locally optimal settings, but these systems are complicated to the point where demonstration itself can be difficult. Typically, real-world applications are restricted to at best noisy or even partial or incomplete demonstrations that prove cumbersome in existing frameworks. This work derives a very flexible method of IOC based on a form of Structured Prediction known as Direct Loss Minimization. The resulting algorithm is essentially Policy Search on a reward function that rewards similarity to demonstrated behavior (using Covariance Matrix Adaptation (CMA) in our experiments). Our framework blurs the distinction between IOC, other forms of Imitation Learning, and Reinforcement Learning, enabling us to derive simple, versatile, and practical algorithms that blend imitation and reinforcement signals into a unified framework. Our experiments analyze various aspects of its performance and demonstrate its efficacy on conveying preferences for motion shaping and combined reach and grasp quality optimization.

Abstract:
As intelligent robots become more prevalent, methods to make interaction with the robots more accessible are increasingly important. Communicating the tasks that a person wants the robot to carry out via natural language, and training the robot to ground the natural language through demonstration, are especially appealing approaches for interaction, since they do not require a technical background. However, existing approaches map natural language commands to robot command languages that directly express the sequence of actions the robot should execute. This sequence is often specific to a particular situation and does not generalize to new situations. To address this problem, we present a system that grounds natural language commands into reward functions using demonstrations of different natural language commands being carried out in the environment. Because language is grounded to reward functions, rather than explicit actions that the robot can perform, commands can be high-level, carried out in novel environments autonomously, and even transferred to other robots with different action spaces. We demonstrate that our learned model can be both generalized to novel environments and transferred to a robot with a different action space than the action space used during training.

Abstract:
In this paper we point out an overlooked structure of SLAM that distinguishes it from a generic nonlinear least squares problem. The measurement function in most common forms of SLAM is linear with respect to robot and features positions. Therefore, given an estimate for robot orientation, the conditionally optimal estimate for the rest of state variables can be easily obtained by solving a sparse linear-Gaussian estimation problem. We propose an algorithm to exploit this intrinsic property of SLAM by stripping the problem down to its nonlinear core, while maintaining its natural sparsity. Our algorithm can be used together with any Newton-based iterative solver and is applicable to 2D/3D pose-graph and feature-based problems. Our results suggest that iteratively solving the nonlinear core of SLAM leads to a fast and reliable convergence as compared to the state-of-the-art back-ends.

Abstract:
We propose a layered street view model to encode both depth and semantic information on street view images for autonomous driving. Recently, stixels, stix-mantics, and tiered scene labeling methods have been proposed to model street view images. We propose a 4-layer street view model, a compact representation over the recently proposed stix-mantics model. Our layers encode semantic classes like ground, pedestrians, vehicles, buildings, and sky in addition to the depths. The only input to our algorithm is a pair of stereo images. We use a deep neural network to extract the appearance features for semantic classes. We use a simple and an efficient inference algorithm to jointly estimate both semantic classes and layered depth values. Our method outperforms other competing approaches in Daimler urban scene segmentation dataset. Our algorithm is massively parallelizable, allowing a GPU implementation with a processing speed about 9 fps.

Abstract:
Handovers of objects are critical interactions that frequently arise in physical collaborations. In such interactions, humans naturally monitor the pace and workload of their partners and adapt their handovers accordingly. In this paper, we investigate how robots designed to engage in physical collaborations may achieve similar adaptivity in performing handovers. To that end, we collected and analyzed data from human dyads performing a common household task unloading a dish rack where receivers had different levels of task demands. We identified two coordination strategies that enabled givers to adapt to receivers task demands. We then formulated and implemented these strategies on a robotic manipulator. The implemented autonomous system was evaluated in a human-robot interaction study against two baselines that use proactive and reactive coordination methods. The results show a tradeoff between team performance and user experience when human receivers had greater task demands. In particular, the proactive method provided the greatest levels of team performance but offered the poorest user experience compared to the reactive and adaptive methods. The reactive method, while improving user experience over the proactive method, resulted in the poorest team performance. Our adaptive method maintained this improved user experience while offering an improved team performance compared to the reactive method. Our findings offer insights into the tradeoffs involved in the use of these methods and inform the future design of handover interactions for robots.

Abstract:
We present a novel approach to real-time dense visual SLAM. Our system is capable of capturing comprehensive dense globally consistent surfel-based maps of room scale environments explored using an RGB-D camera in an incremental online fashion, without pose graph optimisation or any post-processing steps. This is accomplished by using dense frame-to-model camera tracking and windowed surfel-based fusion coupled with frequent model refinement through non-rigid surface deformations. Our approach applies local model-to-model surface loop closure optimisations as often as possible to stay close to the mode of the map distribution, while utilising global loop closure to recover from arbitrary drift and maintain global consistency.

Abstract:
Long-term operations of resource-constrained robots typically require hard decisions be made about which data to process and/or retain. The question then arises of how to choose which data is most useful to keep to achieve the task at hand. As spacial scale grows, the size of the map will grow without bound, and as temporal scale grows, the number of measurements will grow without bound. In this work, we present the first known approach to tackle both of these issues. The approach has two stages. First, a subset of the variables (focused variables) is selected that are most useful for a particular task. Second, a task-agnostic and principled method (focused inference) is proposed to select a subset of the measurements that maximizes the information over the focused variables. The approach is then applied to the specific task of robot navigation in an obstacle-laden environment. A landmark selection method is proposed to minimize the probability of collision and then select the set of measurements that best localizes those landmarks. It is shown that the two-stage approach outperforms both only selecting measurement and only selecting landmarks in terms of minimizing the probability of collision. The performance improvement is validated through detailed simulation and real experiments on a Pioneer robot.

Abstract:
We introduce a principled method for multi-robot coordination based on a generic model (termed a MacDec-POMDP) of multi-robot cooperative planning in the presence of stochasticity, uncertain sensing and communication limitations. We present a new MacDec-POMDP planning algorithm that searches over policies represented as finite-state controllers, rather than the existing policy tree representation. Finite-state controllers can be much more concise than trees, are much easier to interpret, and can operate over an infinite horizon. The resulting policy search algorithm requires a substantially simpler simulator that models only the outcomes of executing a given set of motor controllers, not the details of the executions themselves and can to solve significantly larger problems than existing MacDec-POMDP planners. We demonstrate significantly improved performance over previous methods and application to a cooperative multi-robot bartending task, showing that our method can be used for actual multi-robot systems.

Abstract:
Unmanned aerial vehicle (UAV) capability is currently limited by the amount of energy that can be stored onboard. Airborne docking, for mid-air refueling, is a viable solution that has been implemented with manned aircraft for decades, but has yet to be achieved with their unmanned counterparts. The prohibitive challenge is the highly accurate and reliable relative positioning performance that is necessary to dock with a small target, in the air, amidst external disturbances. This paper presents a complete solution to airborne docking, which includes vision-aided unscented Kalman filters for leader-relative navigation and docking appendage motion estimation; and guidance that is suitable for all phases of the mission. The work concludes by demonstrating the proposed algorithms in what is thought to be the first UAV airborne docking.

Abstract:
This paper presents a collaborative control strategy designed to enable a team of robots to track attracting Lagrangian coherent structures (LCS) and unstable manifolds in two-dimensional flows. Tracking LCS in dynamical systems is important for many applications such as planning energy optimal paths in the ocean and predicting various physical and biological processes in the ocean. Similar to existing approaches, the proposed strategy does not require global information about the dynamics of the surrounding flow, and is based on local sensing, prediction, and correction. Different from existing approaches, the proposed strategy has the ability to track attracting LCS and unstable manifolds in real time through direct computation of the local finite time Lyapunov exponent field. The collaborative control strategy is implemented on a team of robots and the theoretical guarantees of the tracking strategy is briefly discussed. We demonstrate the tracking strategy in simulation using static and time dependent flows and experimentally validate the strategy using a team of micro autonomous surface vehicles (mASVs) in an actual fluid environment.

Abstract:
Event-based vision sensors, such as the Dynamic Vision Sensor (DVS), do not output a sequence of video frames like standard cameras, but a stream of asynchronous events. An event is triggered when a pixel detects a change of brightness in the scene. An event contains the location, sign, and precise timestamp of the change. The high dynamic range and temporal resolution of the DVS, which is in the order of micro-seconds, make this a very promising sensor for high-speed applications, such as robotics and wearable computing. However, due to the fundamentally different structure of the sensor's output, new algorithms that exploit the high temporal resolution and the asynchronous nature of the sensor are required. In this paper, we address ego-motion estimation for an event-based vision sensor using a continuous-time framework to directly integrate the information conveyed by the sensor. The DVS pose trajectory is approximated by a smooth curve in the space of rigid-body motions using cubic splines and it is optimized according to the observed events. We evaluate our method using datasets acquired from sensor-in-the-loop simulations and onboard a quadrotor performing flips. The results are compared to the ground truth, showing the good performance of the proposed technique.

Abstract:
One of the main challenges in autonomous manipulation is to generate appropriate multi-modal reference trajectories that enable feedback controllers to compute control commands that compensate for unmodeled perturbations and therefore to achieve the task at hand. We propose a data-driven approach to incrementally acquire reference signals from experience and decide online when and to which successive behavior to switch, ensuring successful task execution. We reformulate this online decision making problem as a pair of related classification problems. Both process the current sensor state, composed from multiple sensor modalities, in real-time (at 30 Hz). Our approach exploits that movement generation can dictate sensor feedback. Thus, enforcing stereotypical behavior will yield stereotypical sensory events which can be accumulated and stored along with the movement plan. Such movement primitives, augmented with sensor experience, are called Associative Skill Memories (ASMs). Sensor experience consists of (real) sensors, including haptic, auditory information and visual information, as well as additional (virtual) features. We show that our approach can be used to teach dexterous tasks, e.g. a bimanual manipulation task on a real platform that requires precise manipulation of relatively small objects. Task execution is robust against perturbation and sensor noise, because our method decides online whether or not to switch to alternative ASMs due to unexpected sensory signals.

Abstract:
Rearranging multiple objects is a critical skill for robots so that they can effectively deal with clutter in human spaces. This is a challenging problem as it involves combinatorially large, continuous C-spaces involving multiple movable bodies and complex kinematic constraints. This work initially revisits an existing search-based approach, which solves monotone challenges, i.e., when objects need to be grasped only once so as to be rearranged. The first contribution is the extension of this technique to a method that addresses many non-monotone challenges. The second contribution is the use of either the monotone or of the new non-monotone method as a local planner in the context of a higher-level task planner that searches the space of object placements and which provides stronger guarantees. The paper aims to emphasize the benefit of using more powerful motion primitives in the context of task planning for object rearrangement than an individual pick-and-place. Experiments in simulation using a model of a Baxter robot arm show the capability of solving difficult instances of rearrangement problems and evaluate the methods in terms of success ratio, running time, scalability and path quality.

Abstract:
In this work, we present an approach to topological motion planning which is fully data-driven in nature and which relies solely on the knowledge of samples in the free configuration space. For this purpose, we discuss the use of persistent cohomology with coefficients in a finite field to compute a basis which allows us to efficiently solve the path planning problem. The proposed approach can be used both in the case where a part of a configuration space is well-approximated by samples and, more generally, with arbitrary filtrations arising from real-world data sets. Furthermore, our approach can generate motions in a subset of the configuration space specified by the sub- or superlevel set of a filtration function such as a cost function or probability distribution. Our experiments show that our approach is highly scalable in low dimensions and we present results on simulated PR2 arm motions as well as GPS trace and motion capture data.

Abstract:
We study the problem of path planning for unlabeled (indistinguishable) unit-disc robots in a planar environment cluttered with polygonal obstacles. We introduce an algorithm which minimizes the total path length, i.e., the sum of lengths of the individual paths. Our algorithm is guaranteed to find a solution if one exists, or report that none exists otherwise. It runs in time \tildeO(m^4+m^2n^2), where m is the number of robots and n is the total complexity of the workspace. Moreover, the total length of the returned solution is at most O(\textOPT+O(m)), where OPT is the optimal-solution cost. To the best of our knowledge this is the first algorithm for the problem that has such guarantees. The algorithm has been implemented in an exact manner and we present experimental results that attest to its efficiency.

Abstract:
Place recognition has long been an incompletely solved problem in that all approaches involve significant com- promises. Current methods address many but never all of the critical challenges of place recognition _ viewpoint-invariance, condition-invariance and minimizing training requirements. Here we present an approach that adapts state-of-the-art object proposal techniques to identify potential landmarks within an image for place recognition. We use the astonishing power of convolutional neural network features to identify matching landmark proposals between images to perform place recognition over extreme appearance and viewpoint variations. Our system does not require any form of training, all components are generic enough to be used off-the-shelf. We present a range of challenging experiments in varied viewpoint and environmental conditions. We demonstrate superior performance to current state-of-the- art techniques. Furthermore, by building on existing and widely used recognition frameworks, this approach provides a highly compatible place recognition system with the potential for easy integration of other techniques such as object detection and semantic scene interpretation.

Abstract:
This paper shows how to define quantitative measures of a robot's ability to balance itself actively on a single point of support. These measures are expressed as ratios of velocities, and are called velocity gains. This paper builds on earlier work in this area by showing how these gains can be defined and calculated for the case of a general planar robot balancing on a general rolling-contact point in the plane, and the case of a general spatial robot balancing on a general rolling-contact point in 3D space. The case of balancing on a contact area with compliance is also considered. The paper concludes with two examples showing how to use velocity gains in the design of a triple pendulum and the analysis of a hydraulic quadruped.

Abstract:
Robot teleoperation systems face a common set of challenges including latency, low-dimensional user commands, and asymmetric control inputs. User control with Brain-Computer Interfaces (BCIs) exacerbates these problems through especially noisy and erratic low-dimensional motion commands due to the difficulty in decoding neural activity. We introduce a general framework to address these challenges through a combination of computer vision, user intent inference, and arbitration between the human input and autonomous control schemes. Adjustable levels of assistance allow the system to balance the operator's capabilities and feelings of comfort and control while compensating for a task's difficulty. We present experimental results demonstrating significant performance improvement using the shared-control assistance framework on adapted rehabilitation benchmarks with two subjects implanted with intracortical brain-computer interfaces controlling a seven degree-of-freedom robotic manipulator as a prosthetic. Our results further indicate that shared assistance mitigates perceived user difficulty and even enables successful performance on previously infeasible tasks. We showcase the extensibility of our architecture with applications to quality-of-life tasks such as opening a door, pouring liquids from containers, and manipulation with novel objects in densely cluttered environments.

Abstract:
In unlabeled multi-robot motion planning several interchangeable robots operate in a common workspace. The goal is to move the robots to a set of target positions such that each position will be occupied by some robot. In this paper, we study this problem for the specific case of unit-square robots moving amidst polygonal obstacles and show that it is PSPACE-hard. We also consider three additional variants of this problem and show that they are all PSPACE-hard as well. To the best of our knowledge, this is the first hardness proof for the unlabeled case. Furthermore, our proofs can be used to show that the labeled variant (where each robot is assigned with a specific target position), again, for unit-square robots, is PSPACE-hard as well, which sets another precedence, as previous hardness results require the robots to be of different shapes.

Abstract:
We propose an information-theoretic planning approach that enables mobile robots to autonomously construct dense 3D maps in a computationally efficient manner. Inspired by prior work, we accomplish this task by formulating an information-theoretic objective function based on Cauchy-Schwarz quadratic mutual information (CSQMI) that guides robots to obtain measurements in uncertain regions of the map. We then contribute a two stage approach for active mapping. First, we generate a candidate set of trajectories using a combination of global planning and generation of local motion primitives. From this set, we choose a trajectory that maximizes the information-theoretic objective. Second, we employ a gradient-based trajectory optimization technique to locally refine the chosen trajectory such that the CSQMI objective is maximized while satisfying the robot's motion constraints. We evaluated our approach through a series of simulations and experiments on a ground robot and an aerial robot mapping unknown 3D environments. Real-world experiments suggest our approach reduces the time to explore an environment by 70% compared to a closest frontier exploration strategy and 57% compared to an information-based strategy that uses global planning, while simulations demonstrate the approach extends to aerial robots with higher-dimensional state.

Abstract:
Recent results in monocular visual-inertial navigation (VIN) have shown that optimization-based approaches outperform filtering methods in terms of accuracy due to their capability to relinearize past states. However, the improvement comes at the cost of increased computational complexity. In this paper, we address this issue by preintegrating inertial measurements between selected keyframes. The preintegration allows us to accurately summarize hundreds of inertial measurements into a single relative motion constraint. Our first contribution is a preintegration theory that properly addresses the manifold structure of the rotation group and carefully deals with uncertainty propagation. The measurements are integrated in a local frame, which eliminates the need to repeat the integration when the linearization point changes while leaving the opportunity for belated bias corrections. The second contribution is to show that the preintegrated IMU model can be seamlessly integrated in a visual-inertial pipeline under the unifying framework of factor graphs. This enables the use of a structureless model for visual measurements, further accelerating the computation. The third contribution is an extensive evaluation of our monocular VIN pipeline: experimental results confirm that our system is very fast and demonstrates superior accuracy with respect to competitive state-of-the-art filtering and optimization algorithms, including off-the-shelf systems such as Google Tango.

Abstract:
This paper presents the Rapidly-exploring Adaptive Search and Classification (ReASC) algorithm, a sampling-based algorithm for planning the trajectories of mobile robots performing real-time target search and classification tasks in the field. The proposed algorithm incrementally builds up a tree of solutions and evaluates the utility of each solution for identifying targets in an environment. An optimistic approximation for the classification utility is used, which reduces the computational cost of evaluating trajectories and makes real-time adaptive planning feasible. The proposed algorithm is tested on an autonomous aquatic vehicle and are shown to outperform myopic methods by up to 36% in a lake monitoring scenario.

Abstract:
Designing robotic controllers for tasks with complex non-linear dynamics is extremely challenging, time-consuming, and in many cases, infeasible. This difficulty is exacerbated in tasks such as robotic food-cutting, in which dynamics might vary both with environmental properties, such as material and tool class, and with time while acting. In this work, we present DeepMPC, an online real-time model-predictive control approach designed to handle such difficult tasks. Rather than hand-design a dynamics model for the task, our approach uses a novel deep architecture and learning algorithm, learning controllers for complex tasks directly from data. We validate our method in experiments on a large-scale dataset of 1488 material cuts for 20 diverse classes, and in 450 real-world robotic experiments, demonstrating significant improvement over several other approaches.

Abstract:
Optimization is often difficult to apply to robots due to the presence of significant errors in the optimization model, which may cause constraints to be violated during execution on a real robot. This work presents a method to optimize trajectories with large modeling errors using a combination of robust optimization and parameter learning. In particular it considers the problem of computing a dynamically-feasible trajectory along a fixed path under frictional contact, where friction is uncertain and actuator effort is noisy. It introduces a robust time-scaling method that is able to accept confidence intervals on uncertain parameters, and uses a convex parameterization that allows dynamically-feasible motions under contact to be computed in seconds. This is combined with an iterative learning method that uses feedback from execution to learn confidence bounds on modeling parameters. Experiments on a manipulator performing a ``waiter'' task, on which an object is moved on a carried tray as quickly as possible, demonstrate this method can compensate for modeling uncertainties within a handful of iterations.

Abstract:
Stochastic optimal control problems frequently arise as motion control problems in the context of robotics. Unfortunately, all existing approaches that guarantee arbitrary precision suffer from the curse of dimensionality: the computational effort invested by the algorithm grows exponentially fast with increasing dimensionality of the state space of the underlying dynamic system governing the robot. In this paper, we propose a novel algorithm that utilizes compressed representations to efficiently solve stochastic optimal control problems with arbitrary precision. The running time of the new algorithms scale linearly with increasing dimensionality of the state space! The running time also depends polynomially on the rank of the value function, a measure that quantifies the intrinsic geometric complexity of the value function, due to the geometry and physics embedded in the problem instance at hand. The new algorithms are based on the recent analysis and algorithms for tensor decomposition, generalizing matrix decomposition algorithms, e.g., the singular value decomposition, to three or more dimensions. In computational experiments, we show the computational effort of the new algorithm also scales linearly with the discretization resolution of the state space. We also demonstrate the new algorithm on a problem involving the perching of an aircraft, represented by a nonlinear non-holonomic longitudinal model with a seven-dimensional state space, the full numerical solution to which was not obtained before. In this example, we estimate that the proposed algorithm runs more than seven orders of magnitude faster, when compared to the naive value iteration.

Abstract:
The Gaussian Filter (GF) is one of the most widely used filtering algorithms; instances are the Extended Kalman Filter, the Unscented Kalman Filter and the Divided Difference Filter. GFs represent the belief of the current state by a Gaussian with the mean being an affine function of the measurement. We show that this representation can be too restrictive to accurately capture the dependences in systems with nonlinear observation models, and we investigate how the GF can be generalized to alleviate this problem. To this end, we view the GF from a variational-inference perspective. We analyse how restrictions on the form of the belief can be relaxed while maintaining simplicity and efficiency. This analysis provides a basis for generalizations of the GF. We propose one such generalization which coincides with a GF using a virtual measurement, obtained by applying a nonlinear function to the actual measurement. Numerical experiments show that the proposed Feature Gaussian Filter (FGF) can have a substantial performance advantage over the standard GF for systems with nonlinear observation models.

Abstract:
We present a novel trajectory optimization framework to address the issue of robustness, scalability and efficiency in optimal control and reinforcement learning. Based on prior work in Cooperative Stochastic Differential Game (CSDG) theory, our method performs local trajectory optimization using cooperative controllers. The resulting framework is called Cooperative Game-Differential Dynamic Programming (CG-DDP). Compared to related methods, CG-DDP exhibits improved performance in terms of robustness and efficiency. The proposed framework is also applied in a data-driven fashion for belief space trajectory optimization under learned dynamics. We present experiments showing that CG-DDP can be used for optimal control and reinforcement learning under external disturbances and internal model errors.

Abstract:
This paper proposes an efficient and effective scheme to applying the sliding window approach popular in computer vision to 3D data. Specifically, the sparse nature of the problem is exploited via a voting scheme to enable a search through all putative object locations at any orientation. We prove that this voting scheme is mathematically equivalent to a convolution on a sparse feature grid and thus enables the processing, in full 3D, of any point cloud irrespective of the number of vantage points required to construct it. As such it is versatile enough to operate on data from popular 3D laser scanners such as a Velodyne as well as on 3D data obtained from increasingly popular push-broom configurations. Our approach is "embarrassingly parallelisable" and capable of processing a point cloud containing over 100K points at eight orientations in less than 0.5s. For the object classes car, pedestrian and bicyclist the resulting detector achieves best-in-class detection and timing performance relative to prior art on the KITTI dataset as well as compared to another existing 3D object detection approach.

Abstract:
Accurately estimating a robot's pose relative to a global scene model and precisely tracking the pose in real-time is a fundamental problem for navigation and obstacle avoidance tasks. Due to the computational complexity of localization against a large map and the memory consumed by the model, state-of-the-art approaches are either limited to small workspaces or rely on a server-side system to query the global model while tracking the pose locally. The latter approaches face the problem of smoothly integrating the server's pose estimates into the trajectory computed locally to avoid temporal discontinuities. In this paper, we demonstrate that large-scale, real-time pose estimation and tracking can be performed on mobile platforms with limited resources without the use of an external server. This is achieved by employing map and descriptor compression schemes as well as efficient search algorithms from computer vision. We derive a formulation for integrating the global pose information into a local state estimator that produces much smoother trajectories than current approaches. Through detailed experiments, we evaluate each of our design choices individually and document its impact on the overall system performance, demonstrating that our approach outperforms state-of-the-art algorithms for localization at scale.

Abstract:
In the last years several direct (i.e. featureless) monocular SLAM approaches have appeared showing impressive semi-dense or dense scene reconstructions. These works have questioned the need of features, in which consolidated SLAM techniques of the last decade were based. In this paper we present a novel feature-based monocular SLAM system that is more robust, gives more accurate camera poses, and obtains comparable or better semi-dense reconstructions than the current state of the art. Our semi-dense mapping operates over keyframes, optimized by local bundle adjustment, allowing to obtain accurate triangulations from wide baselines. Our novel method to search correspondences, the measurement fusion and the inter-keyframe depth consistency tests allow to obtain clean reconstructions with very few outliers. Against the current trend in direct SLAM, our experiments show that by decoupling the semi-dense reconstruction from the trajectory computation, the results obtained are better. This opens the discussion on the benefits of features even if a semi-dense reconstruction is desired.

Abstract:
To operate reliably in real-world traffic, an autonomous car must evaluate the consequences of its potential actions by anticipating the uncertain intentions of other traffic participants. This paper presents an integrated behavioral inference and decision-making approach that models vehicle behavior for both our vehicle and nearby vehicles as a discrete set of closed-loop policies that react to the actions of other agents. Each policy captures a distinct high-level behavior and intention, such as driving along a lane or turning at an intersection. We first employ Bayesian changepoint detection on the observed history of states of nearby cars to estimate the distribution over potential policies that each nearby car might be executing. We then sample policies from these distributions to obtain high-likelihood actions for each participating vehicle. Through closed-loop forward simulation of these samples, we can evaluate the outcomes of the interaction of our vehicle with other participants (e.g., a merging vehicle accelerates and we slow down to make room for it, or the vehicle in front of ours suddenly slows down and we decide to pass it). Based on those samples, our vehicle then executes the policy with the maximum expected reward value. Thus, our system is able to make decisions based on coupled interactions between cars in a tractable manner. This work extends our previous multipolicy system by incorporating behavioral anticipation into decision-making to evaluate sampled potential vehicle interactions. We evaluate our approach using real-world traffic-tracking data from our autonomous vehicle platform, and present decision-making results in simulation involving highway traffic scenarios.

Abstract:
The vast amount of data robots can capture today motivates the development of fast and scalable statistical tools to model the environment the robot operates in. We devise a new technique for environment representation through continuous occupancy mapping that improves on the popular occupancy grip maps in two fundamental aspects: 1) it does not assume an a priori discretisation of the world into grid cells and therefore can provide maps at an arbitrary resolution; 2) it captures statistical relationships between measurements naturally, thus being more robust to outliers and possessing better generalisation performance. The technique named Hilbert maps, is based on the computation of fast kernel approximations that project the data in a Hilbert space where a logistic regression classifier is learnt. We show that this approach allows for efficient stochastic gradient descent optimisation where each measurement is only processed once during learning or online updates. We present results with three types of kernel approximations, Random Fourier, Nystrom and a novel sparse projections. We also show how to extend the approach to accept probability distributions as inputs, i.e. when there is uncertainty over the position of laser scans due to sensor or localisation errors. Experiments demonstrate the benefits of the approach in popular benchmark datasets with several thousand laser scans.

Abstract:
For assistive robots, anticipating the future actions of humans is an essential task. This requires modeling both the evolution of the activities over time and the rich relationships between humans and the objects. Since the future activities of humans are quite ambiguous, robots need to assess all the future possibilities in order to choose an appropriate action. Therefore, a successful anticipation algorithm needs to compute all plausible future activities and their corresponding probabilities. In this paper, we address the problem of efficiently computing beliefs over future human activities from RGB-D videos. We present a new recursive algorithm that we call Recursive Conditional Random Field (rCRF) which can compute an accurate belief over a temporal CRF model. We use the rich modeling power of CRFs and describe a computationally tractable inference algorithm based on Bayesian filtering and structured diversity. In our experiments, we show that incorporating belief, computed via our approach, significantly outperforms the state-of-the-art methods, both in terms of accuracy and computation time.

Abstract:
Task-Space Inverse Dynamics (TSID) is a well-known optimization-based technique for the control of highly-redundant mechanical systems, such as humanoid robots. One of its main flaws is that it does not take into account any of the uncertainties affecting these systems: poor torque tracking, sensor noises, delays and model uncertainties. As a consequence, the resulting control trajectories may be feasible for the ideal system, but not for the real one. We propose to improve the robustness of TSID by modeling uncertainties in the joint torques as additive white random noise (similarly to LQG). This results in a stochastic optimization problem, in which we can maximize the probability to satisfy the inequality constraints (i.e. to be feasible). Since computing this probability is computationally expensive, we propose three ways to approximate it that are much faster to compute and that we can then use for online control (resolution time below 1 ms). Simulation results show that taking robustness into account greatly increases the chances to have feasible control trajectories (even when the uncertainties affecting the system are not the one modeled in the controller).

Abstract:
We consider an optimal stopping formulation of the mission monitoring problem, where a monitor vehicle must remain in close proximity to an autonomous robot that stochastically follows a pre-planned trajectory. This problem arises when autonomous underwater vehicles are monitored by surface vessels, and in a diverse range of other scenarios. The key problem characteristics we consider are that the monitor must remain stationary while observing the robot, and that the robot motion is modelled in general as a stochastic process. We propose a resolution-complete algorithm for this problem that runs in polynomial time. The algorithm is based on a sweep-plane approach and generates a motion plan that maximises the expected observation time. A variety of stochastic models may be used to represent the expected robot trajectory. We present results drawn from real AUV trajectories and Monte Carlo simulations that validate the correctness of our algorithm and its feasibility in practice.

Abstract:
The application of autonomous robots to efficiently locate small wildlife species has the potential to provide significant ecological insights not previously possible using traditional land-based survey techniques, and a basis for improved conservation policy and management. We present an approach for autonomously localizing radio-tagged wildlife using a small aerial robot. We present a novel two-point phased array antenna system that yields unambiguous bearing measurements and an associated uncertainty measure. Our estimation and information-based planning algorithms incorporate this bearing uncertainty to choose observation points that improve confidence in the location estimate. These algorithms run online in real time and we report experimental results that show successful autonomous localization of stationary radio tags and live radio-tagged birds.

Abstract:
This paper presents a new framework for the generation of high-speed running jumps to clear terrain obstacles in quadrupedal robots. Our methods enable the quadruped to autonomously jump over obstacles up to 40 cm in height within a single control framework. Specifically, we propose new control system components, layered on top of a low-level running controller, which actively modify the approach and select stance force profiles as required to clear a sensed obstacle. The approach controller enables the quadruped to end in a preferable state relative to the obstacle just before the jump. This multi-step gait planning is formulated as a multiple-horizon model predictive control problem and solved at each step through quadratic programming. Ground reaction force profiles to execute the running jump are selected through constrained nonlinear optimization on a simplified model of the robot that possesses polynomial dynamics. Exploiting the simplified structure of these dynamics, the presented method greatly accelerates the computation of otherwise costly function and constraint evaluations that are required during optimization. With these considerations, the new algorithms allow for online planning that is critical for reliable response to unexpected situations. Experimental results, for a stand-alone quadruped with on-board power and computation, show the viability of this approach, and represent important steps towards broader dynamic maneuverability in experimental machines.

Abstract:
In this paper we present a novel approach for the parameterization of the trajectory of a moving platform, which facilitates the development of real-time pose-estimation methods. The key idea of the proposed approach is the decoupling of the parameterization of the trajectory estimate from the parameterization of the error in this estimate. Specifically, we represent the trajectory estimate as usual, via a set of pose states, each associated with a sensor reading (e.g., a laser scan or an image). The novelty of our approach lies in the representation of the estimation errors, for which we employ B-splines. This decoupled formulation, which we term Decoupled Estimate-Error Parameterization (DEEP) offers two key advantages. First, the use of a pose-based representation of the trajectory allows us to represent arbitrarily complex trajectories. Second, the use of B-splines for error representation allows us to control the computational complexity of an estimator, by selecting the density of the knots of the B-spline. We empirically demonstrate that, in the problem of visual-inertial localization, the DEEP formulation leads to substantial computational gains, while incurring only a small loss of estimation performance.

Abstract:
When making contact with an object, a robot can use a tactile sensor consisting of a heating element and a temperature sensor to recognize the object's material based on conductive heat transfer from the tactile sensor to the object. When this type of tactile sensor has time to fully reheat prior to contact and the duration of contact is long enough to achieve a thermal steady state, numerous methods have been shown to perform well. In order to enable robots to more efficiently sense their environments and take advantage of brief contact events over which they lack control, we focus on the problem of material recognition from heat transfer given varying initial conditions and short-duration contact. We present both model-based and data-driven methods. For the model-based method, we modeled the thermodynamics of the sensor in contact with a material as contact between two semi-infinite solids. For the data-driven methods, we used three machine learning algorithms (SVM+PCA, k-NN+PCA, HMMs) with time series of raw temperature measurements and temperature change estimates. When recognizing 11 materials with varying initial conditions and 3-fold cross-validation, SVM+PCA outperformed all other methods, achieving 84% accuracy with 0.5 s of contact and 98% accuracy with 1.5 s of contact.

Abstract:
The performance of a state lattice motion planning algorithm depends critically on the resolution of the lattice to ensure a balance between solution quality and computation time. There is currently no theoretical basis for selecting the resolution because of its dependence on the robot dynamics and the distribution of obstacles. In this paper, we examine the problem of motion planning on a resolution constrained lattice for a robot with non-linear dynamics operating in an environment with randomly generated disc shaped obstacles sampled from a homogeneous Poisson process. We present a unified framework for computing explicit solutions to two problems - i) the critical planning resolution which guarantees the existence of an infinite collision free trajectory in the search graph ii) the critical speed limit which guarantees infinite collision free motion. In contrast to techniques used by \citetkaraman2012high, we use a novel approach that maps the problem to parameters of directed asymmetric hexagonal lattice bond percolation. Since standard percolation theory offers no results for this lattice, we map the lattice to an infinite absorbing Markov chain and use results pertaining to its survival to obtain bounds on the parameters. As a result, we are able to derive theoretical expressions that relate the non-linear dynamics of a robot, the resolution of the search graph and the density of the Poisson process. We validate the theoretical bounds using Monte-Carlo simulations for single integrator and curvature constrained systems and are able to validate the previous results presented by Karaman and Frazolli independently using the novel connections introduced in this paper.

Abstract:
In this paper, we present a square-root inverse sliding window filter (SR-ISWF) for vision-aided inertial navigation systems (VINS). While regular inverse filters suffer from numerical issues, employing their square-root equivalent enables the usage of single-precision number representations, thus achieving considerable speed ups as compared to double-precision alternatives on resource-constrained mobile platforms. Besides a detailed description of the SR-ISWF for VINS, which focuses on the numerical procedures that enable exploiting the problem's structure for gaining in efficiency, this paper presents a thorough validation of the algorithm's processing requirements and achieved accuracy. In particular, experiments are conducted using a commercial-grade cell phone, where the proposed algorithm is shown to achieve the same level of estimation accuracy, when compared to state-of-the-art VINS algorithms, with significantly higher speed.

Abstract:
This paper builds off of recent work on rapidly exponentially stabilizing control Lyapunov functions (RES-CLF) and control Lyapunov function based quadratic programs (CLF-QP) for underactuated hybrid systems. The primary contribution of this paper is developing a robust control technique for underactuated hybrid systems with application to bipedal walking, that is able to track desired trajectories with significant model perturbation (mass and inertia increased by up to 200%.) We evaluate our proposed control design on a model of RABBIT, a five-link planar bipedal robot.

Abstract:
We build on previous works advocating the use of the Gravito-Inertial Wrench Cone (GIWC) as a general contact stability criterion (a ZMP for non-coplanar contacts). We show how to compute this wrench cone from the friction cones of contact forces by using an intermediate representation, the surface contact wrench cone, which is the minimal representation of contact stability for each surface contact. The observation that the GIWC needs to be computed only once per stance leads to particularly efficient algorithms, as we illustrate in two important problems for humanoids: testing robust static equilibrium and time-optimal path parameterization. We show, through theoretical analysis and in physical simulations, that our method is more general and/or outperforms existing ones.

Abstract:
Natural substrates are often composed of particulates of varying size, from fine sand to pebbles and boulders. Robot locomotion on such heterogeneous substrates is complicated in part due to large force and kinematic fluctuations introduced by heterogeneities. To systematically explore how heterogeneity affects locomotion, we study the movement of a hexapedal robot (15 cm, 150 g) in a trackway filled with ~1 mm sand, with a larger convex boulder of various shape and roughness embedded within. We investigate how the presence of the boulder affects the robot's trajectory. To do so we develop a fully-automated terrain creation system, the SCATTER (Systematic Creation of Arbitrary Terrain and Testing of Exploratory Robots), to control the initial conditions of the substrate, including sand compaction, boulder distribution, and substrate inclination. Analysis of the robot's trajectory indicates that the interaction with a boulder can be modeled as a scatterer with attractive and repulsive features. Depending on the contact position on the boulder, the robot will be scattered to different directions after the interaction. The trajectory of an individual interaction depends sensitively on the initial conditions, but remarkably this dependence of scattering angle upon initial contact location is universal over a wide range of boulder properties. For a larger heterogeneous field with multiple scatterers, the trajectory of the robot can be estimated using a superposition of the scattering angles from each scatterer. This scattering superposition can be applied to a variety of complex terrains, including heterogeneities of different geometry, orientation, and texture. Our results can aid in development of both deterministic and statistical descriptions of robot locomotion, control and path planning in complex terrain.

Abstract:
We describe Chisel : a system for real-time house-scale (300 square meter or more) dense 3D reconstruction onboard a Google Tango mobile device by using a dynamic spatially-hashed truncated signed distance field for mapping, and visual-inertial odometry for localization. By aggressively culling parts of the scene that do not contain surfaces, we avoid needless computation and wasted memory. Even under very noisy conditions, we produce high-quality reconstructions through the use of space carving. We are able to reconstruct and render very large scenes at a resolution of 2-3 cm in real time on a mobile device without the use of GPU computing. The user is able to view and interact with the reconstruction in real-time through an intuitive interface. We provide both qualitative and quantitative results on publicly available RGB-D datasets, and on datasets collected in real-time from two devices.

Abstract:
In everyday applications of robotics, people will likely interact with groups of robots. Most human-robot interaction (HRI) research to date, however, has studied humans interacting with individual robots. Initial research suggests that humans respond differently to individual robots and robots in groups, making responses to groups of robots critical to study. This paper presents a study performed in a public setting familiar to participants (university cafeterias) to examine how participants respond when robots, individually and in groups, enter their space. We examined participant survey and behavioral responses to different numbers of robots (Single or Group) with different behaviors (Social or Functional). Because robots will be used across cultures, we performed the study in Japan and the USA. Across cultures, we found that people interact more with robots in groups than single robots, yet report similar levels of liking for both; participants also rated social robots as more friendly and helpful than functional robots in general. They rated single social robots more positively than a group of social robots, but a group of functional robots more positively than single functional robots. Japanese participants reported liking the robots more than USA participants. This suggests that researchers and designers should be aware of how robot characteristics influence group effects.