Companies like Tesla, Uber, Cruise, and Waymo promise a future where cars are essentially mobile robots that can take us anywhere with a few taps on a smartphone. But a new category of vehicles is about to overtake self-driving cars in that leap into the future. Autonomous trucks have been quietly making just as much, if not more, progress toward commercial deployment, and their impact on the transportation of goods will no doubt be profound.
Among nearly a dozen companies developing autonomous trucking, San Diego–based TuSimple is trying to get ahead by combining unique technology with a series of strategic partnerships. Working with truck manufacturer Navistar as well as shipping giant UPS, TuSimple is already conducting test operations in Arizona and Texas, including depot-to-depot autonomous runs. These are being run under what’s known as “supervised autonomy,” in which somebody rides in the cab and is ready to take the wheel if needed. Sometime in 2021, the startup plans to begin doing away with human supervision, letting the trucks drive themselves from pickup to delivery without anybody on board.
Both autonomous cars and autonomous trucks rely on similar underlying technology: Sensors—typically cameras, lidars, and radars—feed data to a computer, which in turn controls the vehicle using skills learned through a massive amount of training and simulation. In principle, developing an autonomous truck can be somewhat easier than developing an autonomous car. That’s because unlike passenger vehicles, trucks—in particular long-haul tractor-trailers—generally follow fixed routes and spend most of their time on highways that are more predictable and easier to navigate than surface streets. Trucks are also a better platform for autonomy, with their large size providing more power for computers and an improved field of view for sensors, which can be mounted higher off the ground.
TuSimple claims that its approach is unique because its equipment is purpose built from the ground up for trucks. “Most of the other companies in this space got the seeds of their ideas from the DARPA Grand and Urban Challenges for autonomous vehicles,” says Chuck Price, chief product officer at TuSimple. “But the dynamics and functional behaviors of trucks are very different.”
The biggest difference is that trucks need to be able to sense conditions farther in advance, to allow for their longer stopping distance. The 200-meter practical range of lidar that most autonomous cars use as their primary sensor is simply not good enough for a fully loaded truck traveling at 120 kilometers per hour. Instead, TuSimple relies on multiple HD cameras that are looking up to 1,000 meters ahead whenever possible. The system detects other vehicles and calculates their trajectories at that distance, which Price says is approximately twice as far out as professional truck drivers look while driving.
“I think there’s a big wave coming in the logistics industry that’s not necessarily well appreciated,” says Tasha Keeney, an analyst at ARK Invest who specializes in autonomous technology. She explains that electrified autonomous trucks have the potential to reduce shipping expenses not only when compared with those of traditional trucking but also with those of rail, while offering the door-to-door service that rail cannot. “The relationships that TuSimple has made within the trucking industry are interesting—in the long term, vertically integrated, purpose-built vehicles will have a lot of advantages.”
By 2024,TuSimple plans to achieve Level 4 autonomy, meaning that its trucks will be able to operate without a human driver under limited conditions that may include time of day, weather, or premapped routes. At that point, TuSimple would start selling the trucks to fleet operators. Along the way, however, there are several other milestones the company must hit, beginning with its first “driver out” test in 2021, which Price describes as a critical real-world demonstration.
“This is no longer a science project,” he says. “It’s not research. It’s engineering. The driver-out demonstration is to prove to us, and to prove to the public, that it can be done.”
This article appears in the January 2021 print issue as “Robot Trucks Overtake Robot Cars.”
Uber is selling its robocar development operation to Aurora and calling the move a step forward in its quest for autonomous driving technology.
True, it will now have a big stake in the two companies’ combined robocar projects. But this is the latest wrinkle in the consolidation that is under way throughout the self-driving business. Uber itself has in the past been among the acquisitive companies.
But this news is not something Uber’s founder would have welcomed. And it puts the lie to two verities common just a few years back: that every big player in road transport needed its own robocar research unit and that the payoff would come soon—like, now.
The sale is valued at US $4 billion—a far cry, as Reutersreports, from the $7.5-billion valuation implicit in Uber’s deal last year to raise $1 billion from investors, including Toyota. But it’s still a hefty chunk of change, and that means Uber still has access to a trove of IP and, perhaps as important, robotics talent.
Indeed, when the Pittsburgh company’s disgraced founder and former CEO, Travis Kalanick, first bet big on self-driving tech in early 2015, he went on a hiring spree that gutted the robotics department of Carnegie Mellon University (CMU), also in Pittsburgh. Aurora is based there, too, so there won’t be an out-migration of roboticists from the area.
“Uber was losing $1 million to $2 million a day with no returns in sight in the near future,” says Raj Rajkumar, a professor of electrical and computer engineering at CMU. “Aurora has deep pockets; if it wins, Uber still wins. If not, at least this will cut the bleeding.”
Kalanick spent money like water, then was forced out in 2017 amid allegations of turning a blind eye to sexual harassment. The company’s stock remained a favorite with investors, hitting an all-time high just last week. At today’s market close the market capitalization stood at just under $94 billion, about half again as much as GM’s value. Yet Uber has never turned a profit.
Not turning a profit was once a feature rather than a bug. In the late 1990s, investors were able to put whatever valuation they pleased on a startup so long as that startup did not have any earnings. Without earnings, there can be no price-to-earnings ratio, and without that, my opinion of a company’s value is as good as yours.
In that tech bubble, and perhaps again in today’s mini-bubble, the value comes from somewhere else. It stems from matters unquantifiable—a certain feeling, a je ne sais quoi. In a word, hype.
Those 1990s dot-com startups were on to something, as we in these pandemic times, living off deliveries from FreshDirect, can testify. But they were somewhat ahead of their time. So is robocar tech.
Self-driving technology is clearly transforming the auto industry. Right now, you can buy cars that hew to their lanes, mitigate head-on crashes, warn of things lurking in your blind spots, and even check your eyes to make sure you are looking at the road. It was all too easy to project the trendline all the way to truly autonomous vehicles.
That’s what Uber (and a lot of other companies) did in 2016 when it said it would field such a true robocar in 2021. That’s three and a half weeks from now.
It was Uber, more than any other player, that showed the hollowness of such hype in 2017 when one of its robocars got into an industry-shaking crash in Arizona. The car, given control by a driver who wasn’t paying attention, killed a pedestrian. Across the country, experimental autonomous car fleets were temporarily grounded, and states that had offered lenient regulation began to tighten up. The coming of the driverless era suddenly got set back by years. It was a hard blow for those vendors of high-tech sensors and other equipment that can only make economic sense if manufactured at scale.
The industry is now in long-haul mode, and that requires it to concentrate its resources in a handful of places. Waymo, Aurora, GM and a handful of big auto companies are obvious choices, and among the suppliers there is the odd clear success, like Luminar. The lidar maker had a stellar coming out party on Wall Street last week, one that put Austin Russell, its 25-year-old founder, in the billionaires’ club.
But a shakeout is a shakeout. Despite its stellar market cap, Uber is still a ride-hailing app, not General Motors. The company was wise to get out now.
When race car drivers take tight turns at high speeds, they rely on their experience and gut feeling to hit the gas pedal without spinning out. But how does an autonomous race car make the same decision?
Currently, many autonomous cars rely on expensive external sensors to calculate a vehicle’s velocity and chance of sideslipping on the racetrack. In a different approach, one research team in Switzerland has recently developed a novel a machine learning algorithm that harnesses measurements from more simple sensors. They describe their design in a study published August 14 in IEEE Robotics and Automation Letters.
As a race car takes a turn around the track, its forward and lateral velocity determines how well the tires grip the road—and how much sideslip occurs.
“(Autonomous) race cars are typically equipped with special sensors that are very accurate, exhibit almost no noise, and measure the lateral and longitudinal velocity separately,” explains Victor Reijgwart, of the Autonomous Systems Lab at ETH Zurich and a co-creator of the new design.
These state-of-the art sensors only require simple filters (or calculations) to estimate velocity and control sideslip. But, as Reijgwart notes, “Unfortunately, these sensors are heavy and very expensive—with single sensors often costing as much as an entry-level consumer car.”
His group, whose Formula Student team is named AMZ Racing, sought a novel solution. Their resulting machine learning algorithm relies on several measurements including: two normal inertial measurement units, the rotation speed and motor torques at all four wheels, and the steering angle. They trained their model using real data from racing cars on flat, gravel, bumpy, and wet road surfaces.
In their study, the researchers compared their approach to the external velocity sensors that have been commonly used at multiple Formula Student Driverless events across Europe in 2019. Results show that the new approach demonstrates comparable performance when the cars are undergoing a high level of sideslip (at 10◦ at the rear axle), but offers several advantages. For example, the new approach is better at rejecting biases and outlier measurements. The results also show that the machine learning approach is 15 times better than using just simple algorithms with non-specialized sensors.
“But learning from data is a two-edged sword,” says Sirish Srinivasan, another AMZ Racing member at ETH Zurich. “While the approach works well when it has been used under circumstances that are similar to the data it was trained on, safe behavior of the [model] cannot yet be guaranteed when it is used in conditions that significantly differ from the training data.”
Some examples include unusual weather conditions, changes in tire pressure, or other unexpected events.
The AMZ Racing team participates in yearly Formula Student Driverless engineering competitions, and hopes to apply this technique in the next race.
In the meantime, the team is interested in further improving their technique. “Several open research questions remain, but we feel like the most central one would be how to deal with unforeseen circumstances,” says Reijgwart. “This is, arguably, a major open question for the machine learning community in general.”
He notes that adding more “common sense” to the model, which would give it more conservative but safe estimates in unforeseen circumstances, is one option. In a more complex approach, the model could perhaps be taught to predict its own uncertainty, so that it hands over control to a simpler but more reliable mode of calculation when the AI encounters an unfamiliar scenario.
No longer a rare sight, amphibious buses can now be found making a splash around the globe by providing tourists with a different view of local attractions. Even Naganohara, a small town in Gunma Prefecture, Japan, population 5,600, operates an amphibious tourist bus daily in and alongside the Yanba Dam nine months of the year.
And that’s the problem—the experience is less of a thrill year by year. So the town, an hour’s train journey northwest of Tokyo, hit on the idea of making the amphibious bus self-driving.
The amphibious bus, the property of the town, comprises a converted truck design combined with a ship’s bottom and carries 40 passengers. It uses the truck’s diesel engine on land and a separate ship engine to travel in the dam at 3.6 knots an hour.
SIT is developing the self-driving technologies for both land and water that are based on the open-source Autoware platform for autonomous cars, and on controllers for modified Joy Cars.
“Joy Cars are joystick-controlled cars for disable people that have been retrofitted with actuators and a joystick controller system,” says Daishi Watabe, director, Center for Self-Driving Technologies at SIT, who is heading the Yanba Smart Mobility Project. “They are the development of an industry-SIT collaboration.”
Tatsuma Okubo, a general manager at ITbook Holdings, is the project’s manager and describes the autonomous technology set-up as follows. A PC with the Autoware software installed takes in data from the various sensors including Lidar, cameras, and the Global Navigation Satellite System. The software uses a controller area network (CAN bus) to communicate the data to a vehicle motion controller that in turn controls two Joystick-controlled Joy System sub-control units: one for steering and one for accelerating and braking.
“Basically, our autonomous bus system substitutes voltage data from the joystick interface with voltage data from the Autoware electronic unit,” says Watabe. “We are developing two sets of remodeled Joy Car actuators for retrofitting in the Naganohara amphibious bus—one set for use on land, the other for water, which are remodeled land actuators.”
He says the autonomous control system will manage four major areas of control: vehicle water-in/water-out location-recognition; sensor-stabilization to counter ship rolling; self-localization techniques to manage changes in surrounding 3D views, given the water height in the dam can dramatically change; and a sonar-based obstacle-avoidance scheme. In addition, AI is also used to assist in obstacle detection, self-localization, and path planning.
When the dam was created, buildings and trees were left as they are.
“Given the height of the lake can change as much as 30 meters, we have to recognize underwater obstacles and driftwood to avoid any collisions,” says Watabe. “But because water permeability is low, cameras are not suitable. And Lidar doesn’t function well underwater. So we need to use sonar sensors for obstacle detection and path planning.”
What’s more, 3D views change according to the water level, while Lidar has no surrounding objects to reflect from when the bus is in the middle of the lake. “This means a simple scan-matching algorithm is not sufficient for self-localization,” explains Watabe. “So we’ll also use global navigation satellite data enhanced through real-time kinematic positioning and a gyro-based localization scheme.”
The biggest difficulty the project faces, according to Okubo, is the short construction period available, as they only have the off-season—December to March—this year and next to install and field test autonomous functionality.
Another challenge: Because winds and water flows can affect vehicle guidance, subtle handling of the vehicle is required when entering and exiting the water to ensure the underwater guardrails do not cause damage. Consequently, the group is developing a precise control system to govern the rudder and propulsion system.
“We’ll install and fine-tune the autonomous functionality during two off-season periods in 2020-21 and 2021-22,” says Watabe. “Then the plan is to conduct field tests with the public in February and March 2022.”
Besides tourism, Okubo says the technology has a huge potential to “revolutionize logistics” to Japan’s remote islands, which are facing a survival crisis due to declining populations. As an example, he says only a single driver (or no driver once full automation is introduced) would be necessary during goods transshipments to such islands. This should reduce costs and enable more frequent operations.
A four-lane street narrows to two to accommodate workers repairing a large pothole. One worker holds a stop sign loosely in his left hand as he waves cars through with his right. Human drivers don’t think twice about whether to follow the gesture or the sign; they move smoothly forward without stopping.
This situation, however, would likely stop an autonomous vehicle in its tracks. It would understand the stop sign and how to react, but that hand gesture? That’s a lot more complicated.
And drivers, human and computer, daily face this and far more complex situations in which reading body language is the key. Consider a city street corner: A pedestrian, poised to cross with the light, stops to check her phone and waves a right-turning car forward. Another pedestrian lifts a hand up to wave to a friend across the way, but keeps moving. A human driver can decode these gestures with a glance.
Navigating such challenges safely and seamlessly, without interrupting the flow of traffic, requires that autonomous vehicles understand the common hand motions used to guide human drivers through unexpected situations, along with the gestures and body language of pedestrians going about their business. These are signals that humans react to without much thought, but they present a challenge for a computer system that’s still learning about the world around it.
Autonomous-vehicle developers around the world have been working for several years to teach self-driving cars to understand at least some basic hand gestures, initially focusing on signals from cyclists. Generally, developers rely on machine learning to improve vehicles’ abilities to identify real-world situations and understand how to deal with them. At Cruise we gather that data from our fleet of more than 200 self-driving cars. These vehicles have logged hundreds of thousands of miles every year for the past seven years; before the pandemic hit, they were on the road around the clock, taking breaks only to recharge (our cars are all-electric) and for regular maintenance. Our cars are learning fast because they are navigating the hilly streets of San Francisco, one of the most complex driving environments in the United States.
But we realized that our machine-learning models don’t always have enough training data because our cars don’t experience important gestures in the real world often enough. Our vehicles need to recognize each of these situations from different angles and distances and under different lighting conditions—a combination of constraints that produce a huge number of possibilities. It would take us years to gain enough information on these events if we relied only on the real-world experiences of our vehicles.
We at Cruise found a creative solution to the data gap: motion capture (or mo-cap) of human gestures, a technique that game developers use to create characters. Cruise has been hiring game developers—including me—for expertise in simulating detailed worlds, and some of us took on the challenge of capturing data to use in teaching our vehicles to understand gestures.
First, our data-collection team set out to build a comprehensive list of the ways people use their bodies to interact with the world and with other people—when hailing a taxi, say, talking on a phone while walking, or stepping into the street to dodge sidewalk construction. We started with movements that an autonomous vehicle might misconstrue as an order meant for itself—for example, that pedestrian waving to a friend. We then moved on to other gestures made in close proximity to the vehicle but not directed at it, such as parking attendants waving cars in the lane next to the vehicle into a garage and construction workers holding up a sign asking cars to stop temporarily.
Ultimately, we came up with an initial list of five key messages that are communicated using gestures: stop, go, turn left, turn right, and what we call “no”—that is, common motions that aren’t relevant to a passing car, like shooting a selfie or removing a backpack. We used the generally accepted American forms of these gestures, assuming that cars will be driving on the right, because we’re testing in San Francisco.
Of course, the gestures people use to send these messages aren’t uniform, so we knew from the beginning that our data set would have to contain a lot more than just five examples. Just how many, we weren’t sure.
Creating that data set required the use of motion-capture technology. There are two types of mo-cap systems—optical and nonoptical. The optical version of mo-cap uses cameras distributed over a large gridlike structure that surrounds a stage; the video streams from these cameras can be used to triangulate the 3D positions of visual markers on a full-body suit worn by an actor. There are several variations of this system that can produce extremely detailed captures, including those of facial expressions. That’s the kind that allows movie actors to portray nonhuman characters, as in the 2009 movie Avatar, and lets the gaming industry record the movements of athletes for the development of sports-themed video games.
Optical motion capture, however, must be performed in a studio setting with a complex multicamera setup. So Cruise selected a nonoptical, sensor-based version of motion capture instead. This technology, which relies on microelectromechanical systems (MEMS), is portable, wireless, and doesn’t require dedicated studio space. That gives us a lot of flexibility and allows us to take it out of the studio and into real-world locations.
Our mo-cap suits each incorporate 19-sensor packages attached at key points of the body, including the head and chest and each hip, shoulder, upper arm, forearm, and leg. Each package is about the size of a silver dollar and contains an accelerometer, a gyroscope, and a magnetometer. These are all wired to a belt containing a battery pack, a control bus, and a Wi-Fi radio. The sensor data flows wirelessly to a laptop running dedicated software, which lets our engineers view and evaluate the data in real time.
We recruited five volunteers of varying body characteristics—including differences in height, weight, and gender—from the Cruise engineering team, had them put the suits on, and took them to places that were relatively free from electronic interference. Each engineer-actor began by assuming a T-pose (standing straight, with legs together and arms out to the side) to calibrate the mo-cap system. From there, the actor made one gesture after another, moving through the list of gestures our team had created from our real-world data. Over the course of seven days, we had these five actors run through this gesture set again and again, using each hand separately and in some cases together. We also asked our actors to express different intensities. For example, the intensity would be high for a gesture signaling an urgent stop to a car that’s driving too fast in a construction zone. The intensity would be lower for a movement indicating that a car should slow down and come to a gradual stop. We ended up with 239 thirty-second clips.
Then our engineers prepared the data to be fed into machine-learning models. First, they verified that all gestures had been correctly recorded without additional noise and that no incorrectly rotated sensors had provided bad data. Then the engineers ran each gesture sequence through software that identified the joint position and orientation of each frame in the sequence. Because these positions were originally captured in three dimensions, the software could calculate multiple 2D perspectives of each sequence; that capability allowed us to expand our gesture set by incrementally rotating the points to simulate 10 different viewpoints. We created even more variations by randomly dropping various points of the body—to simulate the real-world scenarios in which something is hiding those points from view—and again incrementally rotating the remaining points to create different viewing angles.
Besides giving us a broad range of gestures performed by different people and seen from different perspectives, motion capture also gave us remarkably clean data: The skeletal structure of human poses is consistent no matter what the style or color of clothing or the lighting conditions may be. This clean data let us train our machine-learning system more efficiently.
Once our cars are trained on our motion-captured data, they will be better equipped to navigate the various scenarios that city driving presents. One such case is road construction. San Francisco always has a plethora of construction projects under way, which means our cars face workers directing traffic very often. Using our gesture-recognition system, our cars will be able to maneuver safely around multiple workers while comprehending their respective hand gestures.
Take, for example, a situation in which three road workers are blocking the lane that a self-driving car was planning to take. One of the workers is directing traffic and the other two are assessing road damage. The worker directing traffic has a sign in one hand; it has eight sides like a stop sign but reads “SLOW.” With the other hand he motions to traffic to move forward. To cross the intersection safely, our self-driving vehicle will recognize the person as someone controlling traffic. The vehicle will correctly interpret his gestures to mean that it should shift into the other lane, move forward, and ignore the car that’s coming to a stop at the opposite side of the intersection but appears to have the right-of-way.
In another situation, our vehicles will realize that someone entering an intersection and ignoring the flashing “Don’t Walk” sign is in fact directing traffic, not a pedestrian crossing against the light. The car will note that the person is facing it, rather than presenting his side, as someone preparing to cross the street would do. It will note that one of the person’s arms is up and the other is moving so as to signal a vehicle to cross. It will even register assertive behavior. All these things together enable our car to understand that it can continue to move forward, even though it sees someone in the intersection.
Training our self-driving cars to understand gestures is only the beginning. These systems must be able to detect more than just the basic movements of a person. We are continuing to test our gesture-recognition system using video collected by our test vehicles as they navigate the real world. Meanwhile, we have started training our systems to understand the concept of humans carrying or pushing other objects, such as a bicycle. This is important because a human pushing a bicycle usually behaves differently from a human riding a bicycle.
We’re also planning to expand our data set to help our cars better understand cyclists’ gestures—for example, a left hand pointing up, with a 90-degree angle at the elbow, means the cyclist is going to turn right; a right arm pointing straight out means the same thing. Our cars already recognize cyclists and automatically slow down to make room for them. Knowing what their gestures mean, however, will allow our cars to make sure they give cyclists enough room to perform a signaled maneuver without stopping completely and creating an unnecessary traffic jam. (Our cars still look out for unexpected turns from cyclists who don’t signal their intent, of course.)
Self-driving cars will change the way we live our lives in the years to come. And machine learning has taken us a long way in this development. But creative use of technologies like motion capture will allow us to more quickly teach our self-driving fleet to better coexist in cities—and make our roads safer for all.
This article appears in the September 2020 print issue as “The New Driver’s Ed.”
Getting a car to drive itself is undoubtedly the most ambitious commercial application of artificial intelligence (AI). The research project was kicked into life by the 2004 DARPA Urban Challenge and then taken up as a business proposition, first by Alphabet, and later by the big automakers.
The industry-wide effort vacuumed up many of the world’s best roboticists and set rival companies on a multibillion-dollar acquisitions spree. It also launched a cycle of hype that paraded ever more ambitious deadlines—the most famous of which, made by Alphabet’s Sergei Brin in 2012, was that full self-driving technology would be ready by 2017. Those deadlines have all been missed.
Much of the exhilaration was inspired by the seeming miracles that a new kind of AI—deep learning—was achieving in playing games, recognizing faces, and transliterating voices. Deep learning excels at tasks involving pattern recognition—a particular challenge for older, rule-based AI techniques. However, it now seems that deep learning will not soon master the other intellectual challenges of driving, such as anticipating what human beings might do.
Among the roboticists who have been involved from the start are Gill Pratt, the chief executive officer of Toyota Research Institute (TRI) , formerly a program manager at the Defense Advanced Research Projects Agency (DARPA); and Wolfram Burgard, vice president of automated driving technology for TRI and president of the IEEE Robotics and Automation Society. The duo spoke with IEEE Spectrum’s Philip Ross at TRI’s offices in Palo Alto, Calif.
This interview has been condensed and edited for clarity.
IEEE Spectrum: How does AI handle the various parts of the self-driving problem?
Gill Pratt: There are three different systems that you need in a self-driving car: It starts with perception, then goes to prediction, and then goes to planning.
The one that by far is the most problematic is prediction. It’s not prediction of other automated cars, because if all cars were automated, this problem would be much more simple. How do you predict what a human being is going to do? That’s difficult for deep learning to learn right now.
Spectrum: Can you offset the weakness in prediction with stupendous perception?
Wolfram Burgard: Yes, that is what car companies basically do. A camera provides semantics, lidar provides distance, radar provides velocities. But all this comes with problems, because sometimes you look at the world from different positions—that’s called parallax. Sometimes you don’t know which range estimate that pixel belongs to. That might make the decision complicated as to whether that is a person painted onto the side of a truck or whether this is an actual person.
With deep learning there is this promise that if you throw enough data at these networks, it’s going to work—finally. But it turns out that the amount of data that you need for self-driving cars is far larger than we expected.
Spectrum: When do deep learning’s limitations become apparent?
Pratt: The way to think about deep learning is that it’s really high-performance pattern matching. You have input and output as training pairs; you say this image should lead to that result; and you just do that again and again, for hundreds of thousands, millions of times.
Here’s the logical fallacy that I think most people have fallen prey to with deep learning. A lot of what we do with our brains can be thought of as pattern matching: “Oh, I see this stop sign, so I should stop.” But it doesn’t mean all of intelligence can be done through pattern matching.
For instance, when I’m driving and I see a mother holding the hand of a child on a corner and trying to cross the street, I am pretty sure she’s not going to cross at a red light and jaywalk. I know from my experience being a human being that mothers and children don’t act that way. On the other hand, say there are two teenagers—with blue hair, skateboards, and a disaffected look. Are they going to jaywalk? I look at that, you look at that, and instantly the probability in your mind that they’ll jaywalk is much higher than for the mother holding the hand of the child. It’s not that you’ve seen 100,000 cases of young kids—it’s that you understand what it is to be either a teenager or a mother holding a child’s hand.
You can try to fake that kind of intelligence. If you specifically train a neural network on data like that, you could pattern-match that. But you’d have to know to do it.
Spectrum: So you’re saying that when you substitute pattern recognition for reasoning, the marginal return on the investment falls off pretty fast?
Pratt: That’s absolutely right. Unfortunately, we don’t have the ability to make an AI that thinks yet, so we don’t know what to do. We keep trying to use the deep-learning hammer to hammer more nails—we say, well, let’s just pour more data in, and more data.
Spectrum: Couldn’t you train the deep-learning system to recognize teenagers and to assign the category a high propensity for jaywalking?
Burgard: People have been doing that. But it turns out that these heuristics you come up with are extremely hard to tweak. Also, sometimes the heuristics are contradictory, which makes it extremely hard to design these expert systems based on rules. This is where the strength of the deep-learning methods lies, because somehow they encode a way to see a pattern where, for example, here’s a feature and over there is another feature; it’s about the sheer number of parameters you have available.
Our separation of the components of a self-driving AI eases the development and even the learning of the AI systems. Some companies even think about using deep learning to do the job fully, from end to end, not having any structure at all—basically, directly mapping perceptions to actions.
Pratt: There are companies that have tried it; Nvidia certainly tried it. In general, it’s been found not to work very well. So people divide the problem into blocks, where we understand what each block does, and we try to make each block work well. Some of the blocks end up more like the expert system we talked about, where we actually code things, and other blocks end up more like machine learning.
Spectrum: So, what’s next—what new technique is in the offing?
Pratt: If I knew the answer, we’d do it. [Laughter]
Spectrum: You said that if all cars on the road were automated, the problem would be easy. Why not “geofence” the heck out of the self-driving problem, and have areas where only self-driving cars are allowed?
Pratt: That means putting in constraints on the operational design domain. This includes the geography—where the car should be automated; it includes the weather, it includes the level of traffic, it includes speed. If the car is going slow enough to avoid colliding without risking a rear-end collision, that makes the problem much easier. Street trolleys operate with traffic still in some parts of the world, and that seems to work out just fine. People learn that this vehicle may stop at unexpected times. My suspicion is, that is where we’ll see Level 4 autonomy in cities. It’s going to be in the lower speeds.
That’s a sweet spot in the operational design domain, without a doubt. There’s another one at high speed on a highway, because access to highways is so limited. But unfortunately there is still the occasional debris that suddenly crosses the road, and the weather gets bad. The classic example is when somebody irresponsibly ties a mattress to the top of a car and it falls off; what are you going to do? And the answer is that terrible things happen—even for humans.
Spectrum: Learning by doing worked for the first cars, the first planes, the first steam boilers, and even the first nuclear reactors. We ran risks then; why not now?
Pratt: It has to do with the times. During the era where cars took off, all kinds of accidents happened, women died in childbirth, all sorts of diseases ran rampant; the expected characteristic of life was that bad things happened. Expectations have changed. Now the chance of dying in some freak accident is quite low because of all the learning that’s gone on, the OSHA [Occupational Safety and Health Administration] rules, UL code for electrical appliances, all the building standards, medicine.
Furthermore—and we think this is very important—we believe that empathy for a human being at the wheel is a significant factor in public acceptance when there is a crash. We don’t know this for sure—it’s a speculation on our part. I’ve driven, I’ve had close calls; that could have been me that made that mistake and had that wreck. I think people are more tolerant when somebody else makes mistakes, and there’s an awful crash. In the case of an automated car, we worry that that empathy won’t be there.
Spectrum: Toyota is building a system called Guardian to back up the driver, and a more futuristic system called Chauffeur, to replace the driver. How can Chauffeur ever succeed? It has to be better than a human plus Guardian!
Pratt: In the discussions we’ve had with others in this field, we’ve talked about that a lot. What is the standard? Is it a person in a basic car? Or is it a person with a car that has active safety systems in it? And what will people think is good enough?
These systems will never be perfect—there will always be some accidents, and no matter how hard we try there will still be occasions where there will be some fatalities. At what threshold are people willing to say that’s okay?
Spectrum: You were among the first top researchers to warn against hyping self-driving technology. What did you see that so many other players did not?
Pratt: First, in my own case, during my time at DARPA I worked on robotics, not cars. So I was somewhat of an outsider. I was looking at it from a fresh perspective, and that helps a lot.
Second, [when I joined Toyota in 2015] I was joining a company that is very careful—even though we have made some giant leaps—with the Prius hybrid drive system as an example. Even so, in general, the philosophy at Toyota is kaizen—making the cars incrementally better every single day. That care meant that I was tasked with thinking very deeply about this thing before making prognostications.
And the final part: It was a new job for me. The first night after I signed the contract I felt this incredible responsibility. I couldn’t sleep that whole night, so I started to multiply out the numbers, all using a factor of 10. How many cars do we have on the road? Cars on average last 10 years, though ours last 20, but let’s call it 10. They travel on an order of 10,000 miles per year. Multiply all that out and you get 10 to the 10th miles per year for our fleet on Planet Earth, a really big number. I asked myself, if all of those cars had automated drive, how good would they have to be to tolerate the number of crashes that would still occur? And the answer was so incredibly good that I knew it would take a long time. That was five years ago.
Burgard: We are now in the age of deep learning, and we don’t know what will come after. We are still making progress with existing techniques, and they look very promising. But the gradient is not as steep as it was a few years ago.
Pratt: There isn’t anything that’s telling us that it can’t be done; I should be very clear on that. Just because we don’t know how to do it doesn’t mean it can’t be done.
The idea of tomorrow’s car as a computer on wheels is ripening apace. Today, Daimler and Nvidia announced that Daimler’s carmaking arm, Mercedes-Benz, will drop Nvidia’s newest computerized driving system into every car it sells, beginning in 2024.
The system, called the Drive AGX Orin, is a system-on-a-chip that was announced in December and is planned to ship in 2022. It’s an open system, but as adapted for Mercedes, it will be laden with specially designed software. The result, say the two companies, will be a software-defined car: Customers will buy a car, then periodically download new features, among them some that were not known at the time of purchase. This capability will be enhanced by using software instead of dedicated hardware in the form of a constellation of electronic control units, or ECUs.
“In modern cars, there can be 100, up to 125, ECUs,” said Danny Shapiro, Nvidia’s senior director of automotive. “Many of those will be replaced by software apps. That will change how different things function in the car—from windshield wipers to door locks to performance mode.”
The plan is to give cars a degree of self-driving competence comparable to Level 2 (where the car assists the driver) and Level 3 (where the driver can do other things as the car drives itself, while remaining ready to take back the wheel). The ability to park itself will be Level 4 (where there’s no need to mind the car at all, so long as it’s operating in a predefined comfort zone).
In all these matters, the Nvidia-Daimler partnership is following a trail that Tesla blazed and others have followed. Volkswagen plainly emulated Tesla in a project that is just now yielding its first fruit: an all-electric car that will employ a single master electronic architecture that will eventually power all of VW’s electric and self-driving cars. The rollout was slightly delayed because of glitches in software, as we reported last week.
Asked whether Daimler would hire a horde of software experts, as Volkwagen is doing, and thus become something of a software company, Bernhard Wardin, spokesman for Daimler’s autonomous driving and artificial intelligence division, said he had no comment. (Sounds like a “yes.”) He added that though a car model had been selected for the debut of the new system, its name was still under wraps.
One thing that strikes the eye is the considerable muscle of the system. The AGX Orin has 17 billion transistors, incorporating what Nvidia says are new deep learning and computer vision accelerators that “deliver 200 trillion operations per second—nearly 7 times the performance of Nvidia’s previous generation Xavier SoC.”
It’s interesting to note that the Xavier itself began to ship only last year, after being announced toward the end of 2016. On its first big publicity tour, at the 2017 Consumer Electronics Show, the Xavier was touted by Jen-Hsun Huang, the CEO of Nvidia. He spoke together with the head of Audi, which was going to use the Xavier in a “Level 4” car that was supposed to hit the road in three years. Yes, that would be now.
So, a chip 7 times more powerful, with far more advanced AI software, is now positioned to power a car with mere Level 2/3 capabilities—in 2024.
None of this is meant to single out Nvidia or Daimler for ridicule. It’s meant to ridicule the entire industry, both the tech people and the car people, who have been busy walking back expectations for self-driving for the past two years. And to ridicule us tech journalists, who have been backtracking right alongside them.
Cars that help the driver do the driving are here already. More help is on the way. Progress is real. But the future is still in the future: Robocar tech is harder than we thought.
In March, because of the coronavirus, self-driving car companies, including Argo, Aurora, Cruise, Pony, and Waymo, suspended vehicle testing and operations that involved a human driver. Around the same time, Waymo and Ford released open data sets of information collected during autonomous-vehicle tests and challenged developers to use them to come up with faster and smarter self-driving algorithms.
These developments suggest the self-driving car industry still hopes to make meaningful progress on autonomous vehicles (AVs) this year. But the industry is undoubtedly slowed by the pandemic and facing a set of very hard problems that have gotten no easier to solve in the interim.
Five years ago, several companies including Nissan and Toyota promised self-driving cars in 2020. Lauren Isaac, the Denver-based director of business initiatives at the French self-driving vehicle company EasyMile, says AV hype was “at its peak” back then—and those predictions turned out to be far too rosy.
Now, Isaac says, many companies have turned their immediate attention away from developing fully autonomous Level 5 vehicles, which can operate in any conditions. Instead, the companies are focused on Level 4 automation, which refers to fully automated vehicles that operate within very specific geographical areas or weather conditions. “Today, pretty much all the technology developers are realizing that this is going to be a much more incremental process,” she says.
For example, EasyMile’s self-driving shuttles operate in airports, college campuses, and business parks. Isaac says the company’s shuttles are all Level 4. Unlike Level 3 autonomy (which relies on a driver behind the wheel as its backup), the backup driver in a Level 4 vehicle is the vehicle itself.
“We have levels of redundancy for this technology,” she says. “So with our driverless shuttles, we have multiple levels of braking systems, multiple levels of lidars. We have coverage for all systems looking at it from a lot of different angles.”
Another challenge: There’s no consensus on the fundamental question of how an AV looks at the world. Elon Musk has famously said that any AV manufacturer that uses lidar is “doomed.” A 2019 Cornell research paper seemed to bolster the Tesla CEO’s controversial claim by developing algorithms that can derive from stereo cameras 3D depth-perception capabilities that rival those of lidar.
However, open data sets have called lidar doomsayers into doubt, says Sam Abuelsamid, a Detroit-based principal analyst in mobility research at the industry consulting firm Navigant Research.
Abuelsamid highlighted a 2019 open data set from the AV company Aptiv, which the AI company Scale then analyzed using two independent sources: The first considered camera data only, while the second incorporated camera plus lidar data. The Scale team found camera-only (2D) data sometimes drew inaccurate “bounding boxes” around vehicles and made poorer predictions about where those vehicles would be going in the immediate future—one of the most important functions of any self-driving system.
“While 2D annotations may look superficially accurate, they often have deeper inaccuracies hiding beneath the surface,” software engineer Nathan Hayflick of Scale wrote in a company blog about the team’s Aptiv data set research. “Inaccurate data will harm the confidence of [machine learning] models whose outputs cascade down into the vehicle’s prediction and planning software.”
Abuelsamid says Scale’s analysis of Aptiv’s data brought home the importance of building AVs with redundant and complementary sensors—and shows why Musk’s dismissal of lidar may be too glib. “The [lidar] point cloud gives you precise distance to each point on that vehicle,” he says. “So you can now much more accurately calculate the trajectory of that vehicle. You have to have that to do proper prediction.”
So how soon might the industry deliver self-driving cars to the masses? Emmanouil Chaniotakis is a lecturer in transport modeling and machine learning at University College London. Earlier this year, he and two researchers at the Technical University of Munich published a comprehensive review of all the studies they could find on the future of shared autonomous vehicles (SAVs).
They found the predictions—for robo-taxis, AV ride-hailing services, and other autonomous car-sharing possibilities—to be all over the map. One forecast had shared autonomous vehicles driving just 20 percent of all miles driven in 2040, while another model forecast them handling 70 percent of all miles driven by 2035.
So autonomous vehicles (shared or not), by some measures at least, could still be many years out. And it’s worth remembering that previous predictions proved far too optimistic.
This article appears in the May 2020 print issue as “The Road Ahead for Self-Driving Cars.”
For the past two months, the vegetables have arrived on the back of a robot. That’s how 16 communities in Zibo, in eastern China, have received fresh produce during the coronavirus pandemic. The robot is an autonomous van that uses lidars, cameras, and deep-learning algorithms to drive itself, carrying up to 1,000 kilograms on its cargo compartment.
The unmanned vehicle provides a “contactless” alternative to regular deliveries, helping reduce the risk of person-to-person infection, says Professor Ming Liu, a computer scientist at the Hong Kong University of Science and Technology (HKUST) and cofounder of Unity Drive Innovation, or UDI, the Shenzhen-based startup that developed the self-driving van.
Neolix, a maker of urban robo-delivery trucks, made an interesting claim recently. The Beijing-based company said orders for its self-driving delivery vehicles were soaring because the coronavirus epidemic had both cleared the roads of cars and opened the eyes of customers to the advantages of driverlessness. The idea is that when the epidemic is over, the new habits may well persist.
Neolix last week told Automotive News it had booked 200 orders in the past two months after having sold just 159 in the eight months before. And on 11 March, the company confirmed that it had raised US $29 million in February to fund mass production.
Of course, this flurry of activity could merely be coincidental to the epidemic, but Tallis Liu, the company’s manager of business development, maintains that it reflects changing attitudes in a time of plague.
“We’ve seen a rise in both acceptance and demand both from the general public and from the governmental institutions,” he tells IEEE Spectrum. The sight of delivery bots on the streets of Beijing is “educating the market” about “mobility as a service” and on “how it will impact people’s day-to-day lives during and after the outbreak.”
During the epidemic, Neolix has deployed 50 vehicles in 10 major cities in China to do mobile delivery and also disinfection service. Liu says that many of the routes were chosen because they include public roads that the lockdown on movement has left relatively empty.
The company’s factory has a production capacity of 10,000 units a year, and most of the factory staff has returned to their positions, Liu adds. “Having said that, we are indeed facing some delays from our suppliers given the ongoing situation.”
Neolix’s deliverybots are adorable—a term this site once used to describe a strangely similar-looking rival bot from the U.S. firm Nuro. The bots are the size of a small car, and they’re each equipped with cameras, three 16-channel lidar laser sensors, and one single-channel lidar. The low-speed version also has 14 ultrasonic short-range sensors; on the high-speed version, the ultrasonic sensors are supplanted by radars.
If self-driving technology benefits from the continued restrictions on movement in China and around the world, it wouldn’t be the first time that necessity had been the mother of invention. An intriguing example is furnished by a mere two-day worker’s strike on the London Underground in 2014. Many commuters, forced to find alternatives, ended up sticking with those workarounds even after Underground service resumed, according to a 2015 analysis by three British economists.
One of the researchers, Tim Willems of Oxford University, tells Spectrum that disruptions can induce permanent changes when three conditions are met. First, “decision makers are lulled into habits and have not been able to achieve their optimum (close to our Tube strike example).” Second, “there are coordination failures that make it irrational for any one decision maker to deviate from the status quo individually” and a disruption “forces everybody away from the status quo at the same time.” And third, the reluctance to pay the fixed costs required to set up a new way of doing things can be overcome under crisis conditions.
By that logic, many workers sent home for months on end to telecommute will stay on their porches or in their pajamas long after the all-clear signal has sounded. And they will vastly accelerate the move to online shopping, with package delivery of both the human and the nonhuman kind.
On Monday, New York City’s mayor, Bill de Blasio, said he was suspending his long-running campaign against e-bikes. “We are suspending that enforcement for the duration of this crisis,” he said. And perhaps forever.
As a transportation technology journalist, I’ve ridden in a lot of self-driving cars, both with and without safety drivers. A key part of the experience has always been a laptop or screen showing a visualization of other road users and pedestrians, using data from one or more laser-ranging lidar sensors.
Ghostly three-dimensional shapes made of shimmering point clouds appear at the edge of the screen, and are often immediately recognizable as cars, trucks, and people.
At first glance, the screen in Echodyne’s Ford Flex SUV looks like a lidar visualization gone wrong. As we explore the suburban streets of Kirkland, Washington, blurry points and smeary lines move across the display, changing color as they go. They bear little resemblance to the vehicles and cyclists I can see out of the window.
A lot of people in the auto industry talked for way too long about the imminent advent of fully self-driving cars.
In 2013, Carlos Ghosn, now very much the ex-chairman of Nissan, said it would happen in seven years. In 2016, Elon Musk, then chairman of Tesla, implied his cars could basically do it already. In 2017 and right through early 2019 GM Cruise talked 2019. And Waymo, the company with the most to show for its efforts so far, is speaking in more measured terms than it used just a year or two ago.
It’s all making Gill Pratt, CEO of the Toyota Research Institute in California, look rather prescient. A veteran roboticist who joined Toyota in 2015 with the task of developing robocars, Pratt from the beginning emphasized just how hard the task would be and how important it was to aim for intermediate goals—notably by making a car that could help drivers now, not merely replace them at some distant date.
That helpmate, called Guardian, is set to use a range of active safety features to coach a driver and, in the worst cases, to save him from his own mistakes. The more ambitious Chauffeur will one day really drive itself, though in a constrained operating environment. The constraints on the current iteration will be revealed at the first demonstration at this year’s Olympic games in Tokyo; they will certainly involve limits to how far afield and how fast the car may go.
Earlier this week, at TRI’s office in Palo Alto, Calif., Pratt and his colleagues gave Spectrum a walkaround look at the latest version of the Chauffeur, the P4; it’s a Lexus with a package of sensors neatly merging with the roof. Inside are two lidars from Luminar, a stereocamera, a mono-camera (just to zero in on traffic signs), and radar. At the car’s front and corners are small Velodyne lidars, hidden behind a grill or folded smoothly into small protuberances. Nothing more could be glimpsed, not even the electronics that no doubt filled the trunk.
Pratt and his colleagues had a lot to say on the promises and pitfalls of self-driving technology. The easiest to excerpt is their view on the difficulty of the problem.
“There isn’t anything that’s telling us it can’t be done; I should be very clear on that,” Pratt says. “Just because we don’t know how to do it doesn’t mean it can’t be done.”
That said, though, he notes that early successes (using deep neural networks to process vast amounts of data) led researchers to optimism. In describing that optimism, he does not object to the phrase “irrational exuberance,” made famous during the 1990s dot-com bubble.
It turned out that the early successes came in those fields where deep learning, as it’s known, was most effective, like artificial vision and other aspects of perception. Computers, long held to be particularly bad at pattern recognition, were suddenly shown to be particularly good at it—even better, in some cases, than human beings.
“The irrational exuberance came from looking at the slope of the [graph] and seeing the seemingly miraculous improvement deep learning had given us,” Pratt says. “Everyone was surprised, including the people who developed it, that suddenly, if you threw enough data and enough computing at it, the performance would get so good. It was then easy to say that because we were surprised just now, it must mean we’re going to continue to be surprised in the next couple of years.”
The mindset was one of permanent revolution: The difficult, we do immediately; the impossible just takes a little longer.
Then came the slow realization that AI not only had to perceive the world—a nontrivial problem, even now—but also to make predictions, typically about human behavior. That problem is more than nontrivial. It is nearly intractable.
Of course, you can always use deep learning to do whatever it does best, and then use expert systems to handle the rest. Such systems use logical rules, input by actual experts, to handle whatever problems come up. That method also enables engineers to tweak the system—an option that the black box of deep learning doesn’t allow.
Putting deep learning and expert systems together does help, says Pratt. “But not nearly enough.”
Day-to-day improvements will continue no matter what new tools become available to AI researchers, says Wolfram Burgard, Toyota’s vice president for automated driving technology.
“We are now in the age of deep learning,” he says. “We don’t know what will come after—it could be a rebirth of an old technology that suddenly outperforms what we saw before. We are still in a phase where we are making progress with existing techniques, but the gradient isn’t as steep as it was a few years ago. It is getting more difficult.”
The facets of autonomous car development that automakers tend to get excited about are things like interpreting sensor data, decision making, and motion planning.
Unfortunately, if you want to make self-driving cars, there’s all kinds of other stuff that you need to get figured out first, and much of it is really difficult but also absolutely critical. Things like, how do you set up a reliable network inside of your vehicle? How do you manage memory and data recording and logging? How do you get your sensors and computers to all talk to each other at the same time? And how do you make sure it’s all stable and safe?
In robotics, the Robot Operating System (ROS) has offered an open-source solution for many of these challenges. ROS provides the groundwork for researchers and companies to build off of, so that they can focus on the specific problems that they’re interested in without having to spend time and money on setting up all that underlying software infrastructure first.
Apex.ai’s Apex OS, which is having its version 1.0 release today, extends this idea from robotics to autonomous cars. It promises to help autonomous carmakers shorten their development timelines, and if it has the same effect on autonomous cars as ROS has had on robotics, it could help accelerate the entire autonomous car industry.
For more about what this 1.0 software release offers, we spoke with Apex.ai CEO Jan Becker.
IEEE Spectrum: What exactly can Apex.OS do, and what doesn’t it do?
Jan Becker: Apex.OS is a fork of ROS 2 that has been made robust and reliable so that it can be used for the development and deployment of highly safety-critical systems such as autonomous vehicles, robots, and aerospace applications. Apex.OS is API-compatible to ROS 2. In a nutshell, Apex.OS is an SDK for autonomous driving software and other safety-critical mobility applications. The components enable customers to focus on building their specific applications without having to worry about message passing, reliable real-time execution, hardware integration, and more.
Apex.OS is not a full [self-driving software] stack. Apex.OS enables customers to build their full stack based on their needs. We have built an automotive-grade 3D point cloud/lidar object detection and tracking component and we are in the process of building a lidar-based localizer, which is available as Apex.Autonomy. In addition, we are starting to work with other algorithmic component suppliers to integrate Apex.OS APIs into their software. These components make use of Apex.OS APIs, but are available separately, which allows customers to assemble a customized full software stack from building blocks such that it exactly fits their needs. The algorithmic components re-use the open architecture which is currently being built in the open source Autoware.Auto project.
So if every autonomous vehicle company started using Apex.OS, those companies would still be able to develop different capabilities?
Apex.OS is an SDK for autonomous driving software and other safety-critical mobility applications. Just like iOS SDK provides an SDK for iPhone app developers enabling them to focus on the application, Apex.OS provides an SDK to developers of safety-critical mobility applications.
Every autonomous mobility system deployed into a public environment must be safe. We enable customers to focus on their application without having to worry about the safety of the underlying components. Organizations will differentiate themselves through performance, discrete features, and other product capabilities. By adopting Apex.OS, we enable them to focus on developing these differentiators.
What’s the minimum viable vehicle that I could install Apex.OS on and have it drive autonomously?
In terms of compute hardware, we showed Apex.OS running on a Renesas R-Car H3 and on a Quanta V3NP at CES 2020. The R-Car H3 contains just four ARM Cortex-A57 cores and four ARM Cortex-A53 cores and is the smallest ECU for which our customers have requested support. You can install Apex.OS on much smaller systems, but this is the smallest one we have tested extensively so far, and which is also powering our vehicle.
We are currently adding support for the Renesas R-Car V3H, which contains four ARM Cortex-A53 cores (and no ARM Cortex-A57 cores) and an additional image processing processor.
You suggest that Apex.OS is also useful for other robots and drones, in addition to autonomous vehicles. Can you describe how Apex.OS would benefit applications in these spaces?
Apex.OS provides a software framework that enables reading, processing, and outputting data on embedded real-time systems used in safety-critical environments. That pertains to robotics and aerospace applications just as much as to automotive applications. We simply started with automotive applications because of the stronger market pull.
Industrial robots today often run ROS for the perception system and non-ROS embedded controller for highly-accurate position control, because ROS cannot run the realtime controller with the necessary precision. Drones often run PX4 for the autopilot and ROS for the perception stack. Apex.OS combines the capabilities of ROS with the requirements of mobility systems, specifically regarding real-time, reliability and the ability to run on embedded compute systems.
How will Apex contribute back to the open-source ROS 2 ecosystem that it’s leveraging within Apex.OS?
We have contributed back to the ROS 2 ecosystem from day one. Any and all bugs that we find in ROS 2 get fixed in ROS 2 and thereby contributed back to the open-source codebase. We also provide a significant amount of funding to Open Robotics to do this. In addition, we are on the ROS 2 Technical Steering Committee to provide input and guidance to make ROS 2 more useful for automotive applications. Overall we have a great deal of interest in improving ROS 2 not only because it increases our customer base, but also because we strive to be a good open-source citizen.
The features we keep in house pertain to making ROS 2 realtime, deterministic, tested, and certified on embedded hardware. Our goals are therefore somewhat orthogonal to the goals of an open-source project aiming to address as many applications as possible. We, therefore, live in a healthy symbiosis with ROS 2.
Yesterday I drove from Silicon Valley to San Francisco. It started raining on the way and I hadn’t thought to take an umbrella. No matter—I had the locations of two parking garages, just a block or so from my destination, preloaded into my navigation app. But both were full, and I found myself driving in stop-and-go traffic around crowded, wet, hilly, construction-heavy San Francisco, hunting for street parking or an open garage for nearly an hour. It was driving hell.
So when I finally arrived at a launch event hosted by Cruise, I couldn’t have been more receptive to the company’s pitch for Cruise Origin, a new vehicle that, Cruise executives say, intends to make it so I won’t need to drive or park in a city ever again.
These systems don’t take control, not even in anticipation of a crash, as they do in many advanced driver assistance systems in cars. They leave a motorcyclist fully in command while offering the benefit of an extra pair of eyes.
Why drape high tech “rubber padding” over the motorcycle world? Because that’s where the danger is: Motorcyclists are 27 times more likely to die in a crash than are passengers in cars.
“It’s not a matter of if you’ll have an accident on a motorbike, but when,” says Damon chief executive Jay Giraud. “Nobody steps into motorbiking knowing that, but they learn.”
The Hypersport’s sensor suite includes cameras, radar, GPS, solid-state gyroscopes and accelerometers. It does not include lidar–“it’s not there yet,” Giraud says–but it does open the door a crack to another way of seeing the world: wireless connectivity.
The bike’s brains note everything that happens when danger looms, including warnings issued and evasive maneuvers taken, then shunts the data to the cloud via 4G wireless. For now that data is processed in batches, to help Damon refine its algorithms, a practice common among self-driving car researchers. Some day, it will share such data with other vehicles in real-time, a strategy known as vehicle-to-everything, or V2x.
But not today. “That whole world is 5-10 years away—at least,” Giraud grouses. “I’ve worked on this for over decade—we’re no closer today than we were in 2008.”
The bike has an onboard neural net whose settings are fixed at any given time. When the net up in the cloud comes up with improvements, these are sent as over-the-air updates to each motorcycle. The updates have to be approved by each owner before going live onboard.
When the AI senses danger it gives warning. If the car up ahead suddenly brakes, the handlebars shake, warning of a frontal collision. If a vehicle coming from behind enters the biker’s blind spot, LEDs flash. That saves the rider the trouble of constantly having to look back to check the blind spot.
The patterns the bike’s AI tease out from the data are not always comparable to those a self-driving car would care about. A motorcycle shifts from one half of a lane to the other; it leans down, sometimes getting fearsomely close to the pavement; and it is often hard for drivers in other vehicles to see.
One motorbike-centric problem is the high risk a biker takes just by entering an intersection. Some three-quarters of motorcycle accidents happen there, and of that number about two-thirds are caused by a car’s colliding from behind or from the side. The side collision, called a T-bone, is particularly bad because there’s nothing at all to shield the rider.
Certain traffic patterns increase the risk of such collisions. “Patterns that repeat allow our system to predict risk,” Giraud says. “As the cloud sees the tagged information again and again, we can use it to make predictions.”
Damon is taking pre-orders, but it expects to start shipping in mid-2021. Like Tesla, it will deliver straight to the customer, with no dealers to get in the way.
Researchers have developed a new technique for tracking the hand movements of a non-attentive driver, to calculate how long it would take the driver to assume control of a self-driving car in an emergency.
If manufacturers can overcome the final legal hurdles, cars with Level 3 autonomous vehicle technology will one day be chauffeuring people from A to B. These cars allow a driver to have his or her eyes off the road and the freedom to do minor tasks (such as texting or watching a movie). However, these cars need a way of knowing how quickly—or slowly—a driver can respond when taking control during an emergency.
For decades, anyone who wanted to know whether a new car was safe to drive could simply put it through its paces, using tests established through trial and error. Such tests might investigate whether the car can take a sharp turn while keeping all four wheels on the road, brake to a stop over a short distance, or survive a collision with a wall while protecting its occupants.
But as cars take an ever greater part in driving themselves, such straightforward testing will no longer suffice. We will need to know whether the vehicle has enough intelligence to handle the same kind of driving conditions that humans have always had to manage. To do that, automotive safety-assurance testing has to become less like an obstacle course and more like an IQ test.
One obvious way to test the brains as well as the brawn of autonomous vehicles would be to put them on the road along with other traffic. This is necessary if only because the self-driving cars will have to share the road with the human-driven ones for many years to come. But road testing brings two concerns. First, the safety of all concerned can’t be guaranteed during the early stages of deployment; self-driving test cars have already been involved in fatal accidents. Second is the sheer scale that such direct testing would require.
That’s because most of the time, test vehicles will be driven under typical conditions, and everything will go as it normally does. Only in a tiny fraction of cases will things take a different turn. We call these edge cases, because they concern events that are at the edge of normal experience. Example: A truck loses a tire, which hops the median and careens into your lane, right in front of your car. Such edge cases typically involve a concurrence of failures that are hard to conceive of and are still harder to test for. This raises the question, How long must we road test a self-driving, connected vehicle before we can fairly claim that it is safe?
The answer to that question may never be truly known. What’s clear, though, is that we need other strategies to gauge the safety of self-driving cars. And the one we describe here—a mixture of physical vehicles and computer simulation—might prove to be the most effective way there is to evaluate self-driving cars.
A fatal crash occurs only once in about 160 million kilometers of driving, according to statistics compiled by the U.S. National Highway Traffic Safety Administration [PDF]. That’s a bit more than the distance from Earth to the sun (and 10 times as much as has been logged by the fleet of Google sibling Waymo, the company that has the most experience with self-driving cars). To travel that far, an autonomous car driving at highway speeds for 24 hours a day would need almost 200 years. It would take even longer to cover that distance on side streets, passing through intersections and maneuvering around parking lots. It might take a fleet of 500 cars 10 years to finish the job, and then you’d have to do it all over again for each new design.
Clearly, the industry must augment road testing with other strategies to bring out as many edge cases as possible. One method now in use is to test self-driving vehicles in closed test facilities where known edge cases can be staged again and again. Take, as an example, the difficulties posed by cars that run a red light at high speed. An intersection can be built as if it were a movie set, and self-driving cars can be given the task of crossing when the light turns green while at the same time avoiding vehicles that illegally cross in front of them.
While this approach is helpful, it also has limitations. Multiple vehicles are typically needed to simulate edge cases, and professional drivers may well have to pilot them. All this can be costly and difficult to coordinate. More important, no one can guarantee that the autonomous vehicle will work as desired, particularly during the early stages of such testing. If something goes wrong, a real crash could happen and damage the self-driving vehicle or even hurt people in other vehicles. Finally, no matter how ingenious the set designers may be, they cannot be expected to create a completely realistic model of the traffic environment. In real life, a tree’s shadow can confuse an autonomous car’s sensors and a radar reflection off a manhole cover can make the radar see a truck where none is present.
Computer simulation provides a way around the limitations of physical testing. Algorithms generate virtual vehicles and then move them around on a digital map that corresponds to a real-world road. If the data thus generated is then broadcast to an actual vehicle driving itself on the same road, the vehicle will interpret the data exactly as if it had come from its own sensors. Think of it as augmented reality tuned for use by a robot.
Although the physical test car is driving on empty roads, it “thinks” that it is surrounded by other vehicles. Meanwhile, it sends information that it is gathering—both from augmented reality and from its sensing of the real-world surroundings—back to the simulation platform. Real vehicles, simulated vehicles, and perhaps other simulated objects, such as pedestrians, can thus interact. In this way, a wide variety of scenarios can be tested in a safe and cost-effective way.
The idea for automotive augmented reality came to us by the back door: Engineers had already improved certain kinds of computer simulations by including real machines in them. As far back as 1999, Ford Motor Co. used measurements of an actual revving engine to supply data for a computer simulation of a power train. This hybrid simulation method was called hardware-in-the-loop, and engineers resorted to it because mimicking an engine in software can be very difficult. Knowing this history, it occurred to us that it would be possible to do the opposite—generate simulated vehicles as part of a virtual environment for testing actual cars.
In June 2017, we implemented an augmented-reality environment in Mcity, the world’s first full-scale test bed for autonomous vehicles. It occupies 32 acres on the North Campus of the University of Michigan, in Ann Arbor. Its 8 lane-kilometers (5 lane-miles) of roadway are arranged in sections having the attributes of a highway, a multilane arterial road, or an intersection.
Here’s how it works. The autonomous test car is equipped with an onboard device that can broadcast vehicle status, such as location, speed, acceleration, and heading, doing so every tenth of a second. It does this wirelessly, using dedicated short-range communications (DSRC), a standard similar to Wi-Fi that has been earmarked for mobile users. Roadside devices distributed around the testing facility receive this information and forward it to a traffic-simulation model, one that can simulate the testing facility by boiling it down to an equivalent network geometry that incorporates the actions of traffic signals. Once the computer model receives the test car’s information, it creates a virtual twin of that car. Then it updates the virtual car’s movements based on the movements of the real test car.
Feeding data from the real test vehicle into the computer simulation constitutes only half of the loop. We complete the other half by sending information about the various vehicles the computer has simulated to the test car. This is the essence of the augmented-reality environment. Every simulated vehicle also generates vehicle-status messages at a frequency of 10 hertz, which we forward to the roadside devices, which in turn broadcast it in real time. When the real test car receives that data, its vehicle-control system uses it to “see” all the virtual vehicles. To the car, these simulated entities are indistinguishable from the real thing.
By having vehicles pass messages through the roadside devices—that is, by substituting “vehicle-to-infrastructure” connections for direct “vehicle-to-vehicle” links—real vehicles and virtual vehicles can sense one another and interact accordingly. In the same fashion, traffic-signal status is also synchronized between the real and the simulated worlds. That way, real and virtual vehicles can each “look” at a given light and see whether it is green or red.
The status messages passed between real and simulated worlds include, of course, vehicle positions. This allows actual vehicles to be mapped onto the simulated road network, and simulated vehicles to be mapped into the actual road network. The positions of actual vehicles are represented with GPS coordinates—latitude, longitude, and elevation—and those of simulated vehicles with local coordinates—x, y, and z. An algorithm transforms one system of coordinates into the other.
But that mathematical transformation isn’t all that’s needed. There are small GPS and map errors, and they sometimes prevent a GPS position, forwarded from the actual test car and translated to the local system of coordinates, from appearing on a simulated road. We correct these errors with a separate mapping algorithm. Also, when the test car stops, we must lock it in place in the simulation, so that fluctuations in its GPS coordinates do not cause it to drift [PDF] out of position in the simulation.
Everything here depends on wireless communication. To ensure that it was reliable, we installed four roadside radios in Mcity, enough to cover the entire testing facility. The DSRC wireless standard, which operates in the 5.9-gigahertz band, gives us high data-transmission rates and very low latency. These are critical to safety at high speeds and during stop-on-a-dime maneuvers. DSRC is in wide use in Japan and Europe; it hasn’t yet gained much traction in the United States, although Cadillac is now equipping some of its cars with DSRC devices.
Whether DSRC will be the way cars communicate with one another is uncertain, though. Some people have argued that cellular communications, particularly in the coming 5G implementation, might offer equally low latency with a greater range. Whichever standard wins out, the communications protocols used in our system can easily be adapted to it.
We expect that the software framework we used to build our system will also endure, at least for a few years. We constructed our simulation with PTV Vissim, a commercial package developed in Germany to model traffic flow “microscopically,” that is, by simulating the behavior of each individual vehicle.
One thing that can be expected to change is the test vehicle, as other companies begin to use our system to put their own autonomous vehicles through their paces. For now, our one test vehicle is a Lincoln MKZ Hybrid, which is equipped with DSRC and thus fully connected. Drive-by-wire controls that we added to the car allow software to command the steering wheel, throttle, brake, and transmission. The car also carries multiple radars, lidars, cameras, and a GPS receiver with real-time kinematic positioning, which improves resolution by referring to a signal from a ground-based radio station.
We have implemented two testing scenarios. In the first one, the system generates a virtual train and projects it into the augmented reality perceived by the test car as the train approaches a mock-up of a rail crossing in Mcity. The point is to see whether the test car can stop in time and then wait for the train to pass. We also throw in other virtual vehicles, such as cars that follow the test car. These strings of cars—actual and virtual—can be formally arranged convoys (known as platoons) or ad hoc arrangements: perhaps cars queuing to get onto an entry ramp.
The second, more complicated testing scenario involves the case we mentioned earlier—running a red light. In the United States, cars running red lights cause more than a quarter of all the fatalities that occur at an intersection, according to the American Automobile Association. This scenario serves two purposes: to see how the test car reacts to traffic signals and also how it reacts to red-light-running scofflaws.
Our test car should be able to tell whether the signal is red or green and decide accordingly whether to stop or to go. It should also be able to notice that the simulated red-light runner is coming, predict its trajectory, and calculate when and where the test car might be when it crosses that trajectory. The test car ought to be able to do all these things well enough to avoid a collision.
Because the computer running the simulation can fully control the actions of the red-light runner, it can generate a wide variety of testing parameters in successive iterations of the experiment. This is precisely the sort of thing a computer can do much more accurately than any human driver. And of course, the entire experiment can be done in complete safety because the lawbreaker is merely a virtual car.
There is a lot more of this kind of edge-case simulation that can be done. For example, we can use the augmented-reality environment to evaluate the ability of test cars to handle complex driving situations, like turning left from a stop sign onto a major highway. The vehicle needs to seek gaps in traffic going in both directions, meanwhile watching for pedestrians who may cross at the sign. The car can decide to make a stop in the median first, or instead simply drive straight into the desired lane. This involves a decision-making process of several stages, all while taking into account the actions of a number of other vehicles (including predicting how they will react to the test car’s actions).
Another example involves maneuvers at roundabouts—entering, exiting, and negotiation for position with other cars—without help from a traffic signal. Here the test car needs to predict what other vehicles will do, decide on an acceptable gap to use to merge, and watch for aggressive vehicles. We can also construct augmented-reality scenarios with bicyclists, pedestrians, and other road users, such as farm machinery. The less predictable such alternative actors are, the more intelligence the self-driving car will need.
Ultimately, we would like to put together a large library of test scenarios including edge cases, then use the augmented-reality testing environment to run the tests repeatedly. We are now building up such a library with data scoured from reports of actual crashes, together with observations by sensor-laden vehicles of how people drive when they don’t know they’re part of an experiment. By putting together disparate edge conditions, we expect to create artificial edge cases that are particularly challenging for the software running in self-driving cars.
Thus armed, we ought to be able to see just how safe a given autonomous car is without having to drive it to the sun and back.
This article appears in the December 2019 print issue as “Augmented Reality for Robocars.”
About the Authors
Henry X. Liu is a professor of civil and environmental engineering at the University of Michigan, Ann Arbor, and a research professor at the University of Michigan Transportation Research Institute. Yiheng Feng is an assistant research scientist in the university’s Engineering Systems Group.
The three laws of robotic safety in Isaac Asimov’s science fiction stories seem simple and straightforward, but the ways the fictional tales play out reveal unexpected complexities. Writers of safety standards for self-driving cars express their goals in similarly simple terms. But several groups now developing standards for how autonomous vehicles will interact with humans and with each other face real-world issues much more complex than science fiction.
Advocates of autonomous cars claim that turning the wheel over to robots could slash the horrific toll of 1.3 million people killed around the world each year by motor vehicles. Yet the public has become wary because robotic cars also can kill. Documents released last week by the U.S. National Transportation Safety Board blame the March 2018 death of an Arizona pedestrian struck by a self-driving Uber on safety failures by the car’s safety driver, the company, and the state of Arizona. Even less-deadly safety failures are damning, like the incident where a Tesla in Autopilot mode wasn’t smart enough to avoid crashing into a stopped fire engine whose warning lights were flashing.
Safety standards for autonomous vehicles “are absolutely critical” for public acceptance of the new technology, says Greg McGuire, associate director of the Mcity autonomous vehicle testing lab at the University of Michigan. “Without them, how do we know that [self-driving cars] are safe, and how do we gain public trust?” Earning that trust requires developing standards through an open process that the public can scrutinize, and may even require government regulation, he adds.
Companies developing autonomous technology have taken notice. Earlier this year, representatives from 11 companies including Aptiv, Audi, Baidu, BMW, Daimler, Infineon, Intel, and Volkswagen collaborated to write a wide-ranging whitepaper titled “Safety First for Automated Driving.” They urged designing safety features into the automated driving function, and using heightened cybersecurity to assure the integrity of vital data including the locations, movement, and identification of other objects in the vehicle environment. They also urged validating and verifying the performance of robotic functions in a wide range of operating conditions.
On 7 November, the International Telecommunications Union announced the formation of a focus group called AI for Autonomous and Assisted Driving. It’s aim: to develop performance standards for artificial intelligence (AI) systems that control self-driving cars. (The ITU has come a long way since its 1865 founding as the International Telegraph Union, with a mandate to standardize the operations of telegraph services.)
ITU intends the standards to be “an equivalent of a Turing Test for AI on our roads,” says focus group chairman Bryn Balcombe of the Autonomous Drivers Alliance. A computer passes a Turing Test if it can fool a person into thinking it’s a human. The AI test is vital, he says, to assure that human drivers and the AI behind self-driving cars understand each other and predict each other’s behaviors and risks.
A planning document says AI development should match public expectations so:
• AI never engages in careless, dangerous, or reckless driving behavior
• AI remains aware, willing, and able to avoid collisions at all times
• AI meets or exceeds the performance of a competent, careful human driver
These broad goals for automotive AI algorithms resemble Asimov’s laws, insofar as they bar hurting humans and demand that they obey human commands and protect their own existence. But the ITU document includes a list of 15 “deliverables” including developing specifications for evaluating AIs and drafting technical reports needed for validating AI performance on the road.
A central issue is convincing the public to entrust the privilege of driving—a potentially life-and-death activity—to a technology which has suffered embarrassing failures like the misidentification of minorities that led San Francisco to ban the use of facial recognition by police and city agencies.
Testing how well an AI can drive is vastly complex, says McGuire. Human adaptability makes us fairly good drivers. “We’re not perfect, but we are very good at it, with typically a hundred million miles between fatal traffic crashes,” he says. Racking up that much distance in real-world testing is impractical—and it is but a fraction of the billions of vehicle miles needed for statistical significance. That’s a big reason developers have turned to simulations. Computers can help them run up virtual mileage needed to find potential safety flaws that might arise only rare situations, like in a snowstorm or heavy rain, or on a road under construction.
It’s not enough for an automotive AI to assure the vehicle’s safety, says McGuire. “The vehicle has to work in a way that humans would understand.” Self-driving cars have been rear-ended when they stopped in situations where most humans would not have expected a driver to stop. And a truck can be perfectly safe even when close enough to unnerve a bicyclist.
How long will it take to develop standards? “This is a research process,” says McGuire. “It takes as long as it takes” to establish public trust and social benefit. In the near term, Mcity has teamed with the city of Detroit, the U.S. Department of Transportation, and Verizon to test autonomous vehicles for transporting the elderly on city streets. But he says the field “needs to be a living thing that continues to evolve” over a longer period.
The Uber car that hit and killed Elaine Herzberg in Tempe, Ariz., in March 2018 could not recognize all pedestrians, and was being driven by an operator likely distracted by streaming video, according to documents released by the U.S. National Transportation Safety Board (NTSB) this week.
But while the technical failures and omissions in Uber’s self-driving car program are shocking, the NTSB investigation also highlights safety failures that include the vehicle operator’s lapses, lax corporate governance of the project, and limited public oversight.
This week, the NTSB released over 400 pages ahead of a 19 November meeting aimed at determining the official cause of the accident and reporting on its conclusions. The Board’s technical review of Uber’s autonomous vehicle technology reveals a cascade of poor design decisions that led to the car being unable to properly process and respond to Herzberg’s presence as she crossed the roadway with her bicycle.
A radar on the modified Volvo XC90 SUV first detected Herzberg roughly six seconds before the impact, followed quickly by the car’s laser-ranging lidar. However, the car’s self-driving system did not have the capability to classify an object as a pedestrian unless they were near a crosswalk.
For the next five seconds, the system alternated between classifying Herzberg as a vehicle, a bike and an unknown object. Each inaccurate classification had dangerous consequences. When the car thought Herzberg a vehicle or bicycle, it assumed she would be travelling in the same direction as the Uber vehicle but in the neighboring lane. When it classified her as an unknown object, it assumed she was static.
Worse still, each time the classification flipped, the car treated her as a brand new object. That meant it could not track her previous trajectory and calculate that a collision was likely, and thus did not even slow down. Tragically, Volvo’s own City Safety automatic braking system had been disabled because its radars could have interfered with Uber’s self-driving sensors.
By the time the XC90 was just a second away from Herzberg, the car finally realized that whatever was in front of it could not be avoided. At this point, it could have still slammed on the brakes to mitigate the impact. Instead, a system called “action suppression” kicked in.
This was a feature Uber engineers had implemented to avoid unnecessary extreme maneuvers in response to false alarms. It suppressed any planned braking for a full second, while simultaneously alerting and handing control back to its human safety driver. But it was too late. The driver began braking after the car had already hit Herzberg. She was thrown 23 meters (75 feet) by the impact and died of her injuries at the scene.
Four days after the crash, at the same time of night, Tempe police carried out a rather macabre re-enactment. While an officer dressed as Herzberg stood with a bicycle at the spot she was killed, another drove the actual crash vehicle slowly towards her. The driver was able to see the officer from at least 194 meters (638 feet) away.
Key duties for Uber’s 254 human safety drivers in Tempe were actively monitoring the self-driving technology and the road ahead. In fact, recordings from cameras in the crash vehicle show that the driver spent much of the ill-fated trip looking at something placed near the vehicle’s center console, and occasionally yawning or singing. The cameras show that she was looking away from the road for at least five seconds directly before the collision.
Police investigators later established that the driver had likely been streaming a television show on her personal smartphone. Prosecutors are reportedly still considering criminal charges against her.
Uber’s Tempe facility, nicknamed “Ghost Town,” did have strict prohibitions against using drugs, alcohol or mobile devices while driving. The company also had a policy of spot-checking logs and in-dash camera footage on a random basis. However, Uber was unable to supply NTSB investigators with documents or logs that revealed if and when phone checks were performed. The company also admitted that it had never carried out any drug checks.
Originally, the company had required two safety drivers in its cars at all times, with operators encouraged to report colleagues who violated its safety rules. In October 2017, it switched to having just one.
The investigation also revealed that Uber didn’t have a comprehensive policy on vigilance and fatigue. In fact, the NTSB found that Uber’s self-driving car division “did not have a standalone operational safety division or safety manager. Additionally, [it] did not have a formal safety plan, a standardized operations procedure (SOP) or guiding document for safety.”
Instead, engineers and drivers were encouraged to follow Uber’s core values or norms, which include phrases such as: “We have a bias for action and accountability”; “We look for the toughest challenges, and we push”; and, “Sometimes we fail, but failure makes us smarter.”
NTSB investigators found that state of Arizona had a similarly relaxed attitude to safety. A 2015 executive order from governor Doug Ducey established a Self-Driving Vehicle Oversight Committee. That committee met only twice, with one of its representatives telling NTSB investigators that “the committee decided that many of the [laws enacted in other states] stifled innovation and did not substantially increase safety. Further, it felt that as long as the companies were abiding by the executive order and existing statutes, further actions were unnecessary.”
When investigators inquired whether the committee, the Arizona Department of Transportation, or the Arizona Department of Public Safety had sought any information from autonomous driving companies to monitor the safety of their operations, they were told that none had been collected.
As it turns out, the fatal collision was far from the first crash that Uber’s 40 self-driving cars in Tempe had been involved in. Between September 2016 and March 2018, the NTSB learned there had been 37 other crashes and incidents involving Uber’s test vehicles in autonomous mode. Most were minor rear-end fender-benders, but on one occasion, a test vehicle drove into a bicycle lane bollard. Another time, a safety driver had been forced to take control of the car to avoid a head-on collision. The result: the car struck a parked vehicle.
Non-competing researchers can build models on labeled data gathered by Waymo’s fleet of self-driving cars
The collective thoughts of the interwebz
The cookie settings on this website are set to "allow cookies" to give you the best browsing experience possible. If you continue to use this website without changing your cookie settings or you click "Accept" below then you are consenting to this.