Interview with Professor Davide Scaramuzza is the director of the Robotics and Perception Group at the University of Zurich in Switzerland, and leader of the EU-funded AGILEFLIGHT project.
How good do you think uncrewed aerial vehicle [UAV] systems are at the moment?
Prof. Davide Scaramuzza: Human pilots can navigate drones fast, but autonomous drones often look clumsy at present: they take off and land vertically, move slowly and are far from human pilot performance.
Many drone systems also work as long as there is GPS coverage, yet this is a problem when the signal is obstructed flying at low altitude close to buildings or indoors.
So, when you want an autonomous drone able to operate regardless of external infrastructure, you need to use cameras and other on-board sensors for camera-based navigation, which we can now find in some commercial products. These use vision, much like humans and animals, with eyes and a brain. And they are properly autonomous because they don’t rely on GPS – though if it’s available, they can use it.
I would say vision-based navigation is mature, but these drones are still far from human pilot performance. This also matters because their batteries are generally limited to about 30 minutes, so the only way to accomplish more within this time is to fly faster to, for example, save more people in search-and-rescue applications.
Is it possible to ultimately get drones to perform as well autonomously as with a human pilot, and how are you seeking to resolve such challenges?
Prof. Davide Scaramuzza: I think so. I’ve been working on what’s called agile navigation – navigating faster and in a more agile way to approach the performance of human pilots. If we can do that, it will open many doors for applications of drones in search and rescue, inspection and delivery, and even for flying cars and space.
The drone we’re using is about 20 centimetres in diameter and weighs less than a kilo. Its ‘eyes’ are comprised of a stereoscopic camera system similar to human eyes. Then we have the ‘brain’, which is a graphics processing unit – essentially, the same computer that you find in smartphones.
We are using more and more artificial intelligence by training vision algorithms on computer simulations rather than the long and difficult process of going out into the field to collect images. The idea is to design neural networks that take as inputs images from the camera and measurements from other sensors, and then output the commands for the drone.
My understanding is that your drones combine standard cameras and so-called ‘event cameras’. Can you explain a bit about how that system works?
Prof. Davide Scaramuzza: The algorithms decide whether to use one of these types of camera or both, depending on application. Event cameras – or ‘smart eyes’ – are smart cameras that detect motion and don’t output full images, but motion.
With normal cameras, there can be a delay from one to hundreds of milliseconds, depending on light, while the image forms – which is detrimental for time-critical robotics applications. But with an event camera, ‘events’ are streamed continuously, so you don’t need to wait for the next frame to appear.
That means it’s microsecond resolution and you can make decisions much faster. It’s also something useful in the automotive domain, in which some companies we’re working with are interested because the same algorithms we’re developing for drones can be used there.
We are developing algorithms that can react fast to changes in the environment, such as pedestrians or other vehicles moving. However, these are not yet as accurate at detection as algorithms using standard cameras.
What types of tests are you doing using the drones?
Prof. Davide Scaramuzza: We’ve created a test bed in three focus areas: acrobatic manoeuvres; navigation in the ‘wild’, or unknown environments; and drone racing.
In the acrobatics tests, the drone mimics a human pilot performing acrobatic manoeuvres, using only on-board cameras and no GPS. We simulated hundreds of drones performing hundreds of autonomous manoeuvres, with the ability to process the equivalent of about 50 flight hours in just four hours. We then tested this on the real drone.
For navigating in unknown environments like forests, we’ve done tests in which our autonomous drone even beat the best commercial drone in terms of the speed at which it can fly without crashing. We trained the algorithms in Unreal Engine, a simulation engine used to make video games.
In autonomous-drone racing, meanwhile, we showed in summer 2022 for the first time that we can outperform some of the best drone-racing pilots when we compare lap times. However, this does not mean the game is over: humans still have the advantage in terms of the ability to cope with adverse environmental conditions, such as changes in wind or brightness compared with computer vision and control algorithms.
What other goals do you have?
Prof. Davide Scaramuzza: Another is that we can demonstrate autonomous exploration of a known building as fast as possible, as the entry point to using drones in search-and-rescue applications.
If we are successful, we can deploy these drones, for example, in a mock-up search-and-rescue operation. We have access to a huge search-and-rescue training location with a mock-up disaster area built to train firefighters: there are collapsed and entire buildings, and even derailed trains. It’s an interesting test bed for trying out our algorithms, which we are already doing.
How long do you think it will be before the types of drone system you’re developing could be used in real-life situations such as emergency response?
Prof. Davide Scaramuzza: It depends what we’re talking about. Some things are already there, but what’s not there yet is drones that can reason as well as human pilots. Also, when you enter a real search-and-rescue environment, nothing mirrors a normal environment. What’s really difficult in robotics is to navigate in environments that violate the usual assumptions, such as that there is light and no dust, fire or fog.
It’s difficult to research such environments because there’s not enough data at present. But we’re recording data sets to share and organising competitions to put researchers together so such algorithms can be made robust. Using drones also depends on having the regulations in place, but these are getting better and better, especially for search and rescue.
If you’re asking about state-of-the-art vision-based navigation, I would say for moving slowly, you can already use them. If you’re asking me about completely replacing a human, it’s difficult to predict, but I think within five years we’ll have drones that will reach human-level performance in certain areas.
What do you hope your research into such technology achieves in the long term?
Prof. Davide Scaramuzza: My hope is that it can help save people’s lives in the aftermath of a disaster or on the roads. There are many fatalities caused by human error, so having highly reactive detection systems can help save millions of people.
But also, I hope our algorithms can be used to explore other planets. We’re working with NASA on exploring the use of event cameras for such purposes because they can see better in low light as well.
There are also possibilities in areas such as robotics for precision agriculture. In general, we are helping the market whenever anyone needs a drone or robot that’s able to see.