Interview with computer scientist Mel Slater on the future of virtual and augmented reality

ESMH

1 year ago

Mel Slater interview: Concept of augmented reality technology being used in futuristic smart tech glasses

Professor Mel Slater is a Distinguished Investigator at the University of Barcelona and the coordinator of the European Metaverse Research Network. He is a key figure shaping the rapidly advancing field of virtual and augmented, or extended, reality (VR/AR/XR).

These technologies are important for several topics recently discussed by the European Parliament’s Panel for the Future of Science and Technology at workshops on ‘Human-computer confluence in education’ (recorded) and ‘Neurotechnology and neurorights – Privacy’s last frontier’ (recorded).

Given a rather long tradition in XR/telepresence research, how would you define the progress in the field and the main research questions to which we don’t have answers?

Mel Slater: The progress in the field in the last 10 years has been enormous, but it has mainly been driven by new hardware developments. As an example, our lab 10 years ago cost about EUR 100 000 simply to run a couple of head-mounted displays (HMDs). Each HMD cost around EUR 35 000. Now for approximately EUR 800 we can have two HMDs that are of a better quality than those of 10 years ago, and this suffices almost all regular needs.

Then occasionally we need to do some motion capture for animation, and this requires a device that costs approximately EUR 20 000. Plus, there are the obvious costs of PCs for development. Moreover, we no longer need a fixed large lab (except for the motion capture) because the HMDs are standalone and we can use them anywhere. For example, we can run studies by giving them out to people at home.

So, this type of progress is ‘enabling’; we could say that VR use has been ‘democratised’. The cost for a single device is less than the cost of many brands of smartphone.

Of course, there are several problems:

The software for development remains unsatisfactory. Both Unity and UnReal have long learning curves, and problematic hybrid models of visual programming and actual programming (using C languages). A uniform system is needed, just as OpenGL used to be a de facto standard for 3D computer graphics in the past.

For users, a big problem is ‘onboarding’. Many find it difficult to figure out what to do after they first put on an HMD. It is somewhat like when the ‘mouse’ was first introduced and users found it complex to understand what it was for and how to use it.

XR is a totally different medium to 2D, but nevertheless 2D thinking prevails with the use of interaction items, such as menus. XR needs its own unique paradigm, just as interactive computing needed the ‘windows’ point-and-click paradigm years ago. This was largely pioneered by Apple, arising out of original research at the Xerox Palo Alto Research Center. This story is instructive – Xerox got a group of experts together for years with the mission of coming up with a new way of interacting with computers, and we are still using what they came up with today. A similar initiative is needed for VR.

While visual and auditory displays have caused no special problems – although there is always room for improvement with aspects such as visual field-of-view and resolution, there is still a long way to go with haptics. Any visual and auditory display can be used to display anything, we don’t need a different type of display to render a house compared to a mountain.

However, a different system is needed for each different type of haptics. There are excellent devices for specific purposes, but, for example, a device that can push a needle through different materials cannot also be used to simulate a brush on the shoulder as a virtual human character collides with you in VR. A generalised haptics device, and one that can be used simply and at home, is still a long way off, if ever it will be possible.

There is a lot of speculation about the ethics of XR, but very little data. Empirical studies are needed to find out where the ethical problems may be, along with ideas about how to overcome them.

On the theoretical side, while a lot of progress has been made with respect to the ‘being there’ aspect of presence, there is still a lot of work to do on the ‘plausibility’ aspect (i.e. the illusion that virtual events are happening). Both ‘being there’ and plausibility are essential for virtual environments to have a useful effect.

The integration of large language models (LLMs) into XR applications is in its early days and needs to be encouraged. At the same time, the ethical considerations must be taken into account, and ways of overcoming these should be investigated and tested empirically.

Could you please comment on new trends/directions in XR research? Is there anything new that was not anticipated 20-30 years ago?

Mel Slater: All that I wrote above applies here. It had not been anticipated a few years ago that people would have XR devices in their homes. The whole area of integration with LLMs obviously did not exist. Where a few years ago we had to use ‘wizard of oz’ methods to enable a conversation with a human character, now it can be done very well by an LLM.

Virtual human body representations are important for many applications in XR. They have been improving in quality over the years, and today it is quite straightforward to make a virtual body that looks like a specific person. This trend should continue and, with the potential advent of the ‘metaverse’, will become essential.

The ideas behind the ‘metaverse’ (multiple people in the same environment) were implemented in EU projects and systems such as DIVE, in the 1990s. Today this is moving onto a massive scale with companies providing nascent ‘metaverses’ where millions of people could potentially be joining in real time simultaneously. Of course, this raises issues of security, where people need to be sure that the individuals with whom they are interacting are who they profess to be. Here, methods such as blockchain may be useful to solve such problems.

Research in augmented reality / mixed reality (AR/MR) has become more possible, but AR is lagging behind, as the development of the hardware has been slower. Now various companies are producing or planning to produce a form factor for AR/MR, much like a normal pair of glasses. If this happens at a low enough cost, it will be a big boost to development.

People have been predicting that AR will overtake VR in consumer uptake, but it is more likely to be MR. But here we will face the problem of security. If I am walking through the streets wearing lightweight MR glasses, I do not want to be bombarded by hundreds of virtual people trying to sell me goods or services unless I pay for a subscription. This would be tantamount to extortion (pay to use the MR or get bombarded with ads that intrude into your everyday life).

When this happens with web pages and apps, it is bearable because they can be ignored to some extent, as they are not built into our physical lives. But if MR becomes part of our physical lives, this will become intolerable.

What are the major bottlenecks, in your opinion, for a wider uptake of the Metaverse (or other XR platforms/technologies)? Are these of technological or behavioural (e.g. acceptance, lack of the ‘killer apps’) nature?

Mel Slater: Wider uptake of the metaverse requires mass diffusion of XR technology and a need for people to use it in their everyday lives. This requires a solution to the ‘onboarding’ problem mentioned above. It is not really a technological problem, but what might be referred to as a ‘user interface’ problem.

I don’t believe in the idea of a ‘killer app’. What’s the ‘killer app’ for Microsoft Excel? It is a system that anyone can use quite easily to collect and retain data and perform useful calculations. What is the equivalent statement for XR? It is probably going to be for communication at a distance, to enter an environment to meet other people maybe 1000s of kilometres away, while promoting the same feelings as actually being with them. This is not a ‘killer app’ but a paradigm.

As indicated above, security problems have to be solved, as well as ethical and legal issues. But technically there is no intrinsic reason why such systems cannot be deployed massively today. It is also a question of the ‘form factor’ – it is not so ‘trendy’ to wear a strange looking contraption on the head. But with the right form factor, one can imagine people walking through the streets wearing glasses that provide access to a metaverse, but one that is a mix between reality and virtual reality.

I think the blend of the real and the virtual will, and should, become commonplace; I walk through ‘this’ physical door and I am in another place. I walk through ‘that’ virtual door and I am in yet another (maybe entirely virtual) place. This will need to become ‘normal’, an everyday activity without it being anything special. Instead of getting on a train to go to another place, I just select the right location on my device (the glasses) and I am there – all my sensory systems indicate that, even though I might actually be lying in bed at home.

How will AI (including generative AI), in your opinion, influence the development of the XR field?

Mel Slater: There are at least two ways that generative AI is influencing the XR field.

The first is the creation of scenarios and applications. LLMs are quite good at creating code. It is leading to a different way of programming. Now we can have a human specify what the code should be executing, in what language and in what system (e.g. in Xcode using Swift for the iPhone, or in Unity using C# for a Meta Quest) and be responsible for checking and testing the code. So the role of the human is no longer that of directly coding but supervising an AI to do the coding. This approach can save a huge amount of time in the development cycle and can lead to a whole new paradigm for ‘programming’ – which to devise the appropriate ‘prompts’ rather than the actual code.

The second way is that LLMs can be used as part of the XR scenario – intelligent objects that know how to respond to participant actions. I mentioned LLMs driving a virtual human character above, for example answering questions and maintaining a dialogue. But it goes further than that, since LLMs can integrate many different types of information, such as physiological measures, learn about the state of participants and alter the scenario accordingly in real time. So direct multilevel and multimodal interaction between humans and the AI becomes possible. Along the same lines, it is possible that the AI may help people engaged in activities together in XR to reach their objectives – this is being developed in the EU-funded project GuestXR.

What areas, in your opinion, would be impacted by XR developments: education, supernatural sensing/sensory augmentation, mental health and well-being and/or other areas?

Mel Slater: All of these areas of course. I’ve mentioned meetings at a distance above and all the applications that go along with that. Another important area is sport and exercise.

Do you see any new challenges related to new developments in XR?

Mel Slater: There are all the ethical and regulatory aspects that need to be considered, as mentioned above. There is a real problem here. Suppose you are in a dialogue with a virtual human character controlled by an AI. The conversation is highly likely to be very realistic and compelling, and this will become increasingly likely.

Now if we have some marker on the virtual human (e.g. an arrow above the head) that indicates ‘this is an AI’ then it will defeat the purpose of XR. So we need informative and regulatory practices that still allows XR to be XR, while ensuring that it is safe for the participant. This is a problem that is not easy nor obvious to solve. Yet, on the other hand, if the EU starts putting more and more restrictions on what we are allowed to do in research, then we will never solve this problem. Above all, empirical data (experimentation) that allows us to address these problems is needed.

Is EU/international legislation prepared for these, or are there, in your opinion, some new legal aspects that need re-thinking or protection (e.g. body transformation technologies, avatar ‘rights’, or new multisensory stimulation/manipulation techniques such as ultrasound haptics)?

Mel Slater: Yes, but as I have argued in some papers, and as we are doing now on a Spanish Government grant, the fundamental necessity is to move away from speculation. It is true that XR might cause all sorts of problems. But we do not know! We need empirical research to be able to explore these issues and provide advice to policymakers, based on data and consequent theoretical understanding, rather than only on speculation.

Share this: