Progress Report ①

Project Title: Making Science More Comprehensible with the Use of Extra Dimensions

November 2021 by Yuko Suzuki

My Background

I’m Yuko, a new PhD student at the IET. As I’m new here, let me briefly touch on my background before presenting my research idea. This is a progress made in just over a month since I started the programme in October, so I’d appreciate it if you could be gentle to me :)

I’ve been a TV producer/director over the past decade or so. Before that, I studied history of science, and science and technologies studies. My main research area was early-20th century physics.

I would say I’m rather a content producer by trade, so my fundamental interests lie in visualisation, conception and storytelling of science. But at the same time, I have a very strong desire to produce content that is truly ‘comprehensible’ for a wider audience.

Do Extra Dimensions Help?

Often contemplating ‘How could we make scientific knowledge more comprehensible for our audience,’ I have come to realise the potential of extra dimensions. 3D models give extra spatial dimension, and animation adds extra temporal dimension. Furthermore, the exciting technology of extended reality (XR) is prevailing!

To be honest, as a resident of the TV industry, I’d been quite sceptical about 3D technologies. We are living with old 2D technology, and we still vividly remember the epic failure of 3D TV around 2010!

But I felt more potential in the use of 3D virtual space because physical phenomena occur in three dimensional space and it felt simply reasonable to depict these processes in the same coordinates.

Cognitive Load Theory

Limited Working Memory

To evaluate its potential, I took the Cognitive Load Theory (CLT) and Cognitive Theory of Multimedia Learning (CTML) as a research framework. Both of the theories were developed in the context of multimedia learning, and it’s based on the idea that the capacity of our working memory, which we use to process information coming in from the sensory memory before passing onto the long-term memory, is limited. When the working memory is overloaded, the learning process slows down. Therefore, cognitive load needs to be managed.

Subtypes of Cognitive Load

CLT identifies three sub-types of cognitive load: Intrinsic Load, Extraneous Load and Germane Load. While Intrinsic load is inherent to the complexity of learning materials, Extraneous and Germane Load can be managed by instructional design. Extraneous load is an unnecessary burden to the learning process, so we try to reduce it. Germane load is constructive to the learning, so we try to increase it.

Principles of Cognitive Theory

CLT and CTML led to practical recommendations for the instructional design of multimedia learning materials. It’s best summarised in Richard E. Mayer’s book ‘Multimedia Learning.’ There were originally 12 principles when it was first published back in 2001, but now with a few more added, its 3rd edition sees 15 principles lined up as instructional design guidelines.

Evaluation Methods

Traditional Methods

In the framework of CLT, the efficacy of many novel multimedia learning materials have been tested. Traditionally, subjective methods of questionnaire have been used to evaluate. NASA-TLX is the most established or widely used form of asking the experienced mental effort and cognitive demand.

Other methods are to test subjects’ performance. Pre and post tests can tell the knowledge acquisition after the intervention with the leaning material in question. Reaction time and rhythm method are also to measure cognitive load through the performance of subjects.

  • Questionnaire
  • Interview
  • Transfer test
  • Pre & post test
  • Reaction time
  • Rhythm method

Physiological Methods

More recently, there are increasing interests in the physiological measurements of cognitive load. There are answers to the limitation of subjective methods, such as the varied perception of each subject and the lack of temporal data.

Physiological methods include eye-tracking, EEG, ECG, EMG, HRV, GSR and facial expressions. These are seen not only as parameters of cognitive load, but also as indicators of affective responses to the educational materials.

  • Eye-tracking
  • EEG
  • ECG
  • EMG
  • HRV
  • Facial expressions

However, physiological methods lack the ability to distinguish three subtypes of CL when you strictly try to work in the framework of CLT. So physiological methods are often used together with traditional methods. Or they are used as simple parameters of mental effort, or interpreted into non-CL metrics. I’m still thinking of using physiological methods together with subjective methods.



Eye-tracking is one of the most used physiological methods to measure cognitive load. More concretely, the cognitive studies using eye-tracking look at the following parameters.

  • Areas of Interest (gaze pattern)
  • Pupil (dilation)
  • Fixation (count & duration)
  • Blinking (rate & velocity)
  • Saccade (length & velocity)

These parameters are quite well associated with cognitive activities. With the increase of cognitive load, pupil dilates, fixation duration prolongs, blink rate decreases, and saccade velocity hastens.

However, there are some practical issues when it comes to implementing eye-tracking for the evaluation of learning materials. As immersive AR and VR are still relatively new to its users, the novelty effect influences eye movement more than cognitive load. It is also reported that blink rate for HMDs are generally lower than for conventional displays.

Gaze pattern is an interesting parameter especially because of the increasing use of Gaze Transition Entropy (GTE). It is the method to quantify eye movement transition based on Markov chain and Shannon’s entropy.

Higher entropy means more chaotic visual inspection, and lower entropy means more systematic one. While there is a report that GTE increases with cognitive load, another research showed the negative correlation between GTE and fixation duration. So further investigation is needed. In addition, calculation of GTE mapped in 3D space may be a little complicated.


Although there are eye-tracking devices that have been used for the research of cognitive load, such as Tobii and Pupil Labs with accompanying data analysis solutions, more recent HMDs are equipped with eye-tracking functions, which could be potentially used for the research purposes.

However, eye-tracking functions equipped with HMDs such as Microsoft HoloLens and Magic Leap, are fundamentally intended for the purpose of eye-gaze interaction, not eye-tracking research. Therefore, we need to come up with an appropriate solution to extract the data directly from HMDs.

ARETTE is a toolkit developed for AR eye-tracking with HMDs, particularly for the research using HoloLens 2. It also comes with an R package for data analysis, and it looks like a promising solution. Otherwise, I would need to develop my own solution using Mixed Reality Toolkit (MRTK).



EEG (electroencephalography) is another method to physiologically measure cognitive activities. It can record brainwave signals from the cerebral cortex based on the international 10/20 system. We have access to temporal and spatial data of subjects’ brains as well as spectral information.

As EEG measures brainwaves of different frequencies, we can obtain the data of alpha, theta, beta and gamma waves respectively. Theta and alpha frequency bands are linked with task difficulty, which is largely associated with cognitive load. When task difficulty increases, the theta band wave increases while the alpha band decreases.

Also available are metrics developed by EEG device manufacturers such as Emotiv. Their analytic solution offers the performance metrics interpreted into affective states: stress, engagement, interest, excitement, focus, and relaxation. These can be used as indicators of experienced cognitive load as well, i.e., relaxation as low cognitive load state and stress as high cognitive load state.


As biometric devices become increasingly popular among consumers, EEG headsets with lower price tags are now available in the market. NeuroSky and Muse are probably the cheapest ones, but they are of no use for this research purposes as their functions are limited. Emotiv Epoc X (or Flex) and Open BCI would be a good bet with the price tag under $1000 for the research quality devices. (Emotiv says that their devices can be worn with HoloLens because there are examples of researche with VR HMDs.)

There are certainly many more options available when you look at the devices under $25,000 range, e.g., g.tec, mbt, ABM and Neuroeletrcis, but the complexity and setup time required for higher quality devices with many channels may not fit in the purpose of this research.

Type of Media

Considering the efficacy of three dimensional learning materials in context of cognitive load, elaborate research in each type of media with different devices, i.e., desktop, mobile phone, tablet, and HDM, is needed as learners respond to various devices differently.

Many comparison studies among different types of media were made over the time, and some of them reported the different results in terms of cognitive load. Although I haven’t got around to run systematic review on them, I’m currently interested in the following reported features of each media type.

On Screen

In the research to compare static illustration, self-paced animation, and interactive simulation, using eye-tracking, it is reported that interactive simulation leads to ‘deeper’ and ‘more systematic’ cognitive thinking. This may indicate that the interactivity of XR and its possibility of inspecting the simulation could contribute to the increase of germane load.


Ronald T. Azuma defines augmented reality (AR) with the following three characteristics.

  • Combines real and virtual
  • Interactive in real time
  • Registered in 3D

Although much of the AR research in educational context is conducted with mobile phones and tablets rather than HMDs, there are reports that overlaying information on the real world environment reduces cognitive load and helps learners organise the information. AR’s three dimensional aspect also helps reduce cognitive effort compensating visuo-spatial abilities for otherwise difficult-to-imagine spatial and temporal structures of scientific entities.


Virtual reality (VR) learning is sometimes reported with cognitive overload. Its strength seems to be rather in the affective and sensory scope, such as motivation, excitement, interest, and presence, which could lead to better memory retention.

It is also reported that VR is more effective for learners with lower baseline knowledge and has positive impacts in K-12 education. This may lead to the assumption that VR-based learning materials are more suited to simpler memory-oriented learning objectives, rather than complex concepts that require heavy processing in working memory before the information is transferred to long-term memory.

If certain attributes of XR are acting like seductive details, which could appeal to learners and arouse interests, the balance between positive affective state and cognitive load needs to be met. I would need to delve into the affective side of the framework a bit deeper.

Type of Representation

We have to consider the fact that different subjects of science rely on different types of representations when certain learning objectives are explained. Here are multiple literatures that offer varied taxonomies, but I would like to look at three categories.


Conducting laboratory experiments or performing a physical task can be assisted by XR technologies. AR can, for example, show information on the function of a mechanical device in a laboratory environment, and give instructions for the next task for learners to perform. VR can simulate experiments or training that is not easy to be accessed or repeated in real life for ethical or safety reasons.

In these categories, AR seems to be more successful in terms of managing cognitive load than VR as VR-based learning is sometimes reported with cognitive overload. However, as research shows that no difference is observed in the comparison of cognitions between real classroom and virtual classroom, the cognitive difference between AR and VR virtual laboratory environments could be overcome in some way.


Replacing traditionally physical models, e.g., a model of human organs, with virtual ones seems to be more effective in terms of learning outcome, especially in the field of health science, biology and medicine. Virtual models could have more realistic texture and movement, which could give learners enhanced experience.

However, I wonder what happens if the learning activities involve more complex operational processes, where more realistic representation of the model causes cognitive overload, which may exceed the good level of interest and excitement.


XR helps learners to imagine spatial structure and function, and temporal alignment for abstract concepts of science by showing it in three dimensional coordinates and timeline.

I also wonder what the affective impact of immersion brought by XR technologies would be in this category. Do they feel more ‘real’ and ‘sensational’ about these abstract concepts if the experience is brought about in immersive media? I would need more literature review on this subject.


As I got into each element of the project, I realised how much technical knowledge I need to gain to be able to conduct a viable experiment for myself! I have to say it’s a bit overwhelming… Here are the skills I have identified that I would need so far:

  • Unity 3D for app development
  • C# for app development
  • R for data analysis
  • Experimental devices (Eye-tracking and EEG)
  • Neuroscience for EEG analysis
  • Information theory for gaze pattern analysis

Because of the diversity of media and representation combination, and skill sets required for the research, it feels like I need to focus on a narrower range of learning materials. Would it be reasonable to focus on HMD-based AR? Could I bring myself to the point where I can actually run the experiment in this area?

Also there are other aspects of learning, which I did not touch on so far. They are communication aspects of XR such as instructor and interaction with other learners. To avoid over complication for the research, I would like to put it aside for now.

Provisional Research Questions

So, taking the discussions above into consideration, my provisional research questions are something along this line:

  • What set-up of XR gives the best efficacy in learning?
  • What cognitive and affective impact does the use of AR give on learning?
  • What is the ideal Instructional Design for AR-based educational materials?

Am I heading in the right direction? I would appreciate any comments and suggestions. Thank you!