Opendata, web and dolomites

Report

Teaser, summary, work performed and final results

Periodic Reporting for period 1 - MULTISSOUND (Multisensory-Based Spatial Sound)

Teaser

The worlds of multisensory research and multimedia technologies have originated and, for the most part, still remain largely independent. The first exists mostly within the field of experimental psychology, or psychophysics, while the second one is mostly studied and developed...

Summary

The worlds of multisensory research and multimedia technologies have originated and, for the most part, still remain largely independent. The first exists mostly within the field of experimental psychology, or psychophysics, while the second one is mostly studied and developed by engineers. This project aimed at implementing this perfect multidisciplinary marriage in the field of acoustics. A series of psychophysical experiments were carried out at Aalto School of Engineering, in the Spatial Sound Group, aiming at obtaining detailed guidelines on how to reproduce sound for the best audiovisual experience. The ultimate goal was to reach every consumer, namely in television, cinema, gaming, computer streaming and teleconferencing applications.

This project observed how acoustic events spatialized inside and outside the field of view, congruent and incongruently, could affect the perception of visual events. It was found that sounds can affect negatively the perception of visual events in audiovisual reproductions. In one experiment, it was found that subjects\' gaze moved more when the soundtrack was more immersive, when compared to stereo reproductions. The ability to count visual events was also affected by different sound conditions. In another experiment, it was found that sounds do not need to be presented in the same place as light events, but they should happen within the same region. Visual lateralization was maximally affected by sounds occurring in a region slightly wider than the visual event region. These findings were replicated in an immersive environment, with a movie reproduction. It was found that greater spatialization of sound sources led to slightly more dispersion of visual attention. Audiovisual demos were created to demonstrate the effect. Preliminary solutions were proposed and tested.

Work performed

The project was structured according to the following Work Packages (WP):
Work Package 1 (WP1): Sound width, position, and visual attention
This WP aimed at testing the gaze/attention control hypotheses and at obtaining a first set of sound parameters that can influence visual experience (specific goal 1).
A simplified experimental setup was assembled in the multichannel chamber. A perceptual experiment was implemented to test the influence of sound on visual experience. Sounds varied randomly in position (foveal, parafoveal, peripheral), and width (point-like, broad, or diffuse). Subjects wore an eye-tracking device. Gaze direction shows visual attention, while pupil dilation shows perceptual saliency. Data was collected in a light lateralization task: tell whether the last flash in a sequence occured in the left or in the right hemisphere.
Results from this WP brought new insights on
a) the influence of peripheral sound on processing of central visual events: peripheral sound does affect the visual processing of brief events
b) the spatial area within which such effects occur: sounds happening in an area 10 to 20 deg wider than the area where visual events occur produce the most disturbing effects, while sounds occuring in the back or at very different elevations have little influence.
c) the relevance of simultaneous co-localized sound on the processing of central visual events: sound co-localization was found to be surprisingly unimportant, so long as the sounds occured within the same region as the visual events.
d) the importance of spatial sound definition in the processing of central audiovisual events: sound definition is crucial for the effect to be observed. Therefore, as a technological solution, we propose the use of diffuse sounds in the periphery and point-like sounds in the central area.

Work Package 2 (WP2): Practical applications – Guidelines for spatial sound
From WP1, a set of new parameters and new hypotheses was created. In this WP, the goal was to further those hypotheses and test such parameters with natural stimuli. This stage used for the first time stimuli that occur frequently in video reproduction. The experiment in this WP was implemented in the audiovisual room, with the spherical loudspeaker setup. In a first step, the spatial audio coding for sound reproduction (DirAc), already implemented in the audiovisual room, was be adapted to meet the criteria of direct and diffuse sound at different visual fields. Then, several soundtracks and reproduction setups were created. An experiment was implemented where an immersive movie was reproduced with different sound conditions for different subjects. 50 subjects took part in the experiment. The subjects were asked several questions about the events in the moview. Simultaneously, they wore eye tracking glasses to monitor gaze direction and pupil dilation. Results were analyzed in terms of correlation between eye movement, attention, and multisensory experience; correlation between eye movements, visual and sound parameters; the relationship between sound parameter combination and the audiovisual experience.
As results we obtained: a) a first implementation of spatial sound designed to optimize audiovisual experience; b) an identification of the best spatial sound parameters for event processing, speech processing, and overall audiovisual immersion; c) breakthrough data on the relation between eye movements, attention, and audiovisual experience; d) a list of factors influencing eye movement during audiovisual stimulation; and e) a quantification of the benefit of the new spatial sound for audiovisual reproduction.

WP3: A new computational model of auditory processes
In this WP I worked on a computational model to describe visual and auditory perception under different multisensory conditions. This model was tested and proved to predict well audiovisual source localization in distance.

WP4: Implementation and proof of concept
Thishe goal here was

Final results

This project opened a new line of research by showing that spatial sound must be developed having in mind its final application. When the sound reproduction is intended to be used in conjunction with video reproductions, then a careful design of the sound must take place. Practitioners in sound engineering should be aware of possible perceptual interactions. In this project, the main finding was that sound reproduction setups slightly wider than the video reproduction area - which are common in home and professional environments - are the most harmful for the perception of visual contents. We found a list of relevant factors that can be accounted for to allow for best audiovisual experience.
Additionally, a new computational model was implemented that can be used to predict human perception of audiovisual events.
4 peer-reviewed journal publication were obtained within this project. Three more peer-reviewed publications are expected as a result of this project, since data is still being analysed. An associated post-doctoral project and a Master\'s thesis were achieved.
Among other dissemination activities, a workshop involving the industry made sure to enhance the societal impact of the findings of this project.

Website & more info

More info: https://www.aalto.fi/news/marie-sklodowska-curie-fellowship-awarded-to-study-multisound.