Powered by OpenAIRE graph

IRCAM

Institut de Recherche et Coordination Acoustique Musique
Funder
Top 100 values are shown in the filters
Results number
arrow_drop_down
63 Projects, page 1 of 13
  • Funder: French National Research Agency (ANR) Project Code: ANR-18-CE33-0002
    Funder Contribution: 583,454 EUR

    While the so-called Natural User Interfaces are becoming widespread in consumer devices, the use of expressive body movements remains limited in most HCI applications. Since memorizing and executing gestures remain challenging for users, most current approaches to movement-based interaction consider “intuitive” interfaces and trivial gesture vocabularies. While these facilitate adoption, they also limit users’ potential for more complex, expressive and truly embodied interactions. We propose to shift the focus from intuitiveness/naturalness towards learnability: new interaction paradigms might require users to develop specific sensorimotor skills compatible with – and transferable between – digital interfaces. With learnable embodied interactions, novice users should be able to approach a new system with a difficulty adapted to their expertise, then the system should be able to carefully adapt to the improving motor skills, and eventually enable complex, expressive and engaging interactions. Our project addresses both methodological and modelling issues. First, we need to elaborate methods to design learnable movement vocabularies, which units are easy to learn and be composed to create richer and more expressive movement phrases. Since movement vocabularies proposed by novice users are often idiosyncratic with limited expressive power, we propose to capitalize on knowledge and experience of movement experts such as dancers and musicians. Second, we need to conceive computational models able to analyze users’ movements in real-time to provide various multimodal feedback and guidance mechanisms (e.g. visual and auditory). Importantly, the movement models must take into account the user’s expertise and learning development. We argue that computational movement models able to adapt to user-specific learning pathways is key to facilitate the acquisition of motor skills. We propose thus to address three main research questions. 1) How to design body movement as input modality, whose components are easy to learn, but that allow for complex/rich interaction techniques that go beyond simple commands? 2) What computational movement modelling can account for sensorimotor adaptation and/or learning in embodied interaction? 3) How to optimize model-driven feedback and guidance to facilitate skill acquisition in embodied interaction? We will consider complementary use-cases such as computer mediated communication, assistive technologies and musical interfaces. The long-term aim is to foster innovation in multimodal interaction, from non-verbal communication to interaction with digital media/content in creative applications.

    more_vert
  • Funder: French National Research Agency (ANR) Project Code: ANR-23-CE33-0012
    Funder Contribution: 564,525 EUR

    Noise pollution has a significant impact on quality of life. In the office, noise exposure creates stress that leads to reduced performance, provokes annoyance responses and changes in social behaviour. Headphones with excellent noise-cancelling processors can now be acquired in order to protect oneself from the noise exposure. While these techniques have reached a high performance level, residual noises still remain that can be important sources of distraction and annoyance. We propose to study two augmented reality approaches, mostly targeted towards disturbance in open offices. We target additional sound source levels that are below or equal to the one of the noise source. The first approach is to conceal the presence of an unpleasant source by adding some spectrotemporal cues which will seemingly convert it into a more pleasant one. Adversarial machine learning techniques will be considered to learn correspondences between noise and pleasing sounds and to train a deep audio synthesiser that is able to generate an effective concealing sound of moderate loudness. The second approach is to tackle a common issue encountered in open offices, where the ability to concentrate on the task at hand is made harder when people are speaking nearby. We propose to reduce the intelligibility of nearby speech by the addition of sound sources whose spectro-temporal properties are specifically designed or synthesised with a generative model to conceal important aspects of the nearby speech. The expected outcomes of the project are: 1) advances in the recent field of deep neural audio and speech synthesis and 2) lead to innovative applications for the engineering of the mitigation of noise in our daily life.

    more_vert
  • Funder: French National Research Agency (ANR) Project Code: ANR-07-MDCO-0017
    Funder Contribution: 476,045 EUR

    textual documents. However, a large number of data sources provide multimedia documents (sound, images, audio-visual documents), for which the description techniques remain rudimentary, restricted to very specific types of sources (e.g. identity photos), and not very homogeneous because built according to particular needs. Our project consists in designing and experimenting generic and flexible techniques for content-based indexing and searching, dedicated to distributed sources of multimedia documents. The project relies on three complementary axes. 1. The first one aims at studying low level descriptors that may be automatically generated from multimedia documents, in order to use them as a support for content-based searching. By "low level descriptor", we mean vectors of values that characterize the content of a document, independently of any contextual information. We aim at characterizing these descriptors, as well as the extraction algorithms for producing them, as generically as possible, in order to cover a large palette of audio, video or audio-visual documents. Our goal in this axis is to exploit our complementary competences concerning the processing of non textual data. 2. The second axis consists in defining index structures and search operators for large collections of descriptors. By "index" we mean any structure (research tree) or technique (hashing) that allows restricting the research space, in order to avoid exhaustive exploring of a data collection. Here also we aim at factorizing as largely as possible techniques applicable to all the types of multimedia documents we considered. Our goal in this second axis is to complement the production of descriptors with the specification of a complete and generic toolkit for multimedia data processing. A content provider should be able to extend this toolkit in order to develop a search engine specific to its own collections. 3. The third axis concerns the distribution aspects of content-based search. We consider the case of institutions that wish to reference their collections and to benefit from a common indexing and searching system, based on the sharing of their descriptors. We intend to study in this axis the extension of the search structures and algorithms to the case of distributed sources. We will also exploit distribution to manage system scalability. The project also includes the implementation of a platform enabling to test on real data and in real environments all the technical proposals resulting from the three axes above. We do not include the problem of content distribution, which would raise problems related to access rights and ownership, but only that of references to content, each provider being free to define its own access rights policy. The consortium is composed of 5 partners – three public laboratories (Wisdom, INRIA Lille, IRCAM) and two content providers with different profiles: European Web Archive (archiving of free audio and audio-visual content collected on the Web) and the photo agency of the Réunion des Musées Nationaux (RMN), which will provide its collection of images. The three laboratories come with complementary competences on the management of audio documents (INRIA, IRCAM), images (Wisdom) and distributed search systems (Wisdom).

    more_vert
  • Funder: French National Research Agency (ANR) Project Code: ANR-11-BS09-0016
    Funder Contribution: 404,410 EUR

    Loudness (the subjective level of a sound) is a basic character of a sound. Some models exist in order to describe this sensation from measurements of the signal, but they are restricted to some cases. Signals should be monaural, or equal at the two ears of the listener (presented through headphones, or in free field in a frontal incidence, or in a diffuse field). Moreover, if the level of the signal is varying over time, these models cannot predict the overall loudness of the sound. Such limitations are major drawbacks because they prevent from using these models for environmental sounds. Indeed, sources in the environment often produce sounds with varying features over time (as an example, the sound of a passing car). The overall loudness depends on how the sound varies over time and this feature should be taken into account by a predictive model. Another important fact is that the position of the source, as well as the body of the listener can modify sounds at the position of the eardrums. These modifications are not the same at the two ears and some work is still to be done in order to understand how these two signals are combined by the listener to give him a single loudness sensation. Therefore, the goal of this project is to extend the validity of loudness models so that they can be used for environmental sounds. The study will consist in psychoacoustical experiments. It will test several hypotheses about cognitive and perceptual phenomena which should be taken into account in order to adapt existing loudness models. Different cases will be studied : first of all, attention will be paid to the loudness of stationary sounds in binaural listening, in situations leading to interaural differences. Some already published studies have shown very important inter-individual differences, though from a rather low number of listeners (less than 10). Experiments planned in this project will be realised on a higher number of persons, so that it will be possible to understand this inter-individual variability in a better way. From experimental results, a loudness model applicable to stationary binaural signals will be defined. It will be checked that this model can be used from dummy head recordings, which will ease its use by acousticians in the industry. In the meantime, the loudness of non stationary headphone-presented sounds will be investigated. A preliminary study will consist in characterizing typical temporal profiles of environmental sounds. A special attention will be paid to two categories corresponding to sounds with increasing and decreasing levels. Then signals with such typical characteristics will be used as stimuli in psychoacoustical experiments, so that to study how a listener evaluates an overall loudness. Some short-term memory effects will be carefully examined, and a loudness model will be proposed for such sounds. Finally, by combining the previous results, the third part of the project will study the loudness of non stationary signals in which binaural differences exist. Experiment will place listeners in real listening conditions, or will use signals measured with a dummy head. The results of these experiments will be compared with the predictions got from a model mixing the characteristics of the two previous ones. If necessary, this model will be adapted so that it will allow providing accurate results, which will make it suitable for practical applications.

    more_vert
  • Funder: French National Research Agency (ANR) Project Code: ANR-10-VPTT-0010
    Funder Contribution: 666,040 EUR

    The research project ROADSENSE aims to define, conceive, deploy and validate experimentally a driving assistance for motored road users. This assistance is delivered by audio tactile lines installed on the road, producing an alert by sound and vibration into the vehicle while passing on it. This alert is intended to correct the lane departures of distracted, disoriented or tired drivers having difficulties to perceive the road or to position their vehicle, or more generally having a hazardous or erratic trajectory compared to their circulating lane in rural roads. In order to achieve this target, the research project ROADSENSE aims: • To propose a functional framework of road safety based on a review of the art and on the identification of the stakes and the rural road crashes mechanisms, • To characterize and to identify the sound and vibrations of existing alert lines • To conceive and validate relevant sound signals (sound) from psychoacoustics surveys on a panel of users, to recreate the on a numerical synthesizer • To implement the sound and vibration signs on a driving simulator, to set up reference scenarios • To experiment the efficiency and the acceptability of alert lines by audio-tactile delimitation of the driving lane from surveys on panels of users (i) on a driving simulator, (ii) on a test track and (iii) on open roads • To define and develop tools and methods enabling to evaluate the efficiency of the road devices from 3 roads test sites locally equipped in order to prepare an applicative research project (FUI). The efforts on the encoding of sounds and vibration to conceive audio tactile alert lines will offer a great opportunity to validate rigorously new low cost driving assist systems, that can be deployed on a short term and that should answer to the needs of the whole park of existing vehicles on the roads infrastructures.

    more_vert
  • chevron_left
  • 1
  • 2
  • 3
  • 4
  • 5
  • chevron_right

Do the share buttons not appear? Please make sure, any blocking addon is disabled, and then reload the page.

Content report
No reports available
Funder report
No option selected
arrow_drop_down

Do you wish to download a CSV file? Note that this process may take a while.

There was an error in csv downloading. Please try again later.