Résumé de la thèse

A Paradigm of Interaction Context-Aware Pervasive Multimodal Multimedia Computing System

M. Hina

Communication is a very important aspect of human life; it is with communication that helps human beings to connect with each other as individuals and as independent groups. Communication is the fulcrum that drives all human developments in all fields. In informatics, the very purpose of the existence of computer is information dissemination – to be able to send and receive information. Humans are quite successful in conveying ideas with one another, and reacting appropriately. This is due to the fact that we share the richness of the language we share, have a common understanding of how things work and an implicit understanding of everyday situations. When human communicate with human, they comprehend the information that is apparent to the current situation, or context, hence increasing the conversational bandwidth. This ability to convey ideas, however, does not transfer when human interacts with computer. On its own, computers do not understand our language, do not understand how the world works and cannot sense information about the current situation. In a typical computing set-up where we have an impoverished typical mechanism for providing computer with information using mouse, keyboard and screen, the end result is we explicitly provide information to computers, producing an effect that is contrary to the promise of transparency and calm technology in Weiser’s vision of ubiquitous computing (Weiser 1991; Weiser and Brown 1996). To reverse this trend, it is imperative that we researchers find ways and methodology that will enable computers to have access to context. It is through context-awareness that we can increase the richness of communication in human-computer interaction, through which we can reap the most likely benefit of more useful computational services.

Context is a subjective idea as demonstrated by the state-of-the art in which each researcher has his own understanding of the term, which continues to evolve nonetheless. The acquisition of contextual information is essential but it is the end user, however, that will have the final say as to whether the envisioned context is correctly captured/acquired or not. Current literatures inform us that some contextual information are already predefined by some researchers from the very beginning – this is correct if the application domain is fixed but is incorrect if we infer that a typical user does different computing tasks in different occasions. With the aim of coming up with more conclusive and inclusive design, we conjure that what contextual information should be considered should be left to the judgment of the end user who is the one that has the authority to determine which information is important to him and which is not. This leads us to the concept of incremental acquisition of context where context parameters are added, modified or deleted one context parameter at a time.

In conjunction with our idea of inclusive context, we broaden the notion of context that it has become context of interaction. Interaction context is the term that is used to refer to the collective context of the user (i.e. user context), of his working environment (i.e. environment context) and of his computing system (i.e. system context). Logically and mathematically, each of these interaction context elements – user context, environment context and system context – is composed of various parameters that describe the state of the user, of his workplace and his computing resources as he undertakes an activity in accomplishing his computing task, and each of these parameters may evolve over time. For example, user location is a user context parameter and its value will evolve as the user moves from one place to another. The same can be said about noise level as an environment context parameter; its value evolves over time. Ditto with available bandwidth that continuously evolve which we consider as a system context parameter. To realize the incremental definition of incremental context, we have developed a tool called layered virtual machine for incremental interaction context. This tool can be used to add, modify and delete a context parameter on one hand and determine the sensor-based context (i.e. context that is based on parameters whose values are obtained from raw data supplied by sensors) on the other.

In order to obtain the full benefit of the richness of interaction context with regards to communication in human-machine interaction, the modality of interaction should not be limited to the traditional use of mouse-keyboard-screen alone. Multimodality allows for a much wider range of modes and forms of communication, selected and adapted to suit the given user’s context of interaction, by which the end user can transmit data with computer and computer responding or yielding results to the user’s queries. In multimodal communication, the weaknesses of one mode of interaction, with regards to its suitability to a given situation, is compensated by replacing it with another mode of communication that is more suitable to the situation. For example, when the environment becomes disturbingly noisy, using voice may not be the ideal mode to input data; instead, the user may opt for transmitting text or visual information. Multimodality also promotes inclusive informatics as those with permanent or temporary disability are given the opportunity to use and benefit from information technology advancement. For example, the work on presentation of mathematical expressions to visually-impaired users (Awdé 2009) would not have been made possible had we not advocated for the advancement of multimodality. With mobile computing within our midst coupled with wireless communication that allows access to information and services, pervasive and adaptive multimodality is more than ever apt to enrich communication in human-computer interaction and in providing the most suitable modes for data input and output in relation to the evolving context of interaction.

A look back at the state of the art inform us that a great amount of effort were exerted and expended in finding the definition of context, in the acquisition of context, in the dissemination of context and the exploitation of context within a system that has a fixed domain of application (e.g. healthcare, education, etc.). Also, another close look tells us that much research efforts on ubiquitous computing were devoted on various application domains (e.g. identifying the user whereabouts, identifying services and tools, etc.) but there is barely, if ever, an effort made to make multimodality pervasive and accessible to various user situations. In this regard, we come up with a research work that will provide for the missing link. Our work – the paradigm of an interaction context-sensitive pervasive multimodal multimedia computing system is an architectural design that exhibits adaptability to a much larger context called interaction context. It is intelligent and pervasive, meaning it is functional even when the end user is stationary, mobile or on the go. It is conceived with two purposes in mind. First, given an instance of interaction context, one which evolves over time, our system determines the optimal modalities that suit such interaction context. By optimal, we mean a selection decision on appropriate multimodality based on the given interaction context, available media devices that support the modalities and user preferences. We designed a mechanism (i.e. a paradigm) that will do this task and simulated its functionality with success. This mechanism employs machine learning (Mitchell 1997; Alpaydin 2004; Hina, Tadj et al. 2006) and uses case-based reasoning with supervised learning (Kolodner 1993; Lajmi, Ghedira et al. 2007). An input to this decision-making component is an instance of interaction context and its output is the most optimal modality and its associated media devices that are for activation. This mechanism is continuously monitoring the user’s context of interaction and on behalf of the user continuously adapts accordingly. This adaptation is through dynamic reconfiguration of the pervasive multimodal system’s architecture. Second, given an instance of interaction context and the user’s task and preferences, we designed a mechanism that allows the automatic selection of user’s applications, the preferred suppliers to these applications and the preferred quality of service (QoS) dimensions’ configurations of these suppliers. This mechanism does its task in consultation with computing resources, sensing the available suppliers and possible configuration restrictions within the given computing set-up.

Apart from the above-mentioned mechanisms, we also formulated scenarios as to how a computing system must provide user interface given that we have already identified the optimal modalities that suit the user’s context of interaction. We present possible configurations of a unimodal and bimodal interfaces based on the given interaction context as well as user preferences.

Our work is different from the rest of previous works in the sense that while others capture, disseminate and consume context to suit its preferred domain of application, ours capture the context of interaction and reconfigure its architecture dynamically in generic fashion in order that the user would continue working on his task anytime, anywhere he wishes regardless of the application domain the user wishes to undertake. In effect, the system that we have designed along with all of its mechanisms, being generally generic in design, can be adapted or integrated with ease or with very little amount of modification into various computing systems of various domains of applications. This is our contribution to the domain.

Simulations and mathematical formulations were provided to support our ideas and concepts related to the design of the paradigm. An actual program in Java is developed to support our concept of layered virtual machine for incremental interaction context.

Keywords: Human-machine interface, multimodal interface, pervasive computing, multimodal multimedia computing, software architecture.