The aim of pervasive multimodal multimedia computing is to realize anytime, anywhere computing using various modes of human-computer interaction. The current state-of-the-art pervasive systems and solutions, however, do not include applications that are related to pervasive multimodality. Also, the current multimodal interfaces were designed with pre-defined modes of human-machine interaction that were not chosen based on the given context of the user, of his environment and of his computing system. This paper addresses these weaknesses by proposing a pervasive multimodal multimedia computing system in which its modalities are chosen based on their suitability to the given interaction context. This same system chooses a multimodal (or unimodal) interface based on the given context, available media devices and user preferences. This paper discusses the challenges in designing the infrastructure of such computing system and illustrates how we addressed those challenges. This work is our contribution to the ongoing research that aims at realizing pervasive multimodality.