Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2009, Lecture Notes in Computer Science
…
13 pages
1 file
In this paper, we present a set of optimizations for a spoken language interface for mobile devices that can improve the recognition accuracy and user interaction experience. A comparison between a speech and a graphical interface, when used to accomplish the same task, is provided. The implications of developing a spoken language interface and integrating speech recognition and text-to-speech modules for European Portuguese in a mobile device, are also discussed. The paper focuses in the speech recognition module and in an algorithm for name matching optimization that provides the user with a more comfortable interaction with the device. Usability evaluation trials have shown that spoken language interfaces can provide an easier and more efficient use of the device, especially within a community of users less experienced in handling mobile devices.
diyahci.com
This paper studies the usability issues involved in the design of a speech based mobile interface for the textually low/non literate user base in the Indian context to come up with a set of heuristics for an effective speech based mobile interaction for the stated user base. A study on 15 low literate subjects in Guwahati, Assam was conducted to identify the key usability issues involved, the mental model of mobile phone of the focus group and the preferred alternative to textual interaction in a mobile interface. The user research methods adopted were contextual interviews, task analysis and retrospective interviews. The study indicated a preference for verbal interaction, a need for the interaction to be "humanlike" and for the speech recognition system to be tolerant to variations in spoken language. In accordance with the results of the study a set of usability heuristics were developed for the design of a usable mobile interface for the low literate involving audio-visual interaction and operated through speech input/output. Certain specifications for the speech recognition system were also developed to ensure satisfactory accuracy rate while keeping the input output process easy to use for the focus group. A prototype was created to demonstrate the working of the interface as well as that of the speech recognition system.
2007
Developing as peech-based application for mobile devices requires work upfront, since mobile devices and speech recognition systems vary dramatically in their capabilities. While mobile devices can concisely be classified by their processing power,m emory,o perating system and wireless network speed it is ab it trickier for speech recognition engines. This paper presents acomprehensive approach that comprises aprofound classification of speech recognition systems for mobile applications and aframework for mobile and distributed speech recognition. The framework called Gulliverspeeds up the development process with multi-modal components that can be easily used in aGUI designer and with abstraction layers that support the integration of various speech recognition engines depending on the user'sneeds. The framework itself provides the base for amodel-drivendevelopment approach.
intechopen.com
Nowadays, the convergence of devices, electronic computing, and massive media produces huge volumes of information, which demands the need for faster and more efficient interaction between users and information. How to make information access manageable, efficient, and easy becomes the major challenge for Human-Computer Interaction (HCI) researchers. The different types of computing devices, such as PDAs (personal digital assistants), tablet PCs, desktops, game consoles, and the next generation phones, provide many different modalities for information access. This makes it possible to dynamically adapt application user interfaces to the changing context. However, as applications go more and more pervasive, these devices show theirs limited input/output capacity caused by small visual displays, use of hands to operate buttons and the lack of an alphanumeric keyboard and mouse (Gu & Gilbert, 2004). Voice User Interface (VUI) systems are capable of, besides recognizing the voice of their users, to understand voice commands, and to provide responses to them, usually, in real time. The state-of-the-art in speech technology already allows the development of automatic systems designed to work in real conditions. VUI is perhaps the most critical factor in the success of any automated speech recognition (ASR) system, determining whether the user experience will be satisfying or frustrating, or even whether the customer will remain one. This chapter describes a practical methodology for creating an effective VUI design. The methodology is scientifically based on principles in linguistics, psychology, and language technology (Cohen et al. 2004; San-Segundo et al., 2005). Given the limited input/output capabilities of mobile devices, speech presents an excellent way to enter and retrieve information either alone or in combination with other modalities. Furthermore, people with disabilities should be provided with a wide range of alternative interaction modalities other than the traditional screen-mouse based desktop computing devices. Whether the disability is temporary or permanent, people with reading difficulty, visual impairment, and/or any difficulty using a keyboard, or mouse can rely on speech as an alternate approach for information access.
2008 IEEE International Conference on Acoustics, Speech and Signal Processing, 2008
Live Search for Mobile is a cellphone application that allows users to interact with web-based information portals. Currently the implementation is focused on information related to local businesses: their phone numbers and addresses, directions, reviews, maps of the surrounding area, and traffic. This paper describes a speech-recognition interface which was recently developed for the application, which allows the users to interact by voice. The paper presents the overall architecture, the user interface, the design and implementation of the speech recognition grammars, and initial performance results indicating that for sentence level utterance recognition we achieve 60 to 65% of human capability.
2007
The advent of mobile phones and the Internet opened the doors for an emerging class of applications that connect mobile users to online resources and information services available anytime and anywhere. VoiceXML is an enabling technology for creating a streamlined speech-based interface for web-based information services. In this paper, we describe VOICE, a framework used to develop mobile contextaware speech-enabled systems powered by VoiceXML. Restaurant Search Guide, a simple context-aware recommender system has been built using VOICE to test the feasibility of the proposed framework.
2008
Even though natural language voice-only input applications may be quite successful in desktop or office environments, voice may not be an appropriate input modality in some mobile situations. This necessitates the addition of a secondary input modality, ideally one with which users can express the same amount of content as they can with natural language using a similar amount of effort. The work presented here describes one possible solution to this problem - leveraging existing help mechanisms in the voice-only application to support an additional non-voice input modality, in our case text input. The user can choose the speech or text modality according to their current situation (e.g. noisy environment) and have the same interaction experience.
Current voice translation tools and services use natural language understanding and natural language processing to convert words. However, these parsing methods concentrate more on capturing keywords and translating them, completely neglecting the considerable amount of processing time involved. In this paper, we are suggesting techniques that can optimize the processing time thereby increasing the throughput of voice translation services. Techniques like template matching, indexing frequently used words using probability search and session-based cache can considerably enhance processing times. More so, these factors become all the more important when we need to achieve real-time translation on mobile phones.
… Design and Evaluation for Mobile …, 2008
The use of a voice interface, along with textual, graphical, video, tactile, and audio interfaces, can improve the experience of the user of a mobile device. Many applications can benefit from voice input and output on a mobile device, including applications that provide travel directions, weather information, restaurant and hotel reservations, appointments and reminders, voice mail, and e-mail. We have developed a prototype system for a mobile device that supports client-side, voice-enabled applications. In fact, the prototype supports multimodal interactions but, here, we focus on voice interaction. The prototype includes six voice-enabled applications and a program manager that manages the applications. In this chapter we describe the prototype, including design issues that we faced, and evaluation methods that we employed in developing a voice-enabled user interface for a mobile device.
2004
The state of the art speech and language technology is already on such a high level that the users can execute simple control commands to direct system operations, and also have short conversations with the system to search for information. However, to enable natural multilingual interaction, the unique requirements that arise from the combination of adaptive interface design, dialogue research, and language processing have to be addressed in the system development.
1996
Pen and Speech Recognition in the User Interface for Mobile Multimedia Terminals by Shankar Narayanaswamy Doctor of Philosophy in Engineering - Electrical Engineering and Computer Sciences University of California at Berkeley
Loading Preview
Sorry, preview is currently unavailable. You can download the paper by clicking the button above.
5th European Conference on Speech Communication and Technology (Eurospeech 1997)
IEEE Transactions on Speech and Audio Processing, 2002
2012 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2012
…, Eatoni Ergonomics Inc
Lecture Notes in Computer Science, 2015
Lecture Notes in Computer Science, 2002
World Congress on the Management of eBusiness, 2007
Journal of Computer Science, 2019
International Journal of Engineering Research and Technology (IJERT), 2013