Developmental Word Grounding Through a Growing Neural Network With a Humanoid Robot | IEEE Journals & Magazine | IEEE Xplore

Developmental Word Grounding Through a Growing Neural Network With a Humanoid Robot


Abstract:

This paper presents an unsupervised approach of integrating speech and visual information without using any prepared data. The approach enables a humanoid robot, Incremen...Show More

Abstract:

This paper presents an unsupervised approach of integrating speech and visual information without using any prepared data. The approach enables a humanoid robot, Incremental Knowledge Robot 1 (IKR1), to learn word meanings. The approach is different from most existing approaches in that the robot learns online from audio-visual input, rather than from stationary data provided in advance. In addition, the robot is capable of learning incrementally, which is considered to be indispensable to lifelong learning. A noise-robust self-organized growing neural network is developed to represent the topological structure of unsupervised online data. We are also developing an active-learning mechanism, called "desire for knowledge", to let the robot select the object for which it possesses the least information for subsequent learning. Experimental results show that the approach raises the efficiency of the learning process. Based on audio and visual data, they construct a mental model for the robot, which forms a basis for constructing IKR1's inner world and builds a bridge connecting the learned concepts with current and past scenes
Page(s): 451 - 462
Date of Publication: 12 March 2007

ISSN Information:

PubMed ID: 17416171
References is not available for this document.

I. Introduction

As human beings, we discriminate various concepts through formation of “iconic representations” of them. Judgments of their resemblance or difference are based on similarity and difference comparisons of these iconic representations [1]. We also interpret their meaning through language. Therefore, in a sense, we ground the meanings of language to its perceptual context. For a robot to be more like a human, it must understand the sound patterns of words and understand their meanings. It must ground language in its world as mediated by its perceptual, motor, and cognitive capacities. Under such a scenario, it must analyze the current scene along with the associated utterance, integrate the extracted information, and then finally acquire their meanings.

Select All
1.
S. Harnad, "The symbol grounding problem", Physica D, vol. 42, no. 13, pp. 335-346, Jun. 1990.
2.
J. Siskind, "A computational study of cross-situational techniques for learning word-to-meaning mappings", Cognition, vol. 61, no. 1/2, pp. 39-91, Oct./Nov. 1996.
3.
J. Siskind, "Learning word-to-meaning mappings" in Models of Language Acquisition: Inductive and Deductive Approaches, U.K., London:Oxford Univ. Press, pp. 121-153, Jul. 2000.
4.
S. Wachsmuth, G. Socher, H. Brandt-Pook, F. Kummert and G. Sagerer, "Integration of vision and speech understanding using Bayesian networks", Videre: J. Comput. Vis. Res., vol. 1, no. 4, pp. 61-83, 2000.
5.
A. L. Gorin, D. Petrovksa-Delacretaz, G. Riccardi and J. Wright, "Learning spoken language without transcriptions", Proc. IEEE Workshop Speech Recog. and Understanding, pp. 293-296, 1999.
6.
T. Regier, The Human Semantic Potential: Spatial Language and Constrained Connectionism, MA, Cambridge:MIT Press, 1996.
7.
T. Oates, Z. Eyler-Walker and P. R. Cohen, "Toward natural language interfaces for robotic agents: Grounding linguistic meaning in sensors", Proc. 4th Int. Conf. Auton. Agents, pp. 227-228, 2000.
8.
D. Roy and A. Pentland, "Learning words from sights and sounds: A computational model", Cogn. Sci., vol. 26, no. 1, pp. 113-146, 2002.
9.
N. Iwahashi, "Language acquisition through a humanrobot interface by combining speech visual and behavioral information", Inf. Sci., vol. 156, no. 1/2, pp. 109-121, Nov. 2003.
10.
N. Iwahashi, "Active and unsupervised learning of spoken words through a multimodal interface", Proc. 13th IEEE Workshop Robot and Human Interactive Commun., pp. 437-442, 2004.
11.
C. Yu and D. Ballard, "On the integration of grounding language and learning objects", Proc. 19th Nat. Conf. Artif. Intell. (AAAI), pp. 488-494, 2004-Jul.
12.
L. Steels and P. Vogt, "Grounding adaptive language games in robotic agents", Proc. ECAL, pp. 474-482, 1997.
13.
L. Steels and J.-C. Baillie, "Shared grounding of event descriptions by autonomous robots", Robot. Auton. Syst., vol. 43, no. 2/3, pp. 163-173, 2003.
14.
L. Steels, F. Kaplan, A. McIntyre and J. Van Looveren, "Crucial factors in the origins of word-meaning" in The Transition to Language, U.K., Oxford:Oxford Univ. Press, pp. 252-271, 2002.
15.
P. Vogt, "The emergence of compositional structures in perceptually grounded language games", Artif. Intell., vol. 167, no. 1/2, pp. 206-242, Sep. 2005.
16.
J. Weng, J. McClelland, A. Pentland, O. Sporns, I. Stockman, M. Sur, et al., "Autonomous mental development by robots and animals", Science, vol. 291, no. 5504, pp. 599-600, 2001.
17.
J. Elman, "Learning and development in neural networks: The importance of starting small", Cognition, vol. 48, no. 1, pp. 71-99, 1993.
18.
S. Thrun and T. Mitchell, "Learning one more thing", Proc. IJCAI, pp. 1217-1223, 1995-Aug.
19.
S. Imai, T. Kobayashi, K. Tokuda, T. Masuko and K. Koishida, Speech signal processing toolkit: Sptk version 3.0, 2002.
20.
A. Gersho and R. M. Gray, Vector Quantization and Signal Compression, MA, Boston:Kluwer, 1992.
21.
C. S. Myers and L. R. Rabiner, "A comparative study of several dynamic time-warping algorithms for connected word recognition", Bell Syst. Tech. J., vol. 60, no. 7, pp. 1389-1409, Sep. 1981.
22.
F. Shen and O. Hasegawa, "An incremental network for on-line unsupervised classification and topology learning", Neural Netw., vol. 19, no. 1, pp. 90-106, Jan. 2006.
23.
D. Roy, K. Hsiao and N. Mavridis, "Mental imagery for a conversational robot", IEEE Trans. Syst. Man Cybern. B Cybern., vol. 34, no. 3, pp. 1374-1383, Jun. 2004.
24.
K. Hsiao, N. Mavridis and D. Roy, "Coupling perception and simulation: Steps towards conversational robotics", Proc. IEEE/RSJ Int. Conf. Intell. Robots Syst., pp. 928-933, 2003.
25.
J. Piaget and B. Inhelder, The Childs Conception of Space, U.K., London:Routledge and Kegan Paul, 1956.

Contact IEEE to Subscribe

References

References is not available for this document.