“Alarm Calls Evoke a Visual Search Image of a Predator in Birds”, Toshitaka N. Suzuki2018 (, )⁠:

One of the core features of human speech is that words cause listeners to retrieve corresponding visual mental images. However, whether vocalizations similarly evoke mental images in animal communication systems is surprisingly unknown.

Japanese tits (Parus minor) produce specific alarm calls when and only when encountering a predatory snake. Here, I show that simply hearing these calls causes tits to become more visually perceptive to objects resembling snakes.

During playback of snake-specific alarm calls, tits approach a wooden stick being moved in a snake-like fashion. However, tits do not respond to the same stick when hearing other call types or if the stick’s movement is dissimilar to that of a snake.

Thus, before detecting a real snake, tits retrieve its visual image from snake-specific alarm calls and use this to search out snakes. This study provides evidence for a call-evoked visual search image in a nonhuman animal, offering a paradigm to explore the cognitive basis for animal vocal communication in the wild.


NYT: …But critics objected that the calls might not have any properties of language at all. Instead of being intentional messages to communicate meaning to others, the calls might be involuntary, emotion-driven sounds, like the cry of a hungry baby. Such involuntary expressions can transmit rich information to listeners, but unlike words and sentences, they don’t allow for discussion of things separated by time and space. The barks of a vervet in the throes of leopard-induced terror could alert other vervets to the presence of a leopard—but couldn’t provide any way to talk about, say, “the really smelly leopard who showed up at the ravine yesterday morning.”

Toshitaka Suzuki, an ethologist at the University of Tokyo who describes himself as an animal linguist, struck upon a method to disambiguate intentional calls from involuntary ones while soaking in a bath one day. When we spoke over Zoom, he showed me an image of a fluffy cloud. “If you hear the word ‘dog’, you might see a dog”, he pointed out, as I gazed at the white mass. “If you hear the word ‘cat’, you might see a cat.” That, he said, marks the difference between a word and a sound. “Words influence how we see objects”, he said. “Sounds do not.” Using playback studies, Suzuki determined that Japanese tits, songbirds that live in East Asian forests and that he has studied for more than 15 years, emit a special vocalization when they encounter snakes. When other Japanese tits heard a recording of the vocalization, which Suzuki dubbed the “jar jar” call, they searched the ground, as if looking for a snake. To determine whether “jar jar” meant “snake” in Japanese tit, he added another element to his experiments: an 8-inch stick, which he dragged along the surface of a tree using hidden strings. Usually, Suzuki found, the birds ignored the stick. It was, by his analogy, a passing cloud. But then he played a recording of the “jar jar” call. In that case, the stick seemed to take on new importance: The birds approached the stick, as if examining whether it was, in fact, a snake. Like a word, the “jar jar” call had changed their perception.