Interconnections: Spatial and Temporal Representation within Image and Sound

УДК 78.04


Anastasya KoshkinUniversity of Wisconsin – Milwaukee, Department of Music, teaching assistant, Milwaukee, USA.


2400 E Kenwood Blvd, Milwaukee, WI53211, USA,

tel: +1 414-229-1122.


Ethnographic work of Benjamin Ives Gilman in the “Hopi Songs” is a prominent work in ethnomusicology as it relates to an example of an imaginative interpretation of music. Gilman’s documentation of the Hopi melody and a visual interpretation of the diatonic scale create an imaginative expression of the “Hopi Songs” alongside the western musical tradition. This study takes inspiration from Gilman’s study in which the use of symbolism serves as an indication of content engaging the intercultural imagination. In this study two distinct languages act as one within the materiality of musical expression, developing an instance of universality in language and sound. Musical and linguistic landscapes pertaining to the Russian and English languages serve as interconnections integrating language and music-based sound and sonic representation. Via the author’s musical work “Siberian Waves” language, it is a focus of this analysis where an event of language based expression precedes the musical event. In this study capturing music and language within systems of signs such as musical transcription and notation, and with digital media, presents an effort to extend music toward wider notions of culture and identity.


Keywords: music; folklore; image; visual perception; audio perception.


Взаимосвязи: пространственные и временные репрезентации в образе и звуке


Анастасия Кошкин – Университет Висконсин – Милуоки, факультет музыки, преподаватель, Милуоки, Соединенные Штаты Америки.


2400 E Kenwood Blvd, Milwaukee, WI53211, USA,

тел.: +1 414-229-1122.

Авторское резюме

Этнографический труд Бенджамина Айвза Гилмана «Песни народа хопи» является значительным вкладом в развитие этномузыкологии, представляя собой пример образного понимания музыки. Свидетельства Гилмана о мелодии народа хопи и зрительная интерпретация диатонической шкалы создают образную выразительность «Песен народа хопи» наравне с западной музыкальной традицией. Вслед за исследованием Гилмана данная работа рассматривает символичность как показатель вовлеченности межкультурной выразительности. В данном исследовании два различных языка действуют совместно в рамках музыкального высказывания, реализуя универсальность языка и звука. Музыкальные и языковые картины мира, свойственные русскому и английскому языкам, являются средством взаимопонимания, объединяющим язык и музыкальное представление. В рамках анализа собственного сочинения «Siberian Waves» автор рассматривает особенности построения музыкальной фразы на базе предшествующего языкового выражения. В данном исследовании в рамках рассмотрения музыки и языка при помощи таких знаковых систем, как музыкальная транскрипция и нотация, а также при помощи цифровых средств представлена попытка распространить музыку на более широкие понятия культуры и идентичности.


Ключевые слова: музыка; народная музыка; образность; визуальное восприятие; звуковое восприятие.


The Terrestrial Environment

Working with visuals and sounds as forms of embodied experience, we may communicate the presence of distance within the material world. Contours and boundaries of the external world may be brought toward human comprehension via visual and auditory stimuli. The material world is a terrestrial surface where earth and air meet. The environment between earth and air is filled with materials of light and sound, which are reflected and absorbed. The medium of image and sound may be viewed within an ecological framework to explore avenues of engaging with the external world. Photography and film create imprints of places in time. The photograph may capture a presence of a particular place and a quality of a particular landscape. Similarly, the medium of sound such as recorded sound may be viewed as a temporal imprint of a certain place in time.


Psychologist James Gibson writes that there are granules of sand scattered all over the terrestrial environment of more or less uniform size. Gibson draws a distinction between the notion of space as a uniform unit of measurement and the notion of space as a composition of a vast diversity. Within the terrestrial environment distances are filled with unique phenomena and change. The English etymology of the word “distance” has evolved from the Latin origin “to stand apart” [8, p. 131]. The terrestrial environment is composed of distances that stand apart as variables of unique material. The phenomenon of human perception of the visual and the sound-based is particularly interesting. Interdisciplinary fields of study in relation to the visual and sound may be employed in order to address the embodied experience of distance.


The Embodied Experience within Contemporary Culture

Perceptions of distance and spaciousness are a source of knowledge and a form of embodied experience. With the advent of civilization more complex social and political orders were formed, framing notions of distance and space in a particular way. In “The Mass Ornament: Weimer Essays” Siegfried Kracauer explores the concept of “mute nature” as a spatial environment that is seldom given a voice in contemporary culture [14, p. 61]. Environments of nature are often seen unknown and unmanageable.


Distance within a spatial environment may be approach within an ecological approach. Contact with the terrestrial environment is ultimately an embodied experience. It is nested and embedded within the material world. The realm of digital media takes an interest in recreating embodied experience such as the experience of distance. An increased focus on interactive and embodied technology, such as the World Wide Web and telepresence, has contributed to the collapse of “physical distances, uprooting the familiar patterns of perception” [16]. An understanding of the world on the basis of digital artifacts and user-centered interfaces strives to provide an embodied experience within the digital interface. Within digital constructions of spatio-temporal frameworks now “at least in principle, every point on Earth is now instantly accessible from any other point on Earth.”


Photography as an Imprint of the Material World

The photograph is an imprint of a certain place in time. In John Berger’s collection of essays “About Seeing” a description of a photograph depicts a peasant couple leaning over a fence – behind them a field is unfolding. Berger states in relation to the photograph, “it is not the substantiality surfaces which fill every square inch but a Slav sense of distance, a sense of plains or hills that continue indefinitely” [1, p. 50]. The photograph takes on symbolic properties in relation to physical environments. The photograph may capture the presence of a particular place and the quality of a particular landscape.


I have photographed the SoinsForest, an ancient forest on the outskirts of Brussels. The SoinsForest stands at the edge of the city. Within the confines of the forest distant landscapes are surrounded by towering trees and hilly soft undergrowths. Green-filled distances stretch and rest before my eyes. A sense of distance could be felt within the presence of the SoinsForest. A certain distance unfolding out there is captured and is present within the photograph, as well. The photograph visually captures a vast expanse of the natural environment as it unfolds.


Along landscapes of Belarus, Israel, Canada, and the United States, I encountered places that have molded and shaped the quality of my voice. Currently in a suburb of Greater Toronto Area a building stands at the edge of the Scarborough Bluffs. It is peering toward a great expanse of water. LakeOntario is infinite and blue. The transition between Canadian and American land is as far as the eyes could see. Distance unfolds from Toronto toward the shores of Buffalo. The sky is vast and unoccupied. LakeOntario brings gusts of wind. Windows and balconies are lit with orange light. Light streams from windows and warm wind blows from the lake. The building stands. A distance of seventeen stories climbs up into the air in the night.


“A Slav sense of distance” that Berger describes, treats the photograph as a symbol, a depiction of continuous space. For instance, as seen in Fig. 1 a sense of distance unfolds within the photographs titled “Views in the Ural Mountains and Western Siberia, survey of waterways, Russian Empire.” The photographs depict Western Siberia as a continuous distance. In one of the photographs small houses and outlines of church spires are seen. The photographs bring a place of knowledge about a faraway place. The photographs serve as symbolic objects representing a faraway place. Inhabitants of places composed of vast distances view the presence of vast distances within the external world as part of their lives. As Berger states, “what informs the whole photograph – space – is part of the skin of their lives” [1, p. 50].


The Audiovisual in Film

Film constructs a sense of distance and space as a reflection of external reality. Holding the world within a unified perspective visuals and sounds constructs perceptions of space. The human system continuously perceives a world that exists on the surface of the terrestrial environment between earth and air. Navigating between distances and borders of land and developing a shared experience of the terrestrial environment is essential to human experience. As light and sound is propagated along boundaries and outlines of the material world, visuals and sounds serve as imprints of the material world.


A profound respect for distance and space within the external world may be found within the genre of documentary film. In the documentary film “Fata Morgana” the audiovisual environment maintains a continuous encounter with the distant land of the Sahara desert. The camera crosses uninhabited landscapes of the Sahara desert presenting them to the viewer as a continuous encounter with distant uninhabited land. Within the camera’s viewfinder, distance becomes an imprint within a continuum.


In “Fata Morgana” the voiceover is narrated by Werner Herzog, the director himself. The narrator speaks about landscapes of the dessert, interpreting and bringing human presence to the visual world of uninhabited distances. The voice of the narrator interprets images of distance. The voice of the narrator becomes an important presence within landscapes of distance. Brimming with presence, as the narrator’s voice speaks, the viewer finds zones of comprehension alongside images of uninhabited and distant land. Thus distances become present as they are given a voice and spent time with, within the visual frame.


The voice of the narrator drifts along desert sand, it approaches points in time as moments of encounter with human language and speech. “Textual speech – generally that of voiceover commentaries – inherits certain attributes of the intertitles of silent films, since unlike theatrical speech it acts upon the images.” [5, p. 172]. Textual speech unifies durations of visual footage, acting upon the images invisibly yet audibly. Interpreting visuals of the dessert, the narrator’s voice serves a nondiegetic presence in film. Nondiegetic sound is a sound whose source remains off-screen. Media theoretician, Michel Chion, coined the term, nondiegetic, as a particular presence within the film’s audiovisual domain. The word nondiegetic is used “to designate sound whose supposed source is not only absent from the image but is also external to the external world” [5, p. 73].


The external world of the desert is impervious, suspended before the viewer’s eyes. An immense distance of the dessert unfolds. Sand stretches yellow and gold. The vast environments of the dessert become as presences unfolding and self-contained. The presence of the narrator’s voice in “Fata Morgana” is important. The voice of the narrator drifts along desert sand, approaching points in time as moments of encounter. It acts independently of the visual content, unifying the plurality of the visual content into a coherent perspective.


Within the visual domain objects are present and made seen at once, such as trees within a forest. On the other hand, within the auditory domain, objects could be presented sequentially with the passing of time. For instance, a music composition or the narrator’s voiceover within a film, communicate audible presences gradually, presenting each at a time within a narrative arch. As all objects may be visible at once within a visual environment, they may be articulated gradually with sound. This type of gradual articulation is reminiscent of perceiving places at a distance. As event-based environments of sound appear at a slower and more gradual pace, they expand as heterogeneous moments of time, separated by distances.


In the section “Music as Spatiotemporal Turntable” Chion writes, “Music can swing over from pit to screen at a moment’s notice, without in the least throwing into question the integrity of the diegesis, as a voiceover intervening in the action would. No other auditory element can claim this privilege. Out of time and out of space, music communicates with all times and all spaces of a film, even as it leaves them to their separate and distinct existences. Music can aid characters in crossing great distances and long stretches of time almost instantaneously” [5, p. 81].


In the documentary film “Fata Morgana,” across visuals of uninhabited space, the narrator’s commentary and the presence of music characterize the visual imprint of uninhabited space. The narrator’s voice acts invisibly yet audibly in relation to the landscapes of the dessert, unifying visual durations into a unified whole. As the medium of music unifies durations of film, it seamlessly crosses boundaries between one scene and another, crossing distances that are captured within the visual footage. The film captures a dynamic between a spatial environment as a homogeneous surface and the presence of sound as a heterogeneous temporal imprint. The relationship between visual objects and the material of sound are fundamental within expressive culture. The psychologist James Gibson views human perception in close relationship with surrounding environments of space. Gibson’s work “concentrated on what he called the direct perception of the environment” [10, p. 2]. Gibson has advocated a view of human psychology striving toward the “sharing of the environment” [10, p. 2]. In the spirit of sharing, the medium of film combines visual and auditory modes of expression, reflecting human experience and the material world.


The Music Domain

“Music, like all the other arts, requires its own interpretation of the concept of distance” [20, p. 137]. Human perception of space and distance has been addressed by prominent electroacoustic music composers and theoreticians such as John Chowning. The article “Perceptual Fusion and Auditory Perspective” by John Chowning addresses the human perception of sound as it is sculpted and framed by the material world. Furthermore, the music work Chowning composed “demonstrated something profound, musically, technically, and about the human perceptual system”. The human perception of sound discerns audible aspects about the external world.


Electroacoustic music emerged from a focus on the material of sound and “the information contained in the spatial behavior of sounds.” Patterns of music serve as symbols, aligning our bodily experience of the external world with the material of sound. Electroacoustic music composition is particularly interesting as a contemporary avenue of interpreting and drawing wisdom from the terrestrial environment. Within music, presences within the material world, such as distance and time, become present within the material of sound. To convey a certain distance between one event and another, music composers have integrated sound objects, sculpting atmospheres of distance with sound. The perception of sound-based phenomena, such as recorded sound, is particularly interesting as it communicates the presence of space and distance within the physical world. As a consequence of the surrounding environment, perceptual awareness of sound and music may be developed.


Sculpting particular properties of the sound object as a musical moment, we may work with the properties of sound, such as pitch and timbre, to “penetrate the internal organization of the sonic symbols” [24, p. 159]. Each note is a discrete articulation of a single pitch. Music composers choose between one note and the next, emphasizing, accentuating events, deciding on presences and absences of sound. The Japanese composer Toru Takemitsu has utilized notions of space within music composition in terms of the Japanese “ma.” The Japanese “ma” is a kind of a space, and a pause, between two presences or potential presences in the physical environment. The Japanese “ma” as a presence and an absence affords a consideration of a certain place at a distance. Within music composition Takemitsu approached presences and absences of sound in relation to spatial environments and aesthetics of distance. Furthermore, music composers return to the concept of distance again and again. For instance, electroacoustic music work “Mint Cascade” composed by Andy Dolphin and “Givre” composed by Monique Jean sculpt sound-based presences, communicating a spatial environment flooded by sound, interpreting and presenting imagined environments.


The Domain of Recorded Sound

Contemporary practitioners have been considering music and sound art in relation to the surrounding environment. For instance, sound artist Jacob Kirkegaard created a sound work titled “4 Rooms.” Kirkegaard recorded sonic environments within four deserted acoustic spaces in alienated regions of Chernobyl, Ukraine. Kirkegaard fed back the recorded environments several times within the acoustic spaces, in order to bring out the reverberant qualities of the empty rooms. Kirkegaard selected acoustic spaces that had a significant reverberant quality to capture the presence of empty space as an audible aura. The audible silences within the reverberant spaces, as durations of time, signal the presence of sound as an ongoing atmosphere. The sound of silence becomes as an audible presence, an atmosphere filled with unwitnessed time and space. Listening to the audible durations of emphasized silences within Kirkegaard’s “4 Rooms” the presence of sound-based silence, as an aura of a place, becomes as a recorded presence unique within each room.


“Field recording practices are considered by many soundscape artists as integral to engagement with specific places” [17, p. 1]. The practice of field recording may be approached as a way of learning about and consideration of sounds of distant environments. An embodied sense of distance comes from human experience. My experience of field recording sculpts my understanding of specific places and provides many opportunities for reflection. I like to record environmental sounds on the periphery of nature and civilization and listen to them as evolving landscapes. At Brooklyn Army Terminal in Brooklyn in New York, I recorded the presence of sounds as they echoed in the distance, as birds sang, as ship engines came and went, and as people on the pier explored landscapes of land and water.


The medium of recorded sound presents an opportunity to explore a specific place and keep it as a record or an audible presence. “Creating sonic fantasies begins with recording, letting us “photograph’’ real sounds and store their images on tape” [19, p. 350]. Environmental sound as a presence signifies a specific relationship to spatial environments. A sound recording captures an audible presence of a physical environment. A field of study termed “acoustic ecology” strives to capture and communicate the presence of unwitnessed sound embedded within remote environments. Practitioners of acoustic ecology encounter environments that are located at great distances to human culture, as presences heard within unwitnessed canopies of sound. “Acoustic ecology investigates the relationships between living beings and the acoustic environment or the soundscape”.


A sound recording inverts spatially occurring phenomena to a temporal imprint. Recorded sound is a temporal medium. It may point to a particular place at a distance to human perception. There is “the view that perceiving is a matter of constructing a mental representation from sensory inputs.” (Imagination article) A sound recording may be heard and interpreted as distance itself. The dynamic between a recorded duration of time, and the spatial environment, echoes the structure of the physical environment. The medium of sound-based recording, such as the field recording, translates spatially occurring phenomena to temporal imprints, resulting in a dynamic between spatial presence and temporal presence. This signifies a symbolic relationship between spatial and temporal environments, in which distance and aura coalesce to form the vast spatio-temporal environments in which we live. Thus, the material of recorded sound echoes the vastness and complexity of spatio-temporal environments as a point of value and deserved focus.



1. Berger J. About Looking. New York, Pantheon Books, 1980, 202 p.

2. Bird R. Andrei Tarkovsky: Elements of Cinema. London, Reaktion Books, 2008, 256 p.

3. Brower C. A Cognitive Theory of Musical Meaning. Journal of Music Theory, Vol. 44(2), 2000, pp 323 – 379.

4. Bregman A. S. Auditory Scene Analysis. International Encyclopedia of the Social and Behavioral Sciences. Amsterdam, Pergamon, pp. 940–942.

5. Chion M., Gorbman C., Murch W. Audio-Vision: Sound on Screen. New York, ColumbiaUniversity Press, 1994, 239 p.

6. Conway C. M. The Oxford Handbook of Qualitative Research in American Music Education. OxfordUniversity Press, 2014, 696 p.

7. Cook P. R. Music, Cognition, and Computerized Sound: An Introduction to Psychoacoustics. Cambridge, MIT Press, 1999. 392 p.

8. Cresswell J. The Oxford Dictionary of Word Origins. Second Edition. Oxford, OxfordUniversity Press, 2010, 512 p.

9. Herzog W. Fata Morgana (film), 1971.

10. Gibson J. J. The Senses Considered as Perceptual Systems. Westport, Greenwood Press, 1983, 335 p.

11. Kane B. L’Objet Sonore Maintenant: Pierre Schaeffer, Sound Objects and the Phenomenological Reduction. Organized Sound, 2007, Vol. 12(01) pp 15–24.

12. Koshkin A. Along a Duration of Time: Extending Human Involvement to Remote Regions of the Earth. Leonardo Music Journal, 2014, Vol. 24, pp. 43–44.

13. Koshkin A. Ecological Validity. Music in the Social and Behavioral Sciences: An Encyclopedia, 2014, Vol. 5, pp. 356–358.

14. Kracauer S., Levin T. Y. The Mass Ornament: Weimar Essays. Cambridge, HarvardUniversity Press, 1995. 403 p

15. Kubik G. Theory of African Music. Chicago, University of Chicago, 2010. 464 p.

16. Manovich L. Cinema and Telecommunication / Distance and Aura. Available at: (accessed 01 June 2013).

17. McCartney A. Ecotonality and Listening Praxis in Sound Ecology, Ambiences, and Popular Music. Wi: Journal of Mobile Media, 2013, Vol. 7, № 1. Available at: (accessed 01 June 2013).

18. Pacey A. Meaning in Technology. Cambridge, MIT Press, 1999, 264 p.

19. Roads C. Microsound. Cambridge, MIT Press, 2001. 409 p.

20. Rowell L. Thinking about Music: An Introduction to the Philosophy of Music. Amherst, University of Massachusetts Press, 304 p.

21. Schaeffer P., Brunet S. De la musique concrète à la musique même. Paris, Richard-Masse, 1977. 252 p.

22. Prokudin-Gorskiĭ S. M. Views in the Ural Mountains and Western Siberia, survey of waterways, Russian Empire. 1912, Library of Congress, 168 photographic prints.

23. Siddons J. Toru Takemitsu: A Bio-bibliography. Westport, Greenwood Print, 2001, 200 p.

24. Xenakis I. Formalized Music; Thought and Mathematics in Composition. Bloomington, Indiana University Press, 1971, 273 p.

Ссылка на статью:
Koshkin A. Interconnections: Spatial and Temporal Representation within Image and Sound // Философия и гуманитарные науки в информационном обществе. – 2015. – № 4. – С. 54–61. URL:

© A. Koshkin, 2015