Synchronization as a Sound-Image Relationship
8 Non-linear and Digital
The SMPTE/EBU Timecode marks tendencies in audiovisual processes that have increasingly taken effect since the 1970s: digitalization and non-linearity. The general functional principle of non-linear editing systems is based on a separation of the image and sound data on the one hand, and on the other hand on their temporal organization during playback, editing. Thus, preliminary decisions with respect to editing can be stored independent of the material and used for a preview, or various editing alternatives can be compared with each other and ultimately rejected if necessary. This applies not only to the (horizontal) sequence of the images, but to the (vertical) relationships of image tracks and soundtracks to one another.[32] Non-linear editing requires that image and sound for the preview can be precisely and relatively quickly accessed at each time location. Accordingly, this involves a separation of these data from audiotapes and film, because in terms of material, these recording media implement a specific time sequence, as one has to wind to arrive at a certain location. The first non-linear editing systems in the 1970s were analog-digital hybrids: access to analog image and sound material was controlled by a computer.[33] But in the early 1990s, digital editing systems began to establish themselves that enabled digital storage and manipulation of data.[34]
With respect to production, processing, and distribution, too, when compared to film and the audiotape, digital video and sound formats are subject to a more fluid and variable economy of transmission and computation speeds or storage capacities. The degree of compression — that is, the reduction of data according to a variety of (spatial, temporal, statistical, perception-oriented, etc.) redundancy and optimization models — can be adjusted to each available bandwidth or storage capacity of a whole variety of equipment and systems. The critical differences here can be found at the level of the structuring and regulation of this economy by means of formats, protocols, and interfaces.[35] Decoding image and sound data requires computing time and is therefore a real-time problem; this means that the computing operations have to be completed at prescribed times in order for the image and sound to be accurately played back. And so the problem of a horizontal synchronization of the processing times of the individual data streams is added to the vertical synchronization of sound and image: audiovisual container formats such as MPEG-2 — which can in turn contain various video and audio coding formats — therefore each include logistics according to which these data can be implemented, buffered, synchronized, and presented. This occurs, for example, by means of so-called time stamps, which can be written during the encoding of individual tracks (audio and video). Because of the different sample rates of digital sound (e.g., 44100 Hz) and video (e.g., 25 images per second), as a rule, these time stamps cannot be regulated among each other but by a central system clock.[36]
Contrary to television or cinema, which appear to be a field of relatively few institutionalized audiovisual forms at least in retrospect, a highly differentiated amount of various forms of recording, manipulation, distribution, and presentation is assembled under the umbrella term digital audiovisual media: from the cell-phone video to digital cinema transmitted via satellite. It is not only for this reason that one can say that digital audiovision introduces a specific kind of dependence on the situation of a presentation: because digital data have to be computed in order to be output by screens, loudspeakers, and other audiovisual interfaces, the general option arises of intervening in the program (which can in turn exist as data) of these computations.[37] A very simple case — because it is defined by a restrictive format standard — would be, for example, the option of alternating between different soundtracks (German, English, commentary, sound effects only, etc.) when playing back a DVD. Computer games, for instance, which are hardly capable of being kept out of a concept of digital audiovisual media, go even further, as do certain software environments, such as, for example, Max/MSP/Jitter, which enable audio, video, and other data to be retrieved, combined, and offset against one another in different ways.