Digital Video Sources for Playback
by Garry Musgrave, CTS-D
Originally published: January, 1999
Last updated: January 2001
Hard-disk and EEPROM digital video sources are becoming more and more popular for use in themed projects, exhibits, and planetariums. This article examines the basis of this technology, and contains recommendations for presentation (non-broadcast) use.
CAVEAT: Any article written about digital video playback technology has a limited life this technology can change significantly in a year. A two-year old article may be out of date we will try to update this article periodically (check the mast head for the update date).
Analog video consists of a continuously changing voltage for each of three components of a video signal: luminance (black and white) and two colour signals. These voltages, along with synch information, are typically recorded onto magnetic tape. Analog signals are appropriate for recording and transmitting video and audio because human beings are analog devices existing in an analog environment.
It is important to understand that an analog signal has virtually infinite resolution. For illustration, assume that the voltage representing a full swing of the video luminance information were, for example, 1 volt. Thus, 0 volts would be black, 1 volt would be white, and voltages in between would be shades of grey. The number of actual grey levels possible is theoretically infinite, as minute changes in signal voltage can be recorded. It should be just as easy to record a change from 0.67 to 0.68 volts as it is to record a change from 0.60 to 0.70 volts or a change from 0.06 volts to 0.07 volts.
In practice, the recording medium will affect the accuracy (or resolution) of the analog recording process. Tape size and formulation, record head construction, and tape speed all affect the end result. This is part of the reason why Betacam SP yields a much higher quality image than VHS and why laser disk (which isnt magnetic at all) is superior to both. Similar resolution limitations are imposed by the transmission medium (e.g.: CATV) and the display device (e.g.: your television receiver or monitor).
How much resolution do we need to make a video image look good? Video resolution is specified as horizontal vs. vertical, and the units for analog video are lines and pixels for digital video (example: 486 x 320). For analog video, the vertical resolution is fixed by the number of visible scan lines allowed by the governing video standard: 486 for NTSC and 576 for PAL. The horizontal resolution is largely determined by the recording medium, and, to a lesser degree, by the display device. Here is a breakdown of some common analog video sources:
Analog Video Media
|8mm video||230 lines|
|Betacam SP (composite output)||320 lines|
|Betacam SP (component output)||400 lines|
|Laser disk||425 lines|
Table 1. Horizontal Resolution of Typical Analog Video Sources
Of course, resolution alone doesnt determine the final image quality with an analog system encoding and storage of the luminance and chrominance signals can also have a dramatic effect. This is why Betacam SP is primarily used for broadcast use rather than, say, S-VHS.
Digital signals are actually a numeric representation of an analog event. Thus, instead of recording a continually changing signal, we record its value at a moment in time. This snapshot is called a sample. The biggest problem with trying to digitise a rapid and complex analog signal is resolution. Resolution must be high enough in both the time domain (how frequently we sample) and the value domain (how many steps we allow between our maximum and minimum voltages).
You can see that if a very complex signal has changed many times in one second, sampling only once a second would give a rather poor representation of what happened. Similarly, if we only allow, say, ten steps between our example of 0 and 1 volt for luminance, we can record only fairly course jumps in grey level. An example of this is the remote volume control of your television the sound level is too low, you increase it by one notch, and now it is too loud. This is a low resolution digital control. If it were an analog control (i.e.: the good ol volume knob), you could set it anywhere between these two levels.
A mathematician named Nyquist determined that the highest frequency which can be accurately represented is one-half of the sampling rate. Thus, the minimum sampling rate we can use must be twice the frequency of the signal we want to digitise in practice, we generally sample slightly higher than exactly this frequency. For example, audio is sampled at a minimum of 44.1KHz, and often at 48KHz (96KHz and higher sample rates are beginning to appear in professional audio applications). The value domain resolution is determined by the number of bits used to store the value of each sample. For example, an 8-bit sample gives 256 steps, while a 10-bit sample gives 1,024.
Why not simply use the highest possible sample frequency with a huge number of bits per sample? Storage space and transmission bandwidth. When a digital signal is stored, it requires a certain amount of disk space. When it is transmitted (over a network, for example) it will use bandwidth.
Video is different from most pure analog signals (like audio) in that it has an inherent rate of change the video frame rate. In NTSC video, a new frame is generated about every 33 milliseconds. During this time, we will record a snapshot of the frame at 720 x 486 resolution. The 720 comes from the currently adopted digital video standard, and 486 is the number of visible scan lines per NTSC frame. Thus, each of these 486 lines is sampled 720 times.
To compound the problem, remember that we have not only the luminance portion of the video signal to digitise, but the colour as well. Due to the way the human eye works, we can get away with sampling colour at half the rate of luminance (the eye is more sensitive to changes in brightness than colour). This is expressed as 4:2:2 sampling (for each line, we record 720 samples of luminance, but only 360 samples of the two colour signals). There are other sampling schemes the two most common being 4:2:0 and 4:1:1.
If each sample is stored as an 8-bit value, we will need about 20 Megabytes to store each second of video. Thus, you would need over 35 Gigabytes of storage for 30 minutes of video material! Apart from the storage consideration, you still need to be able to move these signals around. The data rate required for this same signal is about 168 Megabits/Second. This is over 16 times faster than the average office LAN, and still faster than the 100 Mb/S fast Ethernet and ATM connections commonly used for network backbones. The highly touted Gigabit standard would be required to move uncompressed video streams with reliability.
HDTV, with its higher resolution image, will require even more storage capacity and much higher bandwidth. HDTV is a good example of what we will be facing in the future. While there are several flavours of HDTV (with varying resolutions), full resolution HDTV (1920 x 1152 at 60fps) sampled 22:11:11 at 10-bit resolution requires over 300 MB for one second of video (a data rate of almost 2.5 Gigabits/Second).
The solution is to somehow reduce this quantity of data to a more manageable size this is where compression comes in. Compression can be both lossless and lossy. Most computer users have experienced lossless compression when they have zipped a number of files. The resulting ZIP file is usually smaller than the total of the contained files, yet every bit of data can be recovered with no loss. Unfortunately, this method can only compress a typical video image by about 20 - 30% not nearly enough. Further compression can only be accomplished with lossy techniques part of the picture information will be irrevocably discarded.
There are many ways of doing this. Transforms are used to eliminate high-frequency picture information (detail) that has been statistically determined as not being critical to the eye. Other algorithms analyse the video on a frame-by-frame basis, and discard anything that doesnt change from frame to frame (keeping it only in an initial key frame). More extreme compression methods involve reducing colour depth and lowering the frame rate not at all desirable.
One of the most confusing aspects of digital video compression is that there are many types and flavours. The two most popular formats are MPEG and MJPEG (motion JPEG). The three main MPEG flavours are MPEG-1, MPEG-2, and MPEG-4 (MPEG-3 is no longer used). MPEG-1 was originally developed for CD-ROM and multi-media use, does not look quite as good as VHS tape on a good day, and is disappearing (thankfully). MPEG-4 is for low resolution, low bandwidth teleconferencing and internet use (essentially replacing MPEG-1). MPEG-2 is the MPEG standard for video.
Unfortunately, there is not even one single MPEG-2 standard, but five profiles with four levels each. You must be somewhat careful when evaluating a product that claims to be MPEG-2. It is important to find out what profile and level they support some are no better quality than MPEG-1. The two principal MPEG-2 standards are: 4:2:2 Profile@Main Level and Main Profile@Main Level (limited to 15 Mb/S). There is also a High Profile for HDTV applications. To complicate things further, there is I MPEG, IB MPEG, and full IPB MPEG the meanings of these terms are fodder for another entire article. Suffice it to say that they vary in efficiency (reduction of data) and in ease of access to individual frames.
MPEG-2 is rapidly becoming the de-facto standard for video transmission, as it is more specifically tuned to video than MJPEG (JPEG was designed to compress photographic images), and the MPEG-2 data structure is specifically designed to be delivered as a stream over ATM and other high-speed networks. That said, MJPEG works extremely well for exhibit playback (in some cases, superior to MPEG-2), supports higher data rates, and is well supported on a number of hard disk recorder/players.
Both digitisation and compression introduce distortions or artefacts into the video image. They are present in all digitised video the degree to which they are visible depends on the amount of compression and the nature of the video content.
Digitisation can cause the following problems:
Aliasing occurs when the original signal contains frequencies that are too high to be sampled at a given sampling rate (remember Nyquist?). This shows up as pronounced vertical lines. This is often eliminated by stripping high frequencies from the video source before digitising unfortunately, this also removes picture detail, resulting in a rather soft image.
Quantisation error results from reducing the sample size to a value that is too low. When there are not enough unique colours (e.g.: less than 24-bit colour) to represent the subtle changes in hue that are present in analog video, the image starts to look more like a painting than a photograph (similar to a posterisation effect).
A similar problem is caused by overload distortion. Here the full range of the analog signal cannot be represented by the sampling, and picture levels above a certain value all become white resulting in a bleached look. The opposite problem (wrap-around) occurs when all out of range values are mapped to black.
Bit errors can occur when there is a transmission error in delivering the digital signal over a transport medium. Unfortunately, digital video fails far less gracefully than analog a single bit error can result in an oddly coloured block appearing in the image. You have likely seen this in television broadcasts that are transmitted by satellite several multi-coloured rectangles pop into the picture and disappear again.
Compression can cause the following problems:
The Gibbs Effect is one of the most common problems with both JPEG and MPEG compression. This shows up as a blurring or halo around a sharply defined artificial object such as overlaid text or graphics. A similar problem, referred to as mosquitoing,occurs on natural objects (typically people) the background adjacent to the person appears to shimmer as the person moves.
When video with a high degree of motion is digitised using JPEG or MPEG, blockiness can appear the individual blocks that make up the digitised image can become visible.
When high compression ratios are used, the rules governing what is discarded during compression can start to interfere with significant detail. For example, a highly compressed baseball video may result in the ball itself being eliminated by the compression algorithm due to its size!
Remember that compression throws away picture information, and it can introduce problems into scenes with a lot of motion. Once you reach a certain level of compression, the degradation starts to become visible even to an untrained eye. Thus, you dont want to unilaterally use extremely stiff compression to save on storage space and transmission bandwidth. Beware of claims of broadcast quality video at compression rates from 40:1 to 100:1.
In our opinion, a data rate of 5 Megabits/Second is the minimum quality to use for presentation playback. This same performance can be specified as a compression ratio in this case, about 35:1 compression. Subjectively, compression artefacts are invisible at 8:1 compression or less (i.e.: about 20 Mb/S or greater). In our opinion, higher compression is only suitable for multimedia or network use. A higher data rate (lower compression ratio) is recommended when warranted by the material or the image quality desired.
Also consider the sampling scheme being used: 4:2:0 can suffer from vertical colour smearing (same sampling frequency as 4:2:2, but only samples colour on every other line); and 4:1:1 will reduce colour detail and can introduce colour artefacts (samples colour on every line, but colour is sampled half as often as 4:2:2). We dont recommend either of these unless very high compression ratios are being used (i.e.: greater than 35:1).
If you want a spectacular image that goes beyond simply replacing laser disk, or if you have a complex image, you will want to use a higher data rate (i.e.: less compression). Generally speaking, the more complex detail or action (motion) that you have in your video material, the higher the data rate needed.Here is a rough guide:
|Typical exhibit use - direct LDP or S-VHS tape replacement||
5 Mb/S to 10 Mb/S
35:1 to 16:1
|High detail or motion in program material (7:1 is comparable to Betacam SP)||
11 Mb/S to 24 Mb/S
16:1 to 7:1
|Spectacular imaging beyond analog. Display
equipment must support this.
(5:1 is equivalent to DV, 3.3:1 is equivalent to DVCPRO50 or DIGITAL-S, and 2.2:1 is equivalent to Digital Betacam)
25 Mb/S to 160 Mb/S
7:1 to uncompressed
Table 2. Recommended Compression Rates for Presentation Uses
MPEG-2 also has the option to use a variable bit rate instead of a constant bit rate. Variable bit rate MPEG has the advantage of optimising storage higher compression is used when the content (as determined by the compression system) is less critical. For typical exhibit use, a variable bit rate of between 3 and 6 Mb/S should yield satisfactory results as a replacement for tape or laser disk.
If at all in doubt, you should preview critical sections of your material at the compression level intended.
It is appropriate at this point to explode a popular myth: digital is always superior to analog. This is simply not true. A live video feed directly from one room to an adjacent area would not benefit from digital encoding or compression technology. Unfortunately, we very seldom use live material transmitted over a short distance we are more commonly recording, processing, and transmitting signals. When an analog event has been digitised properly, the recorded quality can be spectacular while digitisation will not improve the source (another myth), it should be identical to our live direct feed. In the analog world, we would have suffered some degradation as soon as we stored the signal (e.g.: on a VTR). Digital signals have the great advantage of being largely non-degradable. That is, once digitised, the signal should retain its original quality after repeated processing and transmission operations assuming it has not been re-digitised anywhere along the way. The factor to be aware of is the phrase its original quality in this last sentence. Consider MPEG-1 video. While this is undisputedly digital video, I certainly would not trade it for an analog format like Betacam SP (or possibly even VHS). Similarly, a digital audio recording made at a 10KHz sample rate with 8-bit samples could likely be surpassed in quality by your average portable cassette recorder.
There are a number of off-the-shelf units that can easily drop into existing systems as replacements for laser disk or Betacam players. Some well-known manufacturers of MJPEG and MPEG-2 players: Adtech, Akman, Alcorn McBride, Doremi, DPS, Fast Forward, and Visual Circuits. Most of these are self-contained, 19 inch rack-mount units. The exception is DPS, who also manufacture a PCI card to go into a customer-furnished computer. There may also be others we are not aware of there certainly are other manufacturers of broadcast units or non-linear editors that have features, performance (and corresponding pricing) that dont make sense for exhibit or presentation playback.
Things to look for:
Does the unit perform its own digitising directly from a composite or Y-U-V input? These units are recorders, and are like a VTR with instant random access almost all MJPEG units do this. Some MPEG-2 units are players only. This means that you must have the means to generate high-quality MPEG-2 files externally to the unit (remember not just any old MPEG encoder will do). In either case, be aware of the garbage-in/garbage-out rule. Your original program material should be produced to the highest quality standards prior to digitising (ideally at least Betacam SP quality). Digitising and then compressing the output from a VHS tape will yield less than desirable results. Noise also wreaks havoc with compression algorithms.
Does the unit have a component output? If you want to go beyond what was possible with laser disk and Betacam, and create a visually striking image, you will need a component video output (as opposed to composite video).
Does the unit have multiple outputs? Some units can generate two, three, or four completely independent outputs (i.e.: each output can be playing a different clip from the hard drive). This can be very cost effective in installations that require multiple players.
Is the unit serially controllable? For exhibit work, you must be able to have your show controller or interactive controller seek the unit to a specific frame and play. Some units have no serial control capabilities these should be avoided. Most have RS-422, supporting the SONY P2 protocol which is a broadcast standard for VTR control. Some have RS-232 control that may mimic a laser disk player or be proprietary. It is important that the control method is supported by your controller. Ensure that the encoding scheme supports frame-accurate searching (some forms of MPEG are not easily frame searchable).
Does the unit support off-the-shelf A/V hard drives, and what are the expansion limitations? Most units now support off-the-shelf hard drives, and you can often buy them without drives (drives are generally cheaper elsewhere than the player manufacturer). Note that you must use special A/V drives such as the Seagate Barracuda standard data drives will not work. If you dont feel comfortable installing your own hard drives, order them with the unit. Be sure to order sufficient hard drive capacity for at least the material you plan to load into the machine. As a rule-of-thumb assume that it takes about 20 Megabytes to store one second of uncompressed 4:2:2 video. Thus, if you plan on 8:1 compression, and you have 20 minutes of video, you will need at least 3 GB of storage (20 x 60 x 20MB/8). In this case, we would specify a 4G drive. CAUTION: a Gigabyte is technically 1,024 Megabytes, but most drive manufactures assume that it is 1,000 MB. If you have a lot of video program material, ensure that the unit is expandable with either more internal drives or by using external drives.
While costs keep coming down, these units are certainly not yet an economic alternative to a single laser disk or DVD player. Multiple output units do start to make economic sense as replacements for multiple disk players. Where they really make sense, however, is when the desire is to go beyond the image quality that is possible with either DVD or laser disk, and you have a budget to compliment this desire.
The other consideration is the availability of laser disk players in the future. Manufacturers are concentrating their efforts on DVD players, and will phase-out laser disk players.
DVD is also a digital playback medium, and it is fast replacing laser disk in exhibit projects. Ensure, however, that you use an industrial DVD player (such as the PioneerDVD-V7400), rather than a consumer model. These units are designed as direct LDP replacements they are frame addressable, are rugged and dependable in high-use situations, and are controlled via RS-232 using a similar protocol.
Remember that DVD was designed for consumer applications to replace laser disk in home theatres and, ultimately, consumer video tape. While its quality is excellent compared with what it replaces, it is not on a par with full broadcast digital video. DVD has a horizontal resolution of about 500 lines slightly better than laser disk and Betacam, but not as good as hard disk at high data rates. It uses a restricted version of MPEG-2 MP@ML with 4:2:0 sampling, and is also allowed to use MPEG-1 (watch out this can be dreadful quality!). It has an average data rate of 3.5 Mb/S, to a maximum of just under 10 Mb/S.
On the plus side, DVD offers greatly increased storage capacity, and has added capabilities such as multiple language support, interactivity, multiple camera angles, and parental control most of which are more suited to consumer applications. The interactive element is tempting, until you delve into it. The DVD disk can be programmed so that a mouse (connected directly to the player) can be used to interact with on-screen buttons. The drawbacks are that the programming is not easily done by the end user, and, like Level II LDP, cannot be changed once burned onto the disk without repressing the disk. In our opinion, external interactive control always offers more flexibility and ease of change.
Video monitors also have a resolution. Every video monitor will, of course, have vertical resolution to match the governing video standard (486 lines for NTSC and 576 for PAL). Monitors will vary considerably, however, in horizontal resolution. Low priced or consumer units may have horizontal resolutions of 300 lines or less. Commercial monitors typically have horizontal resolutions of 500 to 700 lines, and broadcast monitors can go up to 900 lines or more. Unless you have a very low budget, or the display is not important or the content critical, you should be using commercial video monitors (not consumer) to display your material broadcast monitors are overkill.
If you want a really high impact display:
Use a graphics (computer) monitor or CRT-based video projector and a scan doubler (or quadrupler). Be sure to connect the scan doubler/quadrupler to the component outputs of your hard disk playback unit. If you do this, you must use a high data rate we recommend at least 25 Mbps (7:1 compression or less). The scan doubler will be producing an image approximating VGA (640 x 480) resolution, and the quadrupler will approximate XGA (1280 x 1024). Test the compression rate with the actual video material, doubler/quadrupler, and display pay particular attention to sections with high detail or motion, and lower the compression as required to achieve a striking image.
If you are using an LCD graphics monitor, a plasma display, or a fixed-resolution video projector (e.g.: DLP or LCD), then a video scaler should be used instead of the doubler/quadrupler. Use a high quality unit with motion compensation, and set the output to exactly match the native resolution of the display.
For high-impact theatre presentations (e.g.: a presentation theatre in a museum) consider HDTV the aspect ratio and image quality are comparable to film. Here are a few pointers:
the footage should either be shot on film and digitised or shot directly on HD (do not use standard video);
all post work must be done in the digital domain;
encode as 720p (ideally at 60Hz);
use a video projector with SXGA (1280 x 1024) resolution in 16:9 mode this exactly matches the resolution of 720p (1280 x 720);
use a DLP or reflective LCD video projector, rather than an LCD projector the contrast ratio is far superior;
use an uncompressed HD hard disk player as a source suitable units are made by Alcorn McBride, Doremi, DVS, Electrosonic, QuVis, and Sencore.
Hard-disk based video playback is a viable alternative to DVD, laser disk or Betacam tape for exhibit, museum, and planetarium applications. It is more reliable than tape, has even faster access than laser disk or DVD, and, if the correct compression and display technology is used, can generate a higher quality, visually striking image. Consider HDTV for spectacular images in a theatre environment.
Conceptron Associates provide a total solution to the audio visual, presentation, and media design aspects of your projects. We are completely independent of any audio-visual manufacturer, A/V equipment vendor, or A/V contractor. As consultants, we design and specify video, presentation, multimedia, audio, and show control systems. Our audio visual design projects have included: exhibits, planetariums, mixed-media theatres, expositions, museums, theme parks, science centres, conference facilities, sports arenas, and institutions. Our principal has over 20 years of audio-visual design and consulting experience, including pioneering experience with interactive theatres. We can be reached at (800) 871-4161 or on the web at http://www.conceptron.com/.
Copyright © 1999, 2001 Conceptron Associates All rights reserved.