Resolution, Scaling, & Progressive vs. Interlaced

One of the most basic characteristics of a video or digital image is its resolution. Resolution is a measurement of how detailed a video or digital image is. Digital images and video are essentially colorful grids in which the individual cells, called pixels, are given different colors to create a picture.

Measuring Resolution

Since resolution is similar in principle to a grid, resolution is measured in columns by rows. A 1024x768 (pronounced ten-twenty-four by seven-sixty-eight) image, for instance, has 1,024 columns and 768 rows of pixels.

Resolution can also be measured by the total amount of pixels. This can be determined by multiplying the number of columns by the number of rows. So, in the case of 1024x768, the total number of pixels is 786,432. This measurement is sometimes abbreviated in megapixels, and one megapixel equals one million pixels.

Another way to measure resolution is to use the number of rows and stick an "i" or "p" at the end depending on if you are referring to interlaced or progressive (more on that a little later). HD video and HDTVs typically have a max resolution of 1920x1080, which is often shortened to 1080i or 1080p depending on the situation. 525-line and 625-line video also refers to the number of rows, although as the TV Broadcast Formats article points out the 525-line and 625-line names are misleading.

Resolution can also be measured by the number of columns, although this is really only done for 2K and 4K. 2K is 2048x1080 [1 page 31], while 4K is either 3840x2160 or 4096x2160 [1 page 31] [2]. The first consumer 4K display was a 3840x2400 display from IBM named the T220 released on June 27, 2001 for $22,000 [3] [4].

Finally, it is important to look at how resolution is measured in film. Film, an analog medium, doesn't actually have pixels, so technically speaking it doesn't have a resolution. However, a film's detail can still be measured in part by its film gauge, which is a measurement of how wide the film is. The most commonly used film gauge for recording is 35 millimeters (mm) [5]. Other film gauges have been used for recording as well [5], the most important of which are 8mm, 16mm, [6] and 65mm [7]. Film gauges smaller than 35mm are sometimes referred to as narrow gauge [6], while film gauges above 35mm are sometimes referred to as wide gauge [7].

Digital Display Resolution & Scaling

Digital displays are plasmas, LCDs, DLP projectors, and anything else that is not a CRT. Like digital images and videos, digital displays are also grids that consist of individual pixels. This, however, causes a bit of a problem.

For instance, suppose we have a 1024x768 display showing a 640x480 video. One approach would be to center the 640x480 video within the 1024x768 display, but if we do that then we are not using the whole display.

Another approach would be for either the video player or the display to convert, or scale, the 640x480 video to 1024x768. Scaling would use up the whole display, but it would also worsen the image. Scaling an image or video to a new resolution with higher dimensions is called upscaling, while scaling an image or video to a new resolution with lower resolutions is called downscaling. This is also sometimes referred to as upconverting and downconverting.

Let’s take a simplified example to demonstrate centering vs. upscaling. Suppose we have the following 3x3 grid:

Next, let’s suppose that we want to show this grid 3x3 grid on a 5x5 grid:

As you can see, there is no way to get the 5x5 grid to have the same design as the 3x3 grid and still fill up the entire grid without the original image being altered. The first 5x5 grid has the 3x3 image centered but remains unfilled, the second 5x5 grid makes the blue sections larger than the yellow section, and the third 5x5 grid takes the rows that could be blue or yellow and combines them to make green (green is the color combination of blue and yellow).

By looking at the grids used to demonstrate the problems of upscaling, it also easy to see the problems of downscaling. If you were to try and take one of the 5x5 grids and display it in a 3x3 grid, you’d have to toss out some pixels, blend colors, or both. For instance, for the 5x5 grid with white pixels on the outside to fit back into a 3x3 grid, the white pixels on the outside would either have to be dropped or blended with bordering blue and yellow pixels to create lighter shades of blue and yellow as shown below:

The only time that upscaling does not alter the original image is if you are converting one resolution to another resolution with dimensions that are whole number multiples of the original. For instance, going back to the original 3x3 grid, a 6x6 grid would be able to display it correctly since 6/3 is a whole number. This is demonstrated below:

Similarly, downscaling does not alter an image is if it has sufficiently low detail to be fully represented in the smaller resolution. For instance, downscaling the 6x6 grid shown to 3x3 would not cause any image degradation or loss of detail since the 6x6 grid shown has sufficiently low detail to fit within the 3x3 grid. However, if the 6x6 grid shown was to have, say, every row with a different color, it could no longer go into a 3x3 grid without altering the original image.

CRT Resolution & Scaling

Before discussing CRT resolution, it is important to clear something up: CRTs are NOT digital displays. Instead, CRTs are analog displays. You may hear about "digital" CRTs, but what this is really referring to is a CRT with a digital video input and/or digital over-the-air TV reception. Similar to this is "analog" LCDs, which refers to an LCD without a digital video input.

The thing that makes CRTs analog and not digital is that they don't have a native resolution. Instead, CRTs have a max resolution. A CRT computer monitor, for instance, typically has a max resolution of at least 1280x1024 but can also display lower resolutions such as 640x480, 800x600, and 1024x768 without scaling. This is because rather than having a set number of physical pixels that get filled in with one color or another, a CRT is more like a blank canvas onto which various resolutions can get painted onto.

That being said, you still sometimes have to deal with scaling on a CRT. For example, High-Definition (HD) video is typically either 1280x720 or 1920x1080. Rather than showing 1280x720 video as 1280x720, most CRT HDTVs will instead scale 1280x720 video to their max resolution, which is roughly 1920x1080. This is used as a cost-cutting measure.

Progressive vs. Interlaced

Believe it or not, motion in films and video is actually represented by a series of still images. These images are called frames. Progressive video has frames that contain one moment in time, while interlaced video has frames that contain two moments in time. The picture below is a 640x480 progressive frame from the 2003 PC version of the 2001 Xbox game Halo:

Demonstrating an interlaced frame, however, is a bit more complicated. To do this, I'll use the 2008 Wii game Super Smash Bros. Brawl (SSBB). This video game has a short opening cutscene (a cutscene is a video clip in a video game) in which the game's characters are lined up from right to left. The 704x480 screenshot below recorded with the USBAV-191 ADS Tech Video Xpress is from this cutscene:

What the heck?! It’s like there are two images in one! In fact, that is exactly what is happening. The two images within an interlaced frame are known as fields. The odd-numbered rows, known as the top field, form one image. The even-numbered rows, known as the bottom field, form another image. Interlaced video in which the top field comes first is known as Top Field First (TFF), while interlaced video in which the bottom field comes first is known as Bottom Field First (BFF). The SSBB cutscene is TFF, so the pictures below show the top field followed by the bottom field:

Displaying Interlaced Video

How interlaced video is displayed depends on whether you are dealing with an interlaced or progressive scan display. If the display is an interlaced display, it can both display the fields as is and expand them to their full height. However, this is something only a CRT television can do. All other types of displays require the fields to be deinterlaced, which is basically the process of taking the fields and scaling them to the size of the frame [8]. So, in the case of the SSBB fields, each field would be scaled to 704x480, doubling the original frame rate.

There are basically two ways to deinterlace: weave and bob deinterlacing. In the purest sense, weave deinterlacing is not really deinterlacing at all. Weave deinterlacing is basically just taking the two fields and "weaving" them together into a frame. This allows static images such as the skyline in the original frame to look as sharp as possible. However, any moving objects end up looking terrible. The other deinterlacing method is the bob method, which basically takes each field in isolation and scales it to the original frame size. This allows moving objects to appear better than in the weave method, but it also means that static parts of the image end up looking not as good as they could. It can also cause objects to "bob" up and down, hence the name bob deinterlacing. The best deinterlacing will turn each field into a frame by using a mix of weave and bob for different parts of the image [8].

On a somewhat related note, if you scale interlaced video to a new resolution with a different number rows, it is CRUCIAL that you deinterlace the video beforehand. Otherwise, the deinterlacing process won't work, because the deinterlacer will no longer know which rows belong to which field.

Measuring Temporal Rate

Film and video can be recorded using different numbers of frames per second (fps). However, since the existence of interlaced video means that a frame may end up showing two moments in time instead of one, the fps may not necessarily correspond with the number of moments of time per second (i.e. 30 interlaced frames would show 60 moments in time). Therefore, to avoid confusion, the term temporal rate is sometimes used to describe the number of moments in time that are shown in a video. There are, generally speaking, three different ways to measure temporal rate:

One way is to use the fps measurement. However, as already discussed, this can be problematic for interlaced video.
Another method is to stick an "i" or a "p" at the end of fps, similar to sticking an "i" or a "p" at the end of the number of rows to say the resolution. However, this method can also be problematic for interlaced video since there is disagreement about whether to use the frame rate or the temporal rate. For instance, some say a video with 30 interlaced frames should be described as 30i, while others say it should be described as 60i. These days, the most common method is to refer to an interlaced video by its temporal rate. In order to avoid getting confused, keep in mind that if the video is stated 25i, 29.97i, or 30i it is is referring to frames, while if it is stated as 50i, 59.94i or 60i it is referring to fields. Finally, if you are describing video that may be certain amount of fields or frames per second, you can abbreviate this by sticking an "i/p" at the end. For example, TV in America is roughly 59.94 fields or frames per second, and this can be abbreviated as 59.94i/p.
The last method to measure the temporal rate of film or video is to use hertz (Hz), which means time per second. The main problem with this method is that Hz is also used to measure refresh rate. This means that if you use the term Hz to describe temporal rate to someone you could potentially mix the person up.

Measuring Both Resolution and Temporal Rate

If you want to say both the resolution and temporal rate of a video, you can write something like 1080p60 or 1080/60p, with the 1080 representing the 1920x1080 resolution and the 60 representing the temporal rate of 60p.

Finally, if you only see one number with an “i” or “p” at the end, you may be confused as to whether this number is referring to resolution or temporal rate. Generally speaking, if the number is 60 or below, it is probably referring to temporal rate. If it is 480 or higher, then it is probably referring to resolution. If it’s anything in the middle, you’ll just have to figure it out based on context.

Sources

3. IBM Introduces World's Highest-Resolution Computer Monitor. June 27, 2001. IBM.

5. Michael Rogge. More than one hundred years of Film Sizes. Dec. 6, 1996. Last updated Jan. 25, 2012. ©Michael Rogge 2012.