Please consider a donation to the Higher Intellect project. See https://preterhuman.net/donate.php or the Donate to Higher Intellect page for more info.

How Big is Video

From Higher Intellect Vintage Wiki
Jump to navigation Jump to search

By Chris Pirazzi.

Here are some handy pre-computed statistics to give you a sense of how big video is.

Conventions for This Summary

  • Kilobytes: x kb is x*1024 bytes

  • Megabytes: x Mb is x*1024*1024 bytes

  • Here are the four major video signal formats which we will deal with here, with the shorthand names we will use in this summary. Don't try and use these shorthand names outside this summary---you'll get in trouble:

    lineshorizontal samplingspecificationshorthand
    525-line square-pixel
    (NTSC)
    ANSI/SMPTE 170M-1994NTSC
    non-square-pixel
    (Rec. 601 Digital)
    ANSI/SMPTE 125M, 259M
    ITU-R BT.601-4
    525-dig
    625-line square-pixel
    (PAL)
    ITU-R BT.470-3
    (B,G,H,I,D,K,K1,L-PAL)
    PAL
    non-square-pixel
    (Rec. 601 Digital)
    ITU-R BT.656-2
    ITU-R BT.601-4
    625-dig

    See below for some notes about the specs.

How Many Bytes Per Pixel?

There are tons of possible ways of packing video data into memory, described in detail in The Pixel Rosetta Stone: Packings and Colorspaces. These

are the most common:

  • If you represent video data with 4:2:2 sampled 8-bit-per-component YCrCb data, then you get one 8-bit luminance sample per pixel, one 8-bit U chrominance sample per two pixels, and one 8-bit V chrominance sample per two pixels. This gives you 2 bytes per pixel. In the VL, this is called VL_PACKING_YVYU_422_8. For more information on what YCrCb and 4:2:2 mean, see the Rosetta Stone page or "A Technical Introduction to Digital Video" by Charles A. Poynton (New York: Wiley, 1996).

  • If you represent video data with 32-bit RGBA or ABGR quantities (where the A may be a don't care or it may be an alpha channel, synthesized on the computer), then you get 4 bytes per pixel. In the VL, this is called VL_PACKING_RGBA_8, VL_PACKING_RGB_8, and VL_PACKING_ABGR_8.

  • A Rec. 601 digital video stream actually has 10 bits in each Y, U, or V sample, not 8. Often our software will only deal with 8 of the ten bits, as is assumed by VL_PACKING_YVYU_422_8. This sometimes works ok for video data, but in order to parse some forms of ancillary data (such as embedded audio data) out of a video stream, you must bring in all 10 bits of each component. 10 bits is an obnoxious quantity for computers, so the most common technique is to left-shift each 10 bit quantity out to 16 bits, resulting in a YCrCb-style capture with 4 bytes per component, VL_PACKING_YVYU_422_10. This is obviously quite wasteful of memory (and perhaps disk) bandwidth, but these disadvantages must be weighed against the cost of the bit twiddling that would be necessary to manipulate 10-bit packed data on the CPU.

How Many Pixels In a Field or Frame?

That question depends on what part of the field or frame you want.

If you want just the visible picture, no VITC, not all of the closed captioning, and no other ancillary data of any kind, then you want the "active" region. For NTSC and PAL, the part of the signal which is "active" is a little ill-defined. We use the same definition adopted by all SGI VL devices (see Definitions: F1/F2, Interleave, Field Dominance, and More for the vertical definitions of active region). For the 525-line 601 digital format, the concept of active region is well-defined: we choose the "Active Video" region from 125M, not including the "Optional Blanking." For the 625-line digital format, we choose the same set of lines as with the 625-analog format.

If you want VITC or other data which lives in the vertical blanking interval (that data is called VANC (vertical ancillary data) when dealing with a 601 digital signal), you have to capture more lines of data than active video. If you have a digital signal, then ancillary data such as audio can also be stuck in the horizontal blank (this is called HANC), so to get this you will have to capture more pixels per line.

What if you want all the data? For the digital formats, this is well-defined: an image which represents every single bit of data transferred over a Rec. 601 digital video link (including the timing reference signals (called EAV and SAV)) is a "full-raster" image. If you really want all the data, you'll also have to capture at 10 bits (which ends up being 4 bytes per pixel due to padding) rather than 8 bits (which is 2 bytes per pixel). Nitpick: for the digital formats, the size of the two fields is not the same: one field has one more line than the other, and lasts one line time longer than the other. We use the average size for the "field" quantities below.

Sorted by video standard:


formatpartfield/
frame
x sizey sizetotal pixels
NTSCactiveframe646486313956
NTSCactivefield646243156978
525-digactiveframe720486349920
525-digactivefield720243174960
525-digfull-rasterframe858525450450
525-digfull-rasterfield858262.5225225
PALactiveframe768576442368
PALactivefield768288221184
625-digactiveframe720576414720
625-digactivefield720288207360
625-digfull-rasterframe864625540000
625-digfull-rasterfield864312.5270000


How Many Bytes In a Field/Frame?

In order of decreasing size:


formatpartfield/
frame
bytes
per pixel
total bytes
625-digfull-rasterframe42109.375kb
525-digfull-rasterframe41759.570kb
PALactiveframe41728.000kb
625-digactiveframe41620.000kb
525-digactiveframe41366.875kb
NTSCactiveframe41226.391kb
625-digfull-rasterframe21054.688kb
625-digfull-rasterfield41054.688kb
525-digfull-rasterframe2879.785kb
525-digfull-rasterfield4879.785kb
PALactiveframe2864.000kb
PALactivefield4864.000kb
625-digactiveframe2810.000kb
625-digactivefield4810.000kb
525-digactiveframe2683.438kb
525-digactivefield4683.438kb
NTSCactiveframe2613.195kb
NTSCactivefield4613.195kb
625-digfull-rasterfield2527.344kb
525-digfull-rasterfield2439.893kb
PALactivefield2432.000kb
625-digactivefield2405.000kb
525-digactivefield2341.719kb
NTSCactivefield2306.598kb

How Many Fields/Frames per Second?

The exact field rate of NTSC and 525-line digital video is (60000.0/1001.0) fields per second. This oddity is explained in "A Technical Introduction to Digital Video" by Charles A. Poynton (New

York: Wiley, 1996). The chart below shows rounded figures.

The exact field rate of drop-frame timecode (which is a hack that was invented to get around the bizzarre field rate of 525-line video) is 59.94 fields per second, which is not equal to (60000.0/1001.0). This oddity is explained in "Time Code: A User's Guide" by John Ratcliff (Oxford: Butterworth-Heinemann, 1993).

The exact field rate of PAL and 625-line digital video is 50 fields per second.

formatframes/
second
ms/
frame
fields/
sec
ms/
field
NTSC and 525-dig29.9733.3667ms59.940116.6833ms
PAL and 625-dig2540ms5020ms

How Many Bytes Per Second in Full-Rate Video?

In order of decreasing size:

formatpartbytes
per pixel
total bytes per second
625-digfull-raster451.498Mb/sec
525-digfull-raster451.498Mb/sec
PALactive442.188Mb/sec
525-digactive440.005Mb/sec
625-digactive439.551Mb/sec
NTSCactive435.894Mb/sec
625-digfull-raster225.749Mb/sec
525-digfull-raster225.749Mb/sec
PALactive221.094Mb/sec
525-digactive220.003Mb/sec
625-digactive219.775Mb/sec
NTSCactive217.947Mb/sec

You may have heard the figure "27 Million Per Second" associated with digital video. All forms of Rec. 601 video (whether 525- or 625-line) are 10-bit signals with a data rate of 27,000,000 Hz. Since it takes four 10-bit words to encode two pixels (two Y samples, one U sample, and one V sample), this means that a full-raster 601 signal has 13,500,000 pixels per second. You can even verify this by multiplying

out the full-raster sizes times the rates:

  • (858*525*30000/1001) == (864*625*25) == 13,500,000 pixels per second

A few of these "pixels" are reserved for use as timing reference signals

(EAV and SAV).

The reason you don't see "27MB/sec" as the data rate for 2-bytes-per-pixel full-raster digital video above is that we have defined a MB as (1024*1024) bytes (as the computer memory geeks do) rather than (1000*1000) bytes, as the communications people do. Gotta love standards!

I Want More Detail!

The best place to go for more detail is the original specs, which we named above.

Some notes about those specifications:

  • the organization formerly known as the CCIR is now called the ITU.

  • for the digital video formats, ITU-R BT.601-4 (commonly referred to as Rec. 601) defines some basic properties common to both 525- and 625- line digital, regardless of how it is transmitted. Examples of these properties include pixel sampling rate and color space. Then, the more specific documents (125M, 259M, 656) define how the data format defined by Rec. 601 is to be transmitted over various kinds of links (serial, parallel) with various numbers of lines (525,625). Rec. 601 was formerly referred to as CCIR 601.

  • ITU-R BT.470-3 is the ITU spec formerly known as CCIR Report 624-1. There was a CCIR Recommendatation 470-1, but it just said to read CCIR Report 624-1. Now we just refer to 470.

The next best place to go is "A Technical Introduction to Digital

Video" by Charles A. Poynton (New York: Wiley, 1996).

"Video Demystified" by Keith Jack (United States: Brooktree, 1993) comes in a very distant third.