QuickTime VR delivers Virtual Reality in both panoramas and objects. In panoramas, you can look up and down, turn around, zoom in to see detail, or zoom out for a broader view. Objects are interactive. By clicking and dragging, you examine things like the sculptures in a museum or the merchandise in a store. See example scenes and objects in three dimensions or get developer information, tools and descriptions of the technology.
QuickTime VR -- short for virtual reality -- is software that lets a user view a photographic or rendered representation of a scene. The software offers two kinds of experiences: a panoramic technology that enables users to explore 360 degree scenes, and an interaction technology that allows users to pick up and interact with objects. Users can zoom in or out of a scene, navigate from one scene to another, and even pick up and inspect objects. As the user changes his or her view of the scene, correct perspective is maintained, creating the effect of being at the location and looking around. QuickTime VR is the first mainstream technology to enable immersive experiences based on real world scenes. QuickTime VR was recently awarded the 1995 MacUser magazine "Eddy" Award for Breakthrough Technology of the Year. QuickTime VR is a cross-platform technology designed to run on even entry-level Macintosh and Windows computers. QuickTime VR panoramic file sizes are small -- just 800K for one panoramic view.
QuickTime VR is a software technology that Apple will license to third parties in the form of cross-platform run-time software (for Apple Macintosh, Apple Power Macintosh computers and Microsoft Windows -based PCs), and authoring tools (Macintosh-based). Developers using QuickTime VR can author once on the Macintosh platform and deliver this content to run on both Macintosh and Windows-based PCs, enabling access to the vast market of personal computer users.
About QuickTime VR
QuickTime VR is an extension of the QuickTime technology developed by Apple Computer, Inc. that allows users to interactively explore and examine photorealistic, three-dimensional virtual worlds. Unlike many other virtual reality systems, QuickTime VR does not require the user to wear goggles or gloves. Instead, the user navigates in a virtual world using standard input devices (such as the mouse or keyboard) to change the image displayed by the QuickTime VR movie controller. Figure 1-1 shows a view of an object in a virtual world.
The data that comprises a QuickTime VR virtual world is stored in a QuickTime VR movie. A QuickTime VR movie contains a single scene, which is a collection of one or more nodes. QuickTime VR currently supports two types of nodes: object nodes and panoramic nodes.
Note: QuickTime uses the term movie to accentuate the time-based nature of QuickTime data (such as video and audio data streams). QuickTime VR uses the same term solely on analogy with QuickTime movies; in general, QuickTime VR data is not time based.
An object node (or, more briefly, an object) provides a view of a single object or a closely grouped set of objects. You can think of an object node as providing an "outside looking in" view of an object. The user can use the mouse or keyboard to change the horizontal and vertical viewing angles to move around the object. The user can also zoom in or out to enlarge or reduce the size of the displayed object. Object nodes are often designed to give the illusion that the user is picking up and turning an object and viewing it from all angles.
A panoramic node (or, more briefly, a panorama) provides a panoramic view of a particular location, such as you would get by turning around on a rotating stool. You can think of a panoramic node as providing an "inside looking out" view of a location. As with object nodes, the user can use the mouse (or keyboard) to navigate in the panorama and to zoom in and out.
The images displayed in QuickTime VR movies can be either captured photographically or rendered on a computer using a three-dimensional (3D) graphics package. The following sections describe the structure of QuickTime VR movies--including object and panoramic nodes--in greater detail.
A QuickTime VR scene is a set of one or more nodes. A node is a position in a virtual world at which an object or panorama can be viewed. For a panoramic node, the position of the node is the point from which the panorama is viewed. QuickTime VR scenes can contain any number of nodes, which can be either object or panoramic nodes.
A node in a QuickTime VR movie is identified by a unique node ID, a long integer that is assigned to the node at the time a VR movie is created (and which is stored in the movie file).
When a QuickTime VR movie contains more than one node, the user can move from one node to another if the author of the QuickTime VR movie has provided a link (or connection) between the source and destination nodes. A link between nodes is depicted graphically by a link hot spot, a type of hot spot that, when clicked, moves the user from one node in a scene to another node.
The data used to represent an object is stored in a QuickTime VR movie's video track as a sequence of individual frames, where each frame represents a single view of the object. An object view is completely determined by its node ID, field of view, view center, pan angle, tilt angle, view time, and view state.
In QuickTime VR, angles can be specified in either radians or degrees. (The default angular unit is degrees.) A view's pan angle typically ranges from 0 degrees to 360 degrees (that is, from 0 to 2¼ radians). When a user is looking directly at the equator of a multirow object, the tilt angle is 0. Increasing the tilt angle rotates the object down, while decreasing the pan angle rotates the object up. Setting the tilt angle to 90 degrees results in a view that is looking straight down at the top of the object; setting the tilt angle to -90 degrees results in a view that is looking straight up at the bottom of the object. In general, the normal range for tilt angles is from -90 degrees to +90 degrees. You can, however, set the tilt angle to a value greater than 90 degrees if the movie contains upside-down views of the object.
The views that comprise an object node are stored sequentially, as a series of frames in the movie's video track. The authoring tools documentation currently recommends that the first frame be captured with a pan angle of 180 degrees and a tilt angle of 90 degrees. Subsequent frames at that tilt angle should be captured with a +10-degree increment in the pan angle. This scheme gives 36 frames at the starting tilt angle. Then the tilt angle is reduced 10 degrees and the panning process is repeated, resulting in another 36 frames. The tilt angle is gradually reduced until 36 frames are captured at tilt angle -90 degrees. In all, this process results in 684 (that is, 19 36) separate frames.
IMPORTANT: The number of frames captured, the starting and ending pan and tilt angles, and the increments between frames are completely under the control of the author of a QuickTime VR movie.
The individual frames of the object can be interpreted as a two-dimensional object image array (or view array). For a simple object (that is, an object with no frame animation or alternate view states), the upper-left frame is the first captured image. A row of images contains the images captured at a particular tilt angle; a column of images contains the images captured at a particular pan angle. Accordingly, turning an object one step to the left is the same as moving one cell to the right in the image array, and turning an object one step down is the same as moving one cell down in the image array. As you'll see later, you can programmatically set the current view of an object either to a specific pan and tilt angle or to a view specified by its row and column in the object image array.
Note: QuickTime VR object nodes were originally designed as a means of showing a 3D object from different pan and tilt angles. However, there is no restriction on the content of the frames stored in an object image array. In other words, the individual frames do not have to be views of the same object from different pan and tilt angles. Some clever movie authors have used this fact to develop intriguing object nodes that are not simply movies of rotating objects. In these cases, the use of pan and tilt angles to specify a view is less meaningful than the use of row and column numbers. Nonetheless, you can always use either pan and tilt angles or row and column numbers to select a view.
Each view of an object occupies the same amount of time in the object node's video track. This amount of time (the view duration) is arbitrary, but it is stored in the movie file. When a view is associated with only one frame, the QuickTime VR movie controller displays that frame by changing the current time of the movie to the start time of that view.
It's possible, however, to have more than one frame in a particular object view. Moreover, the number of frames per view can be different from view to view. The only restriction imposed by QuickTime VR is that the view duration be constant throughout all views in a single object node.
Having multiple frames per view is useful in several cases. First, you might want to display one frame if the mouse button is up but a different frame if the mouse button is down. To support this, QuickTime VR allows the VR movie author to include more than one view state in an object movie. A view state is an alternate set of images that are displayed, depending on the state of the mouse button.
Note: Alternate view states are stored as separate object image arrays that immediately follow the preceding view state in the object image track. Each state does not need to contain the same number of frames. However, the total movie time of each view state in an object node must be the same.
Another reason to have multiple frames in a particular object view is to display a frame animation when that view is the current view. When frame animation is enabled, the QuickTime VR movie controller plays all frames, in sequence, in the current view. You could use frame animation, for instance, to display a flickering flame on a candle. The rate at which the frames are displayed depends on the view duration and the frame rate of the movie (which is stored in the movie file but can be changed programmatically). If the current play rate is nonzero, then the movie controller plays all frames in the view duration. If the current view has multiple states, then the movie controller plays all frames in the current state (which can be set programmatically).
Note: The frames in a frame animation are stored sequentially in each animated view of the object. Each view does not need to contain the same number of frames (so that a view that is not animated can contain only one frame). However, the view duration of each view in an object node must be the same. In some cases, it is best to duplicate the scene frame to get the same view durations and let the compressor remove the extra data.
An object movie can be set to play, in order, all the views in the current row of the object image array. This is view animation. For both view and frame animation, an object node has a set of animation settings that specify characteristics of the movie while it is playing. For example, if a movie's animate view frames flag is set and there are different frames in the current view duration, the movie controller plays an animation at the current view of the object. That is, the movie controller displays all frames in the appropriate portion of the view duration and, if the kQTVRWrapPan control setting is on, it starts over when it reaches the segment boundary. If the animate view frames flag is not set, the movie controller stops displaying frames when it reaches the segment boundary.
The data used to represent a panorama is stored as a single panoramic image that contains the entire panorama. The movie author creates this image by stitching together individual overlapping digitized photographs of the scene (or by using a 3D renderer to generate an artificial scene). In QuickTime VR version 2.0, these images are cylindrical projections of the panorama. Viewed by itself, the panoramic image appears distorted, but it is automatically corrected at runtime when it is displayed by the QuickTime VR movie controller.
A panorama view is completely described by its node ID, field of view, pan angle, and tilt angle. As with object nodes, a panoramic node's pan angle can range from 0 degrees to 360 degrees. Increasing the pan angle has the effect of turning one's view to the left. When the user is looking directly into the horizon, the tilt angle is 0. Increasing the tilt angle tilts one's view up, while decreasing the tilt angle tilts one's view down.
IMPORTANT: The current image-warping technology for panoramic nodes, using cylindrical projection, does not allow looking straight up or straight down. Future versions of QuickTime VR, however, will provide other methods of projection that do support looking straight up and straight down.
While a panorama is being displayed, it can be either at rest (static) or in motion. A panorama is in motion when being panned, tilted, or zoomed. A panorama is also in motion when a transition (that is, a movement between two items in a movie, such as from one view in a node to another view in the same node, or from one node to another) is occurring. At all other times, the panorama is static. You can change the imaging properties of a panorama to control the quality and speed of display during rest or motion states. By default, QuickTime VR sacrifices quality for speed during motion but displays at highest quality when at rest.
When a transition is occurring, you can specify that a special visual effect, called a transition effect, be displayed. The only transitional effect currently supported is a swing transition between two views in the same node. When the swing transition is enabled and a new pan angle, tilt angle, or field of view is set, the movie controller performs a smooth swing to the new view (rather than a simple jump to the new view). In the future, other transitional effects might be supported.
Both panoramic nodes and object nodes support arbitrarily shaped hot spots, regions in the movie image that permit user interaction. When the cursor is moved over a hot spot (and perhaps when the mouse button is also clicked), QuickTime VR changes the cursor as appropriate and performs certain actions. Which actions are performed depends on the type of the hot spot. For instance, clicking a link hot spot moves the user from one node in a scene to another.
Hot spots can be either enabled or disabled. When a hot spot is enabled, QuickTime VR changes the cursor as it moves in and out of hot spots and responds to mouse button clicks and other user actions. Your application can install callback procedures to respond to mouse actions. When a hot spot is disabled, however, it effectively doesn't exist as far as the user is concerned: QuickTime VR does not change the cursor or execute your callback procedures.
Viewing Limits and Constraints
The data in a panoramic image and in an object image array imposes a set of viewing restrictions on the associated node. For example, a particular panoramic node might be a partial panorama (a panorama that is less than 360 degrees). Similarly, the object image array for a particular object node might include views for tilt angles only in a restricted range, say, +45 degrees to -45 degrees (instead of the more usual +90 degrees to -90 degrees). The allowable ranges of pan angles, tilt angles, and fields of view are the viewing limits for the node. Viewing limits are determined at the time a node is authored and are imposed by the data stored in the movie file.
It's possible to impose additional viewing restrictions at runtime. For instance, a game developer might want to limit the amount of a panorama visible to the user until the user achieves some goal (such as touching all the visible hot spots in the node). These additional restrictions are the viewing constraints for the node. As you might expect, a viewing constraint must always lie in the range established by the node's viewing limits. By default (that is, if the movie file doesn't contain any viewing constraint atoms, and no constraints have been imposed at runtime), a node's viewing constraints coincide with its viewing limits.
Each node also has a set of control settings, which determine the behavior of the QuickTime VR movie controller when the user reaches a viewing constraint. For example, the kQTVRWrapPan control setting determines whether the user can wrap around from the current pan constraint maximum value to the pan constraint minimum value (or vice versa) using the mouse or arrow keys. When this setting is enabled, panning past the maximum or minimum pan constraint is allowed. When this setting is disabled, the user cannot pan across the current viewing constraints; when the user reaches a viewing constraint, further panning in that direction is disabled.
QuickTime VR File Format
A QuickTime VR movie is stored on disk in a format known as the QuickTime VR file format. Beginning in QuickTime VR version 2.0, a VR movie file can contain one or more nodes. Each node is either a panorama or an object. In addition, a QuickTime VR movie can contain links between any two types of nodes.
All QuickTime VR movies contain a single QTVR track, a special type of QuickTime track that maintains a list of the nodes in the movie. Each individual sample in a QTVR track contains general information and hot spot information for a particular node. If a QuickTime VR movie contains any panoramic nodes, that movie also contains a single panorama track, and if it contains any object nodes, it also contains a single object track. The panorama and object tracks contain information specific to the panoramas or objects in the movie.
The actual image data for both panoramas and objects is stored in standard QuickTime video tracks, hereafter referred to as image tracks. The individual frames in the image track for a panorama make up the diced frames of the original single panoramic image. The frames for the image track of an object represent the many different views of the object. Hot spot image data is stored in parallel video tracks for both panoramas and objects.
As mentioned earlier, the QTVR track is a special type of QuickTime track that maintains a list of all the nodes in a movie. The media type for a QTVR track is 'qtvr'. All the media samples in a QTVR track share a common sample description, which contains the VR world atom container. The VR world atom container includes such information as the name for the entire scene, the default node ID, and default imaging properties, as well as a list of the nodes contained in the QTVR track.
Note: A VR world can also contain custom scene information. QuickTime VR ignores any atom types that it doesn't recognize, but you can extract those atoms from the VR world using standard QuickTime atom functions.
The QTVR track contains one media sample for each node in the movie. Each sample contains a node information atom container. The node information includes general information about the node such as the node's type, ID, and name. The node information atom container also contains the list of hot spots for the node.
Note: In QuickTime VR movie files, all angular values are stored as 32-bit floating-point values that specify degrees. In addition, all floating-point values conform to the IEEE Standard 754 for binary floating-point arithmetic.
A movie's panorama track is a track that contains information about the panoramic nodes in a scene. (The media type of the panorama track is 'pano'.) Each sample in a panorama track corresponds to a single panoramic node. This sample parallels the corresponding node in the QTVR track. Panorama tracks do not have a sample description (although QuickTime requires that you specify a "dummy" sample description when you call AddMediaSample to add a sample to a panorama track). The sample itself contains the data that specifies the track reference indexes of the scene and hot spot tracks. In addition, the sample contains information about how the panoramic image has been diced, as well as the viewing angle limits and default view angles for the panorama. The image and hot spot tracks for a panorama can be obtained using the QuickTime GetTrackReference function.
Note: The panorama image track is referenced by a track reference of type kQTVRImageTrackRefType; the hot spot image track is referenced by a track reference of type kQTVRHotSpotTrackRefType.
The actual panoramic image for a panoramic node is contained in a panorama image track, which is a standard QuickTime video track. The panoramic image can be created in many ways. You can use the stitcher provided by the QuickTime VR Authoring Suite to stitch together several photographs. Alternatively, you can use a graphics rendering application or a panoramic camera.
QuickTime VR version 2.0 requires the original panoramic image to be rotated 90 degrees counterclockwise. The rotated image must then be diced into smaller frames. Each diced frame is compressed and added to the video track as a video sample. Frames can be compressed using any spatial compressor; temporal compression is not allowed for panoramic movies.
Note: A panorama sample atom (which contains information about a single panorama) contains a flag that indicates whether the diced panoramic image is oriented horizontally or vertically. Currently, only vertical orientation is supported. Future versions of QuickTime VR might support horizontal panoramic images.
It's possible to store one or more low-resolution versions of a panoramic image in a movie file (called low-resolution image tracks). If there is not enough memory at runtime to use the normal image track, QuickTime VR uses a lower resolution image track, if available. A low-resolution image track contains diced frames just like the higher resolution track, but the reconstructed panoramic image is half the height and half the width of the higher resolution image. The number of diced frames in the lower resolution image track is usually half that in the normal image track.
When a panorama contains hot spots, the movie file contains a hot spot image track, a video track that contains a parallel panorama with the hot spots designated by colored regions. Each diced frame of the hot spot panoramic image must be compressed with a lossless compressor (such as QuickTime's graphics compressor). The dimensions of the hot spot panoramic image are usually the same as those of the image track's panoramic image, but this is not required. The dimensions must, however, have the same aspect ratio as the image track's panoramic image. A hot spot image track should be 8 bits deep.
IMPORTANT: In QuickTime VR version 2.0, the panoramic images in the lower resolution image tracks and the hot spot image tracks, if present, must be rotated 90 degrees counterclockwise (just like image in the panorama image track).
A movie's object track is a track that contains information about the object nodes in a scene. (The media type of the object track is 'obje'.) Each sample in an object track corresponds to a single object node in the scene. The samples of the object track contain information describing the object images stored in the object image track. These object information samples parallel the corresponding node samples in the QTVR track and are equal in time and duration with a particular object node's hot spot samples in the object's hot spot track as well as the object node's image samples in the object's image track.
Note: In QuickTime VR version 2.0, all objects in a single movie must be of the same size (that is, the same number of pixels high and the same number of pixels wide).
Object tracks do not have a sample description (although QuickTime requires that you specify a "dummy" sample description when you call AddMediaSample to add a sample to an object track). Currently, the sample itself is an atom container that contains a single object sample atom; in the future, the sample might contain optional atoms describing new or private features of a particular object. In contrast with the panorama sample atom, track references in the object are not described in the object sample atom. The associated image and hot spot tracks for an object can be obtained using the QuickTime GetTrackReference function.
Note: The object image track is referenced by a track reference of type kQTVRImageTrackRefType; the hot spot image track is referenced by a track reference of type kQTVRHotSpotTrackRefType.
The actual views of an object for an object node are contained in an object image track, which is a standard QuickTime video track. As described in the section "Objects," beginning on page 1-9, these views are often captured by moving a camera around the object in a defined pattern of pan and tilt angles. The views must then be ordered into an object image array, which is stored as a one-dimensional sequence of frames in the movie's video track.
For object movies containing frame animation, each animated view in the object image array consists of the animating frames. It is not necessary that each view in the object image array contain the same number of frames, but the view duration of all views in the object movie must be the same.
For object movies containing alternate view states, alternate view states are stored as separate object image arrays that immediately follow the preceding view state in the object image track. Each state does not need to contain the same number of frames. However, the total movie time of each view state in an object node must be the same.
QuickTime VR Movie Files
A QuickTime VR movie file is a QuickTime movie file. The only differences between a QuickTime VR movie file and a typical time-based QuickTime movie file are the type and usage of the tracks contained in the movie and the necessary QuickTime user data attached to the movie. In particular, for the Mac OS, the file type should be 'MooV'; for Windows systems, the file extension should be .mov.
Also, as with any QuickTime movie that is intended to be played on both operating systems, a data fork version of the file should be created using the FlattenMovie function with the flattenAddMovieToDataFork flag set. Note that the resulting file is optimized for random access and might not perform well in an environment that requires streaming access (such as Web browsing).
For a single-node object movie, the QTVR track contains just one sample. There is a corresponding sample in the object track, whose time and duration are the same as the time and duration of the sample in the QTVR track. The time base of the movie is used to locate the proper video sample in the object image track. For an object movie, the frame corresponding to the first row and column in the object image array is located at the same time as the corresponding QTVR and object track samples. The duration of all the video samples is the same as the duration of the corresponding QTVR sample and the object sample.
In addition to these three required tracks, an object movie can also contain a hot spot image track and any number of standard QuickTime tracks (such as video, sound, and text tracks). A hot spot image track for an object is a QuickTime video track that contains images of colored regions delineating the hot spots; an image in the hot spot image track must be synchronized to match the appropriate image in the object image track. A hot spot image track should be 8 bits deep and can be compressed with any lossless compressor (including temporal compressors).
To play a time-based track with the object movie, you must synchronize the sample data of that track to the start and stop times of a view in the object image track. For example, to play a different sound with each view of an object, you might store a sound track in the movie file with each set of sound samples synchronized to play at the same time as the corresponding object's view image. (This technique also works for video samples.) Another way to add sound or video is simply to play a sound or video track during the object's view animation; to do this, you need to add an active track to the object that is equal in duration to the object's row duration.
For a single-node panoramic movie, the QTVR track contains just one sample. There is a corresponding sample in the panorama track, whose time and duration are the same as the time and duration of the sample in the QTVR track. The time base of the movie is used to locate the proper video sample in the panorama image track. For a panoramic movie, the video sample for the first diced frame of a node's panoramic image is located at the same time as the corresponding QTVR and panorama track samples. The duration of all the video samples is the same as the duration of the corresponding QTVR sample and the panorama sample.
Like an object movie, a panoramic movie can contain an optional hot spot image track and any number of standard QuickTime tracks. A panoramic movie can also contain panoramic image tracks with a lower resolution. The video samples in these low-resolution image tracks must be located at the same time and must have the same total duration as the QTVR track. Likewise, the video samples for a hot spot image track, if one exists, must be located at the same time and must have the same total duration as the QTVR track.