The framework Quicktime VR consists of the following parts:
Currently, Quicktime VR uses cylindrical projection or panoramic images to accomplish camera rotation. The environment maps for cylindrical projections are easier to create. Commercial cameras support vertical rotation. Image warping for object viweing is easier to perform, because cylindrical projections just curve in one direction. Quicktime VR also uses a real-time image processing engine for navigating in space and authoring VR applications. This engine supports also two players. A panorama and an object player. Panorama player supports pan, zoom, navigating and reacting on HotSpots (place within a panorama where the developer can define entry-points of interaction). Object Player supports rotating objects and viweing objects from different directions. Panorama authoring environment supports panoramic image stitching, HotSpot marking, linking, dicing and compression. Object movies are constructed with a motion-controllable camera. Players are available for Windows and Macintosh systems, the authoring system for Macintosh systems only.
Quicktime movies are sequential one-dimensional movies with possible multiple tracks. Panoramic movies are multi-dimensional event-driven spatial-oriented movies. Pan, zoom and navigate is permitted interactively. To fit multi-dimensional movies into a common linear movie framework, a new type for panoramic tracks is added. A panoramic movie consists of three tracks. A track for the nodes (the viewpoint of a panorama) which stores node datat and corresponding nodes. The track describes the directed node graph and can also be triggered by external events (i.e. mouse-clicks,...). The second track stores the panoramic images, which are stored segmented and non-linear. The third track holds the data for the HotSpots defined for every panorama. The image and the HotSpot track are invisible to the player, so the player doesn´t attempt to playback them sequentially. On slow storage media the tracks should be stored interlaced for faster access. Interframe compression is not used within Panoramic movies.
An object movie is designed as a two-dimensional array. Each frame corresponds to a viweing direction. Currently, the same number of frames for each direction is required. For time-varying it´s possible that the movie is more than two-dimensional. Storage of the images is linear on one track. Information on number of rows, columns and indizes are stored in the movie track header. The organization of the images is optimized for horizontal rotation. Interframe compression may be used for time-varying versions of storage.
Rotating horizontaly is possible for 360°. Looking up and down is supported but not all the way long, because of the cylindrical projection. Rotating about the viewing axis is currently not supported. Moving is supported via holding the mouse-button and moving the mouse simultaniously (this is called panning). Zooming is supported through image magnification. If multiple resolutions are available, the player uses the best with respect to current memory usage, CPU performance and disk speed. Multiple level zooming is not supported. The image is segmented and compressed on disk. Only the frame tiles overlapping the current view are decompressed and loaded into a memory read buffer. The visible region of the current view is then taken from the read buffer, warped and displayed. The image warp projects a cylindrical map to a planar view using a two-pass-run-time algorithm. The performance reached is shown in the figure below:
Processor 1D Panning 2D Panning
PowerPC601/80 29.5 11.6
MC68040/40 12.3 5.4
Pentium/90 11.4 7.5
486/33 5.9 3.6
When the user is panning, the player switches to lower quality. When the user moves to another node, the player must maintain the viewing direction of the current panorama. Panoramas are linked together manually in the authoring environment. Together with continous in- and outzooming and jumping to a HotSpot the useractions for the panoramic player are complete.
While the panoramic player is designed to lookaround in a virtual room, the object player supports viewing of an object. Depending on the frames created for the object it is possible to view the object via rotating it horizontally or additionally vertically. The frames are stored in a two-dimensional array with a black background for best contrast and constant colors. Multiple frames are looped continously for a lock-around feeling (flickering candle and animated waterfalls).
The figure below describes the environment components and the sequential use for constructing a virtual reality.
Nodes are defined within a space. They are later used as the viewpoints for all panoramas in the virtual scene. Panoramas are created via rendering, panoramic photography or stitching overlapping photographs. Overlapping should be performed on a minimum of 30% and on an optimum of 50% for all adjacent images. The sucess rate of stitching adjacent pictures automatically together is at 80%. This percentage can only be reached by a near 100% horizontal alignment, so perfect photography is requirement for automation of the stitching process. The stitcher has the possiblity to correct under or over exposed images to grant a balanced panorama without loosing light effects. Images can also be stored on Kodak PhotoCD for further processing. Useful resolutions are 768x512 or 2500x768 pixels. The stitcher takes 5 Minutes to stitch 12 pictures together on a PowerPC with a 80 Mhz CPU.
4.4.1.1. HotSpot Marking
HotSpots within a panorama get a unique identifier via the colorcode. Therefore on a usual 256 color display a maximum of 256 HotSpots per Panorama is allowed. The resultion of the HotSpot is independent from the Panorama. If the accuracy of picking the HotSpot should be high a higher resultion is recommended.
4.4.1.2. Dicing and Compression
Images are segmented and stored on mass media. The current view is loaded into a read buffer. The visible part of it is warped and displayed. A small number of segments for an image demands a large screen buffer. Skipping a 2500x768 into 24 vertical tiles is an optimal compromize for data loading and tile paging. A panorama for this resultion takes in the compressed version 500 KB, so 1000 panoramas fit on one CD-ROM.
Object Movies require photographing from different views. The camera views the objects center, while orbiting in constant increments. Computer generated objects can be easily viewed be difining views via software. Real objects can be viewed via a special device which orbits the camera and sends data to a frame grabber card after each incrementation. Automation is expensive ($10.000) and takes 1 hour for a 360° view at 10° incrementation. Multiple-pass capturing is necessary for photographing different vertical levels.
4. How's Quicktime implemented