Please consider a donation to the Higher Intellect project. See or the Donate to Higher Intellect page for more info.

CD-ROM Common Questions and Answers

From Higher Intellect Vintage Wiki
Jump to navigation Jump to search
                     Common Questions and Answers


 MSCDEX (or the Microsoft compact disc extension) software standardizes
 the way  in which  all CD-ROM  drives are  accessed by  a PC.   On its
 release it  could not  fail to  succeed as it was the only independent
 software which  allowed the  CD-ROM disc  to appear  as one big floppy
 disc to  the end  user.   The following  versions are  likely to be in
 existence, still.   If  you are not up to date then contact your drive
 supplier to receive a new copy.

 Version 1.1:   Supported reading high sierra group files.
 Version 2.0:   Supported  reading   high  sierra   and  ISO  9660  and
                supported standard audio functions
 Version 2.1:   Support for  interleaved Audio in CD-ROM XA, compatible
                with MS-DOS 4.0 and MS-NET compatible.
 Version 2.2:   Supports MS-DOS version 5.0.

 High Sierra (HSG) and ISO 9660 formats

 The High Sierra Group (HSG) standard was the first attempt to lay down
 a formatting  structure for  a CD-ROM  (much  like  the  1.4  Megabyte
 formatting standard  for  PC  floppy  disks).    Its  release  allowed
 different developers  of CD products to ensure that a wide as possible
 range of computer users could read their CDs.

 Later, a  few small  changes were  made for the sake of efficiency and
 computer compatibility and the ISO 9660 standard was born.

 ISO 9660  access software (or drivers as they are known) are available
 for practically  every computer  platform -  making  CD-ROM  the  most
 platform-independent medium ever to exist.

 An ISO  9660 formatted  disc can  be put  into a  PC machine and a DIR
 command performed.   In an Apple machine a desktop can be opened, on a
 Sun UNIX system a file list operation can be performed.

 Most PC-orientated CD-ROM discs are ISO 9660 these days.

 HFS and other proprietary formats

 A few  inadequacies in  the ISO  9660 format do exist.  File names can
 only consist of characters A to Z and underscore.  A dot is allowed as
 a  name/extension   separator.     No  allowance  is  made  for  other
 characters, including spaces, lower case letters or names greater than
 eight  characters.     Even   worse,  the  standard  does  not  easily
 accommodate file  resources.  This of course does not go down too well
 in the Apple world - for one thing you loose the pretty icons with ISO
 9660 format.

 As  an   alternative  the   Apple  systems   can  cope  with  the  HFS
 (hierarchical file  system) format.   This  is,  basically,  a  direct
 sector for  sector copy  of an  Apple system hard disk and is the most
 common format used in the Apple world.  More recently a similar system
 has been  used in  the Sun/Unix  world.   VAX VMS  systems also  use a
 device format system.

 Nimbus Information  Systems can manufacture CD-ROM discs with both ISO
 9660 and Apple HFS images on the same disc, keeping both camps happy.
End part 1


Part 2 of 6

 To get around this problem of multiple standards the Rock-Ridge format
 is a  form of ISO+.  Whilst being very similar it allows extended name
 characters (much  like the  Commodore/CDTV version  of ISO)  and  file
 resources.   It also  copes with  mixed mode and form disc formats for
 the XA, CD-I and Bridge standards.

 How Big is a CD-ROM?

 New people  to the  world of CD are often confused by the numerous and
 varying quotations  for the  capacity of a CD-ROM.  This is mostly due
 to the  fact that  CD is a time-domain medium, size relates to length.
 Nimbus has the world's most accurate mastering lathe and can cut discs
 up to  79 minutes  35  seconds  long.    Allowing  some  overhead  for
 directory structures,  path tables,  system areas  and a  little  more
 besides, let's work on 79 minutes.  The CD-ROM capacity is therefore:

            79 minutes x 60 seconds x 75 blocks x 2,048 bytes

 or                  728,064,000 bytes
                  =      711,000 kilobytes
                  =          694 megabytes

 Let that be it once and for all!

 Blocking factors and efficiency

 Having said  that the  capacity of a CD-ROM is 694 megabytes it may be
 less.   This is due to the blocking structure of the disc.  As on your
 hard disk,  the smallest  part of  the disk  that can be accessed is a
 sector.   A one  byte file on your hard disk will take up 512 bytes of
 storage.   A one  byte file  on a CD-ROM will consume 2,048 bytes (the
 sector size  or block).  Thus if you have 1,000,000 one byte files you
 will not  get them onto a CD-ROM even though they only total less than
 one megabyte!   For this reason, efficient CD-ROMs have a small number
 of very large files instead of a large number of very small files.

 Audio digitization levels

 The main  levels of  audio digitization  than can  be used on a CD-ROM
 (other than proprietary file formats and MPC standards) are CD-Digital
 Audio (PCM,  pulse code modulation) and the CD-I ADPCM (adaptive delta
 pulse code  modulation).   These standards  allow the  following audio
 capacities on a CD-ROM:

                fs       res      BW         hours s/m      Equiv  
     CD-DA    44.1Khz   16 bit   20KHz      1 hr S           DA    
     Lev A    37.8Khz    8 bit   17Khz      2 hr S/4 hr M    LP    
     Lev B    37.8Khz    4 bit   17Khz      4 hr S/8 hr M    FM    
     Lev C    18.9Khz    4 bit   8.5Khz     8 hr S/16 hr M   AM    
End part 2


Part 3 of 6
 All about CD-audio

 CD-Audio, can  play on  the vast  majority of  installed players (only
 some of  the older models didn't have the audio circuits included) and
 is easily  controlled by  simple software.   Users  must be installing
 their CD-ROM  drive with an MSCDEX version 2.0 or greater (see section
 on MSCDEX).   The  two main  problems with  using this format to store
 audio information are that you can only play audio OR load data at any
 one time and you are limited to a maximum of one hour of stereo or two
 hours of  mono sound.  Playing audio whilst displaying images can only
 be done by:

      a)   buffering images in memory before playing an audio sequence,

      b)   pre-installing images  onto the  users hard disk so sourcing
           information from two media at once,

      c)   head-whizzing -  load an  image, play  audio, load an image,
           play audio etc...

 CD-Audio can be accessed in two ways through any standard CD-ROM drive
 with audio capabilities:

      a)   by track.  Note that  the CD-ROM  track must  be track  one,
           audio tracks will then be available as track 2 to 99.

      b)   by time  code. Note  that a  maximum resolution of 75 frames
           per second  is accessible.  In our experience you should aim
           for an  accuracy between  players of the same model of +/- 2
           frames and  between players of different manufacturers +/- 4

 In both  cases you  can choose  to play left, right channels or stereo
 (or even mute which you can use for synchronisation).

 Level A, B and C ADPCM

 Level A  ADPCM is  no longer  recognized as  a usable  standard and is
 unlikely to  be supported  in the future machines.  Whilst the quality
 is quite  high (roughly equivalent to that of LP disks)  the potential
 gains in additional audio time was thought not to be worth pursuing.

 Level B  ADPCM is  commonly used  in CD-I  and CD-ROM XA material as a
 reasonable quality  compression standard for music, roughly equivalent
 to the  sound you  would obtain  of an  FM radio  station (but with no
 horrible DJs!).

 Level C  ADPCM gives  you  a  great  quantity  of  audio  but  with  a
 substantial reduction  in quality  and is  usually only considered for
 speech - where a high bandwidth is not required.

 Level B  and C  decoding are  supported by  the  CD-I  and  CD-ROM  XA
 machines which  have yet  to penetrate  the market  to a large degree.
 Both levels of audio require that the disc be mastered in Mode 2, form
 2 (see discussion of standards and modes) so the mastering house needs
 to know.

 The big  advantage that  ADPCM audio  has over CD-DA is that the sound
 can be  interleaved with  data.   Thus enabling  your software to load
 images and  text simultaneously  whilst playing  audio sections.  This
 interleaving  can  also  be  performed  between  'channels'  of  audio
 allowing  the  software  to  switch  between  multilingual  tracks  or
 different backing tracks 'on the fly'.  Most players will also play up
 to four  channels simultaneously  allow great flexibility in the sound
 presented to the end-user.

 Interleaving, how it works and what you loose

 Interleaving is  at the  expense of  audio capacity  and is  sometimes
 quite tricky to work out.  It is best considered by using an example:

      A sequence  requires two  different channels  of audio in level B
      mono, one a music background track, one a descriptive voice over.
      What is the data rate available for simultaneous images?

      Level B mono audio will consume one eighth of an available second
      per second  of CD  playing time.  Thus, two simultaneous channels
      will consume  one quarter  of a  second per  second playing time.
      Three quarters  of a  second per  second is  available for  other
      data.   Each second  of the  CD in Mode 2, Form 2 has 2,336 bytes
      times 75  blocks.  So three quarters of 2,336 times 75 is 131,400
      bytes per  second.  If your average SVGA image takes up 100K that
      means  you   can  pipe   off  1.3   images  per   second   whilst
      simultaneously playing one or both of the audio tracks.
End part 3


Part 4 of 6
                        Making Efficient CD-ROMs

 System Elements

 In order that a developer can optimise a CD-ROM product it is not
 just a  simple case  of analysing  the CD-ROM  architecture.  The
 performance of  a CD-ROM  is  governed  as  much  by  the  system
 elements of  the target  as  by  the  medium.  A  developer  must
 understand  each   element  of   such  a   system  and   optimise
 accordingly. We will address all of these elements below, ranging
 from the  operation  of  the  computer's  operating  system,  the
 operating system  extension, the  device driver, the CD-ROM drive
 hardware and, of course the CD-ROM itself.

 We will  approach this  by analysing first the structure of a CD-
 ROM and its effect on performance with regard to the system.

 Directory Structures

 The logical  sequence of  events when  your programme  requests a
 file access  has to  be well  understood before  optimisation can
 occur. When a request is sent to open a file the operating system
 will first load the path table on the CD-ROM to find the location
 of the directory record. Next the directory record will be loaded
 to find  the file location, then the byte offset is calculated to
 allow the  operating system to finally read  the relevant section
 of the  datafile. It can be seen then that this can involve three
 disc accesses in order to read a file.

 All directory  records and  path tables  are stored in 2K logical
 blocks. The optimum  directory size, therefore is related to this
 block size   such  that   a list  of 40  files (this  is  roughly
 dependant on the length of the filenames ) will fill one block. A
 list of  42 files  will cover  two blocks  rather  inefficiently.
 Loading a  list of  80 files  will  take  as  long  (2/75  second
 minimum) as  a list  of 42  files and requires the same amount of
 sector cache  ( 4K  ) .  Within a path table you can pack between
 100 and  200  (  again  depending  on  name  length  )  directory

 All is  not lost  however, both  the path table and the directory
 records can  be  buffered  in    'sector  cache  memory'  in  one
 operation,  allowing   subsequent  operations   to  request  this
 information rather than disc, that is, if you have defined enough
 memory. Allowing  enough sector  cache to  cope with  loading the
 path table  is very  efficient. All future requests for directory
 locations will then be handled within sector cache.

 Improvements in  access time  can be  made by grouping files in a
 single directory  that are  opened together  (or sequentially  ).
 Your operating  system will  then reference  the disc  cache when
 opening a  file rather  than  accessing  the  disc.  It  is  also
 possible to  perform  a  'dummy  access'  to  a  file  when  your
 programme is  sitting idle  so  that  the  next  file  access  is
 prepared by already loading in the relevant information.
End part 4


Part 5 of 6
 Buffering and MSCDEX

 Using MSCDEX  you can define the disc cache memory using the '/M'
 option. If your directory covers 10 sectors and you only define 5
 sector cache  blocks then you will end up reloading the directory
 every time  you want  to open a file which can cause vast amounts
 of time  to be  expended hacking  the CD-ROM  disc.  Define  your
 sector caching  to be  the same  as the  largest directory record
 plus two  sectors (  the last  two are used by MSCDEX for its own
 purposes). A  simple calculation is shown here that allows you to
 make a recommendation to your CD-ROM users.

      /M :  (largest number of files in a directory / 40 )   +
            (number of directory files / 100 ) + 4

 As a  general   rule then,  it is  better to  keep your directory
 sizes small  and a  multiple of  40 files  and  allow  sufficient
 buffering to keep down the number of disc accesses .

 The last  hint is 'do not rely on the operating system to do  the
 file searching for you'. It is far better to group your data into
 a few  big files  and use  your own  indexing system  to find the
 section of  data that you need than using the operating system.

 File Structures

 It is  possible to  use logical block sizes of between 0.5K to 2K
 on a  CD-ROM. Usually  this is  done when you have many files all
 less than 2K in size. You can for instance half the capacity of a
 CD-ROM disc  if all  your files  are 1K  in size  and all of your
 logical blocks  are 2K  in size.  Whatever is  defined the CD-ROM
 hardware will  address a 2K block minimum, so unless you can VERY
 reliably predict  the adjacent files to be opened within the same
 block this  method of  data storage proves rather inefficient. It
 is far better to concatenate your small files into one large file
 and hold an index of block offsets.

 Drive Performance

 Never rely  on obtaining  the theoretical   data transfer rate of
 150K per  second. Piping  files through  the .SYS driver , MSCDEX
 and MS-DOS will slow things down. This can be a major factor when
 addressing   medium size  files like  images. The  only real  way
 around this  problem   is to  by-pass all of these operations and
 allow your  software  to address the .SYS driver direct. This can
 be an  arduous task,  most developers opt for file compression to
 cut down the image retrieval time

 We have  mentioned  already  that  the  sector  cache  is    very
 important when  we are  considering  file  access  times.  A  few
 manufacturers have  installed  a  considerable  amount  of  cache
 memory within the CD-ROM drive hardware. This allows a request to
 read a  path table  from the computer to extract data rather than
 another access  to the  disc. This  can  result  in  considerable
 apparent speed increase in your CD-ROM product.

 BE WARNED,  it is  not unknown for a developer to create a CD-ROM
 programme that  is optimised for a particular drive, only to find
 that it works 100 times slower on other manufacturers drives.

 As we  are deling  with relatively slow hardware, keep the number
 of seeks to your data file to a minimum. Use indexing and hashing
 tables which let you approach your data in the mst direct manner.

 It is  obviously inefficient to swing from one end of the disc to
 another and  back again.  try to  do some access analysis on your
 database and  store sequentially  accessed data blocks within the
 same vicinity. Sometimes it is even better to duplicate some data
 in order  that the  seek distances  are kept  to a minimum at the
 expense of a little more disc estate.

 All drive  manufacturers have  their own seeking algorithms which
 effect how  the drive  performs (see  plots attached)  . There is
 usually little point in optimising your long seek jumps to medium
 seek jumps, you will gain relatively little improvement in speed.
 If you  can optimise  to short  seek jumps, however, you may well
 get a great improvement in data accessing.
End part 5


Part 6 of 6
 Disc Quality

 It is  a surprise  to most people  to find that CD quality can be
 related to  the retreival  time. You should find no problems with
 reputable manufacturers  , however  a dusty  or scratched    will
 affect the  retrieval time.  A few  of the  CD-ROM drives  in the
 market perform  the final layer of error correction within the PC
 rather than  the drive,  this can  result in  long data retrieval
 times when  a disc  is giving  a high  error rate.The drive might
 also go back automatically  to read the same block if error rates
 are high and in very bad cases no data will be read at all. Treat
 CD-ROMs as  carefully as you would any  other very  high  density
 storage medium  and  you  will  obtain  good  performancefor  the
 lifetime of the data and more.

 Apple Discs and HFS

 Most of  the rules and comments made above can be also applied to
 Apple and  HFS CD-ROMs. One important note, however is the purity
 of the  data image. Apple discs can be made by mastering directly
 from a hard disk.

 If you  have performed  a number  of file deletes and inserts you
 will have  a poorly  structured hard disk with sections  of files
 dispersed over the hard disk surface. This rather messy structure
 will then be directly transferred to the CD-ROM which can then in
 very long  access times.  Take a  fresh empty  hard disc and copy
 your data, in the order that you want it to appear on the CD-ROM.
 This will result in a clean efficient image.

 Access Time Analysis Plots for two CD-ROM Drives

 Two drive  speed plots showing access times versus data location.
 Note that  the 'Drive  A' plot  is very linear and approaches the
 theorhetical access  time as  distance increases.  Drive A  has a
 powerful head positioning mechanism and a good seek algorithm.

 The  'Drive B' plot is more typical  of current CD-ROM drives. It
 shows that  the difference  between large access paths and medium
 access paths  to be  relatively small,  optimise only  for  short
 access paths to gain a speed advantage.

 Drive A

 (starting block 0)

 End   Access       200     400     600     800    1000    1200
 Block Time(ms)                                                     

 0              330        *
 27000          490              *
 54000          540                 *
 81000          570                  *
 108000         610                    *
 135000         650                     *
 162000         690                       *
 189000         720                         *
 216000         770                          *
 243000         800                            *
 270000         840                              *
 297000         870                               *
 324000         910                                *

 Drive B

 (starting block 0)

 End   Access     200     400     600     800    1000    1200
 Block Time(ms)                                                   

 0              320      *
 27000          710                      *
 54000          820                          *
 81000          920                              *
 108000         990                                *
 135000         1040                                 *
 162000         1100                                    *
 189000         1150                                      *
 216000         1190                                        *
 243000         1240                                          *
 270000         1290                                            *
 297000         1330                                            *
 324000         1360                                             *

Reference: CD-ROM aneCDote from the Nimbus Information System

edited by Armin DB8PP
End part 6