"SfR Fresh" - the SfR Freeware/Shareware Archive

Member "ziplimit.txt" of archive unz552xN.exe:


As a special service "SfR Fresh" has tried to format the requested source page into HTML format using source code syntax highlighting with prefixed line numbers. Alternatively you can here view or download the uninterpreted source code file. That can be also achieved for any archive member file by clicking within an archive contents listing on the first character of the file(path) respectively on the according byte size field.
    1 ziplimit.txt
    2 
    3 A) Hard limits of the Zip archive format:
    4 
    5    Number of entries in Zip archive:            64 k (2^16 - 1 entries)
    6    Compressed size of archive entry:            4 GByte (2^32 - 1 Bytes)
    7    Uncompressed size of entry:                  4 GByte (2^32 - 1 Bytes)
    8    Size of single-volume Zip archive:           4 GByte (2^32 - 1 Bytes)
    9    Per-volume size of multi-volume archives:    4 GByte (2^32 - 1 Bytes)
   10    Number of parts for multi-volume archives:   64 k (1^16 - 1 parts)
   11    Total size of multi-volume archive:          256 TByte (4G * 64k)
   12 
   13    The number of archive entries and of multivolume parts are limited by
   14    the structure of the "end-of-central-directory" record, where the these
   15    numbers are stored in 2-Byte fields.
   16    Some Zip and/or UnZip implementations (for example Info-ZIP's) allow
   17    handling of archives with more than 64k entries.  (The information
   18    from "number of entries" field in the "end-of-central-directory" record
   19    is not really neccessary to retrieve the contents of a Zip archive;
   20    it should rather be used for consistency checks.)
   21 
   22    Length of an archive entry name:             64 kByte (2^16 - 1)
   23    Length of archive member comment:            64 kByte (2^16 - 1)
   24    Total length of "extra field":               64 kByte (2^16 - 1)
   25    Length of a single e.f. block:               64 kByte (2^16 - 1)
   26    Length of archive comment:                   64 KByte (2^16 - 1)
   27 
   28    Additional limitation claimed by PKWARE:
   29      Size of local-header structure (fixed fields of 30 Bytes + filename
   30       local extra field):                     < 64 kByte
   31      Size of central-directory structure (46 Bytes + filename +
   32       central extra field + member comment):  < 64 kByte
   33 
   34    Note:
   35    In 2001, PKWARE has published version 4.5 of the Zip format specification
   36    (together with the release of PKZIP for Windows 4.5).  This specification
   37    defines new extra field blocks that allow to break the size limits of the
   38    standard zipfile structures.  In this extended "Zip64" format, the limits
   39    on the size of zip entries and the size of the complete zip archive are
   40    extended to (2^64 - 1) Bytes; the maximum number of archive entries and
   41    split volumes are enlarged to (2^64 - 1) respective (2^32 - 1).
   42    Currently, these extensions are not yet supported by the released Info-ZIP
   43    software. However, new major releases (Zip 3.0 and UnZip 6.0) are under
   44    development and will support Zip64 archives on selected environments.
   45    (Beta releases are already available for Unix, VMS and Win32.)
   46 
   47 B) Implementation limits of UnZip:
   48 
   49  1. Size limits caused by file I/O and decompression handling:
   50    Size of Zip archive:                 2 GByte (2^31 - 1 Bytes)
   51    Compressed size of archive entry:    2 GByte (2^31 - 1 Bytes)
   52 
   53    Note: On some systems, UnZip may support archive sizes up to 4 GByte.
   54          To get this support, the target environment has to meet the following
   55          requirements:
   56          a) The compiler's intrinsic "long" data types must be able to hold
   57             integer numbers of 2^32. In other words - the standard intrinsic
   58             integer types "long" and "unsigned long" have to be wider than
   59             32 bit.
   60          b) The system has to supply a C runtime library that is compatible
   61             with the more-than-32-bit-wide "long int" type of condition a)
   62          c) The standard file positioning functions fseek(), ftell() (and/or
   63             the Unix style lseek() and tell() functions) have to be capable
   64             to move to absolute file offsets of up to 4 GByte from the file
   65             start.
   66          On 32-bit CPU hardware, you generally cannot expect that a C compiler
   67          provides a "long int" type that is wider than 32-bit. So, many of the
   68          most popular systems (i386, PowerPC, 680x0, et. al) are out of luck.
   69          You may find environment that provide all requirements on systems
   70          with 64-bit CPU hardware. Examples might be Cray number crunchers
   71          or Compaq (former DEC) Alpha AXP machines.
   72 
   73    The number of Zip archive entries is unlimited. The "number-of-entries"
   74    field of the "end-of-central-dir" record is checked against the "number
   75    of entries found in the central directory" modulus 64k (2^16).
   76 
   77    Multi-volume archive extraction is not supported.
   78 
   79    Memory requirements are mostly independent of the archive size
   80    and archive contents.
   81    In general, UnZip needs a fixed amount of internal buffer space
   82    plus the size to hold the complete information of the currently
   83    processed entry's local header. Here, a large extra field
   84    (could be up to 64 kByte) may exceed the available memory
   85    for MSDOS 16-bit executables (when they were compiled in small
   86    or medium memory model, with a fixed 64kByte limit on data space).
   87 
   88    The other exception where memory requirements scale with "larger"
   89    archives is the "restore directory attributes" feature. Here, the
   90    directory attributes info for each restored directory has to be held
   91    in memory until the whole archive has been processed. So, the amount
   92    of memory needed to keep this info scales with the number of restored
   93    directories and may cause memory problems when a lot of directories
   94    are restored in a single run.
   95 
   96 C) Implementation limits of the Zip executables:
   97 
   98  1. Size limits caused by file I/O and compression handling:
   99    Size of Zip archive:                 2 GByte (2^31 - 1 Bytes)
  100    Compressed size of archive entry:    2 GByte (2^31 - 1 Bytes)
  101    Uncompressed size of entry:          2 GByte (2^31 - 1 Bytes),
  102                                         (could/should be 4 GBytes...)
  103    Multi-volume archive creation is not supported.
  104 
  105  2. Limits caused by handling of archive contents lists
  106 
  107  2.1. Number of archive entries (freshen, update, delete)
  108      a) 16-bit executable:              64k (2^16 -1) or 32k (2^15 - 1),
  109                                         (unsigned vs. signed type of size_t)
  110      a1) 16-bit executable:             <16k ((2^16)/4)
  111          (The smaller limit a1) results from the array size limit of
  112          the "qsort()" function.)
  113          32-bit executables             <1G ((2^32)/4)
  114          (usual system limit of the "qsort()" function on 32-bit systems)
  115 
  116      b) stack space needed by qsort to sort list of archive entries
  117 
  118      NOTE: In the current executables, overflows of limits a) and b) are NOT
  119            checked!
  120 
  121      c) amount of free memory to hold "central directory information" of
  122         all archive entries; one entry needs:
  123         96 bytes (32-bit) resp. 80 bytes (16-bit)
  124         + 3 * length of entry name
  125         + length of zip entry comment (when present)
  126         + length of extra field(s) (when present, e.g.: UT needs 9 bytes)
  127         + some bytes for book-keeping of memory allocation
  128 
  129    Conclusion:
  130      For systems with limited memory space (MSDOS, small AMIGAs, other
  131      environments without virtual memory), the number of archive entries
  132      is most often limited by condition c).
  133      For example, with approx. 100 kBytes of free memory after loading and
  134      initializing the program, a 16-bit DOS Zip cannot process more than 600
  135      to 1000 (+) archive entries.  (For the 16-bit Windows DLL or the 16-bit
  136      OS/2 port, limit c) is less important because Windows or OS/2 executables
  137      are not restricted to the 1024k area of real mode memory.  These 16-bit
  138      ports are limited by conditions a1) and b), say: at maximum approx.
  139      16000 entries!)
  140 
  141 
  142  2.2. Number of "new" entries (add operation)
  143      In addition to the restrictions above (2.1.), the following limits
  144      caused by the handling of the "new files" list apply:
  145 
  146      a) 16-bit executable:              <16k ((2^64)/4)
  147 
  148      b) stack size required for "qsort" operation on "new entries" list.
  149 
  150      NOTE: In the current executables, the overflow checks for these limits
  151            are missing!
  152 
  153      c) amount of free memory to hold the directory info list for new entries;
  154         one entry needs:
  155         24 bytes (32-bit) resp. 22 bytes (16-bit)
  156         + 3 * length of filename
  157 
  158 D) Some technical remarks:
  159 
  160  1. The 2GByte size limit on archive files is a consequence of the portable
  161     C implementation of the Info-ZIP programs.
  162     Zip archive processing requires random access to the archive file for
  163     jumping between different parts of the archive's structure.
  164     In standard C, this is done via stdio functions fseek()/ftell() resp.
  165     unix-io functions lseek()/tell(). In many (most?) C implementations,
  166     these functions use "signed long" variables to hold offset pointers
  167     into sequential files. In most cases, this is a signed 32-bit number,
  168     which is limited to ca. 2E+09. There may be specific C runtime library
  169     implementations that interpret the offset numbers as unsigned, but for
  170     us, this is not reliable in the context of portable programming.
  171 
  172  2. The 2GByte limit on the size of a single compressed archive member
  173     is again a consequence of the implementation in C.
  174     The variables used internally to count the size of the compressed
  175     data stream are of type "long", which is guaranted to be at least
  176     32-bit wide on all supported environments.
  177 
  178     But, why do we use "signed" long and not "unsigned long"?
  179 
  180     Throughout the I/O handling of the compressed data stream, the
  181     sign bit of the "long" numbers is (mis-)used as a kind of overflow
  182     detection. In the end, this is caused by the fact that standard C
  183     lacks any overflow checking on integer arithmetics and does not
  184     support access to the underlying hardware's overflow detection
  185     (the status bits, especially "carry" and "overflow" of the CPU's
  186     flags-register) in a system-independent manner.
  187 
  188     So, we "misuse" the most-significant bit of the compressed data
  189     size counters as carry bit for efficient overflow/underflow detection.
  190     We could change the code to a different method of overflow detection,
  191     by using a bunch of "sanity" comparisons (kind of "is the calculated
  192     result plausible when compared with the operands"). But, this would
  193     "blow up" the code of the "inner loop", with remarkable loss of
  194     processing speed. Or, we could reduce the amount of consistency checks
  195     of the compressed data (e.g. detection of premature end of stream) to
  196     an absolute minimum, at the cost of the programs' stability when
  197     processing corrupted data.
  198 
  199     Summary: Changing the compression/decompression core routines to
  200     be "unsigned safe" would require excessive recoding, with little
  201     gain on maximum processable uncompressed size (a gain can only be
  202     expected for hardly compressable data), but at severe costs on
  203     performance, stability and maintainability.  Therefore, it is
  204     quite unlikely that this will ever happen for Zip/UnZip.
  205 
  206     The argumentation above is somewhat out-dated. The new releases
  207     Zip 3 and UnZip 6 will support archive sizes larger than 4GB on
  208     systems where the required underlying support for 64-bit file offsets
  209     and file sizes is available from the OS (and the C runtime environment).
  210     However, this new support will partially break compatibility with
  211     older "legacy" systems.  And it should be expected that the portability
  212     and readability of the UnZip and Zip code may be reduced due to the
  213     extensive use of non-standard language extension needed for 64-bit
  214     support on the major target systems.
  215 
  216 Please report any problems to:  Zip-Bugs at www.info-zip.org
  217 
  218 Last updated:  22 February 2005, Christian Spieler