CP/M has the ability to handle lots of different compression and archive formats which was important due to the limited capacity of floppy disks and the cost of downloading/uploading files on BBS's. They each have their pros and cons and this article will explore some of the most common ones and where you can find programs on the Walnut Creek CD to handle them.
Compression Only
The first compression formats on CP/M only compressed single files and would change the middle letter of the file extension to signify that the file had been compressed.
- .?Q?
- Squeeze was an early compression format that used Huffman encoding to compress files. These can be squeezed (compressed) with sq and unsqueezed (decompressed) with usq.
- .?Z?
- Crunch brought LZW compression to CP/M and these files can be handled with crunch.
- .?Y?
- These files, using LHA compression, were relatively uncommon. They can be handled with crlzh or my favourite for just decompressing is uncr.
Archive Only
- .LBR
- LBR was an early CP/M format that allowed you to combine multiple files into a single archive. These files would often have been compressed with tools such as squeeze or crunch. Because it was so common it was well supported by other tools such as QL, LRUN, LSWEEP and others which can look into a .LBR archive and use individual files without having to separately extract first. These files can be handled using nulu or if you just want to extract files, delbr. For more information have a look at our article: Working with .LBR on CP/M.
Multiple File Archives with Compression
Later on CP/M adopted formats from other platforms, such as MS-DOS, which integrated file compression and archiving into a single format.
Compress and Decompress
- .ARC/.ARK
- This is the most common compressed archive format on CP/M. Internally it analyses each file which it is asked to compress and tries to find the best compression method such as squeeze, crunch, etc. It can be decompressed using unarc or created using arc.
- .LZH/LHA
- A common format at one time on MS-DOS and still is on the Amiga. These can handled using crlzh.
- .PMA
- This is a variant of LHA and as far as I'm aware was only used on CP/M. These files can be handled using PMarc.
Decompress Only
CP/M can also decompress formats that were common on other platforms such as MS-DOS and Windows and in the case of .ZIP still is. They can't be created under CP/M but it is useful to be able to decompress them so that you can read files created on other systems. Unfortunately, the unzip utilities I've found only unzip files created with PKZIP 1.x and therefore can't use the DEFLATE algorithm introduced by Phil Katz's 1993 release of PKZIP 2.04g.
- .ZIP
- There are lots of files compressed as .ZIP files on the Walnut Creek CD and therefore despite not being able to decompress modern .ZIP files under CP/M it is still useful to decompress them. They can be unzipped with unzip.
- .ARJ
- This was pretty common at one time but got overtaken by .ZIP. To decompress use unarj.
Self-Extracting Archives
The PMarc tool mentioned above can also create self-extracting .com files. Which made it really easy to distribute multiple files, but this does add extra overhead and reduce flexibility.
Benchmarks
The various compression formats produce different results. To compare them I have taken some of the most common and used them to compress two files: ED.COM and TAO.TXT. These files can be seen in the first two rows of the table followed by various compressed versions of them.
Filename | Size (Kb) | Size (Records) | |
---|---|---|---|
ED.COM | 10 | 73 | Original binary file (CP/M Plus Editor) |
TAO.TXT | 27 | 214 | Original text file |
ED.CQM | 8 | 63 | Squeezed version of ED.COM |
TAO.TQT | 14 | 110 | Squeezed version of TAO.TXT |
ED.CZM | 7 | 54 | Crunched version of ED.COM |
TAO.TZT | 11 | 86 | Crunched version of TAO.TXT |
BOTH.LBR | 36 | 288 | LBR archive containing files: ED.COM and TAO.TXT (no compression) |
BOTHS.LBR | 22 | 174 | LBR archive containing files: ED.CQM and TAO.TQT (squeezed) |
BOTHC.LBR | 18 | 141 | LBR archive containing files: ED.CZM and TAO.TZT (crunched) |
ED.ARK | 7 | 56 | Ark version of ED.COM (Ark crunched this file) |
TAO.ARK | 11 | 88 | Ark version of TAO.TXT (Ark crunched this file) |
BOTH.ARK | 18 | 142 | Ark version containing files: ED.COM and TAO.TXT (Ark crunched both files) |
Video of Compression and Archiving Tools
You can see some of the tools in action below.