Jump to content

LHA (file format)

From Wikipedia, the free encyclopedia
LHA
Other namesLHarc, LHx, LH
Original author(s)Haruyasu Yoshizaki
Stable release
2.13 / 20 July 1991;32 years ago(1991-07-20)
Preview release
2.55b / 24 November 1992;31 years ago(1992-11-24)
Written inAssembly language,C
Operating systemDOS
SuccessorLHA32
LicensePermissive license
Websitehttps://www.vector.co.jp/vpack/browse/person/an000224.html
LZH
Filename extension
.lzh,.lha
Internet media type
application/x-lzh-compressed
Type code"LHA␣"(L-H-A-SPACE)
Uniform Type Identifier (UTI)public.archive.lha
Developed byHaruyasu Yoshizaki (Yoshi)
Type of formatData compression
Extended fromLArc

LHAorLZHis afreewarecompressionutility and associated file format. It was created in 1988 by Haruyasu Yoshizaki(Cát kỳ vinh thái,Yoshizaki Haruyasu),a doctor, and originally namedLHarc.A complete rewrite of LHarc, tentatively namedLHx,was eventually released asLH.It was then renamed toLHAto avoid conflicting with the then-newMS-DOS5.0LH( "load high" ) command. The original LHA and itsWindowsport, LHA32, are no longer in development because Yoshizaki is busy at work.[1]

Although no longer much used in the west, LHA remained popular inJapanuntil the 2000s.[2]It was used byid Softwareto compress installation files for their earlier games, includingDoomandQuake.Because some versions of LHA have been distributed with source code under thepermissive license,LHA has been ported to many operating systems and is still the main archiving format used on theAmigacomputer, although it competed withLZXin the mid-1990s. This was due toAminet,the world's largest archive of Amiga-related software and files, standardising on Stefan Boberg's implementation of LHA for the Amiga.

Microsoft released the Microsoft Compressed (LZH) Folder Add-on, which was designed for the Japanese version ofWindows XP.[3]The Japanese version ofWindows 7ships with the LZH folder add-on built-in.[4]Users of non-Japanese versions of Windows 7 Enterprise and Ultimate can also install the LZH folder add-on by installing the optional Japanese language pack fromWindows Update.

Compression methods[edit]

In an LZH archive, the compression method is stored as a five-byte text string, e.g.-lz1-.These are the third through seventh bytes of the file.

Canonical LZH[edit]

LHarc compresses files using an algorithm from Yoshizaki's earlier LZHUF product, which was modified from LZARI developed byHaruhiko Okumura(Áo thôn tình ngạn,Okumura Haruhiko),but usesHuffman codinginstead ofarithmetic coding.LZARI usesLempel–Ziv–Storer–Szymanskiwith arithmetic coding.

lh0
No compression method is applied to the source data.
lh1
This method is introduced in LHarc version 1.
It supports 4KiBsliding window,with support of maximum 60 bytes of matching length. Dynamic Huffman encoding is used.
lh2
lh1 variant. This method supports 8 KiB sliding window, with support of maximum 256 bytes of matching length. Dynamic Huffman encoding is used.
lh3
lh2 variant with Static Huffman.
lh4, lh5, lh6, lh7
Methods 4, 5, 6, 7 support 4, 8, 32, 64 KiBsliding windowrespectively, with support of maximum 256 bytes of matching length. Static Huffman encoding is used. lh5 is first introduced in LHarc 2, followed by lh6 in LHA 2.66 (MSDOS), lh7 in LHA 2.67 beta (MSDOS). LHA itself never compresses into lh4.
lhd
Technically it is not a compression method, but it is used in.LZH archive to indicate that the compressed object is an empty directory.

Joe Jared extensions[edit]

Joe Jared extended LZSS to use larger dictionaries.

lh8, lh9, lha, lhb, lhc, lhe
Dictionary (sliding window) sizes are 64, 128, 256, 512, 1024, 2048 KiB respectively.

Jared ported LZH to Atari. The fact that lh8 is the same as lh7 was an oversight. Files using larger numbered methods may as well not exist, as Jared only considers them planned features.[5]

UNLHA32 extensions[edit]

UNLHA32.DLL uses its own method for testing purposes.

lhx
It uses 128–256 KiB dictionary.

PMarc extensions[edit]

These compression methods are created by PMarc, aCP/Marchiver created by Miyo. The archive usually has a.PMA extension.

pc1
PopCom compressed executable archive. Details unknown.
pm0
No compression method is applied to the source data.
pm1
8 KB sliding window, static huffman. Seldom generated, decompressor is reverse-engineered.[6]
pm2
lh5 variant, 4K sliding window.
pms
Used to indicate PMarc self-extracting archive. Should be skipped to reveal the real format.

LArc extensions[edit]

LArc uses the same file format as.LZH, but was written by Kazuhiko Miki, Haruhiko Okumura and Ken Masuyama, with extension name ".LZS".[7]The program seems to have come before LZH. It uses abinary search treein the LZ matching.[8]

lzs
It supports 2 KiBsliding window,with support of maximum 17 bytes of matching length.
lz2
It is similar to lzs, except dictionary size and match length can be changed.
lz3
Unknown.
lz4
No compression method is applied to the source data.
lz5
It supports 4 KiBsliding window,with support of maximum 17 bytes of matching length.
lz7
lz8
Unknown.

Common implementations appear to only support lzs, lz5, plus the storage-only lz4.

Issues[edit]

LHICE/ICE[edit]

There are copies of LHICE marked as version 1.14. According to Okumura, LHICE is not written by Yoshizaki.[9]

Y2K11 bug[edit]

Because of a bug, DOS time stamps from Level 0 and 1 headers after the year 2011 will be set to 1980, meaning that some utilities need to be patched. This is caused by a bug that interprets the unsigned 7-bit year number bitfield as a 5-bit number. The maximum year should be 2107 instead.[10][11]

The newer Level 2 and 3 headers use a 32-bitUnix timeinstead. It suffers from theYear 2038 problem.[12]

Header size[edit]

According to Micco, the author of a popular LHA library UNLHA32.DLL, many LHA implementations do not check for the length of LHA file headers when reading the archive. Two problems could emerge from this scenario: a buffer-overrun may occur for naive implementations assuming a 4KB max size from the original specification; antivirus software may skip over files with such large headers and fail to scan for a virus. A similar problem exists withARJ.Micco reported this problem to Japanese authorities, but they do not consider it a valid vulnerability.[13]

Micco went so far to conclude the development of UNLHA32 and advise people to give up on the format. Nevertheless, they came back in 2017 to fix aDLL hijackingissue.

See also[edit]

References[edit]

  1. ^"LHA World by Dr.Haruyasu Yoshizaki".1999-04-28. Archived fromthe originalon 1999-04-28.Retrieved2021-01-12.
  2. ^Cát trạch, hanh sử (2010-06-07)."“LZH” の khai phát trung chỉ -- xí nghiệp などは sử dụng しないよう tác giả が chú ý hoán khởi ".CNET Japan(in Japanese).Retrieved2021-01-12.
  3. ^"Microsoft Compressed (LZH) Folder Add-on".Microsoft.Archived fromthe originalon 2007-08-19.Retrieved2007-10-05.
  4. ^"Windows 7 で (LZH の áp súc に Microsoft) フォルダーのアドインをインストールできません".Support.microsoft.com.Retrieved2016-07-17.
  5. ^Jared (1998).lzhformat.html
  6. ^"fragglet/lhasa".GitHub.7 July 2022.
  7. ^"Áp súc データの拡 trương tử".LZS "| áp súc ・ giải đống ソフトのガイド".Lzh-zip.com.Retrieved2016-07-17.
  8. ^"Data Compression Algorithms of LARC and LHarc".GameDev.net.
  9. ^"History of Data Compression in Japan".Oku.edu.mie-u.ac.jp.Retrieved12 July2016.
  10. ^"Aminet - util/arc/lha138pch.lha".Aminet.net.Retrieved12 July2016.
  11. ^"Aminet - util/arc/lha_68k.lha".Aminet.net.Retrieved12 July2016.
  12. ^Nifty's LHA Format Notes, Other data formats.
  13. ^"LZH thư khố のヘッダー処 lý における thúy nhược tính について(2010 niên bản )".micco.mars.jp.

External links[edit]