Monday, May 30, 2022

Java Tutorial: Zip4j - A Java library for zip files/streams

Chapters

Introduction

Zip4J is the most comprehensive Java library for zip files or streams. As of this writing, it is the only Java library which has support for zip encryption, apart from several other features. It tries to make handling zip files/streams a lot more easier. No more clunky boiler plate code with input streams and output streams.

Requirements
JDK 7 or later*

* Zip4j is written on JDK 8, as some of the features (NIO) that Zip4j supports requires features available only in JDK 8. However, considering the fact that Zip4j is widely used in Android, and to support older versions of Android, Zip4j supports JDK 7 as well. In cases where the feature/class from JDK 8 is missing, Zip4j falls back to the features available in JDK 7. In other words, when running on JDK 7, not all features will be supported.

zip4j also supports Zip64. Zip64 removes some limitations that ZIP format has. Zip4j will automatically make a zip file with Zip64 format and add appropriate headers, when it detects the zip file to be crossing these limitations. You do not have to explicitly specify any flag for Zip4j to use this feature.

Note: If you're using maven, add this to your pom.xml and you don't need to add a new classpath in order to use zip4j.
<dependency>
    <groupId>net.lingala.zip4j</groupId>
    <artifactId>zip4j</artifactId>
    <version>2.10.0</version>
</dependency>
</pre>
Latest version can be found here.

Before running the example above, we will temporarily add a new classpath where "src\main\java" folder is located in zip4j folder. Command syntax:
set classpath=[root]:\[path];
e.g.
set classpath=C:\test\zip4j-2.10.0-master\src\main\java;
Once the new classpath is added, we can execute the example above. Once we close cmd/terminal, number of classpaths in our system will return to normal.

Create a Zip File or Add a File to a Zip File

First off, let's create a zip file and add a single file in it.

View code with code highlight


ZipFile has add*() methods that can be used to create a zip; add a files/folder to a zip file; extract and remove files from zip file. Initializing a ZipFile instance doesn't create a new zip file.

addFile(File fileToAdd) Adds input source file to the zip file with default zip parameters. If zip file does not exist, this method creates a new zip file. This method throws an exception if the file to be added doesn't exist.

ZipParameters Encapsulates the parameters that that control how Zip4J encodes data.

I think closing a ZipFile instance is not necessary because I think the stream that is used by ZipFile is automatically closed. I'm not really sure though that's why I use try-finally clause. Although, in the documentation, the examples there regarding ZipFile don't use try-finally clause.

We can use addFiles(List<File> filesToAdd) to add multiple files to a zip file. This method adds the list of input files to the zip file with default zip parameters. Example:
import java.util.Arrays;
...
ZipFile zip = null;
...
zip = new ZipFile("myzip.zip");
zip.addFiles(Arrays.asList(
new File("img1.jpg"),
new File("img2.jpg")));
...
This method throws an exception if one of the files in the list doesn't exist. We can use addFolder(File folderToAdd) to add a directory and all of its content to our zip file. This method adds the folder in the given file object to the zip file with default zip parameters. Example:
...
zip = new ZipFile("myzip.zip");
zip.addFolder(new File("folder1/folderA"));
//valid in windows
//new File("folder1\\folderA");
...
If we want to filter files that can be put in a zip file, we can use setExcludeFileFilter(ExcludeFileFilter excludeFileFilter) method from ZipParameters. Example:

View code with code highlight


In the example above, a file with .JPEG file extension won't be included in the zip file. ZipParameters() creates a ZipParameters instance with default parameters. ExcludeFileFilter is a functional interface.

Create a Zip File or Add a File to a Zip File Using a Stream

If you need to use an input stream to add data to your zip file, you can use addStream(InputStream inputStream, ZipParameters parameters). For example:

View code with code highlight


setFileNameInZip(String fileNameInZip) sets the name of the file where the stream data will be stored. It's required to set the name of the destination file if we're using streams to put data to our zip file. The file extension of the destination file should be equivalent to the intended file extension of the stream data. The path name must be relative and use "/" forward slash as directory separator.

Create a Zip File with STORE Compression Mode

There are two types of compression that are available to zip4j: DEFLATE and STORE. DEFLATE uses Deflate compression algorithm. DEFLATE is the default compression algorithm used by zip4j.

STORE denotes uncompressed zip file. This method just put files in a zip file without any compression. This example demonstrates using STORE compression mode.

View code with code highlight


In the example above, "folder" and all of its content use STORE compression method whereas "img.jpg" uses the default compression method which is DEFLATE. seCompressionMethod sets the ZIP compression method. CompressionMethod is an enum class that contains compression methods.

There are three compression methods in this enum. However, we can only use two because "AES_INTERNAL_ONLY" is for internal use only.

Create a Password Protected Zip File

We can also create a password protected zip file using zip4j library. EncryptionMethod is an enum class that contains encryption methods. There are three encryption methods that we can use. In this example I'm gonna use AES encryption method. This example demonstrates creating a password protected zip file.

View code with code highlight


ZipFile(String zipFile, char[] password) Creates a new ZipFile instance with the zip file at the location specified in zipFile parameter. password parameter is the password of our zip file.

setEncryptFiles(boolean encryptFiles) Set the flag indicating that files are to be encrypted. We need to invoke this method in order to enable/disable zip encryption. Once the zip encryption is enabled, we add an encryption method. setEncryptionMethod(EncryptionMethod encryptionMethod) sets the encryption method used to encrypt files.

setAesKeyStrength(AesKeyStrength aesKeyStrength) sets the key strength of the AES encryption key. AesKeyStrength is an enum class that contains AES encryption key length.

There are three available key lengths that we can use. However, KEY_STRENGTH_256 is the best key length that we can use in zip4j. KEY_STRENGTH_128 is too low and KEY_STRENGTH_192 is supported only for extracting.

In the example above, "folder" is password protected inside zip file. However, "img.jpg" is not. add zp to the addFile argument-list to make "img.jpg" password protected. For example:
...
zip.addFile(new File("img.jpg"), zp);
...
If you didn't add the ZipParameters instance with setEncryptFiles(true) and setEncryptionMethod(EncryptionMethod.AES) to one of the add*() methods that you're gonna invoke, your zip file won't be password protected.

Create a Split Zip File

To store files in split zip file we can use createSplitZipFile to split files into multiple zip files/folders and createSplitZipFileFromFolder to split a folder into multiple zip files. Take a look at this example.

View code with code highlight


If we want to split files/folders, we need to put them in a single folder and invoke createSplitZipFileFromFolder method. Take a look at this example.

View code with code highlight


Now, let's take a look at the methods' forms:

createSplitZipFile(List<File> filesToAdd, ZipParameters parameters, boolean splitArchive, long splitLength)
createSplitZipFileFromFolder(File folderToAdd, ZipParameters parameters, boolean splitArchive, long splitLength)

filesToAdd parameter is the list of files that is gonna be added to our split zip file. folderToAdd is the folder that is gonna be added to our split zip file. parameters parameter consists of parameters that will be applied to a zip file.

splitArchive parameter is a flag that enables/disables split zip file mode. splitLength parameter is the split size in bytes. Note that zip file format has a minimum split size of 65536 bytes (64KB)(1024*64=65536). An exception will be thrown if we choose a split size lower than 64KB.

If we want to create a password protected split zip file, we instantiate a ZipParamaters instance and set the necessary parameters to create a password protected zip file.

View code with code highlight


Extracting Zip File

To extract all files in a zip file, we use extractAll method. Take a look at this example.

View code with code highlight


extractAll(String destinationPath) method one parameter. destinationPath is the destination directory. extractAll method has another form:

extractAll(String destinationPath, UnzipParameters unzipParameters)

We use this form if we're dealing with symbolic links. As of this writing, UnzipParameters is not well-documented. I guess this class refers to symbolic links extraction in a zip file. To extract a single file/directory in a zip file, we use extractFile method. Take a look at this example.

View code with code highlight


extractFile(String fileName, String destinationPath) has two parameters. filename parameter refers to the path in the zip entry. When referring to a zip entry, directory separator must be forward slash("/") and path must be relative. In zip entry, a file name with "/" in the path denotes a directory.

Folder extraction using extractFile method is available to version v2.6.0 and above. destinationPath is the destination path of extracted file. Remember that the file type in destination path must be a directory/folder. Java will create destination directory if it doesn't exist.

If we want to extract a single file and give it a new name once it's extracted, we use this form of extractFile method.

extractFile(String fileName, String destinationPath, String newFileName)

For example:
...
ZipFile zip = new ZipFile
("myzip.zip", password.toCharArray());
zip.extractFile("img.jpg", "extracted", "image.jpg");
...
fileName parameter is the path name of the file in the zip file. destinationPath parameter is the destination directory. newFileName parameter is the new name of the file in the zip file once it's extracted.

Take note that the path in fileName parameter should follow zip specification. It means that the directory separator must be "/" and the path must be relative.

If we want to stream file data in a zip entry, we can get an input stream for an entry. With this, we can read data from the input stream and write the data in an output stream. To do this, we use getInputStream(FileHeader fileHeader) method. For example, we want to get the bytes of an image.

View code with code highlight


FileHeader is a class that contains file headers of a zip entry. getFileHeader(String fileName) returns FileHeader of a zip entry if a file header with the given path equivalent to fileName parameter exists in the zip model. Otherwise, returns null.

Take note that the path in fileName parameter should follow zip specification. It means that the directory separator must be "/" and the path must be relative.

extractFile has other forms that you can check them out in the documentation.

If we want to extract a password-protected zip file, we use one of ZipFile constructors:
ZipFile(File zipFile, char[] password)
ZipFile(String zipFile, char[] password)
Example:
...
ZipFile zip = new ZipFile
("myzip.zip", password.toCharArray());
zip.extractAll("destination-dir");
...
Using extractFile method.
...
ZipFile zip = 
new ZipFile
("myzip.zip", password.toCharArray());
zip.extractFile("myfile.txt", "destination-dir");
...

Rename Zip Entry

To rename a file in a zip entry, we can use renameFile(String fileNameToRename, String newFileName) method from ZipFile class.

View code with code highlight


We can use renameFile method to move an entry to another directory entry. For example:
...
zip.renameFile("image1.jpg", "folder/image1.jpg");
...
In the example above, "image1.jpg" will be moved to "folder" directory entry. We can also move and rename file at the same time. For example:
...
zip.renameFile("image1.jpg", "folder/moved-image1.jpg");
...
In the example above, "image1.jpg" will be moved to "folder" directory entry and will be renamed as "moved-image1.jpg". If the directory where a file is going to be moved doesn't exist, java will create one and place the file there.

If we want to rename multiple files by using renameFiles(Map<String,String> fileNamesMap) method.

View code with code highlight


A map consists of key-value pairs. In the example above, the keys are the current path name of entries and the values are the new path name of entries.

Note that zip entries can have equivalent file paths. If we rename a file in a zip file, all zip entries that have file names that are equivalent to the target file will be renamed. Also, we can rename a directory. Renaming a directory in a zip entry will update all file paths of entries in the directory.

Note that entry paths should follow zip specification. It means that the directory separator must be "/" and the path must be relative. Zip file format does not allow modifying split zip files, and Zip4j will throw an exception if an attempt is made to rename files in a split zip file.

Remove Zip Entry

If we want to remove an entry from a zip file, we can use removeFile(String fileName) method.

View code with code highlight



If we want to check if the file that we wanna remove exists in a zip file, we can get a FileHeader instance from a zip entry and check if the instance is null or not. If it's null, the file that we wanna delete doesn't exist in the zip file.

View code with code highlight


In the example above, we use another form of removeFile method which is removeFile(FileHeader fileHeader)

If we want to remove multiple files using a single method, we use removeFiles(List<String> fileNames) method. Since v2.5.0 of zip4j, we can include a directory in the fileNames list and all of its content will be removed. This example demonstrates removeFiles method.

View code with code highlight


Working with ZipInputStream and ZipOutputStream

If we want more control on how we compress/extract zip files, we can use ZipInputStream and ZipOutputStream instead of ZipFile class. ZipInputStream and ZipOutputStream in Zip4j is closely similar to ZipInputStream and ZipOutputStream in java.util.zip package.

If you're not familiar with ZipInputStream and ZipOutputStream, you should read this blogpost that I've created. The blogpost contains tutorial about java.util.zip package.

One of the differences between ZipInputStream and ZipOutputStream of java.util.zip package and Zip4j is that the zip input and output streams of Zip4j has constructors that supports password protected zip. java.util.zip package doesn't support password protected zip files. This example demonstrates creating a password-protected zip file using ZipOutputStream of Zip4j.

View code with code highlight


Next, this example demonstrates extracting password-protected zip file using ZipInputStream of Zip4j.

View code with code highlight


One of the differences between FileHeader and LocalFileHeader is that FileHeader consists of general-purpose zip headers whereas LocalFileHeader consists of headers that are local from an entry.

ProgressMonitor

If we want to monitor progress of a single action, we can use ProgressMonitor. This class can monitor the progress of some methods from ZipFile class such as addFolder, addFiles, removeFiles and extractFiles. This example demonstrates ProgressMonitor.

View code with code highlight


Take note that this is just a demonstration. That's why the result is not very pretty. We need to put more time and effort on the example above to make a pretty result. Also take note that ProgressMonitor instance from ZipFile class may not be thread-safe. Therefore, proceed with caution when you want multiple threads to access ProgressMonitor instance from ZipFile class.

Alright, let's discuss the example above. First off, we need to invoke setRunInThread(boolean runInThread) method and set its flag to true.

This enables a background thread that monitors some actions happening in ZipFile class. setRunInThread is used in conjunction with ProgressMonitor. Thus, we need to get a ProgressMonitor instance from ZipFile to manage the progress of a task in ZipFile.

To do that, we use getProgressMonitor method. This method returns a ProgressMonitor instance from a ZipFile instance.

ProgressMonitor monitors results and tasks of an action. ProgressMonitor.State has two states: BUSY and READY. READY means that ProgressMonitor is idle and ready to monitor an action. BUSY means that ProgressMonitor is already monitoring an action.

ProgressMonitor.Task contains constants that denote tasks that may occur during compression and extraction. ProgressMonitor.Result contains constants that denote the result of an operation in ZipFile class.

getPercentDone returns the progress of an action in percentage form. getFileName method from ProgressMonitor class returns the absolute path of a file being processed in our file system. getCurrentTask method returns ProgressMonitor.Task task that is currently monitored.

Some Helpful Methods of ZipFile Class

ZipFile class has some helpful methods that can come in handy.

isSplitArchive() returns true if a zip file is a split zip file. Otherwise, returns false;
...
ZipFile zip = new ZipFile("myzip.zip");
...
System.out.println(zip.isSplitArchive());
...
getSplitZipFiles() returns a list of split zip files.
...
ZipFile zip = new ZipFile("myzip.zip");
...
if(zip.isSplitArchive())
  List<File> splitZip = zip.getSplitZipFiles();
...
mergeSplitFiles(File outputZipFile) Merges split zip files into a single zip file without the need to extract the files in the archive. This method doesn't delete the split zip file.
...
ZipFile zip = new ZipFile("myzip.zip");
...
if(zip.isSplitArchive())
  zip.mergeSplitFiles(new File("merged.zip"));
...
isEncrypted() Checks to see if the zip file is encrypted.
...
ZipFile zip = new ZipFile("myzip.zip");
...
System.out.println(zip.isEncrypted());
...
isValidZipFile() Checks to see if the input zip file is a valid zip file. Note this method only checks for the validity of the headers and not the validity of each entry in the zip file.
 ...
ZipFile zip = new ZipFile("myzip.zip");
...
System.out.println(zip.isValidZipFile());
... 
setComment(String comment) Sets comment for the Zip file. Note that the zip file must exist in our file system first before we can set comments on it.
 ...
ZipFile zip = new ZipFile("myzip.zip");
...
zip.setComment("Comment1" + "\n" + "Comment2");
... 
To remove a comment, use empty string "" as argument for setComment method. getComment() returns the comment set for the Zip file.

getFileHeaders() Returns the list of file headers in the zip file. We can use file headers to list all files in every entry of a zip file.
 ...
ZipFile zip = new ZipFile("myzip.zip");
...
List<FileHeader> fileHeaders = 
zip.getFileHeaders();
fileHeaders.stream().
forEach(fileHeader -> 
        System.out.println
        (fileHeader.getFileName()));
... 
ZipParameters

ZipParameters contains parameters that define the structure of a zip file. If we instantiate ZipParameters using its default constructor. Default values of parameters are gonna be used.

These are the default values of zip parameters.
CompressionMethod.DEFLATE
CompressionLevel.NORMAL
EncryptionMethod.NONE
AesKeyStrength.KEY_STRENGTH_256
AesVerson.Two
SymbolicLinkAction.INCLUDE_LINKED_FILE_ONLY
readHiddenFiles is true
readHiddenFolders is true
includeRootInFolder is true
writeExtendedLocalFileHeader is true
CompressionMethod.DEFLATE is the default compression method. We can change this value by calling setCompressionMethod(CompressionMethod compressionMethod). Refer to CompressionMethod class for compression method types.

CompressionLevel.NORMAL is the compression level. This parameter is only applicable to DEFLATE compression method. We can change this value by calling setCompressionLevel(CompressionLevel compressionLevel) method. Refer to CompressionLevel class for compression level types.

EncryptionMethod.NONE is the default encryption method. We change this value if we want to create a password-protected zip file. To change this value, we call setEncryptionMethod(EncryptionMethod encryptionMethod) method. Refer to EncryptionMethod class for encryption method types.

AesKeyStrength.KEY_STRENGTH_256 is the default key length of AES encryption method. To change this value, we call setAesKeyStrength(AesKeyStrength aesKeyStrength) method. Refer to AesKeyStrength for available AES key length.

AesVerson.Two is the default version of AES encryption method. To change this value, call setAesVersion(AesVersion aesVersion) method. Refer to AesVersion for AES versions.

SymbolicLinkAction.INCLUDE_LINKED_FILE_ONLY is the default action for symbolic links. To change this value, we call setSymbolicLinkAction(ZipParameters.SymbolicLinkAction symbolicLinkAction) method. Refer to ZipParameters.SymbolicLinkAction for actions for symbolic links.

readHiddenFiles parameter default value is true. To change this value, we call setReadHiddenFiles(boolean readHiddenFiles) method.

readHiddenFolders parameter default value is true. To change this value, we call setReadHiddenFolders(boolean readHiddenFolders) method.

includeRootInFolder parameter default value is true. To change this value, we call setIncludeRootFolder(boolean includeRootFolder) method. You can see the effect of this parameter if you compress a file in a directory. For example, you add this "folder/file.txt" to your zip file. If includeRootInFolder parameter is true, only "file.txt" will be included to your zip file.

writeExtendedLocalFileHeader parameter default value is true. To change this value, we call setWriteExtendedLocalFileHeader(boolean writeExtendedLocalFileHeader) method. I assume this parameter refers to extra field added to local file header. More information about extra fields can be found here.

No comments:

Post a Comment