Today’s article demonstrates how to create a tar.gz file in a single pass in Java. While there’s number of websites that provide instructions for creating a gzip or tar archive via Java, there aren’t any that will tell you how to make a tar.gz file without performing the same operations twice.
Reviewing Tar and Gzip Compression
First, download the Apache Commons Compression library. It is actually a subset of the code found in the Ant Jar for those performing compression operations that do not require all of Ant’s many features. Below is the code to create a tar and gzip archive, respectively, using the Compression library.
1 2 3 4 5 6 7 8 9 | TarArchiveOutputStream out = null ; try { out = new TarArchiveOutputStream( new BufferedOutputStream( new FileOutputStream( "myFile.tar" ))); // Add data to out and flush stream ... } finally { if (out != null ) out.close(); } |
1 2 3 4 5 6 7 8 9 | GZIPOutputStream out = null ; try { out = new GZIPOutputStream( new BufferedOutputStream( new FileOutputStream( "myFile.tar" ))); // Add data to out and flush stream ... } finally { if (out != null ) out.close(); } |
One subtlety in this example is that we use a BufferedOutputStream on the file stream for performance reasons. Often, archive files are large so that buffering the output is desirable. Another good practice is to always close your resources in a finally block after you are done with them.
The Solution
The solution is to wrap the tar stream around a gzip stream, since the order of writing goes inward from outer most to inner most stream. The code below first creates a tar archive, then compresses it inside a gzip stream. Buffering is applied and the result is written to disk.
1 2 3 4 5 6 7 8 9 10 | TarArchiveOutputStream out = null ; try { out = new TarArchiveOutputStream( new GZIPOutputStream( new BufferedOutputStream( new FileOutputStream( "myFile.tar.gz" )))); // Add data to out and flush stream ... } finally { if (out != null ) out.close(); } |
You can then treat the stream as a tar file using the TarArchiveEntry API to add entries and write data directly to the stream. The gzip compression will happen automatically as the stream is written.
https://truezip.dev.java.net/ is very nice too.
nice job