How to create tar.gz in Java
In Java, we have ZipOutputStream
to create a zip file, and GZIPOutputStream
to compress a file using Gzip, but there is no official API to create a tar.gz
file.
In Java, we can use Apache Commons Compress (Still active in development) to create a .tar.gz
file.
<dependency>
<groupId>org.apache.commons</groupId>
<artifactId>commons-compress</artifactId>
<version>1.20</version>
</dependency>
Notes
The below code snippets will create a tar.gz
file.
import org.apache.commons.compress.archivers.tar.TarArchiveEntry;
import org.apache.commons.compress.archivers.tar.TarArchiveOutputStream;
import org.apache.commons.compress.compressors.gzip.GzipCompressorOutputStream;
//...
try (OutputStream fOut = Files.newOutputStream(Paths.get("output.tar.gz"));
BufferedOutputStream buffOut = new BufferedOutputStream(fOut);
GzipCompressorOutputStream gzOut = new GzipCompressorOutputStream(buffOut);
TarArchiveOutputStream tOut = new TarArchiveOutputStream(gzOut)) {
TarArchiveEntry tarEntry = new TarArchiveEntry(file,fileName);
tOut.putArchiveEntry(tarEntry);
// copy file to TarArchiveOutputStream
Files.copy(path, tOut);
tOut.closeArchiveEntry();
tOut.finish();
}
The below code snippets will decompress a .tar.gz
file.
import org.apache.commons.compress.archivers.ArchiveEntry;
import org.apache.commons.compress.archivers.tar.TarArchiveInputStream;
import org.apache.commons.compress.compressors.gzip.GzipCompressorInputStream;
//...
try (InputStream fi = Files.newInputStream(Paths.get("input.tar.gz"));
BufferedInputStream bi = new BufferedInputStream(fi);
GzipCompressorInputStream gzi = new GzipCompressorInputStream(bi);
TarArchiveInputStream ti = new TarArchiveInputStream(gzi)) {
ArchiveEntry entry;
while ((entry = ti.getNextEntry()) != null) {
// create a new path, remember check zip slip attack
Path newPath = filename(entry, targetDir);
//checking
// copy TarArchiveInputStream to newPath
Files.copy(ti, newPath);
}
}
1. Add two files to tar.gz
This example shows how to add two files into a tar.gz
file.
package com.mkyong.io.howto.compress;
import org.apache.commons.compress.archivers.tar.TarArchiveEntry;
import org.apache.commons.compress.archivers.tar.TarArchiveOutputStream;
import org.apache.commons.compress.compressors.gzip.GzipCompressorOutputStream;
import java.io.BufferedOutputStream;
import java.io.IOException;
import java.io.OutputStream;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.util.Arrays;
import java.util.List;
public class TarGzipExample1 {
public static void main(String[] args) {
try {
Path path1 = Paths.get("/home/mkyong/test/sitemap.xml");
Path path2 = Paths.get("/home/mkyong/test/file.txt");
Path output = Paths.get("/home/mkyong/test/output.tar.gz");
List<Path> paths = Arrays.asList(path1, path2);
createTarGzipFiles(paths, output);
} catch (IOException e) {
e.printStackTrace();
}
System.out.println("Done");
}
// tar.gz few files
public static void createTarGzipFiles(List<Path> paths, Path output)
throws IOException {
try (OutputStream fOut = Files.newOutputStream(output);
BufferedOutputStream buffOut = new BufferedOutputStream(fOut);
GzipCompressorOutputStream gzOut = new GzipCompressorOutputStream(buffOut);
TarArchiveOutputStream tOut = new TarArchiveOutputStream(gzOut)) {
for (Path path : paths) {
if (!Files.isRegularFile(path)) {
throw new IOException("Support only file!");
}
TarArchiveEntry tarEntry = new TarArchiveEntry(
path.toFile(),
path.getFileName().toString());
tOut.putArchiveEntry(tarEntry);
// copy file to TarArchiveOutputStream
Files.copy(path, tOut);
tOut.closeArchiveEntry();
}
tOut.finish();
}
}
}
Output – It adds sitemap.xml
and file.txt
into one archive file output.tar
and compress it using Gzip, and the result is a output.tar.gz
$ tar -tvf /home/mkyong/test/output.tar.gz
-rw-r--r-- 0/0 396719 2020-08-12 14:02 sitemap.xml
-rw-r--r-- 0/0 15 2020-08-11 17:56 file.txt
2. Add a directory to tar.gz
This example adds a directory, including its sub-files and sub-directories into one archive file and Gzip compress it into a .tar.gz
The idea is to use Files.walkFileTree
to walk a file tree and add the file one by one into the TarArchiveOutputStream
.
package com.mkyong.io.howto.compress;
import org.apache.commons.compress.archivers.tar.TarArchiveEntry;
import org.apache.commons.compress.archivers.tar.TarArchiveOutputStream;
import org.apache.commons.compress.compressors.gzip.GzipCompressorOutputStream;
import java.io.BufferedOutputStream;
import java.io.IOException;
import java.io.OutputStream;
import java.nio.file.*;
import java.nio.file.attribute.BasicFileAttributes;
public class TarGzipExample2 {
public static void main(String[] args) {
try {
// tar.gz a folder
Path source = Paths.get("/home/mkyong/test");
createTarGzipFolder(source);
} catch (IOException e) {
e.printStackTrace();
}
System.out.println("Done");
}
// generate .tar.gz file at the current working directory
// tar.gz a folder
public static void createTarGzipFolder(Path source) throws IOException {
if (!Files.isDirectory(source)) {
throw new IOException("Please provide a directory.");
}
// get folder name as zip file name
String tarFileName = source.getFileName().toString() + ".tar.gz";
try (OutputStream fOut = Files.newOutputStream(Paths.get(tarFileName));
BufferedOutputStream buffOut = new BufferedOutputStream(fOut);
GzipCompressorOutputStream gzOut = new GzipCompressorOutputStream(buffOut);
TarArchiveOutputStream tOut = new TarArchiveOutputStream(gzOut)) {
Files.walkFileTree(source, new SimpleFileVisitor<>() {
@Override
public FileVisitResult visitFile(Path file,
BasicFileAttributes attributes) {
// only copy files, no symbolic links
if (attributes.isSymbolicLink()) {
return FileVisitResult.CONTINUE;
}
// get filename
Path targetFile = source.relativize(file);
try {
TarArchiveEntry tarEntry = new TarArchiveEntry(
file.toFile(), targetFile.toString());
tOut.putArchiveEntry(tarEntry);
Files.copy(file, tOut);
tOut.closeArchiveEntry();
System.out.printf("file : %s%n", file);
} catch (IOException e) {
System.err.printf("Unable to tar.gz : %s%n%s%n", file, e);
}
return FileVisitResult.CONTINUE;
}
@Override
public FileVisitResult visitFileFailed(Path file, IOException exc) {
System.err.printf("Unable to tar.gz : %s%n%s%n", file, exc);
return FileVisitResult.CONTINUE;
}
});
tOut.finish();
}
}
}
3. Add String to tar.gz
This example adds String
into a ByteArrayInputStream
and put it into the TarArchiveOutputStream
directly. It means to create a file without saving it into the local disk and put the file into the tar.gz
directly.
public static void createTarGzipFilesOnDemand() throws IOException {
String data1 = "Test data 1";
String fileName1 = "111.txt";
String data2 = "Test data 2 3 4";
String fileName2 = "folder/222.txt";
String outputTarGzip = "/home/mkyong/output.tar.gz";
try (OutputStream fOut = Files.newOutputStream(Paths.get(outputTarGzip));
BufferedOutputStream buffOut = new BufferedOutputStream(fOut);
GzipCompressorOutputStream gzOut = new GzipCompressorOutputStream(buffOut);
TarArchiveOutputStream tOut = new TarArchiveOutputStream(gzOut)) {
createTarArchiveEntry(fileName1, data1, tOut);
createTarArchiveEntry(fileName2, data2, tOut);
tOut.finish();
}
}
private static void createTarArchiveEntry(String fileName,
String data,
TarArchiveOutputStream tOut)
throws IOException {
byte[] dataInBytes = data.getBytes();
// create a byte[] input stream
ByteArrayInputStream baOut1 = new ByteArrayInputStream(dataInBytes);
TarArchiveEntry tarEntry = new TarArchiveEntry(fileName);
// need defined the file size, else error
tarEntry.setSize(dataInBytes.length);
// tarEntry.setSize(baOut1.available()); alternative
tOut.putArchiveEntry(tarEntry);
// copy ByteArrayInputStream to TarArchiveOutputStream
byte[] buffer = new byte[1024];
int len;
while ((len = baOut1.read(buffer)) > 0) {
tOut.write(buffer, 0, len);
}
tOut.closeArchiveEntry();
}
4. Decompress file – tar.gz
This example shows how to decompress and extract a tar.gz
file, and it also checks the zip slip vulnerability.
package com.mkyong.io.howto.compress;
import org.apache.commons.compress.archivers.ArchiveEntry;
import org.apache.commons.compress.archivers.tar.TarArchiveInputStream;
import org.apache.commons.compress.compressors.gzip.GzipCompressorInputStream;
import java.io.BufferedInputStream;
import java.io.IOException;
import java.io.InputStream;
import java.nio.file.Files;
import java.nio.file.Path;
import java.nio.file.Paths;
import java.nio.file.StandardCopyOption;
public class TarGzipExample4 {
public static void main(String[] args) {
try {
// decompress .tar.gz
Path source = Paths.get("/home/mkyong/test/output.tar.gz");
Path target = Paths.get("/home/mkyong/test2");
decompressTarGzipFile(source, target);
} catch (IOException e) {
e.printStackTrace();
}
System.out.println("Done");
}
public static void decompressTarGzipFile(Path source, Path target)
throws IOException {
if (Files.notExists(source)) {
throw new IOException("File doesn't exists!");
}
try (InputStream fi = Files.newInputStream(source);
BufferedInputStream bi = new BufferedInputStream(fi);
GzipCompressorInputStream gzi = new GzipCompressorInputStream(bi);
TarArchiveInputStream ti = new TarArchiveInputStream(gzi)) {
ArchiveEntry entry;
while ((entry = ti.getNextEntry()) != null) {
// create a new path, zip slip validate
Path newPath = zipSlipProtect(entry, target);
if (entry.isDirectory()) {
Files.createDirectories(newPath);
} else {
// check parent folder again
Path parent = newPath.getParent();
if (parent != null) {
if (Files.notExists(parent)) {
Files.createDirectories(parent);
}
}
// copy TarArchiveInputStream to Path newPath
Files.copy(ti, newPath, StandardCopyOption.REPLACE_EXISTING);
}
}
}
}
private static Path zipSlipProtect(ArchiveEntry entry, Path targetDir)
throws IOException {
Path targetDirResolved = targetDir.resolve(entry.getName());
// make sure normalized file still has targetDir as its prefix,
// else throws exception
Path normalizePath = targetDirResolved.normalize();
if (!normalizePath.startsWith(targetDir)) {
throw new IOException("Bad entry: " + entry.getName());
}
return normalizePath;
}
}
Further Reading
Please check the official Apache Commons Compress examples.
Download Source Code
$ git clone https://github.com/mkyong/core-java
$ cd java-io
Hi and thank you for this solution!
Just one note:
I used your example for “1. Add two files to tar.gz”
Even though it works and produces the intended file, when I tried to open the produced file, I kept getting the following error:
tar: Unexpected EOF in archive
tar: Error is not recoverable: exiting now
What is needed is this:
after: tOut.finish();
this line must be added: tOut.close();
(TarArchiveOutputStream must be closed)
I don’t know if anybody else had this problem, but I’m mentioning it, just in case someone did..
Thanks again!!!
Hi Team
We have an issue where sometime our tar file created from spring boot code is showing issue when we untar and an additional PAX file is getting added and hence that file is getting rejected by the Agencies. How can we create a tar file without this PAX file getting inside it automatically.
How would one add a file to an existing tar.gz?