Arbitrary file write during tarfile extraction¶
ID: py/tarslip Kind: path-problem Severity: error Precision: medium Tags: - security - external/cwe/cwe-022 Query suites: - python-security-extended.qls - python-security-and-quality.qls
Extracting files from a malicious tar archive without validating that the destination file path is within the destination directory can cause files outside the destination directory to be overwritten, due to the possible presence of directory traversal elements (
..) in archive paths.
Tar archives contain archive entries representing each file in the archive. These entries include a file path for the entry, but these file paths are not restricted and may contain unexpected special elements such as the directory traversal element (
..). If these file paths are used to determine an output file to write the contents of the archive item to, then the file may be written to an unexpected location. This can result in sensitive information being revealed or deleted, or an attacker being able to influence behavior by modifying unexpected files.
For example, if a tar archive contains a file entry
..\sneaky-file, and the tar archive is extracted to the directory
c:\output, then naively combining the paths would result in an output file path of
c:\output\..\sneaky-file, which would cause the file to be written to
Ensure that output paths constructed from tar archive entries are validated to prevent writing files to unexpected locations.
The recommended way of writing an output file from a tar archive entry is to check that
".." does not occur in the path.
In this example an archive is extracted without validating file paths. If
archive.tar contained relative paths (for instance, if it were created by something like
tar -cf archive.tar ../file.txt) then executing this code could write to locations outside the destination directory.
import tarfile with tarfile.open('archive.zip') as tar: #BAD : This could write any file on the filesystem. for entry in tar: tar.extract(entry, "/tmp/unpack/")
To fix this vulnerability, we need to check that the path does not contain any
".." elements in it.
import tarfile import os.path with tarfile.open('archive.zip') as tar: for entry in tar: #GOOD: Check that entry is safe if os.path.isabs(entry.name) or ".." in entry.name: raise ValueError("Illegal tar archive entry") tar.extract(entry, "/tmp/unpack/")