Managing Disk Space   «Prev  Next»
Lesson 5File archives
ObjectiveDescribe an archive.

File archives

One way of managing disk space is to create an archive. An archive is a set of files that are packaged into a single, larger file. An archive can be composed of a few files, one or more directories, or an entire directory tree. Archives are useful for making backup copies of your data. For example, you can store an archive[1] on a different computer or on removable media such as magnetic tape. An archive is easy to manage because you treat it as a single file. In addition, compressing an archive saves more space than compressing the same files individually. You may be familiar with archives in a Windows or Macintosh environment. Programs such as WinZip or pkzip create archives that end with a .zip extension. You might hear these archives referred to as zip files.
On UNIX, you create archives by using the tar command, which is short for tape archive. tar was designed for archiving data to tape. You also can use tar to archive data to a file, which is often called a tar file. Tar files typically end with a .tar extension. It is not required, but this convention lets people identify the file as an archive. Zip files are automatically created in a compressed format, but tar files are not. If you want to compress a tar file, you must run the compress command separately. You can use the tar command to create an archive, to list the file names in an archive, or to extract[2] files from an archive. The next three lessons describe these tasks.

How "File Archives" are used in Unix

File archives in Unix are used to consolidate multiple files and directories into a single file for easier management, storage, and transfer. The primary utility for creating archives in Unix is the `tar` command, and compressed archives often combine `tar` with compression utilities like `gzip`, `bzip2`, or `xz`.
Here’s how file archives are used in Unix:
  1. Backup and Restore
    • Purpose: To create backups of files and directories.
    • Usage:
      • Create an archive:
        tar -cvf backup.tar /path/to/directory
                    
      • Extract from an archive:
        tar -xvf backup.tar
                    
  2. Compression and Decompression
    • Purpose: Reduce the size of an archive for storage or transmission.
    • Usage:
      • Create a compressed archive:
        tar -cvzf archive.tar.gz /path/to/files
                    
      • Extract a compressed archive:
        tar -xvzf archive.tar.gz
                    
  3. Software Distribution
    • Purpose: Distribute software packages as a single archive.
    • Usage:
      • Source code or binaries are packaged into .tar.gz or .tar.bz2 files for download and installation.
      • Example:
        tar -xvzf software.tar.gz
        cd software
        ./configure
        make
        sudo make install
                    
  4. File Organization
    • Purpose: Group related files for organization and versioning.
    • Usage:
      • Create an archive of log files for archiving:
        tar -cvf logs.tar /var/log/
                    
      • Move or store the archive for reference.
  5. Remote Transfers
    • Purpose: Simplify file transfers across systems.
    • Usage:
      • Use scp or rsync to transfer the archive:
        scp archive.tar.gz user@remote:/path/to/destination
                    
  6. Extracting Specific Files
    • Purpose: Extract only certain files from a large archive.
    • Usage:
      tar -xvf archive.tar path/to/file
              
  7. Inspecting Archive Contents
    • Purpose: View the contents of an archive without extracting.
    • Usage:
      tar -tvf archive.tar
              
  8. Incremental Backups
    • Purpose: Archive only files that have changed since the last backup.
    • Usage:
      • Using --newer or --listed-incremental options:
        tar --newer='2024-12-01' -cvf incremental_backup.tar /path/to/directory
                    
Tools Commonly Used with File Archives:
  1. gzip, bzip2, xz: For compression (.gz, .bz2, .xz).
  2. cpio: Another archiving tool, often used with pipelines.
  3. zip/unzip: For creating .zip archives, common for cross-platform compatibility.

File archives streamline handling large numbers of files, ensure efficient storage, and facilitate easy recovery, making them a fundamental tool in Unix systems.

Filesystem Types

Before any disk partition can be used, a filesystem must be built on it. When a filesystem is made, certain data structures are written to disk that will be used to access and organize the physical disk space into files. Table 6-5 lists the most important filesystem types available on the various systems we are considering.
Use AIX FreeBSD HP-UX Linux Solaris Tru64
Default local jfs or jfs2 ufs vxfs ext3, reiserfs ufs ufs or advfs
NFS nfs nfs nfs nfs nfs nfs
CD-ROM cdrfs cd9660 cdfs iso9660 hsfs cdfs
Swap not needed swap swap, swapfs swap swap not needed
DOS not supported msdos not supported msdos pcfs pcfs
/proc procfs procfs not supported procfs procfs procfs
RAM-based not supported mfs not supported ramfs, tmpfs tmpfs mfs
Other union union hfs ext2 cachefs cachefs
Table 6-5. Important filesystem types

Unix Filesystems: Moments from History
In the beginning, there was the System V filesystem, that is where we will start. This filesystem type once dominated System V–based operating systems. The superblock of standard System V filesystems contained information about currently available free space in the filesystem in addition to information about how the space in the filesystem is allocated. It held the number of
  1. free inodes and data blocks,
  2. the first 50 free inode numbers, and
  3. the addresses of the first 100 free disk blocks.

After the superblock came the inodes, followed by the data blocks. The System V filesystem was designed for storage efficiency. It generally used a small filesystem block size: 2K bytes or less. Traditionally, a block is the basic unit of disk storage;† all files consume space in multiples of the block size, and any excess space in the last block cannot be used by other files and is therefore wasted. If a filesystem has a lot of small files, a small block size minimizes waste. However, small block sizes are much less efficient when transferring large files.

System V filesystem

The System V filesystem type is obsolete at this point. It is still supported on some systems for backward compatibility purposes only. The BSD Fast File System (FFS) was designed to remedy the performance limitations of the System V filesystem. It supports filesystem block sizes of up to 64 KB. Because merely increasing the block size to this level would have had a horrendous effect on the amount of wasted space, the designers introduced a subunit to the block known as the fragment. While the block remains the I/O transfer unit, the fragment becomes the disk storage unit (although only the final chunk of a file can be a fragment). Each block may be divided into one, two, four, or eight fragments. Whatever its absolute performance status, the BSD filesystem is an unequivocal improvement over System V. For this reason, it was included in the System V.4 standard as the UFS filesystem type. This is its name on Solaris and Tru64 systems (as well as under FreeBSD). For a while, this filesystem dominated in the Unix arena.

In the next lesson, you will learn to create an archive.
[1]archive: An archive is a set of files that are packaged as a single, large file.
[2]extract: To extract files from an archive means to copy them out of an archive and onto the filesystem.

SEMrush Software 5 SEMrush Banner 5