File size is a measure of how much data a computer file contains or, alternately, how much storage it consumes. Typically, file size is expressed in units of measurement based on the byte. By convention, file size units use either a metric prefix (as in megabyte and gigabyte) or a binary prefix (as in mebibyte and gibibyte).[1]
When a file is written to a file system, which is the case in most modern devices, it may consume slightly more disk space than the file requires. This is because the file system rounds the size up to include any unused space left over in the last disk sector used by the file. (A sector is the smallest amount of space addressable by the file system. The size of a disk sector ranges from several hundred to several thousand bytes.) The unused space is called slack space or internal fragmentation.[2] Although smaller sector sizes allow for denser use of disk space, they decrease the operational efficiency of the file system.
Maximum size
The maximum file size a file system supports depends not only on the capacity of the file system, but also on the number of bits reserved for the storage of file size information. The maximum file size in the FAT32 file system, for example, is 4,294,967,295 bytes, which is one byte less than four gigabytes.[3] The table below details the maximum file size for a number of common or historical file systems.
File system | Maximum size[lower-alpha 1] |
---|---|
APFS | 8 EB |
exFAT | 16 EB - 1 byte |
FAT12 | 16 MB (4 KB clusters) or 32 MB (8 KB clusters) |
FAT16B | 2 GB (without LFS) or 4 GB - 1 byte (with LFS) |
FAT32 | 4 GB - 1 byte |
HFS | 2 GB |
HFS+ | 8 EB |
HPFS | 2 GB |
NTFS | 16 EB - 1 KB |
Btrfs | 16 EB |
Units of information
Bytes are the typical base unit of information. Larger files will typically have their sizes expressed using kilobyte, megabyte or gigabyte depending upon how large the file is. While these larger units are not as accurate as the byte size, most operating systems will expose the true byte size of a file by inspecting the file properties directly. Command line tools can also expose the exact byte size as well.
A file system may display all sizes with the metric system with only 'kB' on small files indicating it, while some file systems/operating systems would display sizes in, the traditionally used on computers, binary system for all sizes, e.g. 'KB', while hard disk manufacturers use the metric system (for e.g. GB = 1,000,000,000 bytes and TB = 1000 GB).
Kilobyte (KB) (JEDEC), is sometimes referred to unambiguously as kibibyte (KiB)(IEC). Sometimes kB, with lower cased SI-prefix 'k-' for kilo (1000), is used, then always equaling 1000 bytes.
File transfers (e.g. "downloads") may use rates of units of bytes (e.g. MB/s) in binary rather than metric system, while networking hardware, such as WiFi, always uses the metric system (Mbit/s, Gbit/s etc.). of units of bits (and it needs to send more than the files themselves, so some overhead needs to be factored in), making superficially similar terms very incompatible.
See also
Notes
- ↑ Based on the format standard, individual implementations may have different limits. See respective file system article for details.
References
- ↑ JEDEC Solid State Technology Association (November 2019). "Terms, Definitions, and Letter Symbols for Microprocessors, and Memory Integrated Circuits". JESD 100B.01. p. 8. Retrieved 2009-04-05.
- ↑ "What is Slack Space?". IT Pro. 2010-01-19. Retrieved 2018-02-17.
- ↑ "Microsoft Extensible Firmware Initiative FAT32 File System Specification, FAT: General Overview of On-Disk Format". Microsoft. 2000-12-06. Retrieved 2011-07-03.