Not all files are created equal. At the most fundamental level, every file on your computer is just bytes – but what those bytes mean is what separates a program from a photo.
Text Files vs Binary Files
Text files like JSON, Python scripts, and HTML are human-readable. Every byte maps to a printable character. You can open them in any text editor and understand them directly.
Binary files – executables, images, databases – store raw structured data. Bytes represent integers, compressed data, memory addresses, and checksums. Printing them to a terminal produces garbled noise, because terminals expect text, not raw bytes.
What Makes a Binary Executable?
Two things must be true for an OS to run a file:
- The execute permission bit must be set (
chmod +x) - Magic bytes at the start must identify a known executable format — on Linux,
\x7FELF; on Windows,MZ
The kernel reads those first bytes before doing anything else. A PNG or PDF will be rejected with an "Exec format error" regardless of permissions.
The Shebang Bridge
Python scripts are text, yet they run. The trick is the shebang (#!/usr/bin/env python3) on line one. The kernel recognises #! as its own magic number and hands the file off to the named interpreter. The script never runs directly – Python does.
How to Inspect a File Yourself
You don’t have to guess – there are simple tools to peek inside any file:
file — identify the format
file peers.dat # peers.dat: data
file image.png # PNG image data, 800 x 600
file /bin/bash # ELF 64-bit LSB pie executable
The file command reads the magic bytes and tells you what it thinks the file is. If it returns data, it means no known magic number was found.
cat – dump raw contents
cat peers.dat
If your terminal fills with garbled symbols and strange characters, it’s binary. If you can read it, it’s text. Be warned — printing binary to a terminal can scramble your prompt; run reset to fix it.
xxd – view the actual bytes in hex
xxd peers.dat | head -20
xxd image.png | head -4
This is the safe way to inspect binary files. Each byte is shown as a hex value alongside any printable ASCII. The very first bytes reveal the magic number – look for 7f 45 4c 46 (ELF) or 89 50 4e 47 (PNG).
strings – extract readable text from a binary
strings peers.dat
Pulls out any human-readable sequences embedded in the file – useful for spotting version numbers, paths, or format identifiers hidden inside a binary blob.
MIME Types
A MIME type is a standardised label that describes what kind of data a file contains. Originally invented for email attachments, it is now used across the web, operating systems, and APIs. It always follows a two-part format: a type and a subtype, like image/png, text/html, or application/json.
MIME types solve the same problem as magic numbers but from a different angle. Magic numbers are baked into the file itself. MIME types are declared externally, by a server, an email client, or an application passing the file around. When a web server sends a file to your browser, it includes a Content-Type header with the MIME type so the browser knows whether to render it as an image, play it as audio, or display it as text.
Finding the MIME Type of a File
The file command can return a MIME type directly using the -i flag:
file -i image.png
# image.png: image/png; charset=binary
file -i peers.dat
# peers.dat: application/octet-stream
application/octet-stream is the catch-all MIME type for an unknown binary file, the same way data is the catch-all for the plain file command. Both are powered by the same magic number database under the hood, just expressed in different formats.
Key Takeaway
File extensions are cosmetic. What really matters is the bytes inside – magic numbers identify format, and the execute bit grants permission. The OS trusts content over naming, every time.