Security Fundamentals September 15, 2025 10 min read By Brickell Technologies

Encoding vs Encryption vs Hashing: A Security Practitioner's Guide

These three concepts are constantly confused - even by developers who should know better. This guide breaks down what each one does, when to use it, and the security implications of choosing wrong.

In security assessments, we routinely find critical vulnerabilities caused by confusing these three concepts. Passwords stored with Base64 "encryption." Sensitive data "protected" with MD5 hashing. JWT tokens with the algorithm set to "none." These aren't obscure edge cases - they're common mistakes born from fundamental misunderstanding.

Let's fix that. Here's the definitive breakdown of encoding, encryption, and hashing - what they are, how they work, and when to use each.

The TL;DR Comparison

Property	Encoding	Encryption	Hashing
Purpose	Data format transformation	Confidentiality (hiding data)	Integrity verification
Reversible?	Yes, by anyone	Yes, with the key	No (one-way)
Uses a key?	No	Yes	No (but can be keyed: HMAC)
Security goal?	None	Confidentiality	Integrity, password storage
Same input = same output?	Yes	Depends (IV/nonce)	Yes

Encoding: Format Transformation, Not Security

Encoding in One Sentence

Encoding transforms data into a different format for compatibility or transmission - it provides zero security because anyone can decode it without any secret.

Encoding exists to solve format problems, not security problems. When you need to transmit binary data through a text-only channel (like email or URLs), you encode it. When you need to represent special characters safely in HTML, you encode them. When you need to store binary in JSON, you encode it.

Common Encoding Schemes

Base64, Converts binary to ASCII text using 64 characters (A-Z, a-z, 0-9, +, /). Output is ~33% larger than input. Used for embedding images in HTML, email attachments (MIME), and data URLs.

Base64 Example
# Encoding
$ echo -n "Hello, World!" | base64
SGVsbG8sIFdvcmxkIQ==

# Decoding
$ echo "SGVsbG8sIFdvcmxkIQ==" | base64 -d
Hello, World!

# Python
import base64
encoded = base64.b64encode(b"Hello, World!")  # b'SGVsbG8sIFdvcmxkIQ=='
decoded = base64.b64decode(encoded)            # b'Hello, World!'

URL Encoding (Percent Encoding), Represents unsafe characters in URLs as %XX hexadecimal. Spaces become %20 or +. Required for query parameters containing special characters.

URL Encoding Example
# Original
Hello World & Goodbye

# URL Encoded
Hello%20World%20%26%20Goodbye

# Python
from urllib.parse import quote, unquote
encoded = quote("Hello World & Goodbye")  # 'Hello%20World%20%26%20Goodbye'
decoded = unquote(encoded)                 # 'Hello World & Goodbye'

HTML Encoding, Represents characters that have special meaning in HTML. Prevents the browser from interpreting user input as markup.

HTML Encoding Example
# Original (XSS payload)
<script>alert('xss')</script>

# HTML Encoded (safe to display)
&lt;script&gt;alert('xss')&lt;/script&gt;

# Key entities:
# <   becomes  &lt;
# >   becomes  &gt;
# &   becomes  &amp;
# "    becomes  &quot;
# '    becomes  &#x27;

Hex Encoding, Represents each byte as two hexadecimal characters. Common in debugging, shellcode, and low-level data representation.

Hex Encoding Example
# ASCII "ABC" in hex
41 42 43

# Python
data = b"ABC"
hex_encoded = data.hex()           # '414243'
decoded = bytes.fromhex('414243')  # b'ABC'

Security Anti-Pattern

Base64 is not encryption. We regularly find applications that "protect" API keys, passwords, or sensitive configuration with Base64 encoding. This provides zero security - anyone can decode Base64 instantly without any key or secret. If you're Base64 encoding something to "hide" it, you're doing it wrong.

Encryption: Confidentiality Through Keys

Encryption in One Sentence

Encryption transforms readable data (plaintext) into unreadable data (ciphertext) using a key - only someone with the correct key can reverse the process.

Encryption provides confidentiality. It ensures that even if an attacker intercepts the data, they cannot read it without the key. Unlike encoding, encryption is specifically designed to resist unauthorized reversal.

Symmetric Encryption

Symmetric encryption uses the same key for encryption and decryption. It's fast and efficient for large amounts of data.

AES-256-GCM Encryption (Python)
from cryptography.hazmat.primitives.ciphers.aead import AESGCM
import os

# Generate a random 256-bit key (store this securely!)
key = os.urandom(32)  # 32 bytes = 256 bits

# Encrypt
plaintext = b"Sensitive data here"
nonce = os.urandom(12)  # 96-bit nonce for GCM
aesgcm = AESGCM(key)
ciphertext = aesgcm.encrypt(nonce, plaintext, None)

# Decrypt
decrypted = aesgcm.decrypt(nonce, ciphertext, None)
# decrypted == b"Sensitive data here"

# The ciphertext is meaningless without the key
# Even with the algorithm known, brute-forcing 256-bit AES is infeasible

Key symmetric algorithms:

AES-256-GCM, Current gold standard. Authenticated encryption provides confidentiality AND integrity. Use this.
ChaCha20-Poly1305, Modern alternative to AES. Faster in software without hardware acceleration. Used by WireGuard and TLS 1.3.
AES-256-CBC, Older mode. Requires separate MAC for integrity. Vulnerable to padding oracle attacks if implemented incorrectly.

Avoid Deprecated Algorithms

DES, 3DES, RC4, and Blowfish are obsolete. DES has a 56-bit key (breakable in hours). RC4 has known biases. If you see these in production code, they need to be replaced.

Asymmetric Encryption

Asymmetric encryption uses a key pair: a public key for encryption and a private key for decryption. Anyone can encrypt with your public key, but only you can decrypt with your private key.

RSA Encryption (Python)
from cryptography.hazmat.primitives.asymmetric import rsa, padding
from cryptography.hazmat.primitives import hashes

# Generate key pair
private_key = rsa.generate_private_key(
    public_exponent=65537,
    key_size=4096  # Use 4096 bits for long-term security
)
public_key = private_key.public_key()

# Encrypt with public key (anyone can do this)
plaintext = b"Secret message"
ciphertext = public_key.encrypt(
    plaintext,
    padding.OAEP(
        mgf=padding.MGF1(algorithm=hashes.SHA256()),
        algorithm=hashes.SHA256(),
        label=None
    )
)

# Decrypt with private key (only key owner)
decrypted = private_key.decrypt(
    ciphertext,
    padding.OAEP(
        mgf=padding.MGF1(algorithm=hashes.SHA256()),
        algorithm=hashes.SHA256(),
        label=None
    )
)

Key asymmetric algorithms:

RSA, The classic. Use 4096-bit keys for new applications. Slower than symmetric but solves the key distribution problem.
ECDH (Elliptic Curve Diffie-Hellman), Key exchange protocol. Used to establish shared secrets for symmetric encryption.
ECIES, Elliptic curve encryption scheme. Combines ECDH key exchange with symmetric encryption.

Hybrid Encryption

In practice, asymmetric and symmetric encryption are combined. Asymmetric encryption is slow and limited in message size. The common pattern: generate a random symmetric key, encrypt the data with it (AES), then encrypt the symmetric key with the recipient's public key (RSA). This gives you the best of both worlds.

Critical Encryption Concepts

Initialization Vector (IV) / Nonce: A random value used to ensure the same plaintext encrypts to different ciphertext each time. Without it, attackers can detect repeated messages. Never reuse an IV with the same key.

Authenticated Encryption: Modes like GCM and ChaCha20-Poly1305 verify that ciphertext hasn't been tampered with. Plain CBC mode doesn't - an attacker can flip bits in ciphertext and produce valid (but corrupted) plaintext. Always use authenticated encryption.

Key Management: The algorithm doesn't matter if your keys are poorly managed. Keys should never be hardcoded, committed to git, or stored alongside encrypted data. Use dedicated secrets management (HashiCorp Vault, AWS KMS, etc.).

Hashing: One-Way Fingerprints

Hashing in One Sentence

A hash function takes input of any size and produces a fixed-size output (digest) that cannot be reversed - even tiny input changes produce completely different outputs.

Hashing is fundamentally different from encryption: you cannot recover the original input from a hash. This isn't a limitation - it's the point. Hashes verify integrity and store passwords without keeping the actual password.

Hash Function Properties

Deterministic: Same input always produces same hash
One-way: Cannot derive input from hash (preimage resistance)
Collision resistant: Infeasible to find two inputs with the same hash
Avalanche effect: Small input change = completely different hash
Fixed output size: Regardless of input length

SHA-256 Hashing
import hashlib

# Same input = same hash (deterministic)
hash1 = hashlib.sha256(b"Hello").hexdigest()
hash2 = hashlib.sha256(b"Hello").hexdigest()
# hash1 == hash2 == '185f8db32271fe25f561a6fc938b2e264306ec304eda518007d1764826381969'

# Tiny change = completely different hash (avalanche effect)
hash3 = hashlib.sha256(b"hello").hexdigest()  # lowercase 'h'
# hash3 == '2cf24dba5fb0a30e26e83b2ac5b9e29e1b161e5c1fa7425e73043362938b9824'

# No way to reverse the hash back to "Hello"

Common Hash Functions

Algorithm	Output Size	Status	Use Case
MD5	128 bits	Broken	Checksums only (non-security)
SHA-1	160 bits	Broken	Legacy systems only
SHA-256	256 bits	Secure	General purpose, integrity
SHA-384/512	384/512 bits	Secure	High-security applications
SHA-3	Variable	Secure	Alternative to SHA-2
BLAKE2	Variable	Secure	Fast, modern applications

MD5 and SHA-1 Are Broken

Collision attacks against MD5 are trivial - you can generate colliding PDFs on a laptop. SHA-1 collision was demonstrated in 2017 (SHAttered attack). Never use MD5 or SHA-1 for security purposes. They're acceptable only for non-security checksums (file deduplication, cache keys).

Password Hashing: A Special Case

General-purpose hash functions (SHA-256, etc.) are too fast for passwords. An attacker with a GPU can compute billions of SHA-256 hashes per second. Password hashing algorithms are intentionally slow:

Password Hashing with bcrypt
import bcrypt

# Hash a password (slow by design)
password = b"user_password_123"
salt = bcrypt.gensalt(rounds=12)  # 2^12 iterations
hashed = bcrypt.hashpw(password, salt)
# b'$2b$12$LQv3c1yqBWVHxkd0LHAkCOYz6TtxMQJqhN8/X4.VTtYWBw6Ck1Wqa'

# Verify a password
if bcrypt.checkpw(password, hashed):
    print("Password matches")

# The salt is embedded in the hash - no separate storage needed
# Work factor (rounds=12) makes brute force infeasible

Password hashing algorithms:

Argon2id, Winner of the Password Hashing Competition (2015). Memory-hard, resistant to GPU/ASIC attacks. The current recommendation.
bcrypt, Battle-tested since 1999. CPU-hard with configurable work factor. Still excellent choice.
scrypt, Memory-hard. Designed to resist hardware attacks. Good but Argon2 is preferred.
PBKDF2, NIST approved. Uses HMAC with many iterations. Acceptable but weaker than memory-hard alternatives.

Modern Password Hashing with Argon2
from argon2 import PasswordHasher

ph = PasswordHasher(
    time_cost=3,        # Number of iterations
    memory_cost=65536,  # 64 MB memory usage
    parallelism=4       # Parallel threads
)

# Hash
hash = ph.hash("user_password_123")
# $argon2id$v=19$m=65536,t=3,p=4$c29tZXNhbHQ$...

# Verify
try:
    ph.verify(hash, "user_password_123")
    print("Password valid")
except:
    print("Invalid password")

HMAC: Keyed Hashing

HMAC (Hash-based Message Authentication Code) adds a secret key to the hash process. It provides integrity AND authenticity - only someone with the key can generate a valid HMAC.

HMAC Example
import hmac
import hashlib

key = b"shared_secret_key"
message = b"Data to authenticate"

# Generate HMAC
mac = hmac.new(key, message, hashlib.sha256).hexdigest()
# 'a4b9c8d7e6f5...'  (64 hex chars for SHA-256)

# Verify HMAC (constant-time comparison to prevent timing attacks)
expected_mac = "a4b9c8d7e6f5..."
if hmac.compare_digest(mac, expected_mac):
    print("Message is authentic and unmodified")

HMAC is used for API authentication (AWS Signature), JWT signatures, cookie integrity, and secure webhooks.

Common Mistakes and How to Fix Them

Mistake 1: Using Encoding for Security

Wrong
# "Hiding" an API key with Base64
api_key = base64.b64encode(b"sk_live_abc123xyz").decode()
# Config file contains: 'c2tfbGl2ZV9hYmMxMjN4eXo='
# This is NOT protected - anyone can decode it

Right
# Use environment variables or secrets management
import os
api_key = os.environ['API_KEY']

# Or encrypt with a key stored separately
from cryptography.fernet import Fernet
key = os.environ['ENCRYPTION_KEY']  # Stored in secrets manager
cipher = Fernet(key)
encrypted_api_key = cipher.encrypt(b"sk_live_abc123xyz")

Mistake 2: Using Fast Hashes for Passwords

Wrong
# SHA-256 is too fast for passwords
password_hash = hashlib.sha256(password.encode()).hexdigest()
# Attacker can try billions of guesses per second

Right
# Use a password-specific algorithm
import bcrypt
password_hash = bcrypt.hashpw(password.encode(), bcrypt.gensalt(rounds=12))
# Each guess takes ~250ms - billions of guesses = centuries

Mistake 3: Encrypting Without Authentication

Wrong
# AES-CBC without MAC - vulnerable to bit-flipping attacks
from Crypto.Cipher import AES
cipher = AES.new(key, AES.MODE_CBC, iv)
ciphertext = cipher.encrypt(pad(plaintext, 16))
# Attacker can modify ciphertext and produce valid (corrupted) plaintext

Right
# Use authenticated encryption (AES-GCM)
from cryptography.hazmat.primitives.ciphers.aead import AESGCM
aesgcm = AESGCM(key)
ciphertext = aesgcm.encrypt(nonce, plaintext, associated_data)
# Any tampering is detected during decryption

Mistake 4: Hardcoding Keys

Wrong
# Key in source code = key in git history forever
ENCRYPTION_KEY = "super_secret_key_12345"

Right
# Key from environment or secrets manager
import os
ENCRYPTION_KEY = os.environ.get('ENCRYPTION_KEY')

# Or from AWS Secrets Manager, HashiCorp Vault, etc.

Decision Flowchart

Use this to choose the right approach:

Which One Should I Use?

Need to transmit binary as text? → Encoding (Base64)
Need to hide data from unauthorized parties? → Encryption
Need to verify data hasn't changed? → Hashing (SHA-256)
Need to store passwords? → Password hashing (Argon2, bcrypt)
Need to verify data AND prove who sent it? → HMAC or digital signatures
Need to transmit data securely between parties? → TLS (which uses all of the above)

Conclusion

The confusion between encoding, encryption, and hashing leads to real vulnerabilities. Encoding provides no security. Encryption requires proper key management and authenticated modes. Hashing is one-way and requires special algorithms for passwords.

When in doubt:

Never use encoding for security
Use AES-256-GCM or ChaCha20-Poly1305 for encryption
Use SHA-256 or BLAKE2 for integrity checking
Use Argon2id or bcrypt for passwords
Never roll your own crypto

For security assessments that identify cryptographic weaknesses in your applications, contact Brickell Technologies.

Cryptography Encoding Encryption Hashing Security Fundamentals