mediumCWE-328

SHA-1 Broken Hash Function: Detection & Auto-Fix

SHA-1 is a cryptographic hash function that has been broken since 2017, when Google demonstrated a practical collision attack (SHAttered). Since then, chosen-prefix collision attacks have become feasible for under $100,000, making SHA-1 completely unsuitable for digital signatures, certificate validation, or integrity verification. Despite being deprecated by NIST, major browsers, and certificate authorities, SHA-1 continues to appear in codebases — particularly in AI-generated code. SHA-1 is still acceptable for non-security purposes like content-addressable storage (Git) or checksums where collision resistance is not required.

Why AI tools generate this vulnerability

AI Risk Factor

AI code assistants generate SHA-1 hashing code at an alarming rate because SHA-1 examples dominate their training data. When asked to "hash a string" or "generate a checksum," AI tools default to SHA-1 or MD5 instead of SHA-256 or SHA-3. The AI models do not distinguish between security-critical and non-security hashing contexts, so they suggest SHA-1 for password hashing, HMAC operations, and integrity verification — all contexts where collision resistance matters. In Node.js, `crypto.createHash('sha1')` is one of the most commonly generated crypto patterns by AI assistants.

Vulnerable code example

VULNERABLE
// Node.js — AI-generated vulnerable code
import { createHash } from "crypto";

// VULNERABLE: SHA-1 is broken — collisions practical
function hashPassword(password: string): string {
  return createHash("sha1")
    .update(password)
    .digest("hex");
}

// VULNERABLE: SHA-1 for integrity verification
function verifyFileIntegrity(
  file: Buffer, expectedHash: string
): boolean {
  const hash = createHash("sha1")
    .update(file)
    .digest("hex");
  return hash === expectedHash;
}

# Python — AI-generated vulnerable code
import hashlib

# VULNERABLE: SHA-1 for password hashing
def hash_password(password: str) -> str:
    return hashlib.sha1(
        password.encode()
    ).hexdigest()

Secure code example

SECURE
// Node.js — SHA-256 for integrity, bcrypt for passwords
import { createHash } from "crypto";
import bcrypt from "bcrypt";

// SECURE: bcrypt for password hashing (with salt)
async function hashPassword(
  password: string
): Promise<string> {
  return bcrypt.hash(password, 12);
}

// SECURE: SHA-256 for integrity verification
function verifyFileIntegrity(
  file: Buffer, expectedHash: string
): boolean {
  const hash = createHash("sha256")
    .update(file)
    .digest("hex");
  return hash === expectedHash;
}

# Python — SHA-256 and bcrypt (secure)
import hashlib
import bcrypt

# SECURE: bcrypt for password hashing
def hash_password(password: str) -> str:
    return bcrypt.hashpw(
        password.encode(), bcrypt.gensalt(12)
    ).decode()

# SECURE: SHA-256 for integrity checks
def verify_integrity(data: bytes) -> str:
    return hashlib.sha256(data).hexdigest()

How CodeShield detects this

CodeShield uses multi-layer static analysis to detect sha-1 broken hash function vulnerabilities across your entire codebase:

Detection of createHash('sha1') and createHash('md5') in Node.js code
hashlib.sha1() and hashlib.md5() usage in Python
MessageDigest.getInstance("SHA-1") and MessageDigest.getInstance("MD5") in Java
Digest::SHA1 and Digest::MD5 in Ruby
sha1() and md5() function calls in PHP

Affected languages

Scan for sha-1 broken hash function in your repos

CodeShield detects sha-1 broken hash function and 5+ other vulnerability types across your entire codebase. Auto-fix with AI in one click.

Scan Your Repos Free