Peer-Reviewed Phishing Detection Infrastructure
Born from NSF-funded research at Alabama A&M University's Cybersecurity Lab. Published in Springer. Presented at SAM'25 Las Vegas. Now available as a production API.
This infrastructure was developed under a National Science Foundation grant awarded to Alabama A&M University's Cybersecurity Laboratory, supporting rigorous academic research into email-based threat detection at scale.
The detection methodology and experimental results were peer-reviewed and published through Springer, one of the world's leading academic publishers, validating our multi-signal classification approach.
Presented at the International Conference on Security & Management (SAM'25) in Las Vegas, where the system was reviewed by security researchers and practitioners from academia and industry worldwide.
Real-time lookup against Google's phishing and malware URL database, updated continuously with newly discovered threats.
Community-verified phishing URL intelligence, cross-referenced against our scan requests for immediate threat identification.
Parallel lookups against Spamhaus, SURBL, and URIBL blocklists to catch known spam infrastructure and malicious domains instantly.
TF-IDF ensemble classifier trained on 80,000 emails, achieving 95% F1 score on holdout test sets across multiple phishing categories.
Deep inspection of SPF, DKIM, and DMARC authentication results, plus Reply-To spoofing and display name impersonation detection.
Shannon entropy scoring, homoglyph character substitution detection, redirect chain traversal, and typosquatting pattern matching.
Identifies Base64-encoded payloads, CSS-hidden content, misleading HTML comments, and other obfuscation techniques used to bypass filters.
WHOIS-based domain registration age lookup. Domains registered under 30 days are flagged as high risk — a consistent indicator of phishing infrastructure.
Submit email details to analyze with all detection layers.
Paste the Authentication-Results and Received headers from your email client
Powered by PhishNet API · Hosted on Render free tier — first request may take 30s
Request Body
{ "email_address": "support@paypa1.com", "email_text": "Verify your PayPal account immediately or it will be suspended", "email_headers": "Authentication-Results: spf=fail; dkim=fail; dmarc=fail (p=REJECT)\r\nFrom: PayPal Support <support@paypa1.com>\r\nReply-To: attacker@gmail.com" }
Response
{ "scan_id": "1a49b796-d242-448d-929a-dbec00106dcf", "scam_score": 83.38, "risk_level": "CRITICAL", "labels": ["Homoglyph Domain", "DMARC Fail", "Phishing Content"], "recommendations": [ "Do not click any links in this email", "This email failed DMARC authentication" ], "content_analysis": { "prediction": "Phishing Email", "confidence": 0.986, "is_phishing": true }, "email_verification": { "homoglyph_detected": true, "risk_score": 100.0 } }
Code Examples
curl -X POST https://scanner-api-st8w.onrender.com/api/scan \
-H "Content-Type: application/json" \
-H "X-API-Key: your_api_key_here" \
-d '{"email_address": "test@example.com", "email_text": "Your message here"}'
import requests
response = requests.post(
"https://scanner-api-st8w.onrender.com/api/scan",
headers={"X-API-Key": "your_api_key_here"},
json={"email_address": "test@example.com", "email_text": "Your message here"}
)
result = response.json()
print(f"Risk Level: {result['risk_level']}, Score: {result['scam_score']}")
const response = await fetch('https://scanner-api-st8w.onrender.com/api/scan', {
method: 'POST',
headers: {'Content-Type': 'application/json', 'X-API-Key': 'your_api_key_here'},
body: JSON.stringify({email_address: 'test@example.com', email_text: 'Your message here'})
});
const result = await response.json();
console.log(`Risk Level: ${result.risk_level}, Score: ${result.scam_score}`);
Request Body
{ "email_text": "Click here: https://paypa1.com/verify" }
Request Body
{ "email_text": "Urgent: Your account has been compromised. Click here to verify now." }
Response
{ "status": "healthy", "ml_model_loaded": true }
Every field in the response has a purpose. Nothing is a black box.
{
"scan_id": "1a49b796-...", // unique per request
"scam_score": 83.38, // 0-100 composite
"risk_level": "CRITICAL", // LOW/MEDIUM/HIGH/CRITICAL
"labels": [...], // human-readable signals
"recommendations": [...], // actionable guidance
"content_analysis": { // ML model output
"prediction": "Phishing Email",
"confidence": 0.986, // 0.0-1.0
"is_phishing": true
},
"header_analysis": { // SPF/DKIM/DMARC
"dmarc": "fail",
"reply_to_mismatch": true,
"spoofed_brand": "PayPal",
"flags": [...]
},
"email_verification": { // domain checks
"homoglyph_detected": true, // paypa1 != paypal
"risk_score": 100.0
}
}
A weighted average across all active detection layers, normalized to 0-100. Weights are calibrated on the 80k training corpus to minimize false positives on legitimate transactional emails.
LOW (<25), MEDIUM (25-49), HIGH (50-74), CRITICAL (75+). Tier boundaries match industry-standard SOC triage thresholds for email security workflows.
The flags array in each sub-analysis contains uppercase string constants ideal for programmatic routing and SIEM integration.
The email_verification layer compares the sender domain character-by-character against a curated list of major brands, detecting substitutions like 1->l, 0->o, and Unicode lookalikes.
The top-level labels array condenses the most significant findings into short strings suitable for display in email clients, browser extensions, or end-user dashboards.
Designed to support researchers, defenders, and builders fighting phishing at every level.
Free tier with generous rate limits for academic use, coursework, and thesis research. No credit card required. Cite us in your paper.
Free · AcademicCustom rate limits, production SLA guarantees, and dedicated support for security operations centers and enterprise email security integrations.
Custom · ProductionFree access for qualifying open source security tools, email clients, and community projects that help protect users from phishing.
Free · OSSDeveloped at AAMU Cybersecurity Lab · NSF Funded