Fraud Detection (AI)

Fraud detection is the application of statistical analysis, machine learning, and rules-based logic to identify illegitimate activity inside payment systems, customer accounts, insurance claims, advertising networks, telecoms, and other commercial channels. Banks, card networks, e-commerce platforms, and insurers run real-time scoring systems that read each event, assign a risk score, and either approve, decline, or refer it for human review within tens of milliseconds. The economic stakes are large: card fraud alone produced more than 33 billion dollars of losses globally in 2023, and financial crime including money laundering accounts for several trillion dollars of illicit flows each year per the UN Office on Drugs and Crime.

Fraud detection is one of the oldest applied uses of data science, with neural network credit card scoring deployed at HNC Software in the early 1990s, but the field has shifted dramatically since 2015. Gradient boosted trees such as XGBoost, LightGBM, and CatBoost replaced shallow neural networks as the production workhorse. Graph neural networks introduced relational reasoning, and autoencoders and isolation forests gave teams unsupervised options for novel attacks. Since 2023 generative AI has reshaped both sides: defenders use large language models for case triage, while attackers weaponize voice cloning, synthetic identity creation, and AI-generated phishing.

Scope and major fraud categories

Fraud detection is not a single problem. Each category has its own data sources, attacker behaviors, regulatory regime, and acceptable false-positive tolerance. A model that works for credit card swipes will not work for first-party application fraud, so most large institutions run separate model stacks for each fraud category.

Category	Typical channel	Hallmark signal	Common ML approach
Credit and debit card fraud	Card-present POS, card-not-present e-commerce, ATM	Velocity, geography, merchant category, BIN ranges, device fingerprint	Gradient boosting, sequence transformers, neural networks
Account takeover (ATO)	Online banking, exchanges, retail logins	Login geo, device change, session behavior, password reuse	Behavioral biometrics, sequence models, anomaly detection
Money laundering (AML)	Wire transfers, correspondent banking, crypto exchanges	Layering patterns, structuring, beneficial ownership opacity	Rules engines plus graph neural networks and unsupervised clustering
Application fraud	Loan, credit card, account opening	Synthetic identity attributes, mismatched personally identifiable information, velocity across institutions	Logistic regression, tree ensembles, identity graph features
Insurance fraud	First notice of loss, medical billing, staged accidents	Provider ring patterns, claim text anomalies, repeat claimants	NLP plus tree ensembles, link analysis, image forensics
Identity fraud and synthetic identity	KYC onboarding, document verification	Document tampering, biometric mismatch, AI-generated face	Computer vision, liveness detection, biometric matching
Ad fraud and click fraud	Programmatic display, search, mobile attribution	Bot signatures, click farms, install hijacking	Behavioral models, IP intelligence, sequence anomalies
E-commerce fraud and chargebacks	Checkout, refund, friendly fraud	Address mismatch, BIN to country mismatch, prior chargeback history	Tree ensembles, SMOTE-augmented training, network features
Telecom fraud	International revenue share, SIM swap, Wangiri	Call detail record patterns, IMSI changes, premium-rate destinations	Rule engines plus autoencoders, sequence models
Deepfake and GenAI fraud	Voice phone calls, video KYC, social engineering	Audio artifacts, lip-sync inconsistencies, identity asset reuse	Audio and video deepfake detectors, biometric ensembles

The categories overlap in practice. A synthetic identity ring may begin with application fraud at a digital bank, age the accounts, then use them to launder proceeds from card fraud. Data governance and regulatory restrictions often keep these signals siloed across teams.

The class imbalance problem

The defining technical challenge of fraud detection is the extreme imbalance between legitimate and fraudulent activity. In a typical card-not-present portfolio fewer than 0.2 percent of transactions are fraudulent, and in wire transfer monitoring the rate is often well under 0.01 percent. A naive classifier that predicts "not fraud" for every transaction would achieve 99.8 percent accuracy while delivering zero business value. Practitioners therefore rely on precision, recall, F1 score, area under the precision-recall curve, and cost-weighted measures that account for the unequal financial impact of false positives versus false negatives.

Bahnsen and colleagues formalized this as example-dependent cost-sensitive classification in 2014 and 2016. Each transaction has its own cost matrix because the loss from a false negative equals the transaction amount, while a false positive costs the operational expense of declining and reissuing the transaction. Optimizing for expected savings rather than raw accuracy can deliver double-digit improvements in net loss reduction.

Several families of techniques are used to handle the imbalance:

Technique	Description	Notes
Random undersampling	Drop legitimate examples until classes are balanced	Loses information, fast baseline
Random oversampling	Duplicate fraud examples	Risks overfitting to specific cases
SMOTE (Synthetic Minority Oversampling Technique)	Interpolate between fraud examples in feature space	Chawla 2002, very widely used
ADASYN	Focus synthesis on hard-to-learn fraud points	Variant of SMOTE with adaptive density
Cost-sensitive learning	Penalize false negatives more in the loss function	Native support in XGBoost and LightGBM via scale_pos_weight
Threshold tuning	Sweep classifier threshold to optimize cost	Cheap, often the most effective single change
One-class learning and anomaly detection	Train only on legitimate behavior, score deviations	Useful when fraud labels are scarce or biased
GAN-based oversampling, including CTGAN	Train a generative model to produce realistic synthetic fraud	Helps when minority class is structurally complex
Conditional Tabular GAN with focal loss	Combine synthetic data with focal loss reweighting	Reported state of the art on several benchmarks

None of these techniques is universally best. IEEE-CIS competition results show that careful threshold tuning combined with strong features and XGBoost often outperforms heavy synthetic sampling. SMOTE in particular can hurt performance on highly imbalanced tabular data because synthetic points lie inside convex hulls of real fraud examples and do not generalize to novel attacks.

History

Fraud detection predates machine learning. Early credit card fraud control in the 1970s and 1980s relied on hot-card lists, behavioral red flags collated by human analysts, and authorization rules such as floor limits. The first wave of analytical scoring arrived with logistic regression and discriminant analysis in the 1980s.

The inflection point came in 1992 when HNC Software, founded by Robert Hecht-Nielsen, deployed Falcon, a neural network credit card fraud scoring system, at First USA. By the late 1990s Falcon was running at most major US issuers and reportedly screened more than two thirds of card transactions worldwide. FICO acquired HNC in 2002 and rebranded the product as FICO Falcon Fraud Manager. Falcon was the canonical example of machine learning in production for nearly two decades.

The academic literature followed in the 2000s. Bolton and Hand published an influential 2002 review on statistical fraud detection, and Ngai and colleagues published a widely cited 2011 systematic review of data mining techniques that classified the field by methodology and application domain. The Ngai survey identified logistic regression, decision trees, neural networks, support vector machines, Bayesian belief networks, and k-nearest neighbors as the dominant approaches, with hybrid and ensemble methods emerging.

The 2010s brought four major changes. Gradient boosted trees, particularly XGBoost released in 2014, displaced both logistic regression and shallow neural networks as the workhorse of supervised fraud scoring. Deep learning arrived via autoencoders and recurrent networks for sequence modeling. The anti-money laundering field shifted from rule engines to network analytics and graph neural networks. Mobile and digital channels expanded the data available for behavioral modeling.

The 2020s have been defined by two further shifts. The Covid-19 pandemic accelerated digital payments and produced new fraud patterns including buy-now-pay-later abuse and unemployment insurance fraud. After 2022 the rapid maturation of generative AI created new fraud vectors, particularly voice cloning for authorized push payment scams and AI-generated synthetic identities for account opening. By 2025 the largest banks were running model ensembles that combine gradient boosting, sequence transformers, graph neural networks, and dedicated deepfake detectors.

Methods

Fraud detection systems combine many model families. The choice depends on data volume, label quality, latency budget, regulatory requirements, and the structure of the fraud pattern.

Rule engines

Deterministic rule engines are the oldest and still the most widespread fraud detection technology. A rule encodes domain knowledge such as "decline if the transaction amount exceeds 5,000 USD and the merchant country is on the high-risk list." Rules are easy to audit, easy to explain to regulators, and trivial to update. They also do not generalize and accumulate operational debt as the rule set grows. Modern systems use rules alongside machine learning models. Rules handle hard policy decisions such as sanctions screening, while machine learning handles probabilistic risk scoring.

Supervised machine learning

Supervised learning is the dominant paradigm where labeled fraud data is available. Logistic regression remains a baseline because of its interpretability and easy integration into model risk management frameworks. Random forests offer a low-tuning option. The current production workhorses are gradient boosted decision trees: XGBoost, LightGBM, and CatBoost. They handle missing values natively, capture nonlinear interactions, and train quickly on tabular data. Most public fraud benchmarks since 2018, including the IEEE-CIS Fraud Detection competition won in 2019, have been topped by gradient boosting solutions or ensembles that include them.

Support vector machines (SVMs) appeared frequently in the 2000s fraud literature and can be effective with well-engineered features, but they scale poorly to the millions of transactions per day handled by modern issuers and have largely been displaced by tree ensembles.

Sequence models are a growing area. Transactions for a single account form a temporal sequence, and recurrent networks, temporal convolutional networks, and transformer architectures can encode it directly. Mastercard, Stripe, and several research groups have published on transformer-based fraud scoring that ingests the past several thousand transactions of an account. The advantage is the model can learn long-range patterns such as a sleeper account that becomes active months after creation.

Unsupervised methods and anomaly detection

Labeled fraud data is scarce and biased toward attacks the issuer already knows how to detect. Unsupervised methods compensate by modeling normal behavior and flagging deviations. They are essential for detecting novel fraud patterns and for early warning before labels accumulate.

The isolation forest algorithm, introduced by Liu, Ting, and Zhou in 2008, isolates anomalies by building random trees that partition the feature space. Anomalous points have shorter average path lengths because they are easier to isolate. Isolation forests are linear in time and embarrassingly parallel, making them attractive for high-volume monitoring. They are the workhorse method in scikit-learn and PyOD.

Autoencoders compress legitimate transaction features into a low-dimensional latent space and reconstruct them. Transactions that reconstruct poorly are likely anomalies. Variational autoencoders extend this to probabilistic latent spaces. Several payment processors use deep autoencoders to flag transactions unlike anything seen during training.

Local outlier factor, DBSCAN clustering, one-class SVMs, and Gaussian mixture models also appear regularly. The PyOD library, started by Yue Zhao in 2017, aggregates more than 50 outlier detection algorithms in a single API and is the de facto Python toolbox for anomaly detection-based fraud work.

Graph and network methods

Fraud rarely occurs in isolation. Synthetic identity rings share addresses, devices, IP ranges, and beneficial owners. Money laundering schemes route funds through long chains of intermediate accounts. Click farms cluster around the same hardware fingerprints. Graph methods turn this relational structure into a model input.

Simple graph features such as the count of distinct devices an account has used or the shortest path to a known fraudulent entity can be added to gradient boosting models with substantial gains. More sophisticated approaches use graph neural networks, which propagate features along graph edges through learned aggregation functions. The graph convolutional network (GCN) of Kipf and Welling, the graph attention network (GAT) of Velickovic and colleagues, GraphSAGE for inductive learning, and heterogeneous attention networks such as HAN have all been applied to fraud problems.

In anti-money laundering, work by Mark Weber and colleagues at IBM Research with the Elliptic dataset showed that GCNs can detect illicit Bitcoin transactions with substantial gains over feature-only baselines. The Elliptic2 dataset and the AMLworld synthetic dataset released in 2024 have become public benchmarks. NVIDIA has published reference architectures combining GraphSAGE embeddings with downstream XGBoost classifiers, achieving ten to fifteen point AUC gains on the IEEE-CIS dataset.

Generative and synthetic data methods

Fraud labels are scarce, so several teams use generative models to augment training data. Conditional Tabular GAN (CTGAN) and CTAB-GAN, introduced by Lei Xu and colleagues in 2019, generate realistic synthetic tabular data conditioned on class labels. Diffusion models for tabular data appeared from 2023. Both can produce more diverse fraud examples than SMOTE interpolations. Synthetic data also matters for privacy-preserving model sharing across institutions, complementing federated learning frameworks that let banks train shared models without exposing customer-level data.

The end-to-end production pipeline

A mature fraud detection system contains far more than a single model. The pipeline includes data ingestion, feature computation, scoring, decisioning, case management, feedback collection, and monitoring.

Event capture. Transaction events arrive from issuers, acquirers, gateways, and internal channels. Each event carries hundreds of raw attributes.
Feature engineering. Raw attributes are transformed into model-ready features. Aggregations across windows of seconds, minutes, hours, and days are particularly important. The Bahnsen feature aggregation strategy, grouping by card and computing transaction counts and amounts over multiple periods, is now standard.
Real-time scoring. The model returns a risk score, usually in under 50 milliseconds for card payments. Scoring infrastructure must handle tens of thousands of transactions per second and tolerate hardware failures.
Decision logic. A decision engine combines the model score with rules and policy to produce an approve, decline, step-up, or refer outcome. Step-up authentication invokes a second factor such as 3D Secure, biometric prompt, or one-time password.
Case management. Referred transactions enter a case management system where human analysts review and decide. Case outcomes feed back as labels.
Feedback and retraining. Confirmed labels arrive after a delay; chargebacks may take 60 to 120 days. Retraining cadence balances stale models against noisy fresh labels.
Monitoring. Population stability indices, drift detection, score distributions, alert volumes, and downstream losses are tracked continuously, with rapid degradation triggering investigation and potential rollback.

Industry vendors and platforms

A large commercial ecosystem provides fraud detection software to financial institutions, payment processors, and insurers.

Vendor	Primary focus	Notable features
FICO Falcon Fraud Manager	Card and payment fraud	Industry-standard neural network platform since 1992, deployed at most large US and European issuers
Visa Advanced Authorization (VAA)	Card authorization risk	Integrated into VisaNet authorization message, scores 100 percent of Visa transactions in real time
Mastercard Decision Intelligence	Card authorization and ATO	AI scoring on every Mastercard transaction, expanded with Decision Intelligence Pro in 2024
Feedzai	Banking and payments	RiskOps platform, deployed at major US and European banks
NICE Actimize	AML, fraud, and trade surveillance	Long-standing leader in financial crime compliance, owned by NICE
SAS Anti-Money Laundering and SAS Fraud Management	Banking, insurance, government	Combines rules and ML, on-premises and cloud deployments
ComplyAdvantage	AML screening and monitoring	Knowledge-graph-driven screening of sanctions, PEP, and adverse media
ThetaRay	Cross-border AML	Unsupervised AI for correspondent banking transaction monitoring
Stripe Radar	E-commerce payment fraud	Network-effect ML across the Stripe payment graph, integrated with the checkout flow
Adyen RevenueProtect	E-commerce payment fraud	Risk and revenue optimization for marketplaces and global merchants
Sift	Digital trust and safety	Fraud and abuse signals across login, signup, content, and payment events
Riskified, Forter, Signifyd	E-commerce chargeback guarantee	ML scoring with financial guarantee for approved transactions
Fraugster	Online retail fraud	Acquired by Smart Engine in 2024, focused on real-time decisioning
Shift Technology	Insurance fraud and claims	AI claims fraud detection used by hundreds of insurers, partnered with Microsoft Azure OpenAI
Quantexa	AML and entity resolution	Contextual decision intelligence with entity graph
SymphonyAI Sensa	AML transaction monitoring	NetReveal product line with explainable AI
BioCatch	Behavioral biometrics	Mouse, touch, and typing rhythm signals for ATO defense
Socure	Identity verification and synthetic identity	KYC and identity intelligence
Onfido and Veriff	Identity document verification	Biometric and document checks for digital onboarding

Open source has lagged the commercial ecosystem because high-quality fraud data is sensitive. Scikit-learn provides core algorithms including IsolationForest and LocalOutlierFactor. PyOD aggregates outlier detection methods. PyCaret offers a low-code workflow with fraud-friendly preprocessing. Featuretools automates feature engineering for transactional data. The Deep Graph Library and PyTorch Geometric enable graph neural network experimentation, and Amazon Science maintains a public fraud-dataset benchmark.

Datasets and benchmarks

Reproducible research has historically been limited by the sensitivity of payment data. A small number of public datasets have become de facto benchmarks, and several synthetic datasets have appeared to fill the gap.

Dataset	Year	Records	Class balance	Notes
Kaggle Credit Card Fraud Detection (ULB)	2015	284,807	0.172 percent fraud	PCA-anonymized European card transactions, the most cited fraud dataset
IEEE-CIS Fraud Detection	2019	590,540	3.5 percent fraud	Vesta e-commerce dataset, hosted on Kaggle, top entries used XGBoost ensembles
PaySim	2016	Up to 6 million	Configurable	Synthetic mobile money data, open source
Elliptic Bitcoin (Elliptic1, Elliptic2)	2019, 2024	200,000+	About 2 percent illicit	Bitcoin transaction graph for AML research
AMLworld	2024	Multi-million	About 0.05 percent illicit	Synthetic AML benchmark from IBM Research
Banksim	2014	600,000	Configurable	Synthetic bank transactions
Czech bank dataset	1999	1 million	Sparse fraud	One of the earliest public bank datasets
Lloyd Banking insurance fraud (UK)	Various	Subject to NDA	Sparse	Available to academic partners
FraudDataset Benchmark (Amazon)	2022	Multiple datasets	Mixed	Aggregated benchmarks with reference baselines

The Kaggle Credit Card Fraud Detection dataset, often called the ULB dataset because it was released by researchers at the Universite Libre de Bruxelles, contains PCA-anonymized features and is the standard didactic example for SMOTE, autoencoder, and isolation forest tutorials. The IEEE-CIS dataset released by Vesta Corporation in 2019 is larger and richer, with 393 raw features. The 2019 winning solution combined XGBoost, LightGBM, and CatBoost with extensive feature aggregation. For anti-money laundering, the Elliptic Bitcoin dataset and the synthetic AMLworld benchmark released in 2024 give researchers access to rich transaction networks.

Regulation

Fraud detection sits inside a thicket of regulation. Banks, processors, and insurers must balance fraud prevention against consumer protection, model risk management, anti-discrimination law, and data privacy law.

Regime	Geography	Scope
FATF Recommendations	Global, 200+ jurisdictions	Anti-money laundering and counter-terrorist financing standards, including risk-based approach guidance updated in 2025
Bank Secrecy Act, USA PATRIOT Act, FinCEN	United States	Suspicious activity reporting, currency transaction reports, beneficial ownership
OFAC sanctions screening	United States	Sanctions and blocked persons list checking
EU AML Directives 4-6 and AML Authority	European Union	Customer due diligence, beneficial ownership registries, EU-level supervisory authority active from 2025
PSD2 Strong Customer Authentication	EU and UK	Two-factor authentication for remote payments above 30 EUR, exemptions for low-risk transactions
3D Secure 2	Global card schemes	Risk-based authentication protocol used to apply PSD2 SCA
GDPR and equivalents	EU and UK	Constraints on use of personal data, automated decision rights, right to explanation
Equal Credit Opportunity Act and Fair Credit Reporting Act	United States	Anti-discrimination and accuracy obligations on credit decisioning
Federal Reserve SR 11-7 model risk guidance	United States	Sound practices for model development, validation, and governance
EU AI Act	European Union	High-risk AI system requirements applying to creditworthiness decisions and biometric identification
MAS, HKMA, FCA AI guidance	Singapore, Hong Kong, UK	Principles-based AI governance for financial services

The regulatory direction since 2023 has been toward more prescriptive AI governance. The EU AI Act, FATF's 2025 guidance, and the Federal Reserve's focus on model risk management push fraud teams to document model purpose, data lineage, validation procedures, and explainability. PSD2's Strong Customer Authentication regime mandates two-factor authentication for remote European card payments with transaction risk analysis exemptions for low-risk transactions, tying fraud detection more tightly into the consumer authentication flow.

Performance metrics

Fraud detection metrics must reflect unequal error costs and heavy class imbalance. Standard accuracy is unhelpful. Common metrics include:

Metric	Formula	Use
Precision	TP / (TP + FP)	Fraction of flagged events that were genuinely fraudulent
Recall (TPR, sensitivity)	TP / (TP + FN)	Fraction of fraud caught
F1 score	2 PR / (P + R)	Harmonic mean of precision and recall
AUC-ROC	Area under receiver operating curve	Threshold-independent ranking quality, can mislead under heavy imbalance
AUC-PR	Area under precision-recall curve	More informative than AUC-ROC for imbalanced data
Recall at K	Recall when only K alerts can be reviewed per day	Reflects analyst capacity constraints
Cost-weighted savings	Sum of (TP gain - FP cost - FN cost)	Direct business measure, used by Bahnsen 2016
False positive rate at fixed recall	FP / (FP + TN) at fixed TPR	Common operating point measure
Alert-to-fraud ratio	Alerts per confirmed fraud	Inverse of precision, used in AML
SAR efficiency	Suspicious Activity Reports per filed report converted to enforcement	AML-specific efficacy measure

For unsupervised methods that produce only an anomaly score, evaluation proceeds by ranking and computing precision and recall at top-K. Realistic evaluation requires temporal splitting because attackers adapt and concept drift is rapid; random k-fold cross-validation almost always overstates production performance.

Recent developments

The fraud landscape since 2022 has moved faster than at any time since the original deployment of neural network scoring in the early 1990s. Three trends dominate.

Generative AI as an attack tool. Voice cloning has driven a wave of authorized push payment scams in which victims are tricked into transferring money to fraudsters posing as a CEO, family member, or trusted institution. The 2024 Arup deepfake video conference fraud in Hong Kong, in which a finance employee transferred 25 million USD after a video call with a deepfaked executive, became the canonical example. Synthetic identity fraud has accelerated as generative models produce convincing fake passports, selfies, and live KYC video. Industry estimates suggest deepfake-related fraud attempts in financial services rose by more than 2,000 percent between 2022 and 2025. AI-generated phishing pages and personalized spear phishing emails have lowered the cost of mass social engineering attacks.

Generative AI as a defense tool. Large language models help fraud analysts triage cases by summarizing transaction histories, drafting suspicious activity reports, and querying internal knowledge bases. Vendors including Shift Technology, Quantexa, NICE Actimize, and Feedzai have launched LLM-powered analyst assistants. Multimodal models help detect deepfake media, and embedding-based retrieval surfaces similar past cases for analyst comparison.

Graph and behavioral methods at scale. Graph neural network deployments have moved from research to production at large card networks and digital banks, often as feature generators feeding downstream gradient boosting. Behavioral biometrics, including mouse and touch dynamics, have become standard for account takeover defense. Continuous authentication, which scores user behavior throughout a session, has reached mainstream deployment in mobile banking.

The combined effect is that the fraud detection stack has become more layered and capable. Single-model systems that dominated the 2010s have been replaced by ensembles combining real-time gradient boosting, sequence transformers, graph neural networks, autoencoders, behavioral biometrics, deepfake detectors, and LLM assistants on top of a deterministic rule layer.

Limitations and challenges

Despite three decades of investment, fraud detection systems share recurring limitations. Labeling latency and noise corrupt training data: chargebacks take weeks or months to materialize, first-party fraud is often misclassified, and investigator decisions reflect operational policy as much as ground truth. Concept drift is constant because attackers adapt to deployed models, sometimes within hours of policy changes.

False positives are expensive. A declined legitimate transaction damages the customer relationship and erodes lifetime value. The ratio of false positives to true frauds in many production systems is between 5:1 and 50:1, and analyst review is a major budget component.

Fairness and bias are growing concerns. Models can encode demographic bias if features correlate with protected attributes. Regulators are paying closer attention, and explainability tools such as SHAP, LIME, and counterfactual reasoning are now standard in model risk documentation. Data silos limit information sharing: privacy laws restrict cross-institution sharing of features and labels. Federated learning, multi-party computation, and consortium data sharing through Early Warning Services, FIS Sentinel, and the FICO Falcon Intelligence Network are partial answers.

Adversarial robustness is poor; tabular adversarial examples are easier to construct than image adversarials, and many production models can be circumvented by modifying a small number of features. Generative AI has shifted the cost curve for attackers, automating attacks that once required skilled human social engineering. Defenders are responding with multimodal deepfake detection, behavioral biometrics, and improved liveness checks, but the long-run equilibrium is unclear.

References

Bolton, R. J., and Hand, D. J. (2002). "Statistical fraud detection: A review." Statistical Science, 17(3), 235 to 255.
Ngai, E. W. T., Hu, Y., Wong, Y. H., Chen, Y., and Sun, X. (2011). "The application of data mining techniques in financial fraud detection: A classification framework and an academic review of literature." Decision Support Systems, 50(3), 559 to 569.
Bahnsen, A. C., Aouada, D., and Ottersten, B. (2014). "Example-dependent cost-sensitive logistic regression for credit scoring." 13th International Conference on Machine Learning and Applications.
Bahnsen, A. C., Aouada, D., Stojanovic, A., and Ottersten, B. (2016). "Feature engineering strategies for credit card fraud detection." Expert Systems with Applications, 51, 134 to 142.
Chawla, N. V., Bowyer, K. W., Hall, L. O., and Kegelmeyer, W. P. (2002). "SMOTE: Synthetic minority over-sampling technique." Journal of Artificial Intelligence Research, 16, 321 to 357.
Liu, F. T., Ting, K. M., and Zhou, Z. H. (2008). "Isolation forest." Eighth IEEE International Conference on Data Mining.
Chen, T., and Guestrin, C. (2016). "XGBoost: A scalable tree boosting system." Proceedings of KDD 2016.
Ke, G., Meng, Q., Finley, T., Wang, T., Chen, W., Ma, W., Ye, Q., and Liu, T. Y. (2017). "LightGBM: A highly efficient gradient boosting decision tree." NeurIPS 2017.
Prokhorenkova, L., Gusev, G., Vorobev, A., Dorogush, A. V., and Gulin, A. (2018). "CatBoost: Unbiased boosting with categorical features." NeurIPS 2018.
Weber, M., Domeniconi, G., Chen, J., Weidele, D. K. I., Bellei, C., Robinson, T., and Leiserson, C. E. (2019). "Anti-money laundering in Bitcoin: Experimenting with graph convolutional networks for financial forensics." KDD 2019 Workshop on Anomaly Detection.
Lopez-Rojas, E. A., Elmir, A., and Axelsson, S. (2016). "PaySim: A financial mobile money simulator for fraud detection." 28th European Modeling and Simulation Symposium.
Xu, L., Skoularidou, M., Cuesta-Infante, A., and Veeramachaneni, K. (2019). "Modeling tabular data using conditional GAN." NeurIPS 2019.
Zhao, Y., Nasrullah, Z., and Li, Z. (2019). "PyOD: A Python toolbox for scalable outlier detection." Journal of Machine Learning Research, 20(96), 1 to 7.
Kipf, T. N., and Welling, M. (2017). "Semi-supervised classification with graph convolutional networks." ICLR 2017.
Velickovic, P., Cucurull, G., Casanova, A., Romero, A., Lio, P., and Bengio, Y. (2018). "Graph attention networks." ICLR 2018.
Hamilton, W. L., Ying, R., and Leskovec, J. (2017). "Inductive representation learning on large graphs (GraphSAGE)." NeurIPS 2017.
Wang, X., Ji, H., Shi, C., Wang, B., Cui, P., Yu, P. S., and Ye, Y. (2019). "Heterogeneous graph attention network." The Web Conference 2019.
Pourhabibi, T., Ong, K. L., Kam, B. H., and Boo, Y. L. (2020). "Fraud detection: A systematic literature review of graph-based anomaly detection approaches." Decision Support Systems, 133.
West, J., and Bhattacharya, M. (2016). "Intelligent financial fraud detection: A comprehensive review." Computers and Security, 57, 47 to 66.
Hilal, W., Gadsden, S. A., and Yawney, J. (2022). "Financial fraud: A review of anomaly detection techniques and recent advances." Expert Systems with Applications, 193.
FATF (2025). "Updated guidance on AML/CFT and financial inclusion." Financial Action Task Force.
FATF (2021). "Opportunities and challenges of new technologies for AML/CFT." Financial Action Task Force.
European Banking Authority (2019). "Final report: Guidelines on the conditions for the application of the exemption to the strong customer authentication requirement." EBA/GL/2019/06.
Federal Reserve (2011). "SR 11-7: Guidance on model risk management." Board of Governors of the Federal Reserve System.
European Union (2024). "Regulation 2024/1689 laying down harmonised rules on artificial intelligence (Artificial Intelligence Act)." Official Journal of the European Union.
Vesta Corporation and IEEE Computational Intelligence Society (2019). "IEEE-CIS Fraud Detection competition." Kaggle.
Pozzolo, A. D., Caelen, O., Johnson, R. A., and Bontempi, G. (2015). "Calibrating probability with undersampling for unbalanced classification." IEEE Symposium on Computational Intelligence and Data Mining.
Pozzolo, A. D., Boracchi, G., Caelen, O., Alippi, C., and Bontempi, G. (2018). "Credit card fraud detection: A realistic modeling and a novel learning strategy." IEEE Transactions on Neural Networks and Learning Systems, 29(8), 3784 to 3797.
Hancock, J. T., and Khoshgoftaar, T. M. (2020). "CatBoost for big data: An interdisciplinary review." Journal of Big Data, 7(94).
Branco, P., Torgo, L., and Ribeiro, R. P. (2016). "A survey of predictive modeling on imbalanced domains." ACM Computing Surveys, 49(2).

Fraud Detection (AI)

Scope and major fraud categories

The class imbalance problem

History

Methods

Rule engines

Supervised machine learning

Unsupervised methods and anomaly detection

Graph and network methods

Generative and synthetic data methods

The end-to-end production pipeline

Industry vendors and platforms

Datasets and benchmarks

Regulation

Performance metrics

Recent developments

Limitations and challenges

See also

References

Improve this article

Scope and major fraud categories

The class imbalance problem

History

Methods

Rule engines

Supervised machine learning

Unsupervised methods and anomaly detection

Graph and network methods

Generative and synthetic data methods

The end-to-end production pipeline

Industry vendors and platforms

Datasets and benchmarks

Regulation

Performance metrics

Recent developments

Limitations and challenges

See also

References

Scope and major fraud categories

The class imbalance problem

History

Methods

Rule engines

Supervised machine learning

Unsupervised methods and anomaly detection

Graph and network methods

Generative and synthetic data methods

The end-to-end production pipeline

Industry vendors and platforms

Datasets and benchmarks

Regulation

Performance metrics

Recent developments

Limitations and challenges

See also

References

Improve this article

Related Articles

Beverage

Productivity

How to Steal ChatGPT-4, GPT-4 and other Proprietary LLMs

Claude Code Review

Finance

Finance ChatGPT Plugins

Scope and major fraud categories

The class imbalance problem

History

Methods

Rule engines

Supervised machine learning

Unsupervised methods and anomaly detection

Graph and network methods

Generative and synthetic data methods

The end-to-end production pipeline

Industry vendors and platforms

Datasets and benchmarks

Regulation

Performance metrics

Recent developments

Limitations and challenges

See also

References

Related Articles

Beverage

Productivity

How to Steal ChatGPT-4, GPT-4 and other Proprietary LLMs

Claude Code Review

Finance

Finance ChatGPT Plugins