Benford's law applied to digital forensic analysis

Fernandes, Pedro; Antunes, Mário

dc.contributor.author	Fernandes, Pedro
dc.contributor.author	Antunes, Mário
dc.date.accessioned	2024-04-19T14:17:29Z
dc.date.available	2024-04-19T14:17:29Z
dc.date.copyright	2023
dc.date.issued	2023-01-30
dc.identifier.citation	Fernandes, P. and Antunes, M. (2023) ‘Benford’s law applied to digital forensic analysis’, Forensic Science International: Digital Investigation, 45, p. 301515. Available at: https://doi.org/10.1016/j.fsidi.2023.301515.	en_US
dc.identifier.uri	https://research.thea.ie/handle/20.500.12065/4802
dc.description.abstract	Tampered digital multimedia content has been increasingly used in a wide set of cyberattacks, challenging criminal investigations and law enforcement authorities. The motivations are immense and range from the attempt to manipulate public opinion by disseminating fake news to digital kidnapping and ransomware, to mention a few cybercrimes that use this medium as a means of propagation. Digital forensics has recently incorporated a set of computational learning-based tools to automatically detect manipulations in digital multimedia content. Despite the promising results attained by machine learning and deep learning methods, these techniques require demanding computational resources and make digital forensic analysis and investigation expensive. Applied statistics techniques have also been applied to automatically detect anomalies and manipulations in digital multimedia content by statistically analysing the patterns and features. These techniques are computationally faster and have been applied isolated or as a member of a classifier committee to boost the overall artefact classification. This paper describes a statistical model based on Benford's Law and the results obtained with a dataset of 18000 photos, being 9000 authentic and the remaining manipulated. Benford's Law dates from the 18th century and has been successfully adopted in digital forensics, namely in fraud detection. In the present investigation, Benford's law was applied to a set of features (colours, textures) extracted from digital images. After extracting the first digits, the frequency with which they occurred in the set of values obtained from that extraction was calculated. This process allowed focusing the investigation on the behaviour with which the frequency of each digit occurred in comparison with the frequency expected by Benford's law. The method proposed in this paper for applying Benford's Law uses Pearson's and Spearman's correlations and Cramer-Von Mises (CVM) fitting model, applied to the first digit of a number consisting of several digits, obtained by extracting digital photos features through Fast Fourier Transform (FFT) method. The overall results obtained, although not exceeding those attained by machine learning approaches, namely Support Vector Machines (SVM) and Convolutional Neural Networks (CNN), are promising, reaching an average F1-score of 90.47% when using Pearson correlation. With non-parametric approaches, namely Spearman correlation and CVM fitting model, an F1-Score of 56.55% and 76.61% were obtained respectively. Furthermore, the Pearson's model showed the highest homogeneity compared to the Spearman's and CVM models in detecting manipulated images, 8526, and authentic ones, 7662, due to the strong correlation between the frequencies of each digit and the frequency expected by Benford's law. The results were obtained with different feature sets length, ranging from 3000 features to the totality of the features available in the digital image. However, the investigation focused on extracting 1000 features since it was concluded that increasing the features did not imply an improvement in the results. The results obtained with the model based on Benford's Law compete with those obtained from the models based on CNN and SVM, generating confidence regarding its application as decision support in a criminal investigation for the identification of manipulated images.	en_US
dc.format	PDF	en_US
dc.language.iso	eng	en_US
dc.publisher	Elsevier	en_US
dc.relation.ispartof	Forensic Science International: Digital Investigation	en_US
dc.relation.ispartof	Forensic Science International: Digital Investigation
dc.rights	Attribution 3.0 United States	*
dc.rights.uri	http://creativecommons.org/licenses/by/3.0/us/	*
dc.subject	Benford's law	en_US
dc.subject	Digital forensics	en_US
dc.subject	Digital images manipulation	en_US
dc.subject	First digits law	en_US
dc.subject	Statistical coefficient correlation	en_US
dc.subject	Pearson's correlation	en_US
dc.title	Benford's law applied to digital forensic analysis	en_US
dc.type	info:eu-repo/semantics/article	en_US
dc.contributor.affiliation	Technological University of the Shannon: Midlands Midwest	en_US
dc.contributor.sponsor	Convolutional Neural Networks	en_US
dc.description.peerreview	yes	en_US
dc.identifier.doi	10.1016/j.fsidi.2023.301515	en_US
dc.identifier.volume	45	en_US
dc.rights.accessrights	info:eu-repo/semantics/openAccess	en_US
dc.type.version	info:eu-repo/semantics/publishedVersion	en_US

Files in this item

Name:: Benford's law applied to digital ...
Size:: 1.721Mb
Format:: PDF

View/Open

Name:: license_rdf
Size:: 914bytes
Format:: application/rdf+xml

View/Open

This item appears in the following Collection(s)

Articles - Department of Applied Sciences [38]

Show simple item record

Except where otherwise noted, this item's license is described as Attribution 3.0 United States