Aproximaciones disruptivas basadas en deep learning para identificación humana en antropología forense

  1. Gómez Trenado, Guillermo
Dirixida por:
  1. Óscar Cordón García Co-director
  2. Pablo Mesejo Santiago Co-director

Universidade de defensa: Universidad de Granada

Fecha de defensa: 18 de outubro de 2024

Tribunal:
  1. Sergio Damas Arroyo Presidente/a
  2. Natalia Díaz Rodríguez Secretario/a
  3. M. J. Carreira Nouche Vogal
  4. Javier Andreu Pérez Vogal
  5. Sancho Salcedo Sanz Vogal

Tipo: Tese

Resumo

Forensic science, a multidisciplinary field crucial to modern justice, applies scientific methods to analyze physical evidence from crime scenes and other contexts. Among its vital techniques is facial imaging, essential for human identification. This technique aids in recognizing suspects, victims, and missing persons through advanced image processing and analysis, significantly impacting criminal investigations and legal procedures. Forensic anthropology, a subfield of biological anthropology, focuses on the analysis of skeletal remains to identify deceased individuals, especially when remains are unrecognizable due to decomposition or as a consequence of catastrophic events. Techniques like dental record comparison, DNA analysis, facial approximation, molecular photofitting, and craniofacial identification are meticulously used to determine the identity of the deceased. These methods are essential not only for providing closure to families but also for aiding legal investigations by confirming identities and causes of death. Recently, Forensic Anthropology has expanded to include the identification of living individuals, adapting to changes in criminal investigation methods and new challenges, such as interpreting surveillance footage and identifying missing persons. The integration of artificial intelligence and Machine Learning in facial recognition systems has further enhanced the precision and efficiency in identifying individuals, underscoring the importance of innovation in justice and public safety. Traditionally, forensic identification methods relied on manual comparisons, which, despite being valuable, had a significant margin of error due to the subjective nature of the process, the expert's skill, and even mental and physical fatigue. However, recent years have seen a shift towards more advanced approaches, integrating Computer Vision and Artificial Intelligence technologies to improve the accuracy and reliability of identifications. These technologies enhance robustness and objectivity, reducing errors, speeding up processes, and optimizing forensic experts' workflow, allowing them to focus on critical areas where their work is most crucial. Ensuring reliability in forensic science is essential for the evidentiary value of proofs in legal contexts, which depends on producing consistent, accurate, and scientifically validated results. Traditional forensic facial imaging methods, such as facial identification and age progression, remain limited by labor-intensive, manual practices that rely on the visual analysis of professionals, introducing potential human errors and high subjectivity. The lack of standardized methodologies and the significant time required per case highlight the need for innovation within Forensic Anthropology. Artificial Intelligence offers a promising avenue for improving forensic practices by minimizing human error and subjectivity. Algorithms can learn from vast datasets, analyze facial features with precision, and automate and standardize the analysis process, supporting objective decision-making in forensic investigations. The global imperative for precise identification, driven by the need to identify missing persons, disaster victims, and solve crimes, underscores the importance of integrating Artificial Intelligence solutions in forensic facial imaging. Deep Learning is a subfield of machine learning that uses deep neural networks to model and solve complex problems from large datasets. Deep Learning's current relevance and popularity stem from its success in driving innovations and solving previously intractable problems.This doctoral dissertation introduces three significant contributions based on Deep Learning that can be integrated in the field of forensic facial imaging. The first contribution is the development of FSCNet, a Deep Learning model designed for the automatic localization of cephalometric landmarks (precise locations on the head highly relevant for many Forensic Anthropology tasks) in facial images, which can help enhancing the efficiency and accuracy of forensic identification processes. FSCNet employs a cascade of convolutional networks, starting with a pre-trained 3D deformable mask model to provide initial landmark location. This is followed by a convolutional neural network that refines these locations by predicting the displacement between the center of the cropped image and the landmark. Through rigorous testing and validation, FSCNet demonstrated superior performance compared to state-of-the-art methods, achieving higher precision in landmark localization and often outperforming human experts in real forensic scenarios. This development addresses the challenges posed by manual and labor-intensive landmark localization, offering a more reliable and efficient solution for forensic applications. The second major contribution of this thesis is the introduction of a novel framework for facial age editing, the Custom Structure Preservation (CUSP) module. This framework leverages a style-based encoder-decoder architecture, inspired by advancements in image-to-image translation and unconditional image generation, to allow realistic age transformations in facial images while preserving key identity features. The CUSP module provides users with the ability to adjust the degree of structure preservation during the transformation process, enabling more profound changes in facial morphology such as head shape and hair growth, which are typically challenging for conventional methods. By the use of the CUSP module, which is able to differentiate which parts of the image should be edited and which should remain untouched, this framework achieves a higher level of realism and accuracy in age progression and regression tasks. Extensive evaluations demonstrated its superior performance in maintaining the balance between structural changes and identity preservation, making it a significant advancement in forensic facial approximation. The third contribution is the development of SAGE (Self-Attention Guidance for Image Editing), an innovative technique for text-guided image editing that balances computational efficiency with high-fidelity reconstruction. SAGE utilizes a pre-trained diffusion model, specifically leveraging the intermediate Self-Attention and Cross-Attention maps computed during the reverse Denoising Diffusion Implicit Model (DDIM) process. This approach allows for precise image modifications based on textual descriptions without the need for explicit reconstruction of the input image. SAGE's unique Self-Attention guidance mechanism ensures faithful image editing and high-fidelity reconstruction in regions unaffected by the edits, providing an optimal balance between maintaining original image details and achieving the desired modifications. Comparative analyses against state-of-the-art methods showed that SAGE delivers comparable or superior editing quality with minimal computational expense, making it a versatile and powerful tool for various forensic and general image editing applications. In addition to the technological advancements presented, it is crucial to ensure that these methods are rigorously validated to guarantee their effectiveness and reliability in real-world applications. Each method introduced in this dissertation has undergone comprehensive user studies. FSCNet for cephalometric point localization was validated with the involvement of forensic experts, who assessed its accuracy, usability, and practical implications in a forensic scenario. For the CUSP module for facial age editing and the SAGE technique for text-guided image editing, diverse user groups were involved in the validation process to compare the methods' preferences and performance against existing techniques. These studies help ensure that the developed methods not only perform well in controlled environments but also meet the demands of actual forensic investigations and general image editing tasks. This user-centric approach underlines the commitment to advancing forensic science through methods that are both scientifically robust and practically applicable, ultimately enhancing the overall credibility and impact of forensic analyses. In conclusion, this dissertation presents groundbreaking advancements in methodologies that can be used in forensic facial imaging through the development of FSCNet, the CUSP module for facial age editing, and the SAGE technique for text-guided image editing. These contributions could significantly enhance the accuracy, efficiency, and reliability of forensic analyses, addressing critical challenges in the field.