Compiler-assisted checkpointing of message-passing applications in heterogeneous environments

  1. Rodríguez, Gabriel
Dirixida por:
  1. María J. Martín Director
  2. Patricia González Director

Universidade de defensa: Universidade da Coruña

Fecha de defensa: 16 de decembro de 2008

Tribunal:
  1. Emilio Luque Fadón Presidente/a
  2. Francisco Fernández Rivera Secretario
  3. Ignacio Martín Llorente Vogal
  4. Dolores Isabel Rexachs del Rosario Vogal
  5. Andrés Gómez Tato Vogal

Tipo: Tese

Teseo: 178223 DIALNET lock_openRUC editor

Obxectivos de Desenvolvemento Sustentable

Resumo

With the evolution of high performance computing towards heterogeneous, massively parallel systems, parallel applications have developed new checkpoint and restart necessities, Whether due to a failure in the execution or to a migration of the processes to different machines, checkpointing tools must be able to operate in heterogeneous environments. However, some of the data manipulated by a parallel application are not truly portable. Examples of these include opaque state (e.g. data structures for communications support) or diversity of interfaces for a single feature (e.g. communications, I/O). Directly manipulating the underlying ad-hoc representations renders checkpointing tools incapable of working on different environments. Portable checkpointers usually work around portability issues at the cost of transparency: the user must provide information such as what data needs to be stored, where to store it, or where to checkpoint. CPPC (ComPiler for Portable Checkpointing) is a checkpointing tool designed to feature both portability and transparency, while preserving the scalability of the executed applications. It is made up of a library and a compiler. The CPPC library contains routines for variable level checkpointing, using portable code and protocols. The CPPC compiler achieves transparency by relieving the user from time-consuming tasks, such as performing code analyses and adding instrumentation code.