Pattern recognition beyond classificationan abductive framework for time series interpretation

  1. Teijeiro Campo, Tomás
Dirixida por:
  1. Paulo Félix Lamas Director
  2. Jesús María Rodríguez Presedo Co-director

Universidade de defensa: Universidade de Santiago de Compostela

Fecha de defensa: 28 de abril de 2017

Tribunal:
  1. Leif Sörnmo Presidente/a
  2. Richard J. Duro Fernández Secretario/a
  3. Yuval Shahar Vogal
Departamento:
  1. Departamento de Electrónica e Computación

Tipo: Tese

Resumo

This work proposes a novel knowledge-based framework for time series interpretation, grounded on the initial hypothesis that abduction provides a proper reasoning paradigm to overcome the major limitations of traditional classification-based approaches, and inspired by how humans identify and characterize the patterns appearing in a time series. This framework relies on some basic assumptions: (i) interpretation of the behavior of a system from the set of available observations is a sort of conjecturing, and as such follows the logic of abduction; (ii) the interpretation task involves both bottom-up and top-down processing of information along a set of abstraction levels; (iii) at the lower levels of abstraction, the interpretation task is a form of precompiled knowledge-based pattern recognition; (iv) the interpretation task involves both the representation of time and reasoning about time and along time. The framework provides a knowledge representation formalism based on the notion of temporal abstraction pattern. A temporal abstraction pattern defines an abstraction relation between observables and provides the knowledge and methods to conjecture new observations from previous ones, thus assigning a dual role to the observables: as a hypothesis on the observation of an underlying process, and as the pieces of evidence supporting that hypothesis. This relation between observables may be established in multiple abstraction levels by using the hypothesis observable of an abstraction pattern as the evidence for a pattern in a higher level, thus building a hierarchical language for the description and characterization of the processes that can be observed in a specific domain. In addition to the knowledge representation formalism, a set of algorithms for the resolution of interpretation problems is provided, implementing a hypothesize-and-test cycle guided by an attentional mechanism that dynamically builds an interpretation according to four heuristic principles: 1) a coverage principle, which maximizes the explained evidence; 2) a simplicity principle, which minimizes the number of hypotheses in the final interpretation; 3) an abstraction principle, which prefers the use of explanatory hypotheses in higher abstraction levels; and 4) a predictability principle, which prioritizes interpretations that properly predict future evidence. Algorithms work both in off-line mode, in which all the evidence is available before the beginning of the interpretation, and in on-line mode, supporting the continuous inclusion of new evidence during the interpretation. As a paradigmatic example of a time series interpretation application, the proposed framework has been applied to some well-known problems in the electrocardiogram (ECG) analysis domain. On the one hand, the non-monotonic nature of abductive reasoning allows to correct both false negative and false positive QRS detections in a state-of-the-art algorithm, thus improving an essential stage in classical ECG processing algorithms. On the other hand, an interpretation in multiple abstraction levels has proven to be valuable for the construction of a set of high-level features describing the heart function in the same terms used by experts, enabling us to build a simple rule-based heartbeat classifier that outperforms state-of-the-art automatic classifiers, and even most classifiers requiring expert assistance to provide a result. The key factor behind these results is the non-monotonic nature of the hypothesize-and-test cycle, making it possible to exploit the complementarity between bottom-up and top-down processing, in order to find the best explanation consistent with the evidence. As an additional result, a comprehensive knowledge base for the ECG domain has been formalized, and with the aim of supporting reproducible research the full source code implementing both the abstraction model and the interpretation algorithms has been published under an Open Source License.