BMI logo
Home Referent Tracking paradigm Papers Projects Education Technology transfer

Referent Tracking Literature

Introductory readings

[Ceusters & Smith 2005a] is the first paper that introduced the idea of Referent Tracking (RT) as a new paradigm for entry and retrieval of data in the Electronic Health Record (EHR). It contains an easy to read introduction to the sort of problems that arise when general terms taken from terminologies or ontologies (such as 'fracture' and 'left femur') are used in the EHR to refer to particular entities on the side of a patient such as the particular fracture that John suffered from in 2005 in his left femur. It explains how all ambiguities can be avoided by referring to such entities by means of identifiers instead of codes. A more thorough and formal discussion of the types of statements that are required as well as the technical infrastructure for real implementations can be found in [Ceusters & Smith 2005b].

RT follows a set of rigorous principles that are based on philosophical realism, a branch of ontology of which the foundations for applying it to biomedicine are discussed in [Smith et al. 2005]. Also important to read in order to understand what is wrong with the prevailing concept-based approach and how RT can contribute to a solution, are [Smith & Ceusters 2006a] which covers biomedical terminologies and ontologies in general, and [Smith & Ceusters 2005b] which focuses on the EHR. Key in all this is to keep constantly aware of the distinction between three levels: reality, our understanding of reality, and our representations of reality of which ontologies are artifacts dealing with what is general in reality, and patient records with what is specific [Smith et al. 2006]. RT is primarily involved with the latter.

Challenges being solved

Of course, RT presents some challenges of its own. One specific problem is how to represent phenomena commonly expressed by statements such as: "no history of diabetes", "hypertension ruled out", "absence of metastases in the lung", and "abortion was prevented". Such statements seem at first sight to present a problem for RT, since there are here no entities on the side of the patient to which unique identifiers can be assigned. For concept-based systems, this is not an issue since entities such as "diabetes" or "no diabetes" are both happily classified as "concepts", no further questions being asked.  We solved this problem by introducing the 'lacks' relation which, as the 'instance' relation, holds between particulars and universals, thereby remaining faithful to the principles of unqualified realism within an EHR regime based on the idea of faithfulness to clinical reality [Ceusters et al. 2006a].
We expanded this idea in [Ceusters et al. 2007a] by describing the lacks-relation at the level of universals as well.

Another challenge is keeping track of the different kinds of changes, reflecting for example: (1) changes in the underlying reality, either in a specific patient's condition or the world in general; (2) changes in our understanding; (3) reassessments of what is considered to be relevant for inclusion in a referent tracking database, or (4) encoding mistakes introduced during data entry. In [Ceusters & Smith 2006a] these issues are addressed from the perspective of versioning in ontologies and repositories. The method developed is then used in [Ceusters et al 2007b] to assess the history mechanism of SNOMED CT. We found that this mechanism would benefit from (1) an explicit representation of the provenance of a class; (2) the separation of the time-period during which a component is stated valid in SNOMED CT from the period it is (or has been) valid in reality, and (3) redesign of the historical relationships table to give users better assistance for recovery in case of introduced mistakes.
An application of these principles to the Gene Ontology, including a technique to use the principles for forcasting the quality of future versions, is described in [Ceusters 2009].
In [Ceusters 2006] the technique is used to assess the quality of ontologies involved in mapping efforts.

Since June 2007, we are tackling the issue of mistakes in a RTS. [Ceusters 2007] is the first publication in this area. It gives an introduction to the sorts of mistakes that may arise, and proposes a solution for how to deal with them.

Ceusters 2023 offers a very detailed introduction to the principles upon which the methodology rests and how these principles can be applied to improve the quality of the problem list in medical records. The paper introduces the 2nd version of Referent Tracking which includes treatment of uncertainty.


RT has been introduced in 2006 and since then many concrete applications have been developed.  An overview of the early efforts to demonstrate the usefulness of referent tracking in different domains is given in [Ceusters & Smith 2007b]. The details can be found in papers that have been published around that time. In [Ceusters & Smith 2006b] it is discussed how the paradigm can be used to make flow-chart types of decision support applications ready for the Semantic Web.  In [Ceusters & Smith 2006c] it is suggested to apply the RT principles to the Digital Object Identifier (DOI) system in the context of digital rights management. In [Ceusters & Smith 2007a], it is described how the principles of referent tracking are used to assess the quality of enterpirse ontologies and how they can lead to building better corporate memories.
How to develop an adequate representation of adverse events, a topic that we address in the context of the RAPS project, is covered in [Ceusters et al. 2008a] and [Ceusters et al. 2011].
In [Hogan 2011], a view is presented according to which ICD-9-CM codes are taken as diagnostic statements, where these statements are about entities that exist in reality. These entities are represented according to a realist view of disease, disorder, and diagnosis as defined by the Ontology for General Medical Science and using Referent Tracking templates. The approach is illustrated using ICD-9-CM codes that refer to systemic arterial hypertension.
In [Ceusters & Manzoor 2010] we discuss how the RT paradigm and its implementation in networks of RT systems can function as an enabling technology to make the vision of a Globally Networked and Integrated Intelligence Enterprise come true. Referent tracking uses a system of singular and globally unique identifiers to track not only entities and events in first-order reality, but also the data and information elements that are created to describe such entities and events in information systems. By doing so, it meets the requirements of the Information Sharing Strategy plan of the US.
Related to this is [MCS2009] in which an approach is proposed that allows storage of the contents of Joint Battle Management Language messages in a Referent Tracking System in a format that mimics the structure of reality thereby providing an aid to message validation.
We introduced in 2014 the idea of maximally self-explanatory and explicit datasets. Using Referent Tracking as basis, we describe in Ceusters et al. 2014 a technical data wrangling strategy which consists in creating for each dataset a template that, when applied to each particular record in the dataset, leads to the generation of a collection of Referent Tracking Tuples (RTT) built out of unique identifiers for the entities described by means of the data items in the record. The proposed strategy is based on (i) the distinction between data and what data are about, and (ii) the explicit descriptions of portions of reality which RTTs provide and which range not only over the particulars described by data items in a dataset, but also over these data items themselves.
In [Ceusters & Bona 2016], we propose Ontological Realism and Referent Tracking as a methodology to identify and describe (1) which components within the ontological structure exhibited by the configurations of entities observed and measured by devices on the Internet of Things (IoT) are essential and (2) the abstract syntax towards which the output of IoT devices (or the subsequent interpretation thereof) should be formatted, for such devices and their operation to minimize both the burden of data entry and the risks for assertion errors.
In Barton et al. 2017 it is shown that certain predicates expressed in temporal databases exhibit ambiguities when the values of some of the attributes of the entities described refer to classes or universals, rather than to individuals and how such ambiguities can be avoided by means of ontology-based representations that deal carefully with the particulars involved, such as referent tracking systems.


[Manzoor et al. 2007a] contains a description of how the Referent Tracking System (RTS) that is developed in the RTU implements the Referent Tracking paradigm. To put this system in practice, we propose an architecture based on middleware technology using web services. A preliminary analysis in the context of electronic health records has been caried out in collaboration with Medtuity Inc. In [Rudnicki et al. 2007a] we describe how data from an EHR application need to be decomposed in order to make them accord with the tenets of RT. We outline the ontological principles on which this decomposition is based. In [Manzoor et al. 2007b], we describe the functional and technical requirements of such an approach and document our experiences with MedtuityEMR, an EHR system that stores patient data in XML
Mid 2008, a version of the RTS application using the Peer 2 Peer (P2P) paradigm became available [Manzoor et al. 2008]. This enables the data to be shared over distributed Peers running at geographically different locations.
[Ceusters & Manzoor 2009a] covers the first implementation of the RTU website as an RT-enabled website.
[Hogan et al. 2011] contains a methodology for tracking the use of local identifiers in healthcare organizations. Such identifiers denote first-order entities such as particular persons, healthcare encounters, organizations, and so forth, and are used in electronic health records (EHRs) - in addition to other biomedical software applications - in the form of unique identifiers that follow local format conventions. An approach is developed that represents local identifiers in the same way as an RTS represents other entities: by assigning them an IUI in their own right.
An example of a "soft" implementation of Referent Tracking is decscribed in [Blaisure & Ceusters 2018] to the effect that the modifier system of i2b2 can be used to represent the relationships between observations and their explicit or implied referents on the one hand, and between relevant referents themselves on the other hand, both in combination with the storage of explicit unique instance identifiers for these observations and referents in i2b2’s fact table. While this approach adheres to the base functionality and implementation specifications of i2b2, it makes explicit ambiguities and confusions that would otherwise remain undetected.
In [Stoeckert et. al. 2018] the use of RT in the Penn Turbo project is described. It is demonstrated to be useful in integrating data from different sources which contain multiple references to the same entity, and incomplete or con-flicting information thus requiring to track the provenance of information when making decisions on what is the actual phenotype of a person.

List of relevant papers

Background material

  • [Smith et al. 2005a] Smith B, Ceusters W, Klagges B, Koehler J, Kumar A, Lomax J, Mungall C, Neuhaus F, Rector A, Rosse C. Relations in biomedical ontologies, Genome Biology 2005, 6:R46.
  • [Smith & Ceusters 2005b] Smith B, Ceusters W. An Ontology-Based Methodology for the Migration of Biomedical Terminologies to Electronic Health Records. AMIA 2005, October 22-26, Washington DC;:669-673. (draft).
  • [Smith & Ceusters 2006a] Smith B, Ceusters W. Ontology as the Core Discipline of Biomedical Informatics: Legacies of the Past and Recommendations for the Future Direction of Research, forthcoming in Gordana Dodig Crnkovic and Susan Stuart (eds.) Computing, Philosophy, And Cognitive Science, Cambridge: Cambridge Scholars Press, 2006. (full paper).
  • [Smith et al. 2006] Smith B, Kusnierczyk W, Schober D, Ceusters W. Towards a Reference Terminology for Ontology Research and Development in the Biomedical Domain. Proceedings of KR-MED 2006, Biomedical Ontology in Action, November 8, 2006, Baltimore MD, USA (draft)

Theoretical aspects of referent tracking

Applying referent tracking in specific contexts


  • [Stoeckert et. al. 2018] Stoeckert, C.J., et al. Transforming and Unifying Research with Biomedical Ontologies: The Penn TURBO Project, in ICBO 2018. 2018, CEUR Workshop Proceedings: Corvallis, OR. (paper)
  • [Blaisure & Ceusters 2018] Blaisure J, Ceusters W. Enhancing the Representational Power of i2b2 through Referent Tracking. AMIA 2018 Annual Symposium, San Francisco, CA, Nov 03-07, 2018. (paper, response to reviewers)
  • [Hogan et al. 2011] Hogan WR, Garimalla S, Tariq SA, Ceusters W. Representing Local Identifiers in a Referent-Tracking System (extended abstract), International Conference on Biomedical Ontology, Buffalo NY, July 28-30, 2011 (in press). (accepted extended abstract, unpublished long version, response to reviewers)
  • [Ceusters & Manzoor 2009a] Ceusters W, Manzoor S. Applying Referent Tracking to the Use and Evolution of Websites. InterOntology 2009, Tokyo, Japan, February 28 - March 1, 2009;:63-76.. (draft)
  • [Manzoor et al. 2008] Manzoor S, Ceusters W, Rudnicki R, Arp R. The Referent Tracking System as a Peer to Peer Application. In: Khoshgoftaar T (ed.) Proceedings of The Ninth IASTED International Conference on Software Engineering and Applications (SEA 2008), Orlando, Florida, USA, November 16-18, 2008. Acta Press, Anaheim, Calgary, Zurich, 2008;:112-117. (paper, reviews)
  • [Rudnicki et al. 2007a] Rudnicki R, Ceusters W, Manzoor S, Smith B. What Particulars are Referred to in EHR Data? A Case Study in Integrating Referent Tracking into an Electronic Health Record Application. In Teich JM, Suermondt J, Hripcsak C. (eds.), American Medical Informatics Association 2007 Annual Symposium Proceedings, Biomedical and Health Informatics: From Foundations to Applications to Policy, Chicago IL, 2007;:630-634. (abstract, draft)
  • [Manzoor et al. 2007a] Manzoor S, Ceusters W, Rudnicki R. Implementation of a Referent Tracking System. International Journal of Healthcare Information Systems and Informatics 2007;2(4):41-58. (summary, final draft, full paper).
  • [Manzoor et al. 2007b] Manzoor S, Ceusters W, Rudnicki R. A Middleware Approach to Integrate Referent Tracking in EHR Systems. In Teich JM, Suermondt J, Hripcsak C. (eds.), American Medical Informatics Association 2007 Annual Symposium Proceedings, Biomedical and Health Informatics: From Foundations to Applications to Policy, Chicago IL, 2007;:503-507.(abstract, draft)