Referent Tracking Literature
Introductory readings
[Ceusters & Smith 2005a] is the first paper that introduced the idea of Referent Tracking (RT) as a new paradigm for entry and retrieval of data in the Electronic Health Record (EHR). It contains an easy to read introduction to the sort of problems that arise when general terms taken from terminologies or ontologies (such as 'fracture' and 'left femur') are used in the EHR to refer to particular entities on the side of a patient such as the particular fracture that John suffered from in 2005 in his left femur. It explains how all ambiguities can be avoided by referring to such entities by means of identifiers instead of codes. A more thorough and formal discussion of the types of statements that are required as well as the technical infrastructure for real implementations can be found in [Ceusters & Smith 2005b].
RT follows a set of rigorous principles that are based on philosophical realism, a branch of ontology of which the foundations for applying it to biomedicine are discussed in [Smith et al. 2005]. Also important to read in order to understand what is wrong with the prevailing concept-based approach and how RT can contribute to a solution, are [Smith & Ceusters 2006a] which covers biomedical terminologies and ontologies in general, and [Smith & Ceusters 2005b] which focuses on the EHR. Key in all this is to keep constantly aware of the distinction between three levels: reality, our understanding of reality, and our representations of reality of which ontologies are artifacts dealing with what is general in reality, and patient records with what is specific [Smith et al. 2006]. RT is primarily involved with the latter.
Challenges being solved
Of course, RT presents some challenges of its own. One specific problem
is how to represent phenomena commonly expressed by statements such as:
"no history of diabetes", "hypertension ruled out",
"absence of metastases in the lung", and "abortion was
prevented". Such statements seem at first sight to present a problem
for RT, since there are here no entities on the side of the patient to which
unique identifiers can be assigned. For concept-based systems, this is not
an issue since entities such as "diabetes" or "no
diabetes" are both happily classified as "concepts", no
further questions being asked. We solved this problem by introducing
the 'lacks' relation which, as the 'instance' relation, holds
between particulars and universals, thereby remaining faithful to the
principles of unqualified realism within an EHR regime based on the idea of
faithfulness to clinical reality [Ceusters et al. 2006a].
We expanded this idea in [Ceusters et al. 2007a] by describing the lacks-relation at the level of universals as well.
Another challenge is keeping track of the different kinds of changes,
reflecting for example: (1) changes in the underlying reality, either in a
specific patient's condition or the world in general; (2) changes in our
understanding; (3) reassessments of what is considered to be relevant for
inclusion in a referent tracking database, or (4) encoding mistakes
introduced during data entry. In [Ceusters & Smith 2006a] these issues are addressed from the perspective of versioning in ontologies and repositories. The method developed is then used in [Ceusters et al 2007b] to assess the history mechanism of SNOMED CT. We found that this mechanism would benefit from (1) an explicit representation of the provenance of a class; (2) the separation of the time-period during which a component is stated valid in SNOMED CT from the period it is (or has been) valid in reality, and (3) redesign of the historical relationships table to give users better assistance for recovery in case of introduced mistakes.
An application of these principles to the Gene Ontology, including a technique to use the principles for forcasting the quality of future versions, is described in [Ceusters 2009].
In [Ceusters 2006] the technique is used to assess the quality of ontologies involved in mapping efforts.
Since June 2007, we are tackling the issue of mistakes in a RTS. [Ceusters 2007] is the first publication in this area. It gives an introduction to the sorts of mistakes that may arise, and proposes a solution for how to deal with them.
Ceusters 2023 offers a very detailed introduction to the principles upon which the methodology rests and how these principles can be applied to improve the quality of the problem list in medical records. The paper introduces the 2nd version of Referent Tracking which includes treatment of uncertainty.
Applications
RT has been introduced in 2006 and since then many concrete applications have been developed. An overview of the early efforts to demonstrate the usefulness of referent tracking in different domains is given in [Ceusters & Smith 2007b]. The details can be found in papers that have been published around that time. In [Ceusters & Smith 2006b] it is discussed how the paradigm can be used to
make flow-chart types of decision support applications ready for the
Semantic Web. In [Ceusters & Smith 2006c] it is suggested to apply the
RT principles to the Digital Object Identifier (DOI) system in the context
of digital rights management. In [Ceusters & Smith 2007a], it is described how the principles of referent tracking are used to assess the quality of enterpirse ontologies and how they can lead to building better corporate memories.
How to develop an adequate representation of adverse events, a topic that we address in the context of the RAPS project, is covered in [Ceusters et al. 2008a] and
[Ceusters et al. 2011].
In [Hogan 2011], a view is presented according to which ICD-9-CM codes are taken as diagnostic statements,
where these statements are about entities that exist in reality.
These entities are represented according to a realist view
of disease, disorder, and diagnosis as defined by the Ontology
for General Medical Science and using Referent Tracking
templates. The approach is illustrated using ICD-9-CM
codes that refer to systemic arterial hypertension.
In [Ceusters & Manzoor 2010] we discuss how the RT paradigm and its implementation in networks of RT systems can function as an enabling technology to make the vision of a Globally Networked and Integrated Intelligence Enterprise come true. Referent tracking uses a system of singular and globally unique identifiers to track not only entities and events in first-order reality, but also the data and information elements that are created to describe such entities and events in information systems. By doing so, it meets the requirements of the Information Sharing Strategy plan of the US.
Related to this is [MCS2009] in which an approach is proposed that allows
storage of the contents of Joint Battle Management Language messages in a Referent Tracking System in a format that mimics the structure of reality thereby providing an aid to message validation.
We introduced in 2014 the idea of maximally self-explanatory and explicit datasets. Using Referent Tracking as basis, we describe in Ceusters et al. 2014 a technical data wrangling strategy which consists in creating for each dataset a template that, when applied to each particular record in the dataset, leads to the generation of a collection of Referent Tracking Tuples (RTT) built out of unique identifiers for the entities described by means of the data items in the record. The proposed strategy is based on (i) the distinction between data and what data are about, and (ii) the explicit descriptions of portions of reality which RTTs provide and which range not only over the particulars described by data items in a dataset, but also over these data items themselves.
In [Ceusters & Bona 2016], we propose Ontological Realism and Referent Tracking as a methodology to identify and
describe (1) which components within the ontological structure exhibited by the
configurations of entities observed and measured by devices on the Internet of Things (IoT) are essential and (2)
the abstract syntax towards which the output of IoT devices (or the subsequent
interpretation thereof) should be formatted, for such devices and their operation to
minimize both the burden of data entry and the risks for assertion errors.
In Barton et al. 2017 it is shown that certain predicates
expressed in temporal databases exhibit ambiguities when the values of some of the attributes of the entities described refer to classes or
universals, rather than to individuals and how such ambiguities can be avoided by means of ontology-based representations that deal carefully
with the particulars involved, such as referent tracking systems.
Implementations
[Manzoor et al. 2007a] contains a description of how the Referent Tracking System (RTS) that is developed in the RTU implements the Referent Tracking paradigm. To put this system in practice, we propose an architecture based on middleware technology using web services. A preliminary analysis in the context of electronic health records has been caried out in collaboration with Medtuity Inc. In [Rudnicki et al. 2007a] we describe how data from an EHR application need to be decomposed in order to make them accord with the tenets of RT. We outline the ontological principles on which this decomposition is based. In [Manzoor et al. 2007b], we describe the functional and technical requirements of such an approach and document our experiences with MedtuityEMR, an EHR system that stores patient data in XML
Mid 2008, a version of the RTS application using the Peer 2 Peer (P2P) paradigm became available [Manzoor et al. 2008]. This enables the data to be shared over distributed Peers running at geographically different locations.
[Ceusters & Manzoor 2009a] covers the first implementation of the RTU website as an RT-enabled website.
[Hogan et al. 2011] contains a methodology for tracking the use of local identifiers in healthcare organizations. Such identifiers denote first-order entities such as particular persons, healthcare encounters, organizations, and so forth, and are used in electronic health records (EHRs) - in addition to other biomedical software applications - in the form of unique identifiers that follow local format conventions. An approach is developed that represents local identifiers in the same way as an RTS represents other entities: by assigning them an IUI in their own right.
An example of a "soft" implementation of Referent Tracking is decscribed in [Blaisure & Ceusters 2018] to the effect that the modifier system of i2b2 can be used to represent the relationships between observations and their explicit or implied referents on the one hand, and between relevant referents themselves on the other hand, both in combination with the storage of explicit unique instance identifiers for these observations and referents in i2b2’s fact table. While this approach adheres to the base functionality and implementation specifications of i2b2, it makes explicit ambiguities and confusions that would otherwise remain undetected.
In [Stoeckert et. al. 2018] the use of RT in the Penn Turbo project is described. It is demonstrated to be useful in integrating data from different sources which contain multiple references to the same entity, and incomplete or con-flicting information thus requiring to track the provenance of information when making decisions on what is the actual phenotype of a person.
List of relevant papers
Background material
- [Smith et al. 2005a] Smith B, Ceusters W, Klagges B, Koehler J, Kumar A, Lomax J, Mungall C, Neuhaus F, Rector A, Rosse C. Relations in biomedical ontologies, Genome Biology 2005, 6:R46.
- [Smith & Ceusters 2005b] Smith B, Ceusters W. An Ontology-Based Methodology for the Migration of Biomedical Terminologies to Electronic Health Records. AMIA 2005, October 22-26, Washington DC;:669-673. (draft).
- [Smith & Ceusters 2006a] Smith B, Ceusters W. Ontology as the Core Discipline of Biomedical Informatics: Legacies of the Past and Recommendations for the Future Direction of Research, forthcoming in Gordana Dodig Crnkovic and Susan Stuart (eds.) Computing, Philosophy, And Cognitive Science, Cambridge: Cambridge Scholars Press, 2006. (full paper).
- [Smith et al. 2006] Smith B, Kusnierczyk W, Schober D, Ceusters W. Towards a Reference Terminology for Ontology Research and Development in the Biomedical Domain. Proceedings of KR-MED 2006, Biomedical Ontology in Action, November 8, 2006, Baltimore MD, USA (draft)
Theoretical aspects of referent tracking
- [Ceusters 2007] Ceusters W. Dealing with Mistakes in a Referent Tracking System. In: Hornsby KS (eds.) Proceedings of Ontology for the Intelligence Community 2007 (OIC-2007), Columbia MA, 28-29 November 2007;:5-8. (Extended abstract, publication)
- [Ceusters et al. 2007a] Ceusters W, Elkin P, Smith B. Negative Findings in Electronic Health Records and Biomedical Ontologies: A Realist Approach. International Journal of Medical Informatics 2007;2007:326-333. (draft, abstract, full paper).
- [Ceusters et al. 2006a] Ceusters W, Elkin P, Smith B. Referent Tracking: The Problem of Negative Findings, Stud Health Technol Inform. 2006;124:741-6. (Presented at MIE2006) (draft paper, slides)
- [Ceusters & Smith 2006a] Ceusters W, Smith B. A Realism-Based Approach to the Evolution of Biomedical Ontologies. Proceedings of AMIA 2006, Washington DC, November 11-15, 2006, pp 121-125. (draft)
- [Ceusters2006] Ceusters W. Towards A Realism-Based Metric for Quality Assurance in Ontology Matching. In: Bennett B, Fellbaum C. (eds.) Formal Ontology in Information Systems, IOS Press, Amsterdam, 2006;:321-332. Proceedings of FOIS-2006, Baltimore, Maryland, November 9-11, 2006. (draft)
- [Ceusters & Smith, 2005a] Ceusters W. and Smith B. Tracking Referents in Electronic Health Records. In: Engelbrecht R. et al. (eds.) Medical Informatics Europe, IOS Press, Amsterdam, 2005;:71-76. (draft, slides)
- [Ceusters & Smith, 2005b] Ceusters W, Smith B. Strategies for Referent Tracking in Electronic Health Records. J Biomed Inform. 2006 Jun;39(3):362-78. (ePub 2005 Sep 9, slides presented during the IMIA WG6 workshop Ontology and Biomedical Informatics, Rome, Italy, April 29 - Mai 1, 2005)
Applying referent tracking in specific contexts
- [Ceusters 2023] Ceusters W. The place of Referent Tracking in Biomedical Informatics. In Elkin, Peter (ed.) Terminology, Ontology and Their Implementations. Springer Nature. (print version, preprint).
- [Barton et al. 2017] Adrien Barton, Christina Khnaisser, Luc Lavoie, Jean-François Ethier. Ambiguities in Medical Bitemporalized Relational Databases: A Referent Tracking View. Special Topic Conference The Joint Ontology Workshops 2017 (JOWO-2017), Bozen-Bolzano, Italy, September 21–23, 2017. (paper).
- [Ceusters & Bona 2016] Ceusters W, Bona J. Ontological Foundations for Tracking Data Quality through the Internet of Things. Special Topic Conference Transforming Healthcare with the Internet of Things (EFMI-STC2016), Paris, France, April 17-19, 2016; Stud Health Technol Inform. 2016;221:74-8. (slides, paper, response to reviewers)
- [Ceusters et al. 2014] Ceusters W, Hsu CY, Smith B. Clinical Data Wrangling using Ontological Realism and Referent Tracking. International Conference on Biomedical Ontologies, ICBO 2014, Houston, Texas, Oct 6-9, 2014. (paper, response to reviewers, slides)
- [Ceusters et al. 2011] Ceusters W, Capolupo M, De Moor G, Devlies J, Smith B. An Evolutionary Approach to Realism-Based Adverse Event Representations. Methods of Information in Medicine, 2011;50(1):62-73. (Epub, uncorrected draft accepted for publication, response to reviewers).
- [Hogan 2010] Hogan WR. To what entities does an ICD-9-CM code refer? A realist approach. In: Shah N, Sansone S-A, Stephens S, Soldatova L, editors. Bio-ontologies; Boston, MA, 2010. (paper)
- [Ceusters & Manzoor 2009] Ceusters W, Manzoor S. How to track absolutely everything? In: Obrst L, Janssen T, Ceusters W (eds.) Ontologies and Semantic Technologies for the Intelligence Community. Frontiers in Artificial Intelligence and Applications. IOS Press Amsterdam, 2010;:13-36. (final draft)
- [Manzoor et al. 2009] Manzoor S, Ceusters W, Smith B. Referent Tracking for Command and Control Messaging Systems. Ontology for the Intelligence Community 2009 (OIC-2009), Fairfax Virginia, October 21-22, 2009. (paper, slides)
- [Ceusters 2009] Ceusters W. Applying Evolutionary Terminology Auditing to the Gene Ontology. Journal of Biomedical Informatics 2009;42:518–529. (Official version, accepted draft, reviewers comments and responses)
- [Ceusters et al. 2008a] Ceusters W, Capolupo M, De Moor G, Devlies J. Introducing Realist Ontology for the Representation of Adverse Events. In: Eschenbach C, Gruninger M. (eds.) Formal Ontology in Information Systems, IOS Press, Amsterdam, 2008;:237-250. (final draft)
- [Ceusters et al. 2007b] Ceusters W, Spackman KA, Smith B. Would SNOMED CT benefit from Realism-Based Ontology Evolution? In Teich JM, Suermondt J, Hripcsak C. (eds.), American Medical Informatics Association 2007 Annual Symposium Proceedings, Biomedical and Health Informatics: From Foundations to Applications to Policy, Chicago IL, 2007;:105-109. (abstract, draft)
- [Ceusters & Smith 2007b] Ceusters W, Smith B. Referent Tracking and its Applications. In: Proceedings of the WWW2007 Workshop i3: Identity, Identifiers, Identification. Banff, Canada, May 8, 2007, CEUR Workshop Proceedings, ISSN 1613-0073, online http://ceur-ws.org/Vol-249/submission_105.pdf.
- [Ceusters & Smith 2007a] Ceusters W, Smith B. Referent Tracking for Corporate Memories. In: Rittgen P. (ed.) Handbook of Ontologies for Business Interaction. Hershey, New York and London: Information Science Reference, 2007, 34-46. (internal document used as basis)
- [Ceusters & Smith 2006b] Ceusters W, Smith B. Referent Tracking for Treatment Optimisation in Schizophrenic Patients. Journal of Web Semantics 4(3) 2006:229-36; Special issue on semantic web for the life sciences. (Long draft, official preprint, published paper)
- [Ceusters & Smith 2006c] Ceusters W, Smith B. Referent Tracking for Digital Rights Management. International Journal of Metadata, Semantics and Ontologies 2007;2(1):45-53. (draft, published version)
Implementation
- [Stoeckert et. al. 2018] Stoeckert, C.J., et al. Transforming and Unifying Research with Biomedical Ontologies: The Penn TURBO Project, in ICBO 2018. 2018, CEUR Workshop Proceedings: Corvallis, OR. (paper)
- [Blaisure & Ceusters 2018] Blaisure J, Ceusters W. Enhancing the Representational Power of i2b2 through Referent Tracking. AMIA 2018 Annual Symposium, San Francisco, CA, Nov 03-07, 2018. (paper, response to reviewers)
- [Hogan et al. 2011] Hogan WR, Garimalla S, Tariq SA, Ceusters W. Representing Local Identifiers in a Referent-Tracking System (extended abstract), International Conference on Biomedical Ontology, Buffalo NY, July 28-30, 2011 (in press). (accepted extended abstract, unpublished long version, response to reviewers)
- [Ceusters & Manzoor 2009a] Ceusters W, Manzoor S. Applying Referent Tracking to the Use and Evolution of Websites. InterOntology 2009, Tokyo, Japan, February 28 - March 1, 2009;:63-76.. (draft)
- [Manzoor et al. 2008] Manzoor S, Ceusters W, Rudnicki R, Arp R. The Referent Tracking System as a Peer to Peer Application. In: Khoshgoftaar T (ed.) Proceedings of The Ninth IASTED International Conference on Software Engineering and Applications (SEA 2008), Orlando, Florida, USA, November 16-18, 2008. Acta Press, Anaheim, Calgary, Zurich, 2008;:112-117. (paper, reviews)
- [Rudnicki et al. 2007a] Rudnicki R, Ceusters W, Manzoor S, Smith B. What Particulars are Referred to in EHR Data? A Case Study in Integrating Referent Tracking into an Electronic Health Record Application. In Teich JM, Suermondt J, Hripcsak C. (eds.), American Medical Informatics Association 2007 Annual Symposium Proceedings, Biomedical and Health Informatics: From Foundations to Applications to Policy, Chicago IL, 2007;:630-634. (abstract, draft)
- [Manzoor et al. 2007a] Manzoor S, Ceusters W, Rudnicki R. Implementation of a Referent Tracking System. International Journal of Healthcare Information Systems and Informatics 2007;2(4):41-58. (summary, final draft, full paper).
- [Manzoor et al. 2007b] Manzoor S, Ceusters W, Rudnicki R. A Middleware Approach to Integrate Referent Tracking in EHR Systems. In Teich JM, Suermondt J, Hripcsak C. (eds.), American Medical Informatics Association 2007 Annual Symposium Proceedings, Biomedical and Health Informatics: From Foundations to Applications to Policy, Chicago IL, 2007;:503-507.(abstract, draft)