One of the most challenging problems facing biomedical informatics today is the integration of disparate information resources arising from the different branches of biological research and clinical medicine. Researchers are flooded with information in a variety of formats, ranging from lab instrument data, gene expression profiles, raw sequence traces, chemical screening data and proteomics data, to metabolic pathway models and full-fledged life-science ontologies developed according to a myriad of incompatible and typically only loosely formalized schemas. This information is presented in general terms, without referring to specific identifiable organisms, experiments, and so forth.
At the other end of the spectrum are electronic health records (EHRs), which consist primarily of descriptions of a patient’s medical condition, the treatments administered, and the outcomes obtained. Also these descriptions are made using general terms which are either natural language based, or are taken from semi-formalized classification systems, terminologies or ontologies.
Only very few of these descriptions contain explicit references to corresponding particular entities. This lack of explicit reference is usually a minor problem for human interpreters, who can disambiguate the reference of a general term such as ‘pain in the upper leg’ by taking account of contextual clues pertaining to times, places and persons. For machines, however, the use of such terms makes an accurate understanding of EHR data nearly impossible.
Even those EHR systems which incorporate data in more structured formats, for example by using controlled vocabularies, terminologies or ontologies, are in no better shape in this respect. This is because the terms or codes contained in the latter are used simply as an alternative to what would otherwise have been registered by means of general terms in natural language. By picking a code from such a system and then registering that code in an EHR, one refers generically to some instance of the class represented by the code. It is still left at best only partially and indirectly specified which particular instance is intended in concrete reality.
The mission of the Referent Tracking Unit (RTU) is to carry out fundamental and applied research and software application development with the goal of allowing better use to be made of both (1) data pertaining to particular patients residing in EHRs on the one hand, and (2) patient-independent data of the type that is typically found in biomedical research databases on the other. This is achieved through a new paradigm: Referent Tracking. The work of the RTU is designed to allow biomedical and bioinformatics researchers to exploit the wealth of information that is stored in patient data repositories. At the same time it is designed to offer clinicians new and higher quality types of evidence for the appropriateness of given diagnoses or therapeutic hypotheses through seamless access to the research data generated by biologists and bio-informaticians.
- improve the diagnosis, treatment and post-treatment care of patients, either directly, during the process of individual patient care, or indirectly, through the development of new kinds of preventive medicine and of new kinds of informatics-based tools for doing translational research;
- achieve the linking of heterogeneous data in such a way that systems can produce demonstrably useful results;
- develop and apply hard measures of usefulness in the various health sciences disciplines such as medicine, nursing, pharmacy, public health and dentistry;
- install a dynamically growing data repository of pseudonymised patient data fed by various health facilities in the Buffalo area, accessible for both research and evidence-based patient management, satisfying the requirements of all national and local legislation and ethical conduct rules;
- develop middleware components able to use either (1) patient record data as input for web-based retrieval of similar cases or (2) general, text-book type descriptions of the relevant topics;
- build a coherent set of ontologies to link together the various resources in a semantically coherent way.
Werner CEUSTERS, MD
Werner Ceusters studied medicine, neuro-psychiatry,
informatics and knowledge engineering in Belgium. Since 1993, he has been
involved in numerous national and European research projects in the area of
Electronic Health Records, Natural Language Understanding and Ontology.
Prior to coming to Buffalo, he was Executive Director of the European
Centre for Ontological Research at Saarland University, Germany.
He is currently Professor in the Departments of Biomedical Informatics and Psychiatry of the School of Medicine and Biomedical
Sciences, SUNY at Buffalo NY, Director of the Ontology Research Group of
the New York State Center of Excellence in Bioinformatics and Life Sciences, Director of Research of the UB Institute for Healthcare Informatics, and PhD Program Director of the UB Department of Biomedical Informatics.
His research is focused on the application of Referent Tracking for data management, on the requirements for ontologies and terminologies to be useful for annotation under this framework, and on the principles information systems must follow to use ontologies and terminologies optimally.