06-03, 11:45–12:10 (Asia/Jerusalem), Hall 2 (PyData)
The centralized database that holds clinical trial data is in need of standardization - python tools are used to help this effort
This is joint work with Joshua Schertz
ClinicalTrials.Gov is the database where clinical trial data from all over the world is registered. Today some clinical trials are required to report their finding in this database according to U.S. law. Today this database holds over 300,000 clinical trials with over 10% with numeric results. However, since many entities are entering data into this fast growing database, the data is not standardized. Specifically, numerical data cannot be comprehended since the units are not standardized. There are over 23K different units detected from this database in 2019 - many of those units are similar only written differently. This talk will discuss how we use python tools to 1) process and index the data, 2) find similar units using NLP and machine learning, 3) create a web site to support user mapping of those units. We created ClinicalUnitMapping.com to support the standardization effort of those units. New elements of this presentation will discuss how units from existing medical standards such as UCUM, RTMMS , and CDISC are incorporated in the python processing pipeline. The intention is to create a unit standard that will be able to map all units reported by clinical trials. With such a database, the data in this clinical trials database would become machine comprehensible.
CLICK ON THIS TEXT TO ACCESS THE PRESENTATION
Jacob Barhak is an independent researcher/developer specializes in chronic disease modeling with emphasis on using Computational Technological solutions. The Reference Model for disease progression was self developed by Dr. Barhak as a freelancer. He is the developer of the Micro Simulation Tool (MIST). Dr. Barhak has diverse international background in engineering and computing science. Dr. Barhak is an advocate of non-blind public scientific review. He is active within the python community and runs the Austin Evening of Python Coding meetup.
For a longer list of activities visit his page