top of page

​Projects

Search

Genomic data in large genomic knowledgebases (KB) such as Gene Ontology (GO) are being integrated into clinical diagnosis and disease prediction. This type of integration is particularly useful in predicting complex diseases with highly heterogeneous genotypes that make biological marker identifications difficult. Several types of machine learning (ML) models have been applied to identify relatively small number of disease-associated genetic sequence amongst the large number of common variants carried by an individual [1]. Biomedical knowledge organization systems (KOSs) containing relationships between variants, genes, and diseases promise higher precision and performance of ML models.

I use GO as a case study of biomedical KOS that provides rich annotation data in directed acyclic graph (DAG) structure as train dataset for ML algorithms to identify potential disease-triggering gene products. Different from approaches in bioinformatics, my research focus is not on designing models and packages. Rather, I discuss the data quality, ontology structure, crosslink with external resources e.g. Disease Ontology, to evaluate the current design of KOS for biological research, which applies theories in knowledge organization and library science. Past findings on this area were contributed by either bioinformatics or computer science scholars. My role is to reveal the importance of knowledge work in bridging these two communities, and discuss the usage of ontology data in LLM to achieve trustworthiness and precision.

Currently, I test collecting GO annotation data to identify potential gene products that may be associated with Autism disease using ML algorithms - Random Forest, Support-Vector Machine, and Gradient Boosting. A demo of this preliminary step will be presented at the DCMI 2024 NKOS workshop (https://www.dublincore.org/conferences/2024/sessions/nkos-workshop/).

In 2023 I began to learn and use social network analysis (SNA) for science of science (SoS) research in the Metadata Lab. I grew some interest in this method and to complete my research practicum, I conducted a project on climate change skeptics and believers on YouTube. I used SNA to plot the users who commented on videos and others on whether to believe or deny anthropogenic climate change.


I collected data on YouTube comments using the free YouTube API key of videos that agree or deny climate change. This polarized topic has gained popularity among scientists, non-scientists, influencers, politicians, etc. even though there is evidence of human-caused global warming. It is an interesting issue because people are skeptical of climate change for one or multiple reasons:

(1) Knowledge

(2) Critical reasoning

(3) Political/Socio- Identity


And possibly (4) The value of good science inquiry, which denies some scientific evidence as valid and trustworthy due to it not meeting standards of objective, rigorous scientific research process. Currently, I am trying to study the 4th reason on a group of scientists who claim to be climate change deniers, or at least skeptics.


IMG_1012_edited.jpg

Qiaoyi (Joy) Liu

School of Information Studies, Syracuse University
​Syracuse, NY 13244

 

©2024 by Qiaoyi Joy Liu. Powered and secured by Wix

bottom of page