Informatics Graduate Theses and PhD Dissertations

Permanent URI for this collection

Browse

Recent Submissions

Now showing 1 - 10 of 160
  • Item
    Integrated Correlation Analysis of Proteomics and Transcriptomics Data in Alzheimer's Disease
    (2020-12) Modekurty, Suneeta; Liu, Xiaowen; Wan, Jun; Zheng, Jiaping
    We wanted to see if there existed any significant correlations between two -omics layers. So, here, we performed a correlation analysis to study the disease. The pipeline building consisted of first performing the differential expression of two datasets (proteomics and transcriptomics) individually. An in-depth analysis of the proteomics data was performed, followed by differential expression analysis of RNA seq data and then a correlational analysis of the differentially expressed proteins (from proteomics data) and genes (from RNA seq data). From our analysis, we found fascinating information about the correlations between proteins and genes in AD. We performed a correlation analysis of AD (N= 84), Control (N = 31), and PSP (N = 85) samples for proteomics data and got 114 differentially expressed proteins (DEPs = 114). The RNA seq data had AD (N = 82), Control (N = 31) and PSP (N = 84) samples which gave us 61 differentially expressed genes (DEGs = 61). A correlation analysis using Spearman’s correlation coefficient method between proteins involved in AD revealed 192 very significant correlations with p-value <= 0.00000000000005. The mean correlation coefficient was quite high (r = 0.52). A correlation analysis using Spearman’s correlation coefficient method between genes involved in AD revealed 208 very significant correlations with p-value <= 0.00000000000005. The mean correlation coefficient was quite high (r = 0.52). A correlation analysis using Spearman’s correlation coefficient method between proteins and genes involved in AD revealed 395 significant correlations with p-value <= 0.0001. The correlation coefficient (quite high of +0.53), which might help in understanding the molecular pathways behind the disease could uncover new prospects of understanding the disease as well as design treatments. We observed that different genes interact with different proteins (correlation coefficient r >= 0.5, p-value < 0.05). We also observed that a single protein interacts with multiple genes, and a single gene is interestingly associated with multiple proteins. The patterns of correlations are also different in that a protein/gene positively correlates with some proteins/genes and negatively with some other proteins/genes. We hope that this observation is quite useful. However, understanding how it works and how they interact with each other needs further assessment at the molecular level.
  • Item
    Bridging The Gap Between Healthcare Providers and Consumers: Extracting Features from Online Health Forum to Meet Social Needs of Patients using Network Analysis and Embedding
    (2020-08) Mokashi, Maitreyi; Chakraborty, Sunandan; Jones, Josette; Zheng, Jiaping
    Chronic disease patients have to face many issues during and after their treatment. A lot of these issues are either personal, professional, or social in nature. It may so happen that these issues are overlooked by the respective healthcare providers and become major obstacles in the patient’s day-to-day life and their disease management. We extract data from an online health platform that serves as a ‘safe haven’ to the patients and survivors to discuss help and coping issues. This thesis presents a novel approach that acts as the first step to include the social issues discussed by patients on online health forums which the healthcare providers need to consider in order to create holistic treatment plans. There are numerous online forums where patients share their experiences and post questions about their treatments and their subsequent side effects. We collected data from an “Online Breast Cancer Forum”. On this forum, users (patients) have created threads across many related topics and shared their experiences and questions. We connect the patients (users) with the topic in which they have posted by converting the data into a bipartite network and turn the network nodes into a high-dimensional feature space. From this feature space, we perform community detection on the node embeddings to unearth latent connections between patients and topics. We claim that these latent connections, along with the existing ones, will help to create a new knowledge base that will eventually help the healthcare providers to understand and acknowledge the non-medical related issues to a treatment, and create more adaptive and personalized plans. We performed both qualitative and quantitative analysis on the obtained embeddings to prove the superior quality of our approach and its potential to extract more information when compared to other models.
  • Item
    Data-Driven Accountability: Examining and Reorienting the Mythologies of Data
    (2020-05) Verma, Nitya; Dombrowski, Lynn; Bolchini, Davide; Young, Alyson; Seybold, Peter; Voida, Amy; Muller, Michael
    In this work, I examine and design sociotechnical interventions for addressing limitations around data-driven accountability, particularly focusing on politically contentious and systemic social issues (i.e., police accountability). While organizations across sectors of society are scrambling to adopt data-driven technologies and practices, there are epistemological and ethical concerns around how data use influences decisionmaking and actionability. My work explores how stakeholders adopt and handle the challenges around being data-driven, advocating for ways HCI can mitigate such challenges. In this dissertation, I highlight three case studies that focus on data-driven, human-services organizations, which work with at-risk and marginalized populations. First, I examine the tools and practices of nonprofit workers and how they experience the mythologies associated with data use in their work. Second, I investigate how police officers are adopting data-driven technologies and practices, which highlights the challenges police contend with in addressing social criticisms around police accountability and marginalization. Finally, I conducted a case study with multiple stakeholders around police accountability to understand how systemic biases and politically charged spaces perceive and utilize data, as well as to develop the design space around how alternative futures of being data-driven could support more robust and inclusive accountability. I examine how participants situate the concepts of power, bias, and truth in the data-driven practices and technologies used by and around the police. With this empirical work, I present insights that inform the HCI community at the intersection of data design, practice, and policies in addressing systemic social issues.
  • Item
    Exploring The Effect Of Visual And Verbal Feedback On Ballet Dance Performance In Mirrored And Non-Mirrored Environments
    (2016-05) Trajkova, Milka; Cafaro, Francesco; Bolchini, Davide; Mannheimer, Steve
    Since the 1800s, the ballet studio has been largely unchanged, a core feature of which is the mirror. The influence of mirrors on ballet education has been documented, and prior literature has shown negative effects on dancers’ body image, satisfaction, level of attention and performance quality. While the mirror provides immediate real-time feedback, it does not inform dancers of their errors. Tools have been developed to do so, but the design of the feedback from a bottom-up perspective has not been extensively studied. The following study aimed to assess the value of different types of feedback to inform the design of tech-augmented mirrors. University students’ ballet technique scores were evaluated on eight ballet combinations (tendue, adagio, pirouette, petit allegro, plié, degage, frappe and battement tendue), and feedback was provided to them. We accessed learning with remote domain expert to determine whether or not the system had an impact on dancers. Results revealed that the treatment with feedback was statistically significant and yielded higher performance versus without the feedback. Mirror versus non-mirror performance did not present any score disparity indicating that users performed similarly in both conditions. A best fit possibility was seen when visual and verbal feedback were combined. We created MuscAt, a set of interconnected feedback design principles, which led us to conclude that the feasibility of remote teaching in ballet is possible.
  • Item
    End-User Needs of Fragmented Databases in Higher Education Data Analysis and Decision Making
    (2019-05) Briggs, Amanda; Cafaro, Francesco; Dombrowski, Lynn; Reda, Khairi
    In higher education, a wealth of data is available to advisors, recruiters, marketers, and program directors. However, data sources can be accessed in a variety of ways and often do not seem to represent the same data set, presenting users with the confounding notion that data sources are in conflict with one another. As users are identifying new ways of accessing and analyzing this data, they are modifying existing work practices and sometimes creating their own databases. To understand how users are navigating these databases, the researchers employed a mixed methods research design including a survey and interview to understand the needs to end users who are accessing these seemingly fragmented databases. The study resulted in a three overarching categories – access, understandability, and use – that affect work practices for end users. The researchers used these themes to develop a set of broadly applicable design recommendations as well as six sets of sketches for implementation – development of a data gateway, training, collaboration, tracking, definitions and roadblocks, and time management.
  • Item
    Explore the relations between personality and gamification
    (2018-01-22) Jia, Yuan; Bolchini, Davide; Voida, Stephen; MacDorman, Karl; Defazio, Joseph
    Successful gamification motivates users to engage in systems using game-like experiences. However, a one-size-fits-all approach to gamification is often unsuccessful; prior studies suggest that personality serves as a key differentiator in the effectiveness of the approach. To advance the understanding of personality differences and their influence on users’ behavior and motivation in gamification, this dissertation is comprised of three studies that: 1) explore the relationships among individuals’ personality traits and preferences for different gamification features through an online survey; 2) investigate how people with different personality traits respond to the motivational affordances in a gamified application over a period of time through a diary study; and 3) reveal how individuals respond differentially to different kinds of leaderboard experiences based on their leaderboard rankings, the application domain, and the individuals’ personality traits through their responses to 9 dynamic leaderboards. The results from the first study show that extraversion and emotional stability are the two primary personality traits that differentiate users’ preferences for gamification. Among the 10 types of motivational affordances, extraverts are more likely to be motivated by Points, Levels, and Leaderboards. However, the results from the second (diary) study indicate that, after the first week, extraverts’ preferences for Points decreased. The motivation effects of Points and Leaderboards changed over the course of using the gamified application. The results from the third study confirm the findings from the first two studies about extraversion and revealed that ranking and domain differences are also effective factors in users’ experiences of Leaderboards in gamification. Design guidelines for gamification are presented based on the results of each of the three studies. Based on a synthesis of the results from these three studies, this dissertation proposes a conceptual model for gamification design. The model describes not only the impact of personality traits, domain differences, and users’ experience over time, but also illustrates the importance of considering individual differences, application context, and the potential significance of user persistence in gamification design. This research contributes to the HCI and gamification communities by uncovering factors that will affect the way that people respond to gamification systems, considered holistically.
  • Item
    Weighted gene co-expression network analysis of colorectal patients to identify right drug-right target for potent efficacy of targeted therapy
    (2017-12-10) Tripathi, Anamika; Pradhan, Meeta; Wu, Huanmei
    Colon rectal cancer (CRC) is one of the most common cancers worldwide. It is characterized by the successive accumulation of mutations in genes controlling epithelial cell growth and differentiation leading to genomic in-stability. This results in the activation of proto-oncogene(K-ras), loss of tumor suppressor gene activity and ab-normality in DNA repair genes. Targeted therapy is a new generation of cancer treatment in which drugs attack targets which are specific for the cancer cell and are critical for its survival or for its malignant behavior. Survival of metastatic CRC patients has approximately doubled due to the development of new combinations of stan-dard chemotherapy, and the innovative targeted therapies, such as monoclonal antibodies against epidermal growth factor receptor (EGFR) or monoclonal antibodies against vascular endothelial growth factor (VEGFR).The study is to exhibit the need for right drug-right target and provides a proof of principle for potent efficacy of molecular targeted therapy for CRC. We have performed the weighted gene co-expression network analysis for three different patient cohort treated with different targeted therapy drugs. The results demonstrates the variation across different treatment regime in context of transcription factor networks. New significant tran-scription factors have been identified as potential biomarker for CRC cancer including EP300, STAT6, ATF3, ELK1, HNF4A, JUN, TAF1, IRF1, TP53, ELF1 and YY1. The results provides guidance for future omic study on CRC and additional validation work for potent biomarker for CRC.
  • Item
    Translational high-dimesional drug interaction discovery and validation using health record databases and pharmacokinetics models
    (2017-10-31) Chiang, Chien-Wei; Li, Lang; Wu, Huanmei; Liu, Yunlong; Liu, Xiaowen
    Polypharmacy leads to increased risk of drug-drug interactions (DDI’s). In this dissertation, we create a database for quantifying fraction of metabolism (fm) of CYP450 isozymes for FDA approved drugs. A reproducible data collection protocol was developed to extract key information from publicly available in vitro selective CYP enzyme inhibition studies. The fm was then estimated from the curated data. Then, proposed a random control selection approach for nested case-control design for electronical health records (HER) and electronical medical records (EMR) databases. By relaxing the matching by case’s index time restriction, random control dramatically reduces the computational burden compared with traditional control selection approaches. Using the Observational Medical Outcomes Partnership gold standard and an EMR database, random control is demonstrated to have better performances as well. Finally, combining epidemiological studies and pharmacokinetic modeling with fm database, we detected and evaluated high-dimensional drug-drug interactions among thirty high frequency drugs. Multi-drug combinations that increased risk of myopathy were identified in the FAERS and EMR databases by a mixture drug-count response model (MDCM) model. Twenty-eight 3-way and 43 4-way DDI’s increased ratio of area under plasma concentration–time curve (AUCR) >2-fold and had significant myopathy risk in both databases. The predicted AUCR of omeprazole in the presence of fluconazole and clonidine was 9.35; and increased risk of myopathy was 6.41 (LFDR = 0.002) in FAERS and 18.46 (LFDR = 0.005) in EMR. We demonstrate that combining health record informatics and pharmacokinetic modeling is a powerful translational approach to detect high-dimensional DDI’s.
  • Item
    Help: defining the usability requirements of a breast cancer long-term survivorship (LTS) navigator
    (2017-08) Al-Abdulmunem, Monirah; Jones, Josette; Kulanthaivel, Anand
    Long-term survivors (LTSs) of breast cancer are defined as patients who have been in remission for a year or longer. Even after being declared breast-cancer-free, many LTSs have questions that were not answered by clinicians. Although online resources provide some content for LTSs, none, or very little, provide immediate answers to specific questions. Thus, the aim involves proposing specifications for a system, the Health Electronic Learning Platform (HELP), that can assist survivors by becoming an all-inclusive resource for LTSs of breast cancer. To achieve this, relevant information from the literature was used to assess the needs of LTSs. Also, data from a study involving the breast cancer survivor’s forum project that had been filtered to include posts with mentions of features to be added to the website and usability issues encountered. To complete the actual design of the system, a synthesis of the results obtained from these two sources was performed. HELP is simple in terms of its layout and consists of a main search-bar, where LTSs are able to ask questions using their own terms and language. This navigator should not be taken as definitive solution, but instead, should be used as a starting point toward better patient-centered care.
  • Item
    Translational drug interaction study using text mining technology
    (2017-08-15) Wu, Heng-Yi; Jones, Josette; Li, Lang; Palakal, Mathew; Wu, Huanmei
    Drug-Drug Interaction (DDI) is one of the major causes of adverse drug reaction (ADR) and has been demonstrated to threat public health. It causes an estimated 195,000 hospitalizations and 74,000 emergency room visits each year in the USA alone. Current DDI research aims to investigate different scopes of drug interactions: molecular level of pharmacogenetics interaction (PG), pharmacokinetics interaction (PK), and clinical pharmacodynamics consequences (PD). All three types of experiments are important, but they are playing different roles for DDI research. As diverse disciplines and varied studies are involved, interaction evidence is often not available cross all three types of evidence, which create knowledge gaps and these gaps hinder both DDI and pharmacogenetics research. In this dissertation, we proposed to distinguish the three types of DDI evidence (in vitro PK, in vivo PK, and clinical PD studies) and identify all knowledge gaps in experimental evidence for them. This is a collective intelligence effort, whereby a text mining tool will be developed for the large-scale mining and analysis of drug-interaction information such that it can be applied to retrieve, categorize, and extract the information of DDI from published literature available on PubMed. To this end, three tasks will be done in this research work: First, the needed lexica, ontology, and corpora for distinguishing three different types of studies were prepared. Despite the lexica prepared in this work, a comprehensive dictionary for drug metabolites or reaction, which is critical to in vitro PK study, is still lacking in pubic databases. Thus, second, a name entity recognition tool will be proposed to identify drug metabolites and reaction in free text. Third, text mining tools for retrieving DDI articles and extracting DDI evidence are developed. In this work, the knowledge gaps cross all three types of DDI evidence can be identified and the gaps between knowledge of molecular mechanisms underlying DDI and their clinical consequences can be closed with the result of DDI prediction using the retrieved drug gene interaction information such that we can exemplify how the tools and methods can advance DDI pharmacogenetics research.