Crowdsourcing-based Hybrid Knowledge Discovery: Application to Biomedical Data

Interdisciplinary Areas: Internet of Things and Cyber Physical Systems, Data/Information/Computation, Human-Machine/Computer Interaction, Human Factors, Human-Centered Design

Project Description

Human agents have remarkable capability to learn a large variety of concepts, often with few examples, whereas current data-mining algorithms require thousands of data points and struggle with problems such as ambiguity, validity, overfitting etc. However, well-trained human agents may be expensive. On the other hand, initially less competent human agents may have the desire to improve themselves on acquiring sufficient knowledge and drawing subsequent inferences. We will investigate the consensus problem of multi-agent hybrid systems with diverse knowledge discovery capacities under a crowdsourcing framework. Combining numerical and symbolic data mining methods remains challenging. Additionally, even though a crowdsourcing framework is easily achievable from an information technology infrastructure point of view, scaling up the networked control system for hybrid knowledge discovery remains challenging. For the above two challenges, we will explore the combination of numerical and symbolic data mining and the use of domain knowledge for improving the performances. We will also study the operations management aspect for the scaling-up that intends to balance efficiency of knowledge discovery and spending on crowdsourcing the necessary tasks. Moreover, we will use organizational psychology to study effective ways of improving workforce commitment. We will conduct proof-of-concept studies in healthcare and biology.

Start Date

August 1, 2019

Postdoc Qualifications

We are seeking a highly qualified individual with expertise in data science applied to healthcare, and in particular, to health-related big data integration. Areas of emphasis include deep learning, explainable artificial intelligence, causal inference, predictive analytics, transportability of causal and statistical relationships, as well as multi-agent control. Research experience in the area psychometrics is high appreciated. 


Nan Kong,, Weldon School of Biomedical Engineering,

Daisuke Kihara,, Department of Biological Sciences; Department of Computer Science,


1. D. R. Karger, S. Oh, D. Shah (2014). Budget-Optimal Task Allocation for Reliable Crowdsourcing Systems. Operations Research. 62(1): 1 - 24.

2. Jae Kwon Kim, Jong Sik, Lee, Kang Sun Lee. A Hybrid M&S Methodology for Knowledge Discovery. Monterey Workshop 2016: Challenges and Opportunity with Big Data pp 3-10.

3. Udoinyang Godwin Inyang, Oluwole Charles Akinyokun. A hybrid knowledge discovery system for oil spillage risks pattern classification. Artificial Intelligence Research. Vol 3, No 4 (2014).

4. Yuchun Tang ; Yuanchen He ; Yan-Qing Zhang ; Zhen Huang ; Xiaohua Hu ; R. Sunderraman. A Hybrid CI-Based Knowledge Discovery System on Microarray Gene Expression Data. 2005 IEEE Symposium on Computational Intelligence in Bioinformatics and Computational Biology.