Loading [MathJax]/extensions/MathZoom.js
Nitesh V. Chawla - IEEE Xplore Author Profile

Showing 1-25 of 96 results

Results

Real-world data is often imbalanced, such that the number of training instances varies by class. Data augmentation (DA) of under-represented classes is commonly used to improve model generalization in the face of class imbalance. Despite its ubiquity, the impact of data augmentation on machine learning (ML) models is not clearly understood. Here, we undertake a holistic examination of the effect o...Show More
In many cases, the search for synthetic functional networks of a gene regulatory network model can be a difficult and time-consuming task. In this paper, we present a method that uses the popular SMOTE algorithm to sample synthetic functional networks in order to boost the results obtained by an evolutionary computation framework in a previous stage. We consider threshold Boolean networks for gene...Show More
The widespread application of machine learning techniques to biomedical data has produced many new insights into disease progression and improving clinical care. Inspired by the flexibility and interpretability of graphs (networks), as well as the potency of sequence models like transformers and higher-order networks (HONs), we propose a method that identifies combinations of risk factors for a gi...Show More
Deep learning models may not effectively generalize across under-represented or minority classes. We empirically study a convolutional neural network’s (CNN) internal representation of imbalanced image data and measure the generalization gap between a model’s feature embeddings in the training and test sets, showing that the gap is wider for minority classes. This insight enables us to design an e...Show More
With the increasing deployment of small unmanned aerial systems (sUASs) on various tasks, it becomes crucial to analyze and detect anomalies from their flight logs. To support research in this area, we curate Drone Log Anomaly (DLA), the first real-world time series anomaly detection dataset in the domain of sUASs, which contains 41 sUAS flight logs annotated with various types of anomalies. As an...Show More
Complementarity plays a significant role in the synergistic effect created by different components of a complex data object. Complementarity learning on multimodal data has fundamental challenges of representation learning because the complementarity exists along with multiple modalities and one or multiple items of each modality. Also, an appropriate metric is needed for measuring the complementa...Show More
Comprehensively evaluating and comparing researchers’ academic performance is complicated due to the intrinsic complexity of scholarly data. Different scholarly evaluation tasks often require the publication and citation data to be investigated in various manners. In this article, we present an interactive visualization framework, SD$^{2}$2, to enable flexible data partition and composition to sup...Show More
Despite over two decades of progress, imbalanced data is still considered a significant challenge for contemporary machine learning models. Modern advances in deep learning have further magnified the importance of the imbalanced data problem, especially when learning from images. Therefore, there is a need for an oversampling method that is specifically tailored to deep learning models, can work o...Show More
Most graph neural network models learn embeddings of nodes in static attributed graphs for predictive analysis. Recent attempts have been made to learn temporal proximity of the nodes. We find that real dynamic attributed graphs exhibit complex phenomenon of co-evolution between node attributes and graph structure. Learning node embeddings for forecasting change of node attributes and evolution of...Show More
CyberPhysical systems (CPS) must be closely monitored to identify and potentially mitigate emergent problems that arise during their routine operations. However, the multivariate time-series data which they typically produce can be complex to understand and analyze. While formal product documentation often provides example data plots with diagnostic suggestions, the sheer diversity of attributes, ...Show More
Graph representation learning aims at preserving structural and attributed information in latent representations. It has been studied mostly in the setting of static graph. In this work, we propose a novel approach for representation learning over dynamic attributed graph using the tool of normalizing flows for exact density estimation. Our approach has three components: (1) a time-aware graph neu...Show More
We hypothesize that behavioral patterns of people are reflected in how they interact with their mobile devices and that continuous sensor data passively collected from their phones and wearables can infer their job performance. Specifically, we study day-today job performance (improvement, no change, decline) of N=298 information workers using mobile sensing data and offer data-driven insights int...Show More
The proliferation of wearable sensors allows for the continuous collection of temporal characterization of an individual's physical activity and physiological data. This is enabling an unprecedented opportunity to delve into a deeper analysis of the underlying patterns of such temporal data and to infer attributes associated with health, behaviors, and well-being. However, there remain several cha...Show More
The prevalence of mobile sensors makes it possible for researchers to collect and analyze the pervasive sensed human activity data with machine learning tools. These analyses and applications heavily rely on the self-report survey data that reflect human physical and psychological behaviors. However, many factors can affect the reliability of such self-report surveys, such as participants' trusty ...Show More
Assessment of individuals' job performance, personalized health and psychometric measures are domains where data-driven ubiquitous computing will have a profound impact in the near future. Existing work in these domains focus on techniques that use data extracted from questionnaires, sensors (wearable, computer, etc.), or other traits to assess well-being and cognitive attributes of individuals. H...Show More
Conditions play an essential role in biomedical statements. However, existing biomedical knowledge graphs (BioKGs) only focus on factual knowledge, organized as a flat relational network of biomedical concepts. These BioKGs ignore the conditions of the facts being valid, which loses essential contexts for knowledge exploration and inference. We consider both facts and their conditions in biomedica...Show More
Understanding user characteristics such as demographic information is useful for the personalization of online content promoted to users. However, it is difficult to obtain such data for each user visiting the website. Since demographic data for some users can be collected, their behavior can be used to predict the attributes of unknown users. Through online news consumption, we can infer the attr...Show More
Human abnormal physical and psychological behaviors, such as high level of stress, may result in negative impacts on work and life, if not handled efficiently. However, the continuous collection of behavioral data from questionnaires is not feasible, as is often the case for the natural downside of survey data gathering. Thanks to the proliferation of mobile sensors, it brings compelling opportuni...Show More
Networks are powerful and flexible structures for modeling relationships in medical and biological systems, but in a traditional first-order network representation, an edge typically expresses a relationship between a single pair of nodes. In order to analyze complex relationships between groups of nodes, researchers rely on combined sets of these pairwise connections, which can misrepresent the t...Show More
In this paper, we discuss a tensor decomposition method for imputing similarity scores between individual clinical pictures at predefined patient age intervals in order to construct a dynamic similarity network of patients with respect to early childhood anthropomorphic development. The method leverages Canonical Polyadic Decomposition (or PARAFAC) to compute missing Euclidean similarity scores be...Show More
A high percentage of information that propagates through a social network is sourced from different exogenous sources. E.g., individuals may form their opinions about products based on their own experience or reading a product review, and then share that with their social network. This sharing then diffuses through the network, evolving as a combination of both network and external effects. Beside...Show More
Social sensing is a new big data application paradigm for Cyber-Physical Systems (CPS), where a group of individuals volunteer (or are recruited) to report measurements or observations about the physical world at scale. A fundamental challenge in social sensing applications lies in discovering the correctness of reported observations and reliability of data sources without prior knowledge on eithe...Show More
Scientists are relying heavily on biomedical literature search (BLS) engines (e.g., PubMed) to acquire knowledge. Existing BLS systems adopt a “C-A” paradigm that is to design query-document similarity measurement based on words/phrases in the unstructured Content and to develop search Algorithms. In this work, we argue that structures should be extracted and utilized to bridge the gap between tex...Show More
The ubiquitous use of social media enables researchers to obtain self-recorded longitudinal data of individuals in real-time. Because this data can be collected in an inexpensive and unobtrusive way at scale, social media has been adopted as a “passive sensor” to study human behavior. However, such research is impacted by the lack of homogeneity in the use of social media, and the engineering chal...Show More
As the healthcare industry shifts from traditional fee-for-service payment to value-based care models, the need to accurately quantify and compare the performance of institutions has become an integral component of both policy and research. To date, several notable metrics have been introduced, including the Centers for Medicare and Medicaids Hospital Value Based Purchasing (HVBP) program. However...Show More