Conferences >2002 IEEE International Confe...

A new algorithm for learning parameters of a Bayesian network from distributed data

Download PDF
Download References
Request Permissions
Save to
Alerts

Abstract:

We present a novel approach for learning parameters of a Bayesian network from distributed heterogeneous dataset. In this case, the whole dataset is distributed in severa...Show More

Metadata

Abstract:

We present a novel approach for learning parameters of a Bayesian network from distributed heterogeneous dataset. In this case, the whole dataset is distributed in several sites and each site contains observations for a different subset of features. The new method uses the collective learning approach proposed in our earlier work and substantially reduces the computational and transmission overhead. Theoretical analysis is given and experimental results are provided to illustrate the accuracy and efficiency of our method.

Published in: 2002 IEEE International Conference on Data Mining, 2002. Proceedings.

Date of Conference: 09-12 December 2002

Date Added to IEEE Xplore: 10 March 2003

Print ISBN:0-7695-1754-4

DOI: 10.1109/ICDM.2002.1184005

Conference Location: Maebashi City, Japan

Contents

1 Introduction

A Bayesian Network (BN) is a probabilistic model based on a directed acyclic graph. In order to use a Bayesian network for inference or decision making, it must first be constructed using prior knowledge from experts and/or observed data. Most of the work reported in the literature assume that all the observed data are available at a single site. However, there are many scientific and non-scientific applications, where the observed data is distributed among different sites. Cost of data communication between the distributed databases is a significant factor in an increasingly mobile and connected world with a large number of distributed data sources. In this paper, we consider a distributed heterogenous data scenario, where each site has observations corresponding to a subset of the attributes. We assume that there exists a “key” that can; link the observations across sites. A naive approach to. learn a BN from distributed heterogenous data is to transmit all local datasets to a central site, and then: learn a BN from the resulting merged dataset (centralized learning However, limited network bandwidth and/or data security might render this approach infeasible.

References is not available for this document.

MIT Libraries

MIT Libraries

A new algorithm for learning parameters of a Bayesian network from distributed data

Abstract:

Metadata

Abstract:

1 Introduction

References

IEEE Account

Purchase Details

Profile Information

Need Help?

MIT Libraries

MIT Libraries

A new algorithm for learning parameters of a Bayesian network from distributed data

Alerts

Abstract:

Metadata

Abstract:

1 Introduction

References