I. Introduction
With the proliferation of digital measurement devices, such as smart meter on the distribution systems and phasor measurement units on the transmission systems, power companies find themselves inundated with increasingly growing data and long for efficient tools and analytical techniques to identify, digest and utilize critical information to improve the efficiency and reliability of grid operations. For instance, the installation of PMU in the US has doubled in the past couple of years and is expected to reach 1100 units by the end of 2014, which will eventually introduce about 310GB raw data every day, in addition to the traditional SCADA data collected on a 2~4 second interval. A power companies with 100 PMUs will receive at least 67500 data points daily for storage, transfer, capture, curation, search, analysis, sharing, and visualization. Traditional grid operation energy management systems and analytics are not designed to handle the volume or complexity of information at such scale. Many power researchers believe that the PMU related power system analytics falls under the category of Big Data Science and are keen to apply typical technologies for solution, including machine learning, data mining, cloud based computation and so on.