I. Introduction
Big data has become the most important research subject recently. It includes large scale and complex data sets that traditional tools and algorithms cannot handle [1]. Its increasing volume, velocity and variety build grand challenges for general researchers to understand and create big data services. In addition, most big data services are intellectual properties protected by patterns, copyrights and other means [1]. Big data services that are available to general researchers are limited to open source data repositories such as digital libraries, Apache big data and public genome data. Comparing to industry level big data services, open source one is limited in the volume, velocity, variety and value. Big data research community needs new open source big data services that include growing data sets as well as software tools and data analytics algorithms to process them. We have been studying cell assay and classification for many years and building a big data service called Cell Morphology Assay (CMA) for modeling and analyzing 3D cell morphology and mining morphology patterns extracted from diffraction images of biology cells. Study of 3D morphology can provide rich information about cells that is essential for cell analysis and classification. Diffraction images of single cells are acquired using a diffraction imaging flow cytometer to quantify and profile 3D morphology of cells [15]. CMA tools can rapidly analyze large amount of diffraction images and obtain texture parameters in real time. Through these parameters, one can use various machine learning algorithms such as Support Vector Machine (SVM) [35] to optimize and identify a set of parameters to perform cell assay according to 3D morphology without the need to reconstruct the structures from the diffraction images. This new approach can thus provide an innovative approach for rapid assay of single cells without the need to stain them with fluorescent reagents. Another goal of CMA project is to provide researchers a significant source of big data and tools to conduct big data research. CMA adopts big data techniques to implement data management, analysis, discovery, applications and sharing into the development of morphology based cell analysis tools. It includes a group of scientific software tools for modeling, analyzing and producing image data, machine learning algorithms for feature selection and cell classifications, and database for managing the big data. CMA is classified as a big data service due to its large volume, fast growing, variety of data, and big potential values.