I. Introduction
The growth of textual data in business is increasing rapidly. According to an IDC survey, unstructured data takes the share of 80% compared to structured data that occupies only 20% of the total textual data. Structured data refers to information that is organized in rows and columns and also it is easily identifiable, labelled and accessed. Unstructured data refers to data that is not organized in a well-defined manner or that does not have pre-defined data model. It can be both textual and non-textual. Unstructured textual data is constantly generated via email, web documents, blogs, tweets, customer reviews, and comments so on. Non-textual data includes images, audio files and video files. Unstructured data is available in abundance but the number of software tools and products to analyze this data and provide accurate insight is rare. In this paper we deal with customer error logs that resulted out of server failure which is textual in nature. These logs consists of the problem faced by customer and various steps taken to solve outage(server failure). Considering the various scenarios of natural language which the customers use to specify their views or comments, it is difficult for The organization to derive the meaningful information and improve productivity. A lot of challenges arise because natural language provides flexibility for the customers to convey the same message in different ways, or in some cases the same statement may have different context, may convey completely different meaning. Dialects, misspellings, short forms, acronyms, colloquialism, grammatical complexities, mixing one or more languages in the same text are just some of the most basic problems unstructured data poses. It makes extremely difficult to precisely analyze unstructured data in the same way we process structured data. To solve this problem, organization is moving towards the usage of tools that can make the analysis of this text heavy data easier and hence the ‘Outage Automation Tool’ has been proposed. This tool can search the unstructured data and extract the desired information which in turn makes it easy to understand the cause of server failure and categorize the failure for further analysis.