A review of unstructured data analysis and parsing methods
Abstract
Computer applications generate an enormous amount of data every day through their logs, system-generated files or other reports. This generated data depicts the state of the running system and contains abundant information that can be used for system diagnostics and monitoring. Network monitoring systems produce a wide variety of unstructured information, so there is a need for an automated way to extract the relevant data, which currently requires multitude of custom parsers. Developing and testing custom parsers can be time-consuming. Instead, data can be automatically processed and parsed into a machine-readable format, building a generic model for standard or vendor-specific data, and generating insights for analytics, anomaly detection, intrusion detection, node failures and various other applications. This paper reviews some existing approaches for unstructured data mining and parsing and discusses the challenges in information extraction, creation of knowledge bases and presents a generic framework for automatic parsing.
Collections
The following license files are associated with this item: