Analysis of log files is not new to software development and maintenance activities. It has been heavily used over years and the advantages involved have increased rapidly with the trending development methodologies. Conventional uses of log file analysis include ensuring conformance of software to the specification, for quality audits, and troubleshooting. Log derived information can be of vital use during an investigation as almost every system generate logs during critical or erroneous events. As for now log analysis can be done entirely manually or partially automatically. Regardless of the method, analyzing process demands expertise to a great deal. In fact completely manual log analysis requires extensive human effort and time. One impediment that has been identified is unavailability of a generalized technique to derive machine data extraction knowledge. Correlating information hindered in a log file is another challenge that makes it more complex. This paper describes a framework which formulates a unique, generalized platform for maintaining expert knowledge as well as for performing other tasks such as extracting information from log files, correlating events and creating reports etc. The framework includes improved scripting language optimized for log data extraction, algorithms for identifying frequently occurring patterns and set of tools that make the process easier even for a novice user. A proof of Concept Application will be presented to illustrate the framework.
Keywords—log analysis; log management; event correlation; structured logs; deep log semantics;