Sunday, February 28, 2016

Logstash Multiline codec issue



Be careful with the multiline codec in logstash. From my tests, it finally works in logstash 2.2.2.
The problem with the multiline codec is that it will mix events from multiple files. Below is the illustration. 

There are two log files test1.log and test2.log:

The logstash configuration file test.conf is:

input {
  file { 
      path => "/devops/tmp/test*.log"    
      start_position => "beginning"
      sincedb_path => "/devops/tmp/.sincedb"   
     
      codec => multiline {    
      pattern => "^(TRACE|DEBUG|NOTICE|INFO|WARN?(?:ING)?|ERROR|FATAL|STATUS)\s+"
      negate => true
      what => previous
      }    
  }
}

output { 
   stdout {    
      codec => rubydebug     
  }
}

The multiline codec configuration means: if a line doesn’t (negate=>true) start with the pattern, i.e. doesn’t start with a status level, this line belongs to the previous (what=>previous) line(s). negate controls whether to match or not match the pattern, what controls whether this current line belongs to the previous or next.

Running logstash 2.0.0 produces:


Notice, logstash mixes the line STATUS CAT1 in file test2.log.The last line in test2.log is not processed.  

Now if we add a line into test1.log, it will prompt logstash to process the last line in test2.log, but logstash will credit this line to test1.log.

Needless to say, this is very confusing. Thankfully, this issue is solved in 2.2.2. Here is the result in 2.2.2:


You will notice that the last line in each log file is missing, which is understandable, because multiline codec needs to read the next line to determine if the multiline event is completed.


No comments:

Post a Comment