admin Posted on 4:45 am

Nagios log monitoring – monitor log files on Unix effectively

Nagios log file monitoring: Monitoring log files using Nagios can be as difficult as it is with any other monitoring application. However, with Nagios, once you have a log monitoring script or tool that can monitor a specific log file in any way you want, you can trust Nagios to take care of the rest. This kind of versatility is what makes Nagios one of the most popular and easy-to-use monitoring applications out there. It can be used to monitor anything effectively. Personally, I love it. It has no equal!

My name is Jacob Bowman and I work as a Nagios Monitoring Specialist. I have come to realize, given the number of requests I get at my job to monitor log files, that monitoring log files is a big problem. IT departments have a constant need to monitor their UNIX log files to ensure that application or system problems can be detected in time. When problems are known, unplanned outages can be completely avoided.

But the common question that many are often asked is: what monitoring application is available that can effectively monitor a log file? The simple answer to this question is NONE! The log monitoring applications that exist require too much configuration, which in fact makes them not worth considering.

Log monitoring should allow pluggable arguments on the command line (rather than in separate configuration files) and should be very easy to understand and use for the average UNIX user. Most log monitoring tools are not like this. They are often complex and take time to get familiar with (reading endless pages of installation settings). In my opinion, this is an unnecessary problem that can and should be avoided.

Again, I strongly believe that to be efficient one must be able to run a program directly from the command line without going elsewhere to edit the configuration files.

So the best solution, in most cases, is to write a log monitoring tool for your particular needs or download a log monitoring program that has already been written for your type of UNIX environment.

Once you have that log monitoring tool, you can give it to Nagios to run at any time, and Nagios will schedule it to start at regular intervals. If after running it at the set intervals, Nagios finds the problems / patterns / strings that tells you to be on the lookout, it will alert and send notifications to whoever you want them to.

But then you wonder, what kind of log monitoring tool should you write or download for your environment?

The log monitoring program you should get to monitor your production log files should be as simple as the following, but it should still be highly versatile:

Example: achievementbot / var / log / messages 60 ‘error’ ‘panic’ 5 10 -foundn

Departure: 2 — 1380 — 352 — ATWF — (Mar / 1) – (16:15) — (Mar / 1) – (17:15:00)

Explanation:

The “-foundn” option searches / var / log / messages for the strings “error” and “panic”. Once it finds it, it will abort with a 0 (for OK), 1 (for WARNING), or 2 (for CRITICAL). Every time you run that command, it will provide a one-line statistical report similar to the one in the Output above. The fields are delimited by “—“.

The first field is 2 = which means that this is critical.

The second field is 1380 = number of seconds since the strings you specified last appeared in the record.

The third field is 352 = there were 352 occurrences of the string “error” and “panic” found in the log in the last 60 minutes.

The fourth field is ATWF = Don’t worry about this for now. Irrelevant.

Fifth and sixth field media = The log file was searched from (March 1) – (16:15) to (March 1) – (17:15:00). And from the data collected from that time period, 352 cases of “error” and “panic” were found.

If you really want to see the 352 occurrences, you can run the following command and pass the “-show” option to the achievementbot tool. This will display all the matching lines in the log that contain the strings you specified and that were written to the log in the last 60 minutes.

Example: achievementbot / var / log / messages 60 ‘error’ ‘panic’ 5 10 -show

The “-show” command will display all the lines it finds in the log file containing the strings “error” and “panic” within the last 60 minute time period that you specified. Of course, you can always change the parameters to suit your particular needs.

With this Nagios log monitoring tool (achievementbot), you can perform the magic that famous big-name monitoring apps can’t even come close to doing.

Once you write or download a log monitoring script or tool like the one above, you can have Nagios or CRON run it regularly, which in turn will give you a bird’s-eye view of all the logged activities of your important servers.

Do you have to use Nagios to run it on a regular basis? Absolutely not. You can use whatever you want.

Leave a Reply

Your email address will not be published. Required fields are marked *