Author: Peter Stone | Post Date: 4/21/2009
More Articles by Author | E-Mail Author
The over-obsession with log collection PER SECOND is just mind-boggling to me.
Is this really as important as other Log Management Vendors make it out to be?
I have to admit that I am blown away by the direction that most logging vendors are going today. What I mean by this, is they appear to be obsessed with this argument about /how many logs they can collect per second/. In fact, many log solution vendors have even gone as far as showing real-time log volume calculations inside their dashboards as if it is crucial information. I must say, that they are doing a pretty good job of selling the sizzle. What I mean by the sizzle is the constant marketing barrage about how many logs they can collect and how quickly they can search through them. They spin this hype to customers about they can jam terabytes of log data into their appliance. Whether they can or can't accomplish this is not my objection. What I am saying is "Why would any customer want to use this as a barometer for choosing a logging solution?"
Does the sizzle sell?
Now, I know at first glance the real-time updated graphs are eye-catching and give the impression that the tool may be doing a lot of work but, when it comes down to it, the volume stats are not useful for compliance, security, or operations. Their hope (i.e. logging vendors) is to sell customers on the idea of what a great "workhorse" there logging tool is, and much it can archive. Now, before you get the wrong impression, let me just go on record right now and say that all good logging solutions do need to be able to aggregate logs from all sources, search through them quickly and keep up with the volume demand. At the same time, I will say that centering the entire value proposition on these facts as the primary focus is just ridiculous. The reason I feel so strongly about this is because from my experience dealing with customers searching for solid logging solutions today, they want much more than a log (i.e. garbage) collection product.
How did it all start?
The reason this logs per-second competition started is because up until late 2007, most organizations were solely fixed on network perimeter security. Network logs from firewalls, IDS/IPS and other network devices can generate large volumes of log data. The main problem back then was, native system log retention was maxing out within hours. This was the initial stumbling block for companies that needed to aggregate logs for compliance and what started the "logs per second competition". The crazy thing is that this is still the argument logging tool vendors are still making. Unfortunately, for customers don't have experience with logging the sizzle sells. Then the buyer's remorse kicks in. Why?
Lesson #1. Don't focus on the logs per second argument
The Needs and Challenges of Today are Different. Let's flash forward.
These days most organizations have set up central log servers so the log retention issue is not so much of the problem. In fact, according to SANS, over 73% of all organizations now have a central log server in use. The major increase in log servers is directly related to compliance which requires 90 days only and at least 1 year of log retention. Some compliance mandates required up to 7 year retention. Today, the needs and the use cases for log data have exceeded the initial log aggregation requirements of the early 2000's. IT professionals want to be able to understand the log data to be able to identify violations of security policy, data theft activities, determine privileged user abuse. They also want to increase network and system operations using log data.
These facts are clear as day. Anyone that have experienced dealing with log data directly, or have read any of the annual reports from SANS about Log management knows this. IT professionals are still feeling overwhelmed by log data. Even though they do have central servers, and may even have a commercial logging tool they are still feeling like they aren't able to gain real insight into what the logs really mean.
The IT professionals that made their decisions based on the "logs per second hype" without looking deeper at their needs feel like they got the short end of the stick. They are seeing significantly less value in how many logs are collected versus understanding what is happening within the enterprise. The real problem is that many commercial logging solutions are high-end garbage collectors with a nice interface to cover it up. I really am serious about this! The proof is in the pudding as they say.
The serious underlying flaw that most commercial log management solutions have is, they don't have the intelligence to understand that much of the log data that is generated is just meaningless.
I am not talking about a small fraction of log data. In fact, the garbage logs, the redundant logs are a huge problem. These logs are actually an overwhelming majority. I am not going to get into the details about which logs are useless and which ones aren't. I will leave that to my technical colleagues to explain later, on other blog entries.
What I will say is, if you have every looked inside Event Viewer and tried to decipher what is being logged, it usually turns out to be a pointless venture. Try this same exercise by executing a basic search query for user logins using any commercial log management solution. Then try to drill down into a specific user activity. You will quickly determine that you have a giant Event Viewer with a nice front end to show for your efforts. You must test drive these tools like you would a car.
Lesson #2. Don't settle for mindless logging solutions
The logging solutions that were developed even just a few years ago are still not equipped to proactively address the garbage log issue. They leave it up to the customer to deal with after the fact. So, instead of helping customers with the real issues, they brag about log volume and normalization. This is a strategy to keep customers from discovering their real limitations. I mentioned normalization and I want to explain.
Another big marketing pitch they put out there is normalization.
Normalization is simply formatting of log data which makes them easier to read. Normalization is not the same as intelligent log analysis. Now, normalization is a valuable piece of the pie, however formatting garbage logs along with real activity logs is still not the answer because normalization alone doesn't provide a clear uncompromised audit trail.
For normalization to really be useful, the logging solution needs to provide the automated filtration of the garbage logs.
Does Manual Log Filtering work?
Many of these logging solutions combat this argument by claiming to provide manual fact log filtering. This would be useful if customers understood what they should filter out and what they should keep. It is not that clear cut. In fact, it takes expertise which is not the job description of just anyone. So what happens is, the customers don't use this false functionality and end up with the same problem, a giant garbage dump of misleading information.
Do you know how hard it is to filter logs out after they are stored in a database or archived?
Have you tried these filtering options and been successful with them?
Do you know how challenging it is to data mine through all the garbage log data to get to the real audit trail of information you're looking for?
Lesson #3. You need to kick the tires at the very least!
To test my theory, try any of these tools out there and just try to find "unwarranted data access activities" in a reasonable amount of time without having to call a forensic expert to muddle through the logs.
Try to decipher things like; permission changes, group policy edit details, or external drives being connected to the network. Try to determine accounts being added to privileged groups.You might also be shocked to find that many logging solutions can't actually tell you accurately "who logged in to an Active Directory domain" without reporting conflicting spurious log information.
Will you Take me up on my Challenge?
If you take me up on the challenge and my rhetoric turns out to be true, ask yourself "Are these things important to me? My guess is that most of you want to know. Don't settle for less.
Let me just ask you right now. If you are tasked with using log data for security, compliance and solving uptime problems, what would be your criteria?
Shouldn't the real test be things like;
How well does the solution help you identify unwarranted behavior?
Can it accurately detect system problems?
Can the solution determine violations of security policy?
Does it provide recommended fixes for reported issues?
Does the solution help determine unwarranted access to customer data?
Now of course, the log retention requirements are an issue, but any solution can be a garbage collector and archive data for you. Not every solution provides valuable information from log data.
A true log management solution needs to provide much more intelligence than simple normalization. They need to help unlock the power of log data to give IT professionals and executives the necessary information in which they can make logical calculated decisions.
I would love to hear your comments and questions. Feel free to post them or email me and I would be happy to engage in a conversation or debate.
Labels: Log Management

