What's more efficient - storing logs in sql database or files?

I have few scripts loaded by cron quite often. Right now I don't store any logs, so if any script fails to load, I won't know it till I see results - and even when I notice that results are not correct, I can't do anything since I don't know which script failed.

I've decided to store logs, but I am still not sure how to do it. So, my question is - what's more efficient - storing logs in sql database or files?

I can create 'logs' table in my mysql database and store each log in separate row, or I can just use php's file_put_contents or fopen/fwrite to store logs in separate files.

My scripts would approximately add 5 logs (in total) per minute while working. I've done few tests to determine what's faster - fopen/fwrite or mysql's insert. I looped an "insert" statement 3000 times to make 3000 rows and looped fopen/fwrite 3000 times to make 3000 files with sample text. Fwrite executed 4-5 times faster than sql's insert. I made a second loop - I looped a 'select' statement and assigned it to a string 3000 times - I also opened 3000 files using 'fopen' and assigned the results to the string. Result was the same - fopen/fwrite finished the task 4-5 times faster.

So, to all experienced programmers - what's your experience with storing logs? Any advice?

// 04.09.2011 EDIT - Thank you all for your answers, they helped ma a lot. Each post were valuable, so it was quite hard to accept only one answer ;-)

Answers


You can use a component such as Zend_Log which natively supports the concept of writers attached to the same log instance. In that way you can log the same message to one or more different place with no need to change your logging code. And you can always change your code to replace the log system or add a new one in a simple way.

For your question I think that log to files is simpler and more appropriate if you (developer) is the only one who needs to read log messages.

Log to db instead if you need other people needs to read logs in a web interface or if you need the ability to search through logs. As someone else has pointed out also concurrency matters, if you have a lot of users log to db could scale better.

Finally, a log frequency of 5 messages per minute requires almost no cpu for your application, so you don't need to worry about performances. In your case I'd start with logfiles and then change (or add more writers) if your requisites will change.


Logs using files are more efficient, however logs stored in the database are easier to read, even remotely (you can write a web frontend if required, for example).

Note however that connecting and inserting rows into the database is error prone (database server down, password wrong, out-of-resources) so where would you log those errors if you decided to use the database?


Commenting on your findings.

Regarding the writing to the file you are probably right. Regarding the reading you are dead wrong.

Writing to a database:

  1. MyISAM locks the whole table on inserts, causing a lock contention. Use InnoDB, which has row locking.
  2. Contrary to 1. If you want to do fulltext searches on the log. Use MyISAM, it supports fulltext indexes.
  3. If you want to be really fast you can use the memory engine, this writes the table in RAM. Transfer the data to a disk-based table when CPU load is low.

Reading from the database

This is where the database truly shines. You can combine all sorts of information from different entries, much much faster and easier than you can ever do from a flat file.

SELECT logdate, username, action FROM log WHERE userid = '1' /*root*/ AND error = 10;

If you have indexes on the fields used in the where clause the result will return almost instantly, try doing that on a flat file.

SELECT username, count(*) as error_count 
FROM log 
WHERE error <> 0 
GROUP BY user_id WITH ROLLUP

Never mind the fact that the table is not normalized, this will be much much slower and harder to do with a flat file. It's a no brainer really.


Speed isn't everything. Yes, it's faster to write to files but it's far faster for you to find what you need in the logs if they are in a database. Several years ago I converted our CMS from a file-based log to a Mysql table. Table is better.


Writing the filesystem should always be faster.

That however shouldent be your concern. Both doing a simple insert and writing to a file system are quick operations. What you need to be worried about is what happens when your database goes down. I personaly like to write to both so there is always a log if anything goes wrong but also you have the ease of searching from a database.


It depends on the size of the logs and on the concurrency level. Because of the latest, your test is completely invalid - if there are 100 users on the site, and you have lets say 10 threads writing to the same file, fwrite won't be so faster. One of the things RDBMS provides is concurrency control.

It depends on the requirements and lot king of analysis you want to perform. Just reading records is easy, but what about aggregating some data over a defined period.

Large scale web sites use systems like Scribe for writing their logs.

If you are talking about 5 record per minute however, this is really low load, so the main question is how you are going to read them. If file is suitable for your needs, go with the file. Generally, append-only writes (usual for logs) are really fast.


I think storing logs in database is not a good idea. The pros of storing logs to databases over files is that you can analyse your logs much more easily with the power of SQL, the cons, however, is that you have to pay much more time for database maintainence. You'd better to set up a seperate database server to store your logs or your might get too much log INSERT which will decrease your database performance to production use; also, it's not easy to migrate, archive logs in database, compared with files(logrotate, etc).

Nowadays you should use some special feature-rich logging system to handling your logs, for example, logstash(http://logstash.net/) has log collector, filter, and it can store log in external systems such as elasticsearch, combined with a beautiful frontend for visualizing and analyzing your logs.

Ref:


Error logging is best limited to files in my opinion, because if there is a problem with the database, you can still log that. Obviously that's not an option if your error logging requires a connection to the database!

What I will also say however, is that general logging is something I leave within the database, however this only applies if you are doing lots of logging for audit trails etc.


Personally, I prefer log files so I've created two functions:

<?php
function logMessage($message=null, $filename=null)
{
    if (!is_null($filename))
    {
        $logMsg=date('Y/m/d H:i:s').": $message\n";
        error_log($logMsg, 3, $filename);
    }
}

function logError($message=null, $filename=null)
{
    if (!is_null($message))
    {
        logMessage("***ERROR*** {$message}", $filename);
    }
}
?>

I define a constant or two (I use ACTIVITY_LOG and ERROR_LOG both set to the same file so you don't need to refer to two files side by side to get an overall view of the running) and call as appropriate. I've also created a dedicated folder (/var/log/phplogs) and each application that I write has its own log file. Finally, I rotate logs so that I have some history to refer back to for customers.

Liberal use of the above functions means that I can trace the execution of apps fairly easily.


Need Your Help

How do I select individual words of a camelized word in IntelliJ IDEA

keyboard-shortcuts intellij-idea

What is the shortcut in IntelliJ IDEA to move across or select individual words of a camelized word? Lets say I have aLongMultiWordVariableName, I want to be able to move the cursor to each word for

"Not in" constraint using JPA criteria

java jpa jpa-2.0 criteria-api

I am trying to write a NOT IN constraint using JPA Criteria.