Count lines in files in folders recursively

In C#, what is the best way to get a count of the total number of lines in all the files in a directory and all of its subdirectories?

The obvious answer is to make a recursive function to go through all of the directories and use the strategy from this question to count the lines in each file. Is there a better/easier way?

Answers


Here's a LINQy way of doing so:

string path = @"C:\TonsOfTextFiles";
int totalLines = (from file in Directory.GetFiles(path, "*.*", SearchOption.AllDirectories)
                    let fileText = File.ReadAllLines(file)
                    select fileText.Length).Sum();

Is there a better/easier way?

No, there is (in general) no better way to get the number of lines in a file than by counting them.

In order to find the total number of lines in all files, you will have to get the total number of lines in each file at some point. There's really no way around that.


The strategy you described works well. An alternative approach instead of a recursive function (basically DFS) is to use BFS. Something like:

int CountLines(string path)
{
    var queue = new Queue<string>();
    queue.Enqueue(path);
    int count = 0;
    while (queue.Count > 0) {
        string dir = queue.Dequeue();
        foreach (var subdir in Directory.GetDirectories(dir))
            queue.Enqueue(subdir);
        foreach (var file in Directory.GetFiles(dir))
            count += GetLineCount(file); 
    }
    return count;
}

There is not really a better way. Walking through a directory structure to all subdirectories inherently lends itself to being done recursively. As for counting the lines in the file, you really have no choice but to open the file up and count the lines. Note that you do need to be aware of blowing up your stack so you might have to manually simulate recursion using a Queue.

Since it's relatively easy to get that method coded up correctly, clearly and concisely I think that is what you should do and move on to adding value elsewhere.


For finding the files, why not just use something like:

Directory.GetFiles("C:/some/path", "*.txt", SearchOption.AllDirectories);

This will give you the results of a recursive search.


I think that post sufficiently explains the latter part of your question. As far as the directory traversing, check out this http://dotnetperls.com/recursively-find-files

UPDATE: there is an abstraction over this: I was really hoping you would read the link, but here it is http://dotnetperls.com/recursive-file-list-1


Please God, forgive me:

@echo off
set sum=0
for /r %%f in (*.cs) do find /v /c "$$some nonsense string$$" %%f >> test.dat
for /f "tokens=3 delims=:" %%i in (test.dat) do set /a sum += %%i
echo total lines = %sum%
del test.dat

Isn't C#, but it's fun.

EDIT: This can be more memory efficient, as it doesn't use ReadAllLines, but one at once:

string basePath = @"C:\some\path";
Console.WriteLine(
    Directory.GetFiles(basePath, "*.cs", SearchOption.AllDirectories)
        .Sum(file => 
        {
            int lines = 0;
            using (StreamReader reader = new StreamReader(file))
                while(reader.ReadLine() != null) lines++;
            return lines;
        }));

Need Your Help

Why do these images load so slowly in Opera?

image webserver opera page-load-time

In FireFox and IE, the images on this page load almost instantly (especially when they're cached by the browser).

Titanium: Unable to install app on physical device

ipad ios5 segmentation-fault titanium titanium-mobile

I have a Titanium mobile app that is working fine on iOS simulator, But when I try to install same application on physical device, it stops installation progress in middle. I could see some error t...