What is the most efficient way to read a big text file backwards?

What is the most efficient way to read a big text file backwards, line by line, using Windows API functions? For example, if a file is:

line 1
...
line 108777
line 108778

the output should be:

line 108778
line 108777
...
line 1

I want to write a C program for this. You don't need to write a code (but if you want, that's great), I am just interested in how to do this having in mind that files are big and that I want program to run as fast as it can.

Also, I am interested in which Windows API functions to use.

Answers


A more clever solution is to open the file, set the file-offset to the (end of the file - buffersize) and read (buffersize) bytes, u can parse the data in the buffer from back to front to find newlines and do whatever you want, and so on.


If performance is more important than memory utilization, I'd just do a buffered read of the entire text file into memory and then parse it in whatever order you like.

Take a look at memory mapped files, some advantages of which are discussed here.


Memory-map the file. It will be automatically buffered for you - just read it as if it was memory, starting from the tail and looking for CRs / LFs / CRLFs.


Memory mapped files will fail (or at least become very tricky) if the file's bigger than the available address space. Instead, try this:

input = input file
block_prefix = unique temporary file
block_index = 0

while (!eof (input))
{
   line = input.readline ();
   push line onto a stack

   if (stack > 100 entries) // doesn't have to be 100
   {
      output = block_prefix + block_index++

      while (stack has entries)
      {
        pop line off stack
        write to output
      }
   }
}

if (stack has entries)
{
  output = block_prefix + block_index++

  while (stack has entries)
  {
    pop line off stack
    write to output
  }
}

output = output file

while (block_index)
{
   read entire contents of block file (block_prefix + --block_index)
   write contents to output
   delete block file
}

One method is to use a container of file offsets to the beginning of each line. After parsing the file, process the container in reverse order. See fgetc, fgets and fseek.


Need Your Help

Generic constraint of X AND Y

c# .net generics type-constraints

Can generic constraints be used to enforce a constraint on type derivatives of an abstract class, but only those that implement an interface?

Please help me....I want to know how to code for background subtraction

opencv background-subtraction

I'm just getting started with OpenCV. I came across this question, and I'm trying to find where that code came from. Any ideas?