Storing 'struct' data to binary file
I need to store a binary file with a 12 byte header composed of 4 fields. They are namely: sSamples (4-bytes integer), sSampPeriod (4-bytes integer), sSampSize (2-bytes integer), and finally sParmKind (2-bytes integer). I'm using 'struct' to my variables to the desired fields. Now that I have them defined separately, how could I merge them all to store the '12 bytes header'?
sSamples = struct.pack('i', nSamples) # 4-bytes integer sSampPeriod = struct.pack('i', nSampPeriod) # 4-bytes integer sSampSize = struct.pack('H', nSampSize) # 2-bytes integer / unsigned short sParmKind = struct.pack('H', 9) # 2-bytes integer / unsigned short
In addition, I've a npVect float array of dimensionality D (numpy.ndarray - float32). How could I store this vector in the same binary file, but after the header?
As Cody Brocious wrote, you can pack your entire header at once:
header = struct.pack('<iiHH', nSamples, nSampPeriod, nSampSize, nParmKind)
He also mentioned endianness, which is important if you want to pack your data so as to reliably unpack it on machines with different architectures. The < at the beginning of my format string specifies "pack this data using a little-endian convention".
As for the array, you'll have to pack its length in order to determine how many values to unpack when you read it again. Doing it all in one call:
flattened = npVect.ravel() # get a 1-D array of numbers arrSize = len(flattened) # pack header, count of numbers, and numbers, all in one call packed = struct.pack('<iiHHi%df' % arrSize, nSamples, nSampPeriod, nSampSize, nParmKind, arrSize, *flattened)
Depending on how big your array is likely to be, you could end up with a huge string representing the entire contents of your binary file, and you might want to look into alternatives to struct which don't require you to have the entire file in memory.
fmt = '<iiHHi' nSamples, nSampPeriod, nSampSize, nParmKind, arrSize = struct.unpack(fmt, packed) # Use unpack_from to start reading after the packed header and count flattened = struct.unpack_from('<%df' % arrSize, packed, struct.calcsize(fmt)) npVect = np.ndarray(flattened, dtype='float32').reshape(# your dimensions go here )
EDIT: Oops, the array format isn't quite as simple as that :) The general idea holds, though: flatten your array into a list of numbers using any method you like, pack the number of values, then pack each value. On the other side, read the array as a flat list, then impose whatever structure you need on it.
EDIT: Changed format strings to use repeat specifiers, rather than string multiplication. Thanks to John Machin for pointing it out.
EDIT: Added numpy code to flatten the array before packing and reconstruct it after unpacking.
struct.pack returns a string, so you can combine the fields simply by string concatenation:
header = sSamples + sSampPeriod + sSampSize + sParmKind assert len( header ) == 12