# AForge FFT gives a different result than Accelerate FFT

Not sure what I am doing wrong. The results I get from the Accelerate framework seem incorrect to me. Any help would be much appreciated!

Here are some graphs comparing AForge with vDPS This is the vDSP Code I run

fftSetup = vDSP_create_fftsetup( 16, 2); // Convert the data into a DSPSplitComplex int samples = spectrumDataSize; int samplesOver2 = samples/2; DSPSplitComplex * complexData = new DSPSplitComplex; float *realpart = (float *)calloc(samplesOver2, sizeof(float)); float *imagpart = (float *)calloc(samplesOver2, sizeof(float)); complexData->realp = realpart; complexData->imagp = imagpart; vDSP_ctoz((DSPComplex *)realData, 2, complexData, 1,samplesOver2); // Calculate the FFT // ( I'm assuming here you've already called vDSP_create_fftsetup() ) vDSP_fft_zrip(fftSetup, complexData, 1, log2f(samples), FFT_FORWARD); // Scale the data //float scale = (float) FFT_SCALE; //scale is 32 vDSP_vsmul(complexData->realp, 1, &scale, complexData->realp, 1,samplesOver2); vDSP_vsmul(complexData->imagp, 1, &scale, complexData->imagp, 1, samplesOver2); vDSP_zvabs(complexData, 1, spectrumData, 1, samples); free(complexData->realp); free(complexData->imagp); delete complexData; // All done! return spectrumData;

This is what I do in AForge

foreach (float f in floatData) { if (i >= this.fft.Length) break; this.fft[i++] = new Complex(f * fftSize, 0); } AForge.Math.FourierTransform.FFT(this.fft, FourierTransform.Direction.Forward);

## Answers

After the following subroutine

vDSP_ctoz((DSPComplex *)realData, 2, complexData, 1,samplesOver2);

is executed, complexData has samplesOver2 elements. But soon after that, you call

vDSP_zvabs(complexData, 1, spectrumData, 1, samples);

which expects complexData to have samples elements, i.e. twice as many. This cannot be.

Also, how is realData laid out? I ask because vDSP_ctoz expects its first argument to be laid out in the form

real0, imag0, real1, imag1, ... real(n-1), imag(n-1).

If your data is indeed real, then imag0, imag1, ... imag(n-1) should all be 0. If it is not, then vDSP_ctoz may not be expecting that. (Unless you are packing the real data in some clever way, which would be two [sic] clever by half!)

Finally, vDSP_create_fftsetup( 16, 2); should probably be changed to

vDSP_create_fftsetup(16, 0);

===================================================================

My sample code appended in postscript:

FFTSetup fftSetup = vDSP_create_fftsetup( 16, // vDSP_Length __vDSP_log2n, kFFTRadix2 // FFTRadix __vDSP_radix // CAUTION: kFFTRadix2 is an enum that is equal to 0 // kFFTRadix5 is an enum that is equal to 2 // DO NOT USE 2 IF YOU MEAN kFFTRadix2 ); NSAssert(fftSetup != NULL, @"vDSP_create_fftsetup() failed to allocate storage"); int numSamples = 65536; // numSamples must be an integer power of 2; in this case 65536 = 2 ^ 16 float realData[numSamples]; // Prepare the real data with (ahem) fake data, in this case // the sum of 3 sinusoidal waves representing a C major chord. // The fake data is rigged to have a sampling frequency of 44100 Hz (as for a CD). // As always, the Nyquist frequency is just half the sampling frequency, i.e., 22050 Hz. for (int i = 0; i < numSamples; i++) { realData[i] = sin(2 * M_PI * 261.76300048828125 * i / 44100.0) // C4 = 261.626 Hz + sin(2 * M_PI * 329.72717285156250 * i / 44100.0) // E4 = 329.628 Hz + sin(2 * M_PI * 392.30804443359375 * i / 44100.0); // G4 = 391.995 Hz } float splitReal[numSamples / 2]; float splitImag[numSamples / 2]; DSPSplitComplex splitComplex; splitComplex.realp = splitReal; splitComplex.imagp = splitImag; vDSP_ctoz( (const DSPComplex *)realData, // const DSPComplex __vDSP_C[], 2, // vDSP_Stride __vDSP_strideC, MUST BE A MULTIPLE OF 2 &splitComplex, // DSPSplitComplex *__vDSP_Z, 1, // vDSP_Stride __vDSP_strideZ, (numSamples / 2) // vDSP_Length __vDSP_size ); vDSP_fft_zrip( fftSetup, // FFTSetup __vDSP_setup, &splitComplex, // DSPSplitComplex *__vDSP_ioData, 1, // vDSP_Stride __vDSP_stride, (vDSP_Length)lround(log2(numSamples)), // vDSP_Length __vDSP_log2n, // IMPORTANT: THE PRECEDING ARGUMENT MUST BE LOG_BASE_2 OF THE NUMBER OF floats IN splitComplex // FOR OUR EXAMPLE, THIS WOULD BE (numSamples / 2) + (numSamples / 2) = numSamples kFFTDirection_Forward // FFTDirection __vDSP_direction ); printf("DC component = %f\n", splitComplex.realp[0]); printf("Nyquist component = %f\n\n", splitComplex.imagp[0]); // Next, we compute the Power Spectral Density (PSD) from the FFT. // (The PSD is just the magnitude-squared of the FFT.) // (We don't bother with scaling as we are only interested in relative values of the PSD.) float powerSpectralDensity[(numSamples / 2) + 1]; // the "+ 1" is to make room for the Nyquist component // We move the Nyquist component out of splitComplex.imagp[0] and place it // at the end of the array powerSpectralDensity, squaring it as we go: powerSpectralDensity[numSamples / 2] = splitComplex.imagp[0] * splitComplex.imagp[0]; // We can now zero out splitComplex.imagp[0] since the imaginary part of the DC component is, in fact, zero: splitComplex.imagp[0] = 0.0; // Finally, we compute the squares of the magnitudes of the elements of the FFT: vDSP_zvmags( &splitComplex, // DSPSplitComplex *__vDSP_A, 1, // vDSP_Stride __vDSP_I, powerSpectralDensity, // float *__vDSP_C, 1, // vDSP_Stride __vDSP_K, (numSamples / 2) // vDSP_Length __vDSP_N ); // We print out a table of the PSD as a function of frequency // Replace the "< 600" in the for-loop below with "<= (numSamples / 2)" if you want // the entire spectrum up to and including the Nyquist frequency: printf("Frequency_in_Hz Power_Spectral_Density\n"); for (int i = 0; i < 600; i++) { printf("%f, %f\n", (i / (float)(numSamples / 2)) * 22050.0, powerSpectralDensity[i]); // Recall that the array index i = 0 corresponds to zero frequency // and that i = (numSamples / 2) corresponds to the Nyquist frequency of 22050 Hz. // Frequency values intermediate between these two limits are scaled proportionally (linearly). } // The output PSD should be zero everywhere except at the three frequencies // corresponding to the C major triad. It should be something like this: /*************************************************************************** DC component = -0.000000 Nyquist component = -0.000000 Frequency_in_Hz Power_Spectral_Density 0.000000, 0.000000 0.672913, 0.000000 1.345825, 0.000000 2.018738, 0.000000 2.691650, 0.000000 . . . 260.417175, 0.000000 261.090088, 0.000000 261.763000, 4294967296.000000 262.435913, 0.000000 263.108826, 0.000000 . . . 328.381348, 0.000000 329.054260, 0.000000 329.727173, 4294967296.000000 330.400085, 0.000000 331.072998, 0.000000 . . . 390.962219, 0.000000 391.635132, 0.000000 392.308044, 4294966784.000000 392.980957, 0.000000 393.653870, 0.000000 . . . ***************************************************************************/ vDSP_destroy_fftsetup(fftSetup);

This line:

vDSP_zvabs(complexData, 1, spectrumData, 1, samples);

should be:

float cr = complexData->realp[0], ci = complexData->imagp[0]; vDSP_zvabs(complexData, 1, spectrumData, 1, samplesOver2); spectrumData[0] = cr*cr; spectrumData[samplesOver2] = ci*ci; // See remarks below.

This because the real-to-complex FFT of N samples returns N/2+1 results. Two of the results are real numbers, which are packed into complexData->realp[0] and complexData->imagp[0]. The remaining N/2-1 results are complex numbers, stored normally with the real components in complexData->realp[i] and the imaginary components in complexData->imagp[i], for 0 < i < N/2.

vDSP_zvabs computes the magnitudes of the complex numbers, except that the first output (in spectrumData[0]) is incorrect due to the packing of two numbers into the [0] elements. Overwriting spectrumData[0] with cr*cr corrects that. You can also write the magnitude of the other packed element (the Nyquist frequency) into spectrumData[samplesOver2], if space has been provided for that.

Some other notes:

spectrumDataSize must be a power of two.

It is not ideal practice to calculate the base-two logarithm as log2f(samples). I think we (Apple) have made log2f return exactly correct results for integer powers of two, but depending on floating-point exactness should be avoided unless care has been taken to be very certain of it.

There is no need to dynamically allocate a DSPSplitComplex with “new”. It is POD (plain old data) containing just two pointers, so you can simply declare “DSPSplitComplex complexData” and use it as a struct, rather than a pointer to a struct.

Some thoughts..

int numberOfInputSamples = ..; int numberOfInputSamplesOver2 = numberOfInputSamples/2; fftSetup = vDSP_create_fftsetup( log2(numberOfInputSamples), FFT_RADIX2 ); ... Float32 scale = (Float32) 1.0 / (2 * numberOfInputSamples); ... float *spectrumData = (float *)calloc( numberOfInputSamplesOver2, sizeof(float)); vDSP_zvabs( complexData, 1, spectrumData, 1, numberOfInputSamplesOver2 );

so at the end you will have numberOfInputSamplesOver2 float magnitudes, right?

*(technically it is numberOfInputSamplesOver2+1 but the whole packing thing is another question)*

You appear to be computing one FFT for length N/2, and one for length N. Thus the different results for different length FFTs.

I assume since you are calling vDSP_ctoz that your data is not in even-odd. If that IS the case, you also need to unpack it after the fft.

From vDSP Programming Guide:

Applications that call the real FFT may have to use two transformation functions, one before the FFT call and one after. This is required if the input array is not in the even-odd split configuration.

Hope that helps.

I am not at all familiar with either AForge or Accelerate, but I did encounter some problems when upgrading FFT libraries in another project dealing with 2D images, which look to me as similar to yours.

It turns out that output data representation from FFT libraries isn't unique, and for some applications the output data is much more convenient if "swapped", so as to put low frequencies in the center rather than in corners.

If you check this page on a FFT algorithm, http://www.eso.org/sci/software/eclipse/eug/eug/man/fft.html , you'll notice that both formats are supported, and the swap structure is described (at bottom).

It seems to me that the data you graphed at the right would look much more like the ones on the left, were you to swap (mirror) the right half of the data array around the center.