FFT by samshi, 2020-01-23

    about the algorithm:
    1. double-precision,<1e-13
    2. 1d complex DFTs, power-of-two sizes
    3. breadth-first
    4. no bit-reversal
    5. no data re-ordering
    6. twiddle factors array, N/4 + 1 length, only cosine value, [0, pi/2]
    7. cache array for temporary data exchange, N/2 * 2 length, one for real, one for imaginary
    8. continue loop outputs
    9. performance: 2**20, 190ms, 1000MFLOPS, iMac 3.5Ghz chrome MacOS
    10. javascript size <8k, compressed <2.4k, hdcafe.com/code/js/fft14.js

    samshi
    40882080@qq.com


    ps:
    continue loop outputs algorithm:

    loop k= range(0 : dimensine):
        bfsize = N >> (k+1)
        loop output_index = range(0 : N/2):
            m = output_index % bfsize
            t = output_index - m

            input_a = output_index + m
            input_b = input_a + bfsize

            omg_re = omg[m]
            omg_im = omg[N/8 - m]
            mt_re = omg_re * x[input_b].re - omg_im * x[input_b].im
            mt_im = omg_re * x[input_b].im + omg_im * x[input_b].re

            x[output_index].re       = x[input_a].re + mt_re
            x[output_index].im       = x[input_a].im + mt_im
            x[output_index + N/2].re = x[input_a].re - mt_re
            x[output_index + N/2].im = x[input_a].im - mt_im