On Sat, Aug 14, 2010 at 8:18 PM, Marcus D. Leech [email protected]
wrote:
to define my passband of interest (I bring in 1MHz, but only need/want
filter that’s
computationally the some order as the two in series, give or take. Sigh.
Glad I could be that somebody.
a dual-core CPU running at about 1.7GHz. The current app is taking up
about 65% of the combined CPU, and I just want to get a little more
headroom.
If you take that approach, you’d get better performance for the longer
SIMD instructions since you wouldn’t be fetching as many instructions
using the convolved filters, right? The number of operations ends up
being the same, but how quickly and how concise you can tell the
processor to do the operation is what you’re fighting now it seems.
It’s a shame it’s a D510 and doesn’t support DPPS or DPPD dot product
instruction as described here:
http://en.wikipedia.org/wiki/SSE4#SSE4.1
Either way, you may be able to pack twice as many samples if you use
16-bit samples instead of 32-bit floats:
http://en.wikipedia.org/wiki/Streaming_SIMD_Extensions
But that ends up as a reduction in dynamic range, which I’m not sure
you can deal with.
I’d be interested to hear what solution you come up with.
Me too 
Actually, something I’d thought of would be to treat the “edges” as
multiple contiguous
notches, and run that in the FFT filter only, and eliminate the FIR
bandpass filter. But I’m not sure
I’ll get really good stop-band attenuation that way.
Not knowing your bandpass FIR filter, but guessing equiripple - you
could complex mix your center to baseband and do a real low-pass
filter which should be a lower order than an equivalent bandpass
equiripple - but I am making a lot of assumptions about your filters,
your data/dynamic range requirements and all sorts of other crucial
bits.
–
Marcus L.
Principal Investigator
Shirleys Bay Radio Astronomy Consortium
http://www.sbrac.org
In the end, I’m no microarchitecture expert - especially when it comes
to x86.
Good luck.
Brian