Assertion error beyond 4096 output items

jason_sam · May 28, 2014, 9:46am

Hi
we have written a c++ block using out of tree modules which uses a
sync
block that outputs same no of items as input items given. we have
written a
python test file where the inputs and outputs are complex floats. The
test
code is running well until 4096 items. But when the output items size
is
greater than or equal to 8192, ctest shows an assertion error which
says

-10+5j !=10+5j beyond 7 places.

we are unsure what the error is. Is it got something to do with gnuradio
architecture with limitation beyond 4096*2 items or is it anything else
we
are doing wrong. I cannot see an error in the logic of the program. I
can
attach the program if necessary.

karan_t · May 28, 2014, 10:17am

Hi Karan,

how does your test look like?
Usually, to test a blocks functionality, you construct a minimal flow
graph in the python test script:
vector_source->block under test->vector sink
fill it with sample data, and let GNU Radio run that flowgraph.
Then you have no control on how large your noutput_items is, because
this is up to GNU Radio to decide
at runtime.
Generally, it’d be nice to know which version of GNU Radio you are
running etc.
If you want to share your code, best way to do this is usually either
pasting just the C++ block and the python test to
pastebin or gist.github.com, or even better, making a git repo out of
your project and pushing it to github or similar.

Greetings,
Marcus

karan_t · May 28, 2014, 11:15am

On 28.05.2014 09:45, Karan T. wrote:

Hi
we have written a c++ block using out of tree modules which uses a
sync block that outputs same no of items as input items given. we have
written a python test file where the inputs and outputs are complex
floats. The test code is running well until 4096 items. But when the
output items size is greater than or equal to 8192, ctest shows an
assertion error which says

-10+5j !=10+5j beyond 7 places.

This looks like floating point quantization errors. Show us your QA, and
make sure you’re using the assertFloatTuplesAlmostEqual (not sure if
that’s the right name) call.

M

karan_t · May 28, 2014, 11:30am

On 28.05.2014 11:13, Martin B. wrote:

This looks like floating point quantization errors. Show us your QA, and
make sure you’re using the assertFloatTuplesAlmostEqual (not sure if
that’s the right name) call.

Ah, as Marcus M points out, this is a signature error, rather than
quantization
Still, this all points to your code being incorrect, or your QA making
invalid assumptions.

Maybe you should be tracking state in your block (these number indicate
that more than 1 work function call seem unveil your bug).

M

karan_t · May 28, 2014, 12:01pm

@Marcus. Instead of using a grc gui flowgraph to test our block, we
have
written a qa test code in python which connects vector source, block
that
we are testing and vector sink just like the example given in out of
tree
modules to generate square of the input items.

We are writing an alamouti code block which takes in an input stream of
N
complex numbers and gives out 2 output steams of N items each. I will
attach the C++ code of block( /* -*- c++ -*- *//* * Copyright 2014 <+YOU OR YOUR COMPANY+>. * * This - Pastebin.com ) and also
the
python qa test code below(#!/usr/bin/env python# -*- coding: utf-8 -*-# # Copyright 2014 <+YOU OR YO - Pastebin.com).
http://pastebin.com/da21Ww4B

karan_t · May 28, 2014, 12:12pm

Your Problem is actually here:
(C++ work function)
" while ( i < d_fft_length ) {"

you never check that your in[] and out[] indices are within
noutput_items!
You only get a limited amount of items per work call. For sync blocks,
this is noutput_items.
What you do is just ignore that number, take d_fft_length items (whether
there are more or less
available) and process them. Then you “return noutput_items”, which is
also wrong,
because you actually process d_fft_length.

What you most probably want to do is either directly process vectors, or
set an output multiple.
A vector is actually just an item with vector_length*single_itemsize
size, so you need to change your in- and
output signatures by multiplying sizeof(whatever) with fft_length; then,
in your work, you only process
one vector, thus you should return 1;.

If you use set_output_multiple, your item size stays the same
(sizeof(gr_complex)), and you don’t have to change your code,
aside from replacing return noutput_items by what you’ve actually
consumed (ie. d_fft_length).

Greetings,
Marcus

karan_t · May 28, 2014, 2:45pm

Hi Marcus,
Thank you for evaluating our code and help debugging
it.
what we understand from your reply is that work function at a time
processes noutput_items and this can be lesser than or more than the
fft_length. As an example in our code when we give the fft length as
8192,
but the noutput_items is still 4096, so does that mean it has to
execute
work function twice to process 4096*2=8192 items?

Regarding the first approach you suggested, we change the input
signature
and output signature to (sizeof (gr_complex)*fft_length) so that it is a
single vector that is being processed. Then we return 1 as suggested.
But
it is throwing an itemsize mismatch error. I have attached the c++ file
here ( /* -*- c++ -*- *//* * Copyright 2014 <+YOU OR YOUR COMPANY+>. * * This - Pastebin.com ). The error says

ERROR: test_001_t (main.qa_al_enc)
2: ------------------------------

2: Traceback (most recent call last):
2: File “/home/sagar/gr-alamouti/python/qa_al_enc.py”, line 66, in
test_001_t
2: self.tb.connect(src,opr)
2: File
“/usr/local/lib/python2.7/dist-packages/gnuradio/gr/top_block.py”, line
130, in connect
2: self._connect(points[i-1], points[i])
2: File
“/usr/local/lib/python2.7/dist-packages/gnuradio/gr/top_block.py”, line
142, in _connect
2: dst_block.to_basic_block(), dst_port)
2: File
“/usr/local/lib/python2.7/dist-packages/gnuradio/gr/runtime_swig.py”,
line
4350, in primitive_connect
2: return _runtime_swig.top_block_sptr_primitive_connect(self,
*args)
2: ValueError: itemsize mismatch: vector_source_c1:0 using 8, al_enc1:0
using 65536

The python qa file link is (#!/usr/bin/env python# -*- coding: utf-8 -*-# # Copyright 2014 <+YOU OR YO - Pastebin.com).
http://pastebin.com/da21Ww4B

For the second method suggested should we write a general work function
and
a forecast function which would mean doing away with sync block that we
are
using with work function right now?
http://pastebin.com/da21Ww4B

karan_t · May 28, 2014, 12:03pm

Hi!

On 28.05.2014 12:00, Karan T. wrote:

@Marcus. Instead of using a grc gui flowgraph to test our block, we have
written a qa test code in python which connects vector source, block that
we are testing and vector sink just like the example given in out of tree
modules to generate square of the input items.
Excellent, that’s what I meant to suggest
We are writing an alamouti code block which takes in an input stream of N
complex numbers and gives out 2 output steams of N items each. I will
attach the C++ code of block( /* -*- c++ -*- *//* * Copyright 2014 <+YOU OR YOUR COMPANY+>. * * This - Pastebin.com ) and also the
python qa test code below(#!/usr/bin/env python# -*- coding: utf-8 -*-# # Copyright 2014 <+YOU OR YO - Pastebin.com).
http://pastebin.com/da21Ww4B
Thanks, will have a look at it.

Greetings,
Marcus

karan_t · May 29, 2014, 8:43am

Hi All,
@Marcus @activecat @Martin Thank you all for your help. We
were
able to solve the problem we had for itemsize greater than or equal to
8192. We used set_output_multiple(fft_length) in the constructor and it
worked. Now it is working for any input fft length.

So at the input of the C++ we are giving vector_source_c() and then
input
output signatures are sizeof(gr_complex). The work function is taking
noutput_items and returning noutput_items. We understand now that
because
we have used set_output_multiple(fft_length) in the constructor, the
noutput_items is fixed at the fft_length and then work function is being
called.

However one small doubt,when we print value of noutput_items in C++
program
inside the work function ,the print statement is executed twice for any
fft_length we use. Is this correct? It should only print once the value
of
noutput_items if i am right.

Thank you once again.

On Thu, May 29, 2014 at 2:53 AM, Marcus Müller
[email protected]

karan_t · May 28, 2014, 5:16pm

Hi Karan,
You keep Marcus very busy, let me help to offload him.

On Wed, May 28, 2014 at 8:36 PM, Karan T. [email protected]
wrote:

Hi Marcus,
Thank you for evaluating our code and help debugging it. what we
understand from your reply is that work function at a time processes
noutput_items and this can be lesser than or more than the fft_length. As
an example in our code when we give the fft length as 8192, but the
noutput_items is still 4096, so does that mean it has to execute work
function twice to process 4096*2=8192 items?

When the desirable fft_length is 8192, could you accept to perform FFT
at
only 4096 elements?
If yes, perform FFT on this 4096 elements, and return 4096.
If not, make sure you use set_output_multiple(fft_length) or at least
set_min_noutput_items(fft_length), in this case the scheduler will not
call
work() function with insufficient number of elements.

Regarding the first approach you suggested, we change the input signature
and output signature to (sizeof (gr_complex)*fft_length) so that it is a
single vector that is being processed. Then we return 1 as suggested. But
it is throwing an itemsize mismatch error. I have attached the c++ file
here ( /* -*- c++ -*- *//* * Copyright 2014 <+YOU OR YOUR COMPANY+>. * * This - Pastebin.com ). The error says

The correct way is, in your work() function you should have 2 loops,
because now your in[0] is a vector but no longer solely a complex
number.
Example:

for (int i=0; i < noutput_items; i++)
{
for (int j=0; j < fft_length; j++)
{
…
}
}

For the second method suggested should we write a general work function
and

a forecast function which would mean doing away with sync block that we are
using with work function right now?

No need, because set_output_multiple() works with sync block as well.

Note:
Apologize in advance if my answer is incorrect.

karan_t · May 29, 2014, 9:03am

On Thu, May 29, 2014 at 2:41 PM, Karan T. [email protected]
wrote:

So at the input of the C++ we are giving vector_source_c() and then input
output signatures are sizeof(gr_complex). The work function is taking
noutput_items and returning noutput_items. We understand now that because
we have used set_output_multiple(fft_length) in the constructor, the
noutput_items is fixed at the fft_length and then work function is being
called.

The noutput_items could be fft_length or multiple of fft_length.
Says fft_length is 8192, then noutput_items could be either 8192, 16384
or
24576 etc, determined by the scheduler.
If you only get noutput_items equals to 8192, that could be due to the
number of input elements available is less than 16384.

However one small doubt,when we print value of noutput_items in C++
program inside the work function ,the print statement is executed twice for
any fft_length we use. Is this correct? It should only print once the
value of noutput_items if i am right.

If the print statement is executed twice, that means the work() function
is
invoked twice.
It could be something wrong with your code, let’s show your code here.

karan_t · May 28, 2014, 11:24pm

Hi Activecat, hi Karan!

You keep Marcus very busy, let me help to offload him.

Activecat: Thanks I’m not in charge of the mailing list, everyone
should feel free to
answer if he’s able to help! Helping Karan was, by the way, not the
biggest of all
work loads, because I always enjoy reading a few lines of code. But as
you will see, there’s not
much to add to your explanations, so keep that good work up

So, back to topic:

Hi Marcus,
Thank you for evaluating our code and help debugging it. what we
understand from your reply is that work function at a time processes
noutput_items
Sorry, you got that wrong:
The work function processes as many items as you define by returning
that number.
The upper boundary is the number of items available, which is
noutput_items.
All items that have been available but you haven’t consumed (by
returning less than noutput_items) will be saved for the next work call.

and this can be lesser than or more than the fft_length. As
an example in our code when we give the fft length as 8192, but the
noutput_items is still 4096, so does that mean it has to execute work
function twice to process 4096*2=8192 items?
Usually, there’s no guarantee that your work will always be called with
the same number of items.
The only guarantee you get is that noutput_items >= 1 !
So if you want to make sure you always get a certain number of items or
vectors of the right length, you will
have to use set_output_multiple or a vector input length.

[…]

If not, make sure you use set_output_multiple(fft_length) or at least
set_min_noutput_items(fft_length), in this case the scheduler will not call
work() function with insufficient number of elements.
Exactly!

Regarding the first approach you suggested, we change the input signature
and output signature to (sizeof (gr_complex)*fft_length) so that it is a
single vector that is being processed. Then we return 1 as suggested. But
it is throwing an itemsize mismatch error. I have attached the c++ file
here ( /* -*- c++ -*- *//* * Copyright 2014 <+YOU OR YOUR COMPANY+>. * * This - Pastebin.com ). The error says
This is because you put in items of sizeof(gr_complex), but your block
expects sizeof(gr_complex)fft_length. Solve this setting the vector
length of your vector source OR by adding a “stream to vector” block; do
the same for your sink OR add a “vector to stream”.
The correct way is, in your work() function you should have 2 loops,
because now your in[0] is a vector but no longer solely a complex number.
As GNU Radio is now, this is not really true. Since vectors are but
vector elements that are directly sequential in memory, and these
vectors, like smaller items, are directly sequential in memory
themselves, there is no difference between 16384 items in two items of
length 8192sizeof(gr_complex) and 16384 items of sizeof(gr_complex).
It’s always good to remember: As a rule of thumb, the GNU Radio
scheduler does not care about your signal at all it only cares about
memory sizes. That’s why it complains about item size mismatches – it
has no way to look inside the data!

For the second method suggested should we write a general work function and

a forecast function which would mean doing away with sync block that we are
using with work function right now?

No need, because set_output_multiple() works with sync block as well.
Exactly. Whenever you can say that for a certain amount of input, there
will be a proportional amount of output items, there’s no need for
general_work, and you can use the sync, interpolator or decimator block
types.

Greetings,
Marcus

Note:

Note:
Apologize in advance if my answer is incorrect.

Activecat, you’re already a rising star so as a GNU Radio enthusiast
I really like when you bring yourself in! I’ve made dozens of mistakes,
and I never felt bad when someone corrected me afterwards, and you
shouldn’t either.
If GNU Radio was for people that are perfect, then this project would be
as dead as a doornail, and as beginner-friendly as parachute-free
skydiving…
Software frameworks (heck, we call ourselves “ecosystem”) like GNU Radio
only blossom when people discuss concepts, and that won’t happen when
everyone has the same understanding and the same problems. So I
encourage everyone to experiment a lot, and ask questions that they
can’t solve themselves! Answering questions definitively belongs into
the community part of experimenting and should be fruitful to both, the
one asking and the one answering.

karan_t · May 29, 2014, 9:13am

@Activecat Here is the c++ block code

karan_t · May 29, 2014, 10:25am

On Thu, May 29, 2014 at 3:12 PM, Karan T. [email protected]
wrote:

@Activecat Here is the c++ block code

Let’s change the print statement to below, recompile, run and paste the
console output here.

std::cout << “al_enc::work: noutput_items=” << noutput_items <<
std::endl;

Also, what did you configure on the vector_source_c ?

karan_t · May 29, 2014, 11:21am

On Thu, May 29, 2014 at 4:54 PM, Karan T. [email protected]
wrote:

Done constructing a list of tests

This is unittest, not the application flowgraph.

Unittest is used to verify your code.
Application flowgraph is used to run RF simulations.

In application flowgraph, if you only have 8192 inputs while the
fft_length
is set to 8192, then the work() should be called only once.

karan_t · May 29, 2014, 10:55am

We changed the print statement to what you said and console output is

sagar@Horus:~/gr-alamouti/
build$ ctest -V -R al_enc
UpdateCTestConfiguration from
:/home/sagar/gr-alamouti/build/DartConfiguration.tcl
UpdateCTestConfiguration from
:/home/sagar/gr-alamouti/build/DartConfiguration.tcl
Test project /home/sagar/gr-alamouti/build
Constructing a list of tests
Done constructing a list of tests
Checking test dependency graph…
Checking test dependency graph end
test 2
Start 2: qa_al_enc

2: Test command: /bin/sh
“/home/sagar/gr-alamouti/build/python/qa_al_enc_test.sh”
2: Test timeout computed to be: 9.99988e+06
2: al_enc::work:noutput_items1024
2: al_enc::work:noutput_items1024
2: .
2:

2: Ran 1 test in 0.014s
2:
2: OK
1/1 Test #2: qa_al_enc … Passed 0.21 sec

The following tests passed:
qa_al_enc

100% tests passed, 0 tests failed out of 1

Total Test time (real) = 0.21 sec

Vector_source_c takes in an argument numb which is a list that contains
fft_size values. I will paste the qa code
here(#!/usr/bin/env python# -*- coding: utf-8 -*-# # Copyright 2014 <+YOU OR YO - Pastebin.com).
http://pastebin.com/da21Ww4B

karan_t · May 29, 2014, 11:51am

On Thu, May 29, 2014 at 4:54 PM, Karan T. [email protected]
wrote:

Done constructing a list of tests

Are you sure this is from the code unmodified at
#!/usr/bin/env python# -*- coding: utf-8 -*-# # Copyright 2014 <+YOU OR YO - Pastebin.com ?
You configured n=2048 but the work() is called with noutput_items=1024.
Earlier you said it worked well.

karan_t · May 30, 2014, 6:56am

On Thu, May 29, 2014 at 4:54 PM, Karan T. [email protected]
wrote:

Vector_source_c takes in an argument numb which is a list that contains
fft_size values. I will paste the qa code here(#!/usr/bin/env python# -*- coding: utf-8 -*-# # Copyright 2014 <+YOU OR YO - Pastebin.com).
http://pastebin.com/da21Ww4B

Something a little bit out of topic:
Part of your code at #!/usr/bin/env python# -*- coding: utf-8 -*-# # Copyright 2014 <+YOU OR YO - Pastebin.com

    numb=list()
    for i in range (n):
            numb.append(complex(lst1[i],lst2[i]))

Above code is not “vectorized”.
The vectorized version may look like:

    numb = numpy.complex64( lst1 )  +  1j * numpy.complex64( lst2 )

Vectorization is probably less critical in unittest, but should not be
overlooked for blocked coded in python.
For info about vectorization,

karan_t · May 29, 2014, 11:56am

Activecat,
I am sorry, I pasted the wrong qa code. I changed
value
of n to n=1024 and then when i printed it, I got the output I pasted
above.

Assertion error beyond 4096 output items

ERROR: test_001_t (main.qa_al_enc) 2: ------------------------------

2: Test command: /bin/sh “/home/sagar/gr-alamouti/build/python/qa_al_enc_test.sh” 2: Test timeout computed to be: 9.99988e+06 2: al_enc::work:noutput_items1024 2: al_enc::work:noutput_items1024 2: . 2:

ERROR: test_001_t (main.qa_al_enc)
2: ------------------------------

2: Test command: /bin/sh
“/home/sagar/gr-alamouti/build/python/qa_al_enc_test.sh”
2: Test timeout computed to be: 9.99988e+06
2: al_enc::work:noutput_items1024
2: al_enc::work:noutput_items1024
2: .
2: