Hello everyone,
I’ve been having the same issue with the scheduler for the past 2 months
or
so and I’m starting to unravel what I believe is occurring. Let me
describe
my flowgraph first, then describe what’s happening when I run the flow
graph, and then describe some REALLY interesting things I’ve noticed and
my
thoughts. I apologize for the length of this in advance! . This isn’t
as
much of a “help me!” post as it is “what is happening here and is there
a
problem with the scheduler or my thinking?”
My flowgraph consists of:
File source → Custom Encode → Custom Server Query → Custom Server
Query
→ Custom Decode → Bit Error Rate Calculator
The encode scheme I am using is rather simple: it splits a byte into 4
“flags”, each representing the four possible values of two bits
(00,01,10,11). The first “Custom Server Query” sends these flags to our
server, which stores them and returns a reference number, which is the
output of the block, with the second query block doing this process in
reverse. Overall, the actual implementations of these blocks are all
extremely simple, but unimportant, except that they all have if
statement
chains to perform different processes based on the incoming flag. (This
is
important later)
Now I expect the entire file to transfer over with a 0 BER. What I find
is
that only a relatively short length transfers successfully before the
data
on the other end becomes completely jumbled. I cannot ascertain what
sort
of jumbling occurs, ie dropping, shifting, delays, because it appears to
change each run and happen at different file locations.
I don’t have a throttle after the file source because it does not work.
Extreme example: I can set the throttle speed to 1 and it will still
process at the fastest rate possible. If I put any other block in
between
the file source and the throttle, then the throttle works as it should.
It
doesn’t fix the jumble issue, but the throttle acts as it was designed
to
by bursting with an average of the set value. That’s weird thing #1: Why
does throttle work if there is a block in between it and the file source
block, but not if directly connected?
Next, I determined that the jumbling doesn’t happen for small file sizes
(21kB) for random content files, but does for larger files. Why is
random
import? Because if I put sequential data through (0x0 0x1 0x2 0x3 0x0
0x1
0x2 0x3 0x4…), it works perfectly with any file length at any flow
graph speed. This is a little unexpected, but not unheard of, as it
appears
to be an artifact of branch prediction failure (see the following for a
good discussion if you are unfamiliar:
java - Why is processing a sorted array faster than processing an unsorted array? - Stack Overflow).
This would be a factor because my blocks have if statements for what
type
of input flag is being processed.
After that, I put a sleep() in my encode block to more uniformly control
execution speed for testing. Even then, the lowest sleep value I can use
for random file data before jumbling is around 300microseconds. For
sequential data I can omit the sleep altogether, meaning it can process
at
the fastest possible rate.
This is the opposite of what I would think would occur. If this was a
buffer overrun issue, then why does the jumbling only happen when the
flowgraph runs slower?
After some thinking, I’m starting to become suspicious of the threading
model. I’ll admit that I haven’t fully reviewed the threading code
(thus,
am I making any obvious blunders in thinking below?) My idea of the
issue
boils down to threads becoming unsynchronized. Executing with sequential
data would perform all encoding and decoding tasks with optimal speed
and
ideally no branch prediction failures, such that the time slice of each
thread for general work would ideally take the same time to execute. If
running random data, branch prediction would surely fail almost every
time.
But the time it didn’t fail would speed up that particular iteration
greatly. Is is possible these threads are getting jumbled when this
occurs
a few times in a row, making a “bubble” of fast execution speed and the
data NOT being guaranteed FIFO?
Unless there is another feature of the scheduler I am unaware of and I
am
over thinking the importance of branch prediction here, I cannot reason
why
sequential data produces no problems. Even if I am completely wrong with
my
reasoning here, it doesn’t explain away the fact that my jumbling still
occurs. I’ve tried this on multiple machines with different versions of
linux, but with GNR version 3.6.4.1. I’ve used built from source
installs
and the Ettus binary.
Does anyone have any thoughts on this? Where are the guts of the
scheduler
located in the repo? I can find files for the thread for a block, but
not
the file that creates the threads and controls them. I’m just using
sleep()
in my encoding block as a temporary fix.
Best,
Ron S.