More on latency

okkezSS · October 21, 2010, 6:42am

I had a flow-graph that earlier today had a latency of roughly 1 second
or so.

When I tested it this evening, after it had been running for several
hours, the latency was
back up to several tens of seconds!!!. Which means that external
events at the source take
several tens of seconds to show up at the sinks – two graphical, and
one filesink. WTF? !!

The CPU load at the time was modest – about 38%

–
Principal Investigator
Shirleys Bay Radio Astronomy Consortium

Marcus_DSLeech · October 21, 2010, 8:11am

On Thu, Oct 21, 2010 at 12:41:16AM -0400, Marcus D. Leech wrote:

The CPU load at the time was modest – about 38%
38% of what? How many cores? What kind of machine?

It’s possible that there’s a computation in a single block that
requires > 1 core to compute in realtime.

Have you tried oprofile to see where the graph is spending its time?
Are you i/o bound? What’s the rate that you’re writing to the file
sink?

I believe htop will show you all the threads of the process. Are any
of them consuming on the order of 100% of a single core?

Eric

Marcus_DSLeech · October 21, 2010, 8:24am

On 10/21/2010 02:10 AM, Eric B. wrote:

one filesink. WTF? !!

The CPU load at the time was modest – about 38%

38% of what? How many cores? What kind of machine?

A dual-core machine, an Atom D-510

It’s possible that there’s a computation in a single block that
requires > 1 core to compute in realtime.

Unlikely. The most “computey” block is a 1024-bin FFT, and my sample
rate is only 400Ksps.
There’s also an FFT filter, but it typically has only about 40-45
taps.

Have you tried oprofile to see where the graph is spending its time?
Are you i/o bound? What’s the rate that you’re writing to the file sink?

I’m writing to the file sink at 1Ksps.

There’s also an audio sink, I’m using the “plughw:0,0” device, and it’s
being pumped at
20Ksps, which generally divides my source rate exactly. I tried
turning off that sub-tree
the other night, but I didn’t let it run very long. Perhaps a
residual clock-rate mis-match
is causing ‘buffer creep’, and after a few hours, that ‘buffer creep’
has grown to several-10s
of seconds?

I believe htop will show you all the threads of the process. Are any
of them consuming on the order of 100% of a single core?

Eric

Hmm, have to check that. OK, just installed ‘htop’ and there’s no
single thread that’s chewing on
near 100% of a cpu. The top two threads peak at around 70% and 30%,
but average somewhat
less than that.

–
Principal Investigator
Shirleys Bay Radio Astronomy Consortium

Marcus_DSLeech · October 21, 2010, 5:41pm

On Thu, Oct 21, 2010 at 02:23:20AM -0400, Marcus D. Leech wrote:

several tens of seconds to show up at the sinks – two graphical, and
Unlikely. The most “computey” block is a 1024-bin FFT, and my sample
being pumped at
20Ksps, which generally divides my source rate exactly. I tried
turning off that sub-tree
the other night, but I didn’t let it run very long. Perhaps a
residual clock-rate mis-match
is causing ‘buffer creep’, and after a few hours, that ‘buffer creep’
has grown to several-10s
of seconds?

Yes, that would cause it. I’ve seen it with the FM receiver apps.

BTW, it would have been useful to tell us that there was an audio sink
in the graph when you first posted the observation.

Thanks,
Eric

Marcus_DSLeech · October 21, 2010, 7:31pm

On 10/21/2010 11:41 AM, Eric B. wrote:

Yes, that would cause it. I’ve seen it with the FM receiver apps.

Any hint about how to “cure” this problem? I’m perfectly willing to
have the audio sink drop samples
from time to time in order to prevent/dramatically-reduce buffer
creep.

How do Linux audio apps deal with this in “digital recording studio”
cases? Where they may have audio inputs/outputs
from/to different cards, with unsynchronized clocks, etc?

I have another GNURadio app, which uses an audio input and an audio
output, on different cards. It has been running for
several days, and the latency is roughly 1sec. The machine it is
running on is a Pentium D dual-core, at 2.4/3.2GHz. Probably
30% more “ooomph” than the D-510 that is running the other app.

Btw, I started the app on the D-510 and let it run overnight. The
latency this morning is roughly the same as it was last night
when I started it–about 1 to 1.5second. So, I wonder what the
condition is that causes buffer creep to become really large?

BTW, it would have been useful to tell us that there was an audio sink
in the graph when you first posted the observation.

Actually, in the first instance, a few days ago, I did. It was an
oversight in this most recent post series. Sorry.

Thanks,
Eric

–
Marcus L.
Principal Investigator
Shirleys Bay Radio Astronomy Consortium

Marcus_DSLeech · October 21, 2010, 8:00pm

On Thu, 2010-10-21 at 13:11 -0400, Marcus D. Leech wrote:

from/to different cards, with unsynchronized clocks, etc?
The cure is to provide a resampling block inside the sink, and to
dynamically set its fractional rate based on the buffer consumption of
the sink. This is on my todo list for the JACK sink. I’m starting with
the JACK sink because the JACK API is the only one that provides
detailed info on buffer state. It’s possible that ALSA also provides
useful information; it’s hard to say because of the incomprehensibility
of the ALSA API. I haven’t looked that hard.

I’m not sure that most digital recording studio applications have the
capability of reading from one audio card and writing directly to
another. If they do, they must have some way of resampling data to match
sample rates.

I have another GNURadio app, which uses an audio input and an audio
output, on different cards. It has been running for
several days, and the latency is roughly 1sec. The machine it is
running on is a Pentium D dual-core, at 2.4/3.2GHz. Probably
30% more “ooomph” than the D-510 that is running the other app.

Btw, I started the app on the D-510 and let it run overnight. The
latency this morning is roughly the same as it was last night
when I started it–about 1 to 1.5second. So, I wonder what the
condition is that causes buffer creep to become really large?

Hard to say. It could be that on that machine the audio card’s clock is
very close to what it’s “supposed” to be; e.g., its 44100Hz is really
44100Hz.

Marcus_DSLeech · October 21, 2010, 10:14pm

On Thu, 2010-10-21 at 15:48 -0400, Davek wrote:

Jack, what a nightmare. I user it to send audio from my ubuntu server
to macbook. Used ssh to access gui. According to the docs this is how
they deal with sync. 2 soundcards…

http://trac.jackaudio.org/wiki/WalkThrough/User/NetJack2

> > Netjack2 includes a system allowing to resample the network stream to > send it on a piece of audio hardware. This is done using an in server > client called audioadapter. After the slave has been started using the > net backend, load the audioadapter client using jack_load:"

Yes, it may be possible to use the alsa_out plugin for JACK as a quick
fix to handle dynamic resampling. This plugin is now built into JACK by
default.

It would be most useful to replicate this functionality in our own JACK
sink and do the resampling using GR’s own resampling blocks. We could
potentially replicate that solution for the native ALSA driver, which
most users seem to use. Since alsa_out uses native ALSA calls to
determine buffer fill state, this should be possible, although far from
straightforward.

–n

Marcus_DSLeech · October 21, 2010, 9:49pm

Jack, what a nightmare. I user it to send audio from my ubuntu server to
macbook. Used ssh to access gui. According to the docs this is how they
deal with sync. 2 soundcards…

http://trac.jackaudio.org/wiki/WalkThrough/User/NetJack2

"You probably know that if you take two computers, and make them run an
audio software, or a Jack server at a given sample rate, they are
obviously not running exactly at the same sample rate. That is because
they don’t have exactly the same master clock. This is the greatest
inconvenience that all the digital audio world has ever fought.
In the Netjack system, no problem of synchronization because the slave
isn’t running an audio hardware. It’s simply led by a network stream.
the incoming stream delivers 64 frames, for example, so the slave have
to deal with those 64 frames, then send back 64 frames. There is no
other time consideration than the master’s cycle last.

But if you want to make these 64 frames goes to an audio hardware, you
will have to resample because the master’s cycle duration will not fit
to the new arbitrary slave’s. Master and slave have no other way of
syncing each other (except for hardware which includewordclock or some
other kind of wired sync).

Netjack2 includes a system allowing to resample the network stream to
send it on a piece of audio hardware. This is done using an in server
client called audioadapter. After the slave has been started using the
net backend, load the audioadapter client using jack_load:"

Sent from my iPad