Packet Radio

dlapsley · April 1, 2006, 6:43am

BBN is working on a project, funded by the US Government, to build
teams of cognitively-controlled software-defined radios. As part of
this project we will be building a subnetwork-layer routing protocol
and a MAC that is designed for the software radio environment. We
will be making all of our code available under Free licenses (GPL for
GNU Radio changes, and 3-clause BSD or equivalent for many other
things). Our system will run on GNU/Linux and NetBSD.

We will be using GNU Radio as the software radio base in our system.
Recently there have been great strides in sending packetized data with
GNU Radio. Several BBNers, with guidance from Eric, have thought
about a number of changes to GNU Radio which would make data radio
usage more flexible, and have written up these changes.

We would very much appreciate peer review of these proposals. We are
interested in finding and fixing problems anyone can see in the
approach, or ways in which the changes can be more broadly useful if
done differently.

Our project will be working to improve GNU Radio, and we plan to
follow our proposed roadmap, after revising it based on feedback. We
would like to work closely with others who would like to join us, and
will strive to make sure anything we do is useful to a broad set of
people and does not cause harm. (Of course, Eric and the consensus of
gnuradio-discuss will, as always, determine what’s in the official
tree.)

We will also be making the USRP work well on NetBSD, fixing the
current USB speed issue.

We will be making our work available as we do it, and plan to interact
as several individuals working on GNU Radio (with a common purpose)
rather than as an isolated project.

The document is available at

 http://acert.ir.bbn.com/downloads/adroit/gr-arch-changes-1.pdf

We would appreciate feedback, sent to gnuradio-discuss, or feel free
to email us privately if there’s some reason gnuradio-discuss isn’t
appropriate.

Greg T. [email protected] (Principal Investigator)
David Lapsley [email protected] (gnuradio-discuss Liason)

dlapsley · April 1, 2006, 9:45am

On Fri, Mar 31, 2006 at 11:42:37PM -0500, dlapsley wrote:

The document is available at
http://acert.ir.bbn.com/downloads/adroit/gr-arch-changes-1.pdf
We would appreciate feedback, sent to gnuradio-discuss, or feel free
to email us privately if there’s some reason gnuradio-discuss isn’t
appropriate.

I think the basic m-block idea looks reasonable, and achieves the goal
of extending GNU Radio without disturbing the existing framework.

In section 4.5, “Two stage, quasi-real time, hybrid scheduler”:

FYI, a given flow graph currently may be evaluated with more than one
thread if it can be partitioned into disjoint subgraphs. I don’t
think that fundamentally changes anything with regard to embedding a
flow graph in an m-block.

Section 4.5.4, second bullet: “profile header portion”. Limiting the
kind and profile lengths to 8-bits each seems like you’re asking for
trouble. For example, when combining many m-blocks from many
different sub-projects, the universe of kinds could easily exceed 256.

Are you assuming that this gets passed across the air, or just within
a given node? If within a node, for the kind I’d suggest something
reminiscent of interned symbols. 16-bits would probably be big
enough, if each block mapped their arbitrary kind name (string) into
an interned 16-bit value at block init time.

I’d also make sure you’ve got some way to ensure that the data portion
is aligned on the most restrictive architectural boundary (16-bytes on
x86 / x86-64)

Section 4.5.5 Standardized Time:

In reading what’s there, I don’t see how you’re going to solve the
problems that I think we’ve got. Perhaps an end-to-end example would
help illustrate your proposal?

For example, Table 4.2 says that “Timestamp” carries the value of the
transmit-side “sampling-clock” at the time this message was
transmitted. If I’m a “source m-block” generating, say a test
pattern, what do I put in the Timestamp field? Where do I get the
value? Consider the case where the “real” sampling-clock is across
USB or ethernet.

If I want to tell the ultimate downstream end of the pipeline not to
transmit the first sample of the modulated packet until time t, how do
I do that? That’s essential for any kind of TDMA mechanism.

In general, I’m not following this section. I’m not sure if you’re
trying to figure out the real time required through each m-block
and/or if you’re trying to figure out the algorithmic delay through
each block, and/or if you’re trying to figure out the NET to NET
delay between multiple nodes, …

Also, an example of how we’d map whatever you’re thinking about on to
something that looked like a USRP or other h/w would be useful.

I guess I’m missing the overall statement of intention. I.e., what do
the higher layers care about, and how does your proposal help them
realize their goals?

Meta data:

General questions about meta-data: Does an m-block just “copy-through”
meta-data that it doesn’t understand?

Or in the general case, why not just make it all key/value pairs?
Why restrict yourself to a single distinguished “data portion”?

Section 4.5.8: Scheduler.

I’m not sure I follow Figure 4.8. Perhaps once I understand the
timing stuff it’ll make more sense.

Section 4.5.9: Memory Mgmt

With regard to reference counting, we’ve had good luck with the
boost::shared_ptr stuff. It’s transparent, interacts well with
Python, and just works.

Section 4.5.10: Implementation Considerations

Reentrancy: I think we need to distinguish between multiple
instances of a block each running in a separate thread, vs a given
single
instance running in multiple threads. I don’t see an overwhelming
need to have a given instance be reentrant, with the possible
exception of communicating commands to it at runtime. But in that
case, a thread safe queue of commands might suffice.

That’s it for now!
Eric

dlapsley · April 1, 2006, 6:12pm

Eric,

Thank you for your comments. Greatly appreciated. See my comments
inline.

Cheers,

David.

On Apr 1, 2006, at 2:42 AM, Eric B. wrote:

I think the basic m-block idea looks reasonable, and achieves the goal
of extending GNU Radio without disturbing the existing framework.

Great!

In section 4.5, “Two stage, quasi-real time, hybrid scheduler”:

FYI, a given flow graph currently may be evaluated with more than one
thread if it can be partitioned into disjoint subgraphs. I don’t
think that fundamentally changes anything with regard to embedding a
flow graph in an m-block.

Thanks for the pointer. We had missed the partition_graph step in the
scheduler class. As you say, it won’t affect the flow graph embedding,
apart from the possibility of having more than one thread spawned
from inside an m-block.

Section 4.5.4, second bullet: “profile header portion”. Limiting the
kind and profile lengths to 8-bits each seems like you’re asking for
trouble. For example, when combining many m-blocks from many
different sub-projects, the universe of kinds could easily exceed 256.

Good point.

Are you assuming that this gets passed across the air, or just within
a given node? If within a node, for the kind I’d suggest something
reminiscent of interned symbols. 16-bits would probably be big
enough, if each block mapped their arbitrary kind name (string) into
an interned 16-bit value at block init time.

We are assuming that this gets passed between elements of the software
radio, so just within a node. 16-bits sounds good for this.

I’d also make sure you’ve got some way to ensure that the data portion
is aligned on the most restrictive architectural boundary (16-bytes on
x86 / x86-64)

Good idea.

Section 4.5.5 Standardized Time:

In reading what’s there, I don’t see how you’re going to solve the
problems that I think we’ve got. Perhaps an end-to-end example would
help illustrate your proposal?

We’ll work on getting a good example into the document.

For example, Table 4.2 says that “Timestamp” carries the value of the
transmit-side “sampling-clock” at the time this message was
transmitted. If I’m a “source m-block” generating, say a test
pattern, what do I put in the Timestamp field? Where do I get the
value? Consider the case where the “real” sampling-clock is across
USB or ethernet.

One option would be to have a sample counter in the source m-block
that is incremented for every data sample that is transmitted.
The value of that sample counter when you transmit a message is what
would be written into the timestamp field.

The timing message ties wall clock time to this sample. Every block
in the flow graph would know the relationship between wall clock time
from the periodic timing messages which contain an NTP timestamp and
the equivalent RTP timestamp (i.e. sample count). Sampling frequency
can also be used to work out the time corresponding to a given sample
given a single timing/synchronization message.

If I want to tell the ultimate downstream end of the pipeline not to
transmit the first sample of the modulated packet until time t, how do
I do that? That’s essential for any kind of TDMA mechanism.

The most direct way is through the signaling interface. The MAC layer
(or other client) can send a signal enabling/disabling transmission at
the end of the pipeline at the appropriate point in time.

Another way to do it would be to have some form of “playout buffer” at
the end of the pipeline that buffers packets until it is time for
them to be sent.
In this case the timing transfer mechanism would be used to enable
each block
to measure the latency from the time the packet entered the top of the
pipeline until it arrived (or left) the block. These latencies would be
exposed to the top level m-block scheduler which could then allocate
processing time to blocks based on these latencies in order to ensure
that
some threshold was not exceeded. Typically, you could imagine the
processor just looking at the end to end delay and scheduling processing
to keep that below a certain threshold. In a sense, the scheduler does
coarse grain scheduling to ensure the end to end delay does not go
beyond some tolerance, while the playout buffer does fine grained
scheduling.

In general, I’m not following this section. I’m not sure if you’re
trying to figure out the real time required through each m-block
and/or if you’re trying to figure out the algorithmic delay through
each block, and/or if you’re trying to figure out the NET to NET
delay between multiple nodes, …

It’s the first two. The initial thought is just to re-use the semantics
and format of RTP/RTCP for transferring timing information between
elements of a radio. There are a couple of options. One option would
be to to figure out the wall clock delay between two blocks within a
flow graph (could imagine that typically they would be endpoints of a
pipeline). This way we can make sure the delay through a flow graph
stays within limits by scheduling blocks appropriately. Another option
would be to measure the end to end delay between some process in
the MAC (or other controlling entity) and the bottom of a pipeline in
the PHY.
There could also be a control loop here to ensure that the end to end
delay requirements are not exceeded.

We’ll work on making this section clearer and get a new revision out
next week.

Also, an example of how we’d map whatever you’re thinking about on to
something that looked like a USRP or other h/w would be useful.

Will do.

I guess I’m missing the overall statement of intention. I.e., what do
the higher layers care about, and how does your proposal help them
realize their goals?

The main goal of section 4.5.5 is to provide mechanisms that will bound
the time it takes for a request to make it all the way to the bottom
of the PHY
and to enable real-time scheduling/playout of data at the bottom of
the PHY.

Meta data:

General questions about meta-data: Does an m-block just “copy-through”
meta-data that it doesn’t understand?

Yes. Ideally, we would just be passing references/pointers so there
wouldn’t need to be any copying. You could also imagine blocks
“popping” off profiles/sections of metadata specific to them.

Or in the general case, why not just make it all key/value pairs?
Why restrict yourself to a single distinguished “data portion”?

Sure. That’s a nice way to think about it. It would also be nice to
maintain a hierarchy of metadata so that there was some structure
to it (e.g. grouping by profiles or block type).

Section 4.5.8: Scheduler.

I’m not sure I follow Figure 4.8. Perhaps once I understand the
timing stuff it’ll make more sense.

We’ll work on making this clearer.

Section 4.5.9: Memory Mgmt

With regard to reference counting, we’ve had good luck with the
boost::shared_ptr stuff. It’s transparent, interacts well with
Python, and just works.

Thanks for the pointer.

Section 4.5.10: Implementation Considerations

Reentrancy: I think we need to distinguish between multiple
instances of a block each running in a separate thread, vs a given
single
instance running in multiple threads. I don’t see an overwhelming
need to have a given instance be reentrant, with the possible
exception of communicating commands to it at runtime. But in that
case, a thread safe queue of commands might suffice.

Yes. I agree.

That’s it for now!
Eric

Thanks so much for the feedback. That is great!

dlapsley · April 1, 2006, 6:37pm

I would like to control the USRP with a smaller embedded Linux computer
instead of a laptop. I’ve don’t have any experience with this sort of
thing, and searching “embedded” in the mailing list archive didn’t
return much. So, I was hoping some of you pros out there might be able
to point me in the direction of some products you like. I’m not sure
what our requirements are exactly. However, I can say, the smaller the
form factor, and the less power consumed, the better. I don’t think we
need too much horsepower. Our application just doesn’t call for it. Any
suggestions would be greatly appreciated.

Thanks,

Lee

dlapsley · April 1, 2006, 9:38pm

On Sat, Apr 01, 2006 at 11:09:18AM -0500, dlapsley wrote:

Timing:

would be written into the timestamp field.
I believe this assumes a model where m-blocks have a more-or-less
uniform flow through them. I believe this assumption is invalid. In
fact, I thought the whole point of m-blocks was to better deal with
discontinuous flow. E.g., packets arrive at random times. The m-block
isn’t running if there isn’t something for it to do. Yet, the real
world sample clock marches on…

The timing message ties wall clock time to this sample. Every block
in the flow graph would know the relationship between wall clock time
from the periodic timing messages which contain an NTP timestamp and
the equivalent RTP timestamp (i.e. sample count). Sampling frequency
can also be used to work out the time corresponding to a given sample
given a single timing/synchronization message.

I don’t think this works in a world of discontinuous transmission or
reception. We aren’t streaming video

If I want to tell the ultimate downstream end of the pipeline not to
transmit the first sample of the modulated packet until time t, how do
I do that? That’s essential for any kind of TDMA mechanism.

The most direct way is through the signaling interface. The MAC layer
(or other client) can send a signal enabling/disabling transmission at
the end of the pipeline at the appropriate point in time.

For this particular example, I think that the time, or a proxy for the
time (e.g., Transmit this packet in TDMA Slot N of M), should be in
the metadata attached to the high-level packet.

Perhaps we’re talking past each other here. Or we’re conflating the
playout time with a desire to figure out how to schedule the m-blocks
so that the right thing occurs at the right time.

The important part is getting the timing semantics right at the
MAC/PHY boundary and at the “soft PHY / hard PHY” boundary.
Everything else is an implementation detail.

I’m not sure we’ve got this pinned down yet.

Another way to do it would be to have some form of “playout buffer” at
the end of the pipeline that buffers packets until it is time for
them to be sent.

Yes, and there’s a (perhaps unacknowledged) requirement to minimize
latency. The latency in this part of the system directly impacts any
MAC control loop.

In this case the timing transfer mechanism would be used to enable
each block to measure the latency from the time the packet entered
the top of the pipeline until it arrived (or left) the block.

My thought is that the latency through the blocks is going to vary all
over the place, and from packet to packet. E.g., one packet may come
in with metadata indicating “needs to get there with
high-probability” or at a different level of abstraction “use XYZ FEC
and ABC modulation”. The next packet has a different metadata. The
downstream flow for these two case may be completely different.

Yes, I understand that you could model this based on worst case and/or
some probability distribution. Is that the goal?

These latencies would be exposed to the top level m-block scheduler
which could then allocate processing time to blocks based on these
latencies in order to ensure that some threshold was not
exceeded. Typically, you could imagine the processor just looking at
the end to end delay and scheduling processing to keep that below a
certain threshold. In a sense, the scheduler does coarse grain
scheduling to ensure the end to end delay does not go beyond some
tolerance, while the playout buffer does fine grained scheduling.

[The final playout buffer is in the attached hardware]

OK, I get the basic goal.
Not sure I agree with all the RTP/RTCP/NTP… Can’t the m-scheduler
just measure the execution time of each m-block?

In general, I’m not following this section. I’m not sure if you’re
trying to figure out the real time required through each m-block
and/or if you’re trying to figure out the algorithmic delay through
each block, and/or if you’re trying to figure out the NET to NET
delay between multiple nodes, …

It’s the first two.

Clear.

The initial thought is just to re-use the semantics and format of
RTP/RTCP for transferring timing information between elements of a
radio.

Seems like a solution looking for a problem.

There are a couple of options. One option would be to to figure out
the wall clock delay between two blocks within a flow graph (could
imagine that typically they would be endpoints of a pipeline). This
way we can make sure the delay through a flow graph stays within
limits by scheduling blocks appropriately.

Seems reasonable.

Another option would be to measure the end to end delay between some
process in the MAC (or other controlling entity) and the bottom of a
pipeline in the PHY. There could also be a control loop here to
ensure that the end to end delay requirements are not exceeded.

OK.

We’ll work on making this section clearer and get a new revision out
next week.

Sounds good. I suggest starting with some use cases for the MAC/PHY
interface and the soft-PHY/hard-PHY interface.

I’m particular interested in sorting out the idea of time.

I think that “priority” originates in the MAC. It may tell us that a
particular packet has priority P. The MAC is going to be handing us
multiple outstanding packets, right? We’re going to flow control it
somehow, but I’m assuming that it may send us a packet at time T+1
that has higher priority than one sent at time T. When/where/how do
we handle this?

I think the soft-PHY/hard-PHY interface is pretty straight forward.
You’ve got to assume that the low level hardware is pretty dumb. In
the receive direction, assume that the hardware gives you fixed length
packets of N samples with a header containing it’s sample counter
value corresponding to the first sample of the packet.

In the transmit direction, it might be slightly more complicated.
Assume fixed length packets with a header that has a “do not transmit
before sample counter time T”, as well as a few other bits including
things like, “this packet is the beginning of a frame”, this “packet
is the middle of a frame”, “this packet is the end of a frame”, “this
packet has N valid samples”. The packet header probably also contains
attributes such as transmit power, but probably only in the packet
that corresponds to the first fragment of a frame.

The soft-PHY/hard-PHY packets also carry some indication of “channel”
which is a label for a particular path in the h/w from the interface
to/from an antenna. [Probably not relevant to the timing problem.]

Also, an example of how we’d map whatever you’re thinking about on to
something that looked like a USRP or other h/w would be useful.

Will do.

I guess I’m missing the overall statement of intention. I.e., what do
the higher layers care about, and how does your proposal help them
realize their goals?

The main goal of section 4.5.5 is to provide mechanisms that will
bound the time it takes for a request to make it all the way to the
bottom of the PHY and to enable real-time scheduling/playout of data
at the bottom of the PHY.

OK.

Why restrict yourself to a single distinguished “data portion”?

Sure. That’s a nice way to think about it. It would also be nice to
maintain a hierarchy of metadata so that there was some structure
to it (e.g. grouping by profiles or block type).

Sounds good.

Section 4.5.8: Scheduler.

I’m not sure I follow Figure 4.8. Perhaps once I understand the
timing stuff it’ll make more sense.

We’ll work on making this clearer.

Good.

With regard to signals and control processing, it seems like it would
make sense to use the same mechanism to communicate with existing
gr_blocks. E.g., a carrier sense block implemented in the flow graph
needs to be able to send a transition signal and receive a
threshold setting. [I’m not attached to the existing msg / msg queue stuff, it just seemed like the simplest thing that could possible work.]

Also, in the definitions in Table 4.1, if a “message” is the smallest
block of information that can be processed by an m-block, what do you
call the data that is associated with a “signal” or “control”?

I’m also not clear about how or when you process the scheduling of
control/signals versus data+metadata. When does an m-block handle the
signals/control versus the data + metadata?

Do you envision a different mechanism? If so, why?

(Can we pick different nomenclature for “signals”? It’s seriously
overloaded in the POSIX environment.)

At a higher level of abstraction, why would an m-block have different
classes of ports? Why don’t they all accept and/or send
“messages” containing “data + metadata”. The current design seems
like it’s being prematurely narrowed and special cased, when one
simple abstraction seems to cover all the cases. They could of course
have a ports labeled “control” and “signal”, but do we really want a
different implementation?

Also, for your use cases and pictures, I’d start with m-blocks
that contain multiple input and output ports for “data + metadata”,
and then handle single-in / single-out as a specialization.

It may be that this latency / scheduling problem is blown way out of
proportion. We’ve got to assume we’ve got sufficient cycles to
execute the transmit and receive paths in the worst case, or else
we’re hosed. Buy faster hardware / fix the signal processing until
the problem goes away.

Then, you can get priority scheduling within the m-block universe by
attaching a priority to each message. I’m assuming the “single model
of the universe where everything is a message containing
data+metadata, including the so-called signal and control ports”.
Then just implement a priority queue for each port. The priority
assigned each m-block at time t is the max(over-all-of-its-port-queues).
Then just pick the highest priority m-block and run it. No problem.
The latency is going to be what it’s going to be. We’re doing what
the MAC asked us to do in the order it asked us to do it.

Then, we’re left with the high level problem of hitting the given TDMA
slot, but that gets sorted out by the tail end of the m-block pipeline
that feeds the physical h/w with the appropriate frame + timestamp.

In summary, isn’t an m-block just an actor that wakes up when told and
processes events/messages in it’s queue(s) and generates
events/messages that get sent someplace else? If so, then this is a
well understood problem, with a well understood solution, and no
reentrancy problems Composing them falls out nicely too. We also
get transparent scaling in the SMP/multi-core environment.

Eric

dlapsley · April 3, 2006, 7:03pm

Thanks for the reply, Clark. Basically, the project we’re working on
involves putting the USRP on a small UAV (“A” as in “aerial” not
“autonomous”). So, weight and power consumption are key. The hydraxc
looks very cool. However, I think we can go larger. And, for the first
cut at a solution, we want to go for the simplest thing that works. It
seems like the hydraxc might not be that. However, for a final design
choice, it might work very nicely. In fact, it looks like it could be
the solution, but it will take more time to get up and running than we
have. I hope you will update the list as you make progress with the
hydraxc.

Lee

dlapsley · April 2, 2006, 3:36pm

Lee,
I am very much interested in the same. Specifically, I’d like to get it
on
embedded Linux running off a virtex-4 FPGA. This module from
hydraxc(http://www.hydraxc.com/) has a USB2.0 port and is the size of a
stick of gum.

Seems like one should be able to connect this to a USRP with a 100 MHz
PPC
linux to process at least a couple hundred kilohertz bandiwidth. The
ehci
driver should already exist so I guess it’s just a matter of getting
python
to run on it and setting up a cross compiler environment for the c++
files.

I have a USRP and I’ve ordered a hydraxc. Let me know if you want to
pursue
further. I don’t have much time to throw at it, but I can at least try
some
things out to gage how much work is involved.

Thanks,
Clark

what our requirements are exactly. However, I can say, the smaller the
[email protected]
Discuss-gnuradio Info Page

Express yourself instantly with MSN Messenger! Download today - it’s
FREE!
http://messenger.msn.click-url.com/go/onm00200471ave/direct/01/

dlapsley · April 3, 2006, 8:23pm

On Mon, Apr 03, 2006 at 01:00:07PM -0400, Lee P. wrote:

Lee

The hydraxc looks cool, but I think it’s going to be a bear to get GNU
Radio running on it. A couple of observations: no floating point (not
positive, but pretty sure about that), no MMU (otherwise they wouldn’t
be running ucLinux).

Though I haven’t played with a Virtex 4 with embedded PPC, I have
played with a Virtex II (V2P50) with 2x embedded PPCs and its
performance was underwhelming. Really small cache didn’t help. After
spending quite a bit of time on it, I was unable to get the LWIP
TCP/IP stack to run at anything like wire speed using a 100Mbit PHY.
Couldn’t even compute UDP checksums and keep up.

If you can afford the size/weight, I’d probably try one of the Pentium M
single board computers. Everything should just work on that platform.
Yes, they’re a lot bigger than the hydraxc or gumstix, but they’re
very likely to work right out of the box.

Also, unless you’re a glutton for punishment, don’t get the Celeron
version, spend the extra bucks and get the Pentium M.

Eric

dlapsley · April 3, 2006, 9:06pm

On Mon, 2006-04-03 at 10:22 -0700, Thomas S. wrote:

Hi Lee,

We are also looking into similar things, though we don’t have the
power constraints which will be difficult to meet. We are currently
looking at a micro-atx solution with an amd geod nx processor since
this one is compatible with the amd athlon mobile. this should make it
easy to have a running linux system.

Good point. Are you custom building this, or did you find a COTS
solution? Have you looked at mini-itx (which I now know of thanks to
Pete)? If so, what was your impression?

we were first also thinking about platforms which feature a intel
xscale processor, though quickly found out that it doesn’t support
floating points. GNU Radio makes heavy use of floating point
operations. Thus it is key, especially if the platform is not very
powerfull, that it supports at least hardware floating points.

Wow. Thanks. I didn’t even consider that point. I will definitely make
sure I check this.

Also, did you consider the power consumption of the usrp? I don’t know
if someone measured it, but this might also be a problem for you,
i.e., finding a battery which supports the usrp over a longer time.

This is a problem for us. However, other members of the group, namely,
the UAV guys, seem to think we’re okay on this.

cheers,

Thomas

Thanks for the reply, Thomas. I really appreciate the advice.

dlapsley · April 3, 2006, 7:24pm

Hi Lee,

We are also looking into similar things, though we don’t have the
power constraints which will be difficult to meet. We are currently
looking at a micro-atx solution with an amd geod nx processor since
this one is compatible with the amd athlon mobile. this should make it
easy to have a running linux system.

we were first also thinking about platforms which feature a intel
xscale processor, though quickly found out that it doesn’t support
floating points. GNU Radio makes heavy use of floating point
operations. Thus it is key, especially if the platform is not very
powerfull, that it supports at least hardware floating points.

Also, did you consider the power consumption of the usrp? I don’t know
if someone measured it, but this might also be a problem for you,
i.e., finding a battery which supports the usrp over a longer time.

cheers,

Thomas

dlapsley · April 3, 2006, 9:15pm

The ppclinux does use the MMU, but a lot of people still run ucLinux
mostly,
it appears, because it is easier to set up and get going. I think one
can
add the FPU at the expense of using up the fpga. But, yes, such a thing
would be an uphill climb.

Then there’s the issue if it’s an embedded, headless system and you run
kernel 2.6 why not just write the blocks in C and use the system fifos
to
handle the data scheduling?

-Clark

“autonomous”). So, weight and power consumption are key. The hydraxc
The hydraxc looks cool, but I think it’s going to be a bear to get GNU

If you can afford the size/weight, I’d probably try one of the Pentium M
single board computers. Everything should just work on that platform.
Yes, they’re a lot bigger than the hydraxc or gumstix, but they’re
very likely to work right out of the box.

Also, unless you’re a glutton for punishment, don’t get the Celeron
version, spend the extra bucks and get the Pentium M.

Eric

Don?t just search. Find. Check out the new MSN Search!
http://search.msn.click-url.com/go/onm00200636ave/direct/01/

dlapsley · April 3, 2006, 8:38pm

On Mon, Apr 03, 2006 at 10:22:01AM -0700, Thomas S. wrote:

Hi Lee,

We are also looking into similar things, though we don’t have the
power constraints which will be difficult to meet. We are currently
looking at a micro-atx solution with an amd geod nx processor since
this one is compatible with the amd athlon mobile. this should make it
easy to have a running linux system.

Have you tried the geode nx?

At least on earlier geodes the performance was pretty dismal. Perhaps
the “nx” is a different processor than I remember. Caveat emptor.
Compatible means “has the same instruction set”. A 486 DX66 comes
pretty close to meeting that requirement

we were first also thinking about platforms which feature a intel
xscale processor, though quickly found out that it doesn’t support
floating points. GNU Radio makes heavy use of floating point
operations. Thus it is key, especially if the platform is not very
powerfull, that it supports at least hardware floating points.

Also, did you consider the power consumption of the usrp? I don’t know
if someone measured it, but this might also be a problem for you,
i.e., finding a battery which supports the usrp over a longer time.

The USRP on my bench draws about 11W. This depends of course on the
daughterboads you’re using, and if you’ve got them all running.

Eric

dlapsley · April 3, 2006, 9:36pm

On Mon, 2006-04-03 at 10:22 -0700, Thomas S. wrote:

Also, did you consider the power consumption of the usrp? I don’t know
if someone measured it, but this might also be a problem for you,
i.e., finding a battery which supports the usrp over a longer time.

I use a usrp with battery a lot and with two mounted basic Rx boards,
one basic Tx board, doing input from just one Rx board - a 6V 7AH
sealed lead-acid battery lasts just about 3 hours.

–Chuck

dlapsley · April 3, 2006, 9:55pm

On Mon, Apr 03, 2006 at 12:33:26PM -0700, Thomas S. wrote:

easy to have a running linux system.

Have you tried the geode nx?

No, we didn’t actually try the geode nx yet, but on paper they look
pretty good. They are completely different from the older geode, which
are less powerfull. For some more information:

http://www.amd.com/us-en/ConnectivitySolutions/ProductInformation/0,,50_2330_9863_10837,00.html

Thanks for the link.

Let us know how it turns out.

Eric

dlapsley · April 3, 2006, 9:36pm

Hi Eric

On 4/3/06, Eric B. [email protected] wrote:

On Mon, Apr 03, 2006 at 10:22:01AM -0700, Thomas S. wrote:

Hi Lee,

We are also looking into similar things, though we don’t have the
power constraints which will be difficult to meet. We are currently
looking at a micro-atx solution with an amd geod nx processor since
this one is compatible with the amd athlon mobile. this should make it
easy to have a running linux system.

Have you tried the geode nx?

No, we didn’t actually try the geode nx yet, but on paper they look
pretty good. They are completely different from the older geode, which
are less powerfull. For some more information:

http://www.amd.com/us-en/ConnectivitySolutions/ProductInformation/0,,50_2330_9863_10837,00.html

dlapsley · April 4, 2006, 1:18am

A couple of observations: no floating point

Have you tried the geode nx?

The MIT “$100 Laptop” uses the AMD Geode [email protected]. See:
http://wiki.laptop.org/wiki/Hardware_specification

According to Jim Gettys, the reason is because it’s the only low-power
processor they could find that has hardware floating point. OLPC
needs extremely low power (~5W for the whole system including LCD).

The GX runs much slower than the NX, but consumes about 1W rather than
6-14W.
The GX is the old National Semiconductor Geode line; the NX is a
low-power
Athlon core.

The OLPC demonstrator used an AMD “Rumba” development board, which you
might
be able to find and try.

The simplest thing that works is almost certainly a modern laptop.
Beware of the Sony small/light ones: they use lots of custom,
undocumented
chips, and won’t run standard Windows or Linux distributions (they’re
only warranteed when running their OWN windows distro).

John

dlapsley · April 4, 2006, 4:12pm

Hi Eric,

Thank you for the feedback. We will incorporate your comments into
a new revision of the document that we should have out soon.

Timing and scheduling seem to be the hardest issues, so we’d like
to think about them a bit more and then discuss on the list.

Cheers,

David.

dlapsley · April 19, 2006, 12:20am

On Mon, 2006-04-03 at 11:20 -0700, Eric B. wrote:

… unless you’re a glutton for punishment, don’t get the Celeron
version, spend the extra bucks and get the Pentium M.

Besides the dearth of on-board cache, what are the other drawbacks of a
Celeron?

To fit the dimensional requirement I was given (5"x5"x~1"), the SBC must
be passively cooled. However, I’m not finding a Pentium-M solution that
can be passively cooled and meets our availability requirements. I have
found a 600 MHz Celeron solution, but has half the L2 cache.

In our application, we’ll be pulling full throttle from the USRP, maybe
FIR filtering, and then pushing back out to USRP. Not too heavy on the
signal processing.

All advice appreciated.

Lee

P.S.

Some potential solutions:
http://www.gms4sbc.com/P60x_BO.html (can’t meet availability)
http://www.kontron-emea.com/index.php?id=82&cat=58 (JRex-PM, can only
air cool Celeron M 600 MHz)

dlapsley · April 19, 2006, 4:29am

On Tue, Apr 18, 2006 at 06:18:55PM -0400, Lee P. wrote:

found a 600 MHz Celeron solution, but has half the L2 cache.

Some potential solutions:
http://www.gms4sbc.com/P60x_BO.html (can’t meet availability)
http://www.kontron-emea.com/index.php?id=82&cat=58 (JRex-PM, can only
air cool Celeron M 600 MHz)

You should be able to benchmark this, including cache performance
using oprofile. To track cache misses you’ll need to enable a
non-default set of counters in oprofile, but it’s possible.
http:://oprofile.sf.net You should be able to determine the cache
hit/miss ratio for you existing configuration using oprofile.

Benchmark the app you want to run on whatever you’ve currently got.
The closer in architecture/microarchitecture, the better. Then scale
by CPU freq, and a big wild-ass guess on cache size differences.

Eric

dlapsley · April 3, 2006, 10:35pm

Thanks, Chuck. Very useful to know.