Forum: GNU Radio Packet Radio

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
24a3fcde2a3a647231c7148b2a216e20?d=identicon&s=25 dlapsley (Guest)
on 2006-04-01 06:43
(Received via mailing list)
BBN is working on a project, funded by the US Government, to build
teams of cognitively-controlled software-defined radios.  As part of
this project we will be building a subnetwork-layer routing protocol
and a MAC that is designed for the software radio environment.  We
will be making all of our code available under Free licenses (GPL for
GNU Radio changes, and 3-clause BSD or equivalent for many other
things).  Our system will run on GNU/Linux and NetBSD.

We will be using GNU Radio as the software radio base in our system.
Recently there have been great strides in sending packetized data with
GNU Radio.  Several BBNers, with guidance from Eric, have thought
about a number of changes to GNU Radio which would make data radio
usage more flexible, and have written up these changes.

We would very much appreciate peer review of these proposals.  We are
interested in finding and fixing problems anyone can see in the
approach, or ways in which the changes can be more broadly useful if
done differently.

Our project will be working to improve GNU Radio, and we plan to
follow our proposed roadmap, after revising it based on feedback.  We
would like to work closely with others who would like to join us, and
will strive to make sure anything we do is useful to a broad set of
people and does not cause harm.  (Of course, Eric and the consensus of
gnuradio-discuss will, as always, determine what's in the official
tree.)

We will also be making the USRP work well on NetBSD, fixing the
current USB speed issue.

We will be making our work available as we do it, and plan to interact
as several individuals working on GNU Radio (with a common purpose)
rather than as an isolated project.

The document is available at

     http://acert.ir.bbn.com/downloads/adroit/gr-arch-c...

We would appreciate feedback, sent to gnuradio-discuss, or feel free
to email us privately if there's some reason gnuradio-discuss isn't
appropriate.


Greg Troxel <gdt@ir.bbn.com>     (Principal Investigator)
David Lapsley <dlapsley@bbn.com>    (gnuradio-discuss Liason)
745d8202ef5a58c1058d0e5395a78f9c?d=identicon&s=25 Eric Blossom (Guest)
on 2006-04-01 09:45
(Received via mailing list)
On Fri, Mar 31, 2006 at 11:42:37PM -0500, dlapsley wrote:
>
> The document is available at
>
>     http://acert.ir.bbn.com/downloads/adroit/gr-arch-c...
>
> We would appreciate feedback, sent to gnuradio-discuss, or feel free
> to email us privately if there's some reason gnuradio-discuss isn't
> appropriate.


I think the basic m-block idea looks reasonable, and achieves the goal
of extending GNU Radio without disturbing the existing framework.


In section 4.5, "Two stage, quasi-real time, hybrid scheduler":

FYI, a given flow graph currently may be evaluated with more than one
thread if it can be partitioned into disjoint subgraphs.  I don't
think that fundamentally changes anything with regard to embedding a
flow graph in an m-block.


Section 4.5.4, second bullet: "profile header portion".  Limiting the
kind and profile lengths to 8-bits each seems like you're asking for
trouble.   For example, when combining many m-blocks from many
different sub-projects, the universe of kinds could easily exceed 256.

Are you assuming that this gets passed across the air, or just within
a given node?  If within a node, for the kind I'd suggest something
reminiscent of interned symbols.  16-bits would probably be big
enough, if each block mapped their arbitrary kind name (string) into
an interned 16-bit value at block init time.

I'd also make sure you've got some way to ensure that the data portion
is aligned on the most restrictive architectural boundary (16-bytes on
x86 / x86-64)


Section 4.5.5 Standardized Time:

In reading what's there, I don't see how you're going to solve the
problems that I think we've got.  Perhaps an end-to-end example would
help illustrate your proposal?

For example, Table 4.2 says that "Timestamp" carries the value of the
transmit-side "sampling-clock" at the time this message was
transmitted.  If I'm a "source m-block" generating, say a test
pattern, what do I put in the Timestamp field?  Where do I get the
value?  Consider the case where the "real" sampling-clock is across
USB or ethernet.

If I want to tell the ultimate downstream end of the pipeline not to
transmit the first sample of the modulated packet until time t, how do
I do that?  That's essential for any kind of TDMA mechanism.

In general, I'm not following this section.  I'm not sure if you're
trying to figure out the real time required through each m-block
and/or if you're trying to figure out the algorithmic delay through
each block, and/or if you're trying to figure out the NET to NET
delay between multiple nodes, ...

Also, an example of how we'd map whatever you're thinking about on to
something that looked like a USRP or other h/w would be useful.

I guess I'm missing the overall statement of intention.  I.e., what do
the higher layers care about, and how does your proposal help them
realize their goals?


Meta data:

General questions about meta-data: Does an m-block just "copy-through"
meta-data that it doesn't understand?

Or in the general case, why not just make it *all* key/value pairs?
Why restrict yourself to a single distinguished "data portion"?


Section 4.5.8: Scheduler.

I'm not sure I follow Figure 4.8.  Perhaps once I understand the
timing stuff it'll make more sense.


Section 4.5.9: Memory Mgmt

With regard to reference counting, we've had good luck with the
boost::shared_ptr stuff.  It's transparent, interacts well with
Python, and just works.


Section 4.5.10: Implementation Considerations

* Reentrancy:  I think we need to distinguish between multiple
instances of a block each running in a separate thread, vs a given
single
instance running in multiple threads.  I don't see an overwhelming
need to have a given instance be reentrant, with the possible
exception of communicating commands to it at runtime.  But in that
case, a thread safe queue of commands might suffice.


That's it for now!
Eric
24a3fcde2a3a647231c7148b2a216e20?d=identicon&s=25 dlapsley (Guest)
on 2006-04-01 18:12
(Received via mailing list)
Eric,

Thank you for your comments. Greatly appreciated. See my comments
inline.

Cheers,

David.

On Apr 1, 2006, at 2:42 AM, Eric Blossom wrote:

>
> I think the basic m-block idea looks reasonable, and achieves the goal
> of extending GNU Radio without disturbing the existing framework.

Great!

> In section 4.5, "Two stage, quasi-real time, hybrid scheduler":
>
> FYI, a given flow graph currently may be evaluated with more than one
> thread if it can be partitioned into disjoint subgraphs.  I don't
> think that fundamentally changes anything with regard to embedding a
> flow graph in an m-block.

Thanks for the pointer. We had missed the partition_graph step in the
scheduler class. As you say, it won't affect the flow graph embedding,
apart from the possibility of having more than one thread spawned
from inside an m-block.

> Section 4.5.4, second bullet: "profile header portion".  Limiting the
> kind and profile lengths to 8-bits each seems like you're asking for
> trouble.   For example, when combining many m-blocks from many
> different sub-projects, the universe of kinds could easily exceed 256.

Good point.

> Are you assuming that this gets passed across the air, or just within
> a given node?  If within a node, for the kind I'd suggest something
> reminiscent of interned symbols.  16-bits would probably be big
> enough, if each block mapped their arbitrary kind name (string) into
> an interned 16-bit value at block init time.

We are assuming that this gets passed between elements of the software
radio, so just within a node. 16-bits sounds good for this.

> I'd also make sure you've got some way to ensure that the data portion
> is aligned on the most restrictive architectural boundary (16-bytes on
> x86 / x86-64)

Good idea.

> Section 4.5.5 Standardized Time:
>
> In reading what's there, I don't see how you're going to solve the
> problems that I think we've got.  Perhaps an end-to-end example would
> help illustrate your proposal?

We'll work on getting a good example into the document.

> For example, Table 4.2 says that "Timestamp" carries the value of the
> transmit-side "sampling-clock" at the time this message was
> transmitted.  If I'm a "source m-block" generating, say a test
> pattern, what do I put in the Timestamp field?  Where do I get the
> value?  Consider the case where the "real" sampling-clock is across
> USB or ethernet.

One option would be to have a sample counter in the source m-block
that is incremented for every data sample that is transmitted.
The value of that sample counter when you transmit a message is what
would be written into the timestamp field.

The timing message ties wall clock time to this sample. Every block
in the flow graph would know the relationship between wall clock time
from the periodic timing messages which contain an NTP timestamp and
the equivalent RTP timestamp (i.e. sample count). Sampling frequency
can also be used to work out the time corresponding to a given sample
given a single timing/synchronization message.

> If I want to tell the ultimate downstream end of the pipeline not to
> transmit the first sample of the modulated packet until time t, how do
> I do that?  That's essential for any kind of TDMA mechanism.

The most direct way is through the signaling interface. The MAC layer
(or other client) can send a signal enabling/disabling transmission at
the end of the pipeline at the appropriate point in time.

Another way to do it would be to have some form of "playout buffer" at
the end of the pipeline that buffers packets until it is time for
them to be sent.
In this case the timing transfer mechanism would be used to enable
each block
to measure the latency from the time the packet entered the top of the
pipeline until it arrived (or left) the block. These latencies would be
exposed to the top level m-block scheduler which could then allocate
processing time to blocks based on these latencies in order to ensure
that
some threshold was not exceeded. Typically, you could imagine the
processor just looking at the end to end delay and scheduling processing
to keep that below a certain threshold. In a sense, the scheduler does
coarse grain scheduling to ensure the end to end delay does not go
beyond some tolerance, while the playout buffer does fine grained
scheduling.

> In general, I'm not following this section.  I'm not sure if you're
> trying to figure out the real time required through each m-block
> and/or if you're trying to figure out the algorithmic delay through
> each block, and/or if you're trying to figure out the NET to NET
> delay between multiple nodes, ...

It's the first two. The initial thought is just to re-use the semantics
and format of RTP/RTCP for transferring timing information between
elements of a radio.  There are a couple of options. One option would
be to to figure out the wall clock delay between two blocks within a
flow graph (could imagine that typically they would be endpoints of a
pipeline). This way we can make sure the delay through a flow graph
stays within limits by scheduling blocks  appropriately.  Another option
would be to measure the end to end delay  between some process in
the MAC (or other controlling entity) and the bottom of a pipeline in
the PHY.
There could also be a control loop here to ensure that the end to end
delay requirements are not exceeded.

We'll work on making this section clearer and get a new revision out
next week.

> Also, an example of how we'd map whatever you're thinking about on to
> something that looked like a USRP or other h/w would be useful.

Will do.

> I guess I'm missing the overall statement of intention.  I.e., what do
> the higher layers care about, and how does your proposal help them
> realize their goals?

The main goal of section 4.5.5 is to provide mechanisms that will bound
the time it takes for a request to make it all the way to the bottom
of the PHY
and to enable real-time scheduling/playout of data at the bottom of
the PHY.


> Meta data:
>
> General questions about meta-data: Does an m-block just "copy-through"
> meta-data that it doesn't understand?

Yes. Ideally, we would just be passing references/pointers so there
wouldn't need to be any copying. You could also imagine blocks
"popping" off profiles/sections of metadata specific to them.

> Or in the general case, why not just make it *all* key/value pairs?
> Why restrict yourself to a single distinguished "data portion"?

Sure. That's a nice way to think about it. It would also be nice to
maintain a hierarchy of metadata so that there was some structure
to it (e.g. grouping by profiles or block type).

>
> Section 4.5.8: Scheduler.
>
> I'm not sure I follow Figure 4.8.  Perhaps once I understand the
> timing stuff it'll make more sense.

We'll work on making this clearer.

> Section 4.5.9: Memory Mgmt
>
> With regard to reference counting, we've had good luck with the
> boost::shared_ptr stuff.  It's transparent, interacts well with
> Python, and just works.

Thanks for the pointer.

> Section 4.5.10: Implementation Considerations
>
> * Reentrancy:  I think we need to distinguish between multiple
> instances of a block each running in a separate thread, vs a given
> single
> instance running in multiple threads.  I don't see an overwhelming
> need to have a given instance be reentrant, with the possible
> exception of communicating commands to it at runtime.  But in that
> case, a thread safe queue of commands might suffice.

Yes. I agree.

> That's it for now!
> Eric

Thanks so much for the feedback. That is great!
F4c3ef08c85c5729a5521bc80d5025c0?d=identicon&s=25 Lee Patton (Guest)
on 2006-04-01 18:37
(Received via mailing list)
I would like to control the USRP with a smaller embedded Linux computer
instead of a laptop. I've don't have any experience with this sort of
thing, and searching "embedded" in the mailing list archive didn't
return much.  So, I was hoping some of you pros out there might be able
to point me in the direction of some products you like.  I'm not sure
what our requirements are exactly. However, I can say, the smaller the
form factor, and the less power consumed, the better.  I don't think we
need too much horsepower. Our application just doesn't call for it.  Any
suggestions would be greatly appreciated.

Thanks,
 - Lee
745d8202ef5a58c1058d0e5395a78f9c?d=identicon&s=25 Eric Blossom (Guest)
on 2006-04-01 21:38
(Received via mailing list)
On Sat, Apr 01, 2006 at 11:09:18AM -0500, dlapsley wrote:
> >
> >

Timing:

> would be written into the timestamp field.
I believe this assumes a model where m-blocks have a more-or-less
uniform flow through them.  I believe this assumption is invalid.  In
fact, I thought the whole point of m-blocks was to better deal with
discontinuous flow.  E.g., packets arrive at random times.  The m-block
isn't running if there isn't something for it to do.  Yet, the real
world sample clock marches on...

> The timing message ties wall clock time to this sample. Every block
> in the flow graph would know the relationship between wall clock time
> from the periodic timing messages which contain an NTP timestamp and
> the equivalent RTP timestamp (i.e. sample count). Sampling frequency
> can also be used to work out the time corresponding to a given sample
> given a single timing/synchronization message.

I don't think this works in a world of discontinuous transmission or
reception.  We aren't streaming video ;)

> >If I want to tell the ultimate downstream end of the pipeline not to
> >transmit the first sample of the modulated packet until time t, how do
> >I do that?  That's essential for any kind of TDMA mechanism.
>
> The most direct way is through the signaling interface. The MAC layer
> (or other client) can send a signal enabling/disabling transmission at
> the end of the pipeline at the appropriate point in time.

For this particular example, I think that the time, or a proxy for the
time (e.g., Transmit this packet in TDMA Slot N of M), should be in
the metadata attached to the high-level packet.

Perhaps we're talking past each other here.  Or we're conflating the
playout time with a desire to figure out how to schedule the m-blocks
so that the right thing occurs at the right time.

The important part is getting the timing semantics right at the
MAC/PHY boundary and at the "soft PHY / hard PHY" boundary.
Everything else is an implementation detail.

I'm not sure we've got this pinned down yet.

> Another way to do it would be to have some form of "playout buffer" at
> the end of the pipeline that buffers packets until it is time for
> them to be sent.

Yes, and there's a (perhaps unacknowledged) requirement to minimize
latency.  The latency in this part of the system directly impacts any
MAC control loop.

> In this case the timing transfer mechanism would be used to enable
> each block to measure the latency from the time the packet entered
> the top of the pipeline until it arrived (or left) the block.

My thought is that the latency through the blocks is going to vary all
over the place, and from packet to packet.  E.g., one packet may come
in with metadata indicating "needs to get there with
high-probability" or at a different level of abstraction "use XYZ FEC
and ABC modulation".  The next packet has a different metadata.  The
downstream flow for these two case may be completely different.

Yes, I understand that you could model this based on worst case and/or
some probability distribution.  Is that the goal?

> These latencies would be exposed to the top level m-block scheduler
> which could then allocate processing time to blocks based on these
> latencies in order to ensure that some threshold was not
> exceeded. Typically, you could imagine the processor just looking at
> the end to end delay and scheduling processing to keep that below a
> certain threshold. In a sense, the scheduler does coarse grain
> scheduling to ensure the end to end delay does not go beyond some
> tolerance, while the playout buffer does fine grained scheduling.

[The final playout buffer is in the attached hardware]

OK, I get the basic goal.
Not sure I agree with all the RTP/RTCP/NTP...  Can't the m-scheduler
just *measure* the execution time of each m-block?

> >In general, I'm not following this section.  I'm not sure if you're
> >trying to figure out the real time required through each m-block
> >and/or if you're trying to figure out the algorithmic delay through
> >each block, and/or if you're trying to figure out the NET to NET
> >delay between multiple nodes, ...

> It's the first two.

Clear.

> The initial thought is just to re-use the semantics and format of
> RTP/RTCP for transferring timing information between elements of a
> radio.

Seems like a solution looking for a problem.

> There are a couple of options. One option would be to to figure out
> the wall clock delay between two blocks within a flow graph (could
> imagine that typically they would be endpoints of a pipeline). This
> way we can make sure the delay through a flow graph stays within
> limits by scheduling blocks appropriately.

Seems reasonable.

> Another option would be to measure the end to end delay between some
> process in the MAC (or other controlling entity) and the bottom of a
> pipeline in the PHY.  There could also be a control loop here to
> ensure that the end to end delay requirements are not exceeded.

OK.

> We'll work on making this section clearer and get a new revision out
> next week.

Sounds good.  I suggest starting with some use cases for the MAC/PHY
interface and the soft-PHY/hard-PHY interface.

I'm particular interested in sorting out the idea of time.

I think that "priority" originates in the MAC.  It may tell us that a
particular packet has priority P.  The MAC is going to be handing us
multiple outstanding packets, right?  We're going to flow control it
somehow, but I'm assuming that it may send us a packet at time T+1
that has higher priority than one sent at time T.  When/where/how do
we handle this?


I think the soft-PHY/hard-PHY interface is pretty straight forward.
You've got to assume that the low level hardware is pretty dumb.  In
the receive direction, assume that the hardware gives you fixed length
packets of N samples with a header containing it's sample counter
value corresponding to the first sample of the packet.

In the transmit direction, it might be slightly more complicated.
Assume fixed length packets with a header that has a "do not transmit
before sample counter time T", as well as a few other bits including
things like, "this packet is the beginning of a frame", this "packet
is the middle of a frame", "this packet is the end of a frame", "this
packet has N valid samples".  The packet header probably also contains
attributes such as transmit power, but probably only in the packet
that corresponds to the first fragment of a frame.

The soft-PHY/hard-PHY packets also carry some indication of "channel"
which is a label for a particular path in the h/w from the interface
to/from an antenna. [Probably not relevant to the timing problem.]


> >Also, an example of how we'd map whatever you're thinking about on to
> >something that looked like a USRP or other h/w would be useful.
>
> Will do.
>
> >I guess I'm missing the overall statement of intention.  I.e., what do
> >the higher layers care about, and how does your proposal help them
> >realize their goals?
>

> The main goal of section 4.5.5 is to provide mechanisms that will
> bound the time it takes for a request to make it all the way to the
> bottom of the PHY and to enable real-time scheduling/playout of data
> at the bottom of the PHY.

OK.

> >Why restrict yourself to a single distinguished "data portion"?
>
> Sure. That's a nice way to think about it. It would also be nice to
> maintain a hierarchy of metadata so that there was some structure
> to it (e.g. grouping by profiles or block type).

Sounds good.

> >
> >Section 4.5.8: Scheduler.
> >
> >I'm not sure I follow Figure 4.8.  Perhaps once I understand the
> >timing stuff it'll make more sense.
>
> We'll work on making this clearer.

Good.


With regard to signals and control processing, it seems like it would
make sense to use the same mechanism to communicate with existing
gr_blocks.  E.g., a carrier sense block implemented in the flow graph
needs to be able to send a transition signal and receive a
threshold setting.  [I'm not attached to the existing msg / msg queue
stuff, it just seemed like the simplest thing that could possible
work.]

Also, in the definitions in Table 4.1, if a "message" is the smallest
block of information that can be processed by an m-block, what do you
call the data that is associated with a "signal" or "control"?

I'm also not clear about how or when you process the scheduling of
control/signals versus data+metadata.  When does an m-block handle the
signals/control versus the data + metadata?

Do you envision a different mechanism?  If so, why?

(Can we pick different nomenclature for "signals"?  It's seriously
overloaded in the POSIX environment.)

At a higher level of abstraction, why would an m-block have different
classes of ports?  Why don't they *all* accept and/or send
"messages" containing "data + metadata".  The current design seems
like it's being prematurely narrowed and special cased, when one
simple abstraction seems to cover all the cases.  They could of course
have a ports labeled "control" and "signal", but do we really want a
different implementation?

Also, for your use cases and pictures, I'd start with m-blocks
that contain multiple input and output ports for "data + metadata",
and then handle single-in / single-out as a specialization.


It may be that this latency / scheduling problem is blown way out of
proportion.  We've got to assume we've got sufficient cycles to
execute the transmit and receive paths in the worst case, or else
we're hosed.  Buy faster hardware / fix the signal processing until
the problem goes away.

Then, you can get priority scheduling within the m-block universe by
attaching a priority to each message.  I'm assuming the "single model
of the universe where everything is a message containing
data+metadata, including the so-called signal and control ports".
Then just implement a priority queue for each port.  The priority
assigned each m-block at time t is the max(over-all-of-its-port-queues).
Then just pick the highest priority m-block and run it.  No problem.
The latency is going to be what it's going to be.  We're doing what
the MAC asked us to do in the order it asked us to do it.

Then, we're left with the high level problem of hitting the given TDMA
slot, but that gets sorted out by the tail end of the m-block pipeline
that feeds the physical h/w with the appropriate frame + timestamp.

In summary, isn't an m-block just an actor that wakes up when told and
processes events/messages in it's queue(s) and generates
events/messages that get sent someplace else?  If so, then this is a
well understood problem, with a well understood solution, and no
reentrancy problems ;)  Composing them falls out nicely too.  We also
get transparent scaling in the SMP/multi-core environment.

Eric
13f040173cbb2ce84ac18275faafce60?d=identicon&s=25 Clark Pope (Guest)
on 2006-04-02 15:36
(Received via mailing list)
Lee,
I am very much interested in the same. Specifically, I'd like to get it
on
embedded Linux running off a virtex-4 FPGA. This module from
hydraxc(http://www.hydraxc.com/) has a USB2.0 port and is the size of a
stick of gum.

Seems like one should be able to connect this to a USRP with a 100 MHz
PPC
linux to process at least a couple hundred kilohertz bandiwidth. The
ehci
driver should already exist so I guess it's just a matter of getting
python
to run on it and setting up a cross compiler environment for the c++
files.

I have a USRP and I've ordered a hydraxc. Let me know if you want to
pursue
further. I don't have much time to throw at it, but I can at least try
some
things out to gage how much work is involved.

Thanks,
Clark




>what our requirements are exactly. However, I can say, the smaller the
>Discuss-gnuradio@gnu.org
>http://lists.gnu.org/mailman/listinfo/discuss-gnuradio

_________________________________________________________________
Express yourself instantly with MSN Messenger! Download today - it's
FREE!
http://messenger.msn.click-url.com/go/onm00200471a...
F4c3ef08c85c5729a5521bc80d5025c0?d=identicon&s=25 Lee Patton (Guest)
on 2006-04-03 19:03
(Received via mailing list)
Thanks for the reply, Clark.  Basically, the project we're working on
involves putting the USRP on a small UAV ("A" as in "aerial" not
"autonomous").  So, weight and power consumption are key.  The hydraxc
looks very cool.  However, I think we can go larger. And, for the first
cut at a solution, we want to go for the simplest thing that works.  It
seems like the hydraxc might not be that. However, for a final design
choice, it might work very nicely.  In fact, it looks like it could be
the solution, but it will take more time to get up and running than we
have.  I hope you will update the list as you make progress with the
hydraxc.

- Lee
0dfa1a815559738fc7e0f17b0cbf9e54?d=identicon&s=25 Thomas Schmid (Guest)
on 2006-04-03 19:24
(Received via mailing list)
Hi Lee,

We are also looking into similar things, though we don't have the
power constraints which will be difficult to meet. We are currently
looking at a micro-atx solution with an amd geod nx processor since
this one is compatible with the amd athlon mobile. this should make it
easy to have a running linux system.

we were first also thinking about platforms which feature a intel
xscale processor, though quickly found out that it doesn't support
floating points. GNU Radio makes heavy use of floating point
operations. Thus it is key, especially if the platform is not very
powerfull, that it supports at least hardware floating points.

Also, did you consider the power consumption of the usrp? I don't know
if someone measured it, but this might also be a problem for you,
i.e., finding a battery which supports the usrp over a longer time.

cheers,

Thomas
745d8202ef5a58c1058d0e5395a78f9c?d=identicon&s=25 Eric Blossom (Guest)
on 2006-04-03 20:23
(Received via mailing list)
On Mon, Apr 03, 2006 at 01:00:07PM -0400, Lee Patton wrote:
>
> - Lee

The hydraxc looks cool, but I think it's going to be a bear to get GNU
Radio running on it.  A couple of observations: no floating point (not
positive, but pretty sure about that), no MMU (otherwise they wouldn't
be running ucLinux).

Though I haven't played with a Virtex 4 with embedded PPC, I have
played with a Virtex II (V2P50) with 2x embedded PPCs and its
performance was underwhelming.  Really small cache didn't help.  After
spending quite a bit of time on it, I was unable to get the LWIP
TCP/IP stack to run at anything like wire speed using a 100Mbit PHY.
Couldn't even compute UDP checksums and keep up.

If you can afford the size/weight, I'd probably try one of the Pentium M
single board computers.  Everything should just work on that platform.
Yes, they're a lot bigger than the hydraxc or gumstix, but they're
very likely to work right out of the box.

Also, unless you're a glutton for punishment, don't get the Celeron
version, spend the extra bucks and get the Pentium M.

Eric
745d8202ef5a58c1058d0e5395a78f9c?d=identicon&s=25 Eric Blossom (Guest)
on 2006-04-03 20:38
(Received via mailing list)
On Mon, Apr 03, 2006 at 10:22:01AM -0700, Thomas Schmid wrote:
> Hi Lee,
>
> We are also looking into similar things, though we don't have the
> power constraints which will be difficult to meet. We are currently
> looking at a micro-atx solution with an amd geod nx processor since
> this one is compatible with the amd athlon mobile. this should make it
> easy to have a running linux system.

Have you tried the geode nx?

At least on earlier geodes the performance was pretty dismal.  Perhaps
the "nx" is a different processor than I remember.  Caveat emptor.
Compatible means "has the same instruction set".  A 486 DX66 comes
pretty close to meeting that requirement ;)

> we were first also thinking about platforms which feature a intel
> xscale processor, though quickly found out that it doesn't support
> floating points. GNU Radio makes heavy use of floating point
> operations. Thus it is key, especially if the platform is not very
> powerfull, that it supports at least hardware floating points.
>
> Also, did you consider the power consumption of the usrp? I don't know
> if someone measured it, but this might also be a problem for you,
> i.e., finding a battery which supports the usrp over a longer time.

The USRP on my bench draws about 11W.  This depends of course on the
daughterboads you're using, and if you've got them all running.

Eric
F4c3ef08c85c5729a5521bc80d5025c0?d=identicon&s=25 Lee Patton (Guest)
on 2006-04-03 21:06
(Received via mailing list)
On Mon, 2006-04-03 at 10:22 -0700, Thomas Schmid wrote:
> Hi Lee,
>
> We are also looking into similar things, though we don't have the
> power constraints which will be difficult to meet. We are currently
> looking at a micro-atx solution with an amd geod nx processor since
> this one is compatible with the amd athlon mobile. this should make it
> easy to have a running linux system.

Good point. Are you custom building this, or did you find a COTS
solution?  Have you looked at mini-itx (which I now know of thanks to
Pete)? If so, what was your impression?


> we were first also thinking about platforms which feature a intel
> xscale processor, though quickly found out that it doesn't support
> floating points. GNU Radio makes heavy use of floating point
> operations. Thus it is key, especially if the platform is not very
> powerfull, that it supports at least hardware floating points.

Wow. Thanks. I didn't even consider that point. I will definitely make
sure I check this.

>
> Also, did you consider the power consumption of the usrp? I don't know
> if someone measured it, but this might also be a problem for you,
> i.e., finding a battery which supports the usrp over a longer time.

This is a problem for us. However, other members of the group, namely,
the UAV guys, seem to think we're okay on this.

> cheers,
>
> Thomas

Thanks for the reply, Thomas.  I really appreciate the advice.
13f040173cbb2ce84ac18275faafce60?d=identicon&s=25 Clark Pope (Guest)
on 2006-04-03 21:15
(Received via mailing list)
The ppclinux does use the MMU, but a lot of people still run ucLinux
mostly,
it appears, because it is easier to set up and get going. I think one
can
add the FPU at the expense of using up the fpga. But, yes, such a thing
would be an uphill climb.

Then there's the issue if it's an embedded, headless system and you run
kernel 2.6 why not just write the blocks in C and use the system fifos
to
handle the data scheduling?

-Clark


> > "autonomous").  So, weight and power consumption are key.  The hydraxc
>The hydraxc looks cool, but I think it's going to be a bear to get GNU
>
>If you can afford the size/weight, I'd probably try one of the Pentium M
>single board computers.  Everything should just work on that platform.
>Yes, they're a lot bigger than the hydraxc or gumstix, but they're
>very likely to work right out of the box.
>
>Also, unless you're a glutton for punishment, don't get the Celeron
>version, spend the extra bucks and get the Pentium M.
>
>Eric

_________________________________________________________________
Don?t just search. Find. Check out the new MSN Search!
http://search.msn.click-url.com/go/onm00200636ave/direct/01/
0dfa1a815559738fc7e0f17b0cbf9e54?d=identicon&s=25 Thomas Schmid (Guest)
on 2006-04-03 21:36
(Received via mailing list)
Hi Eric

On 4/3/06, Eric Blossom <eb@comsec.com> wrote:
> On Mon, Apr 03, 2006 at 10:22:01AM -0700, Thomas Schmid wrote:
> > Hi Lee,
> >
> > We are also looking into similar things, though we don't have the
> > power constraints which will be difficult to meet. We are currently
> > looking at a micro-atx solution with an amd geod nx processor since
> > this one is compatible with the amd athlon mobile. this should make it
> > easy to have a running linux system.
>
> Have you tried the geode nx?

No, we didn't actually try the geode nx yet, but on paper they look
pretty good. They are completely different from the older geode, which
are less powerfull. For some more information:

http://www.amd.com/us-en/ConnectivitySolutions/Pro...
Dc48f9c00e3e6de9640898a531c26d89?d=identicon&s=25 Charles Swiger (Guest)
on 2006-04-03 21:36
(Received via mailing list)
On Mon, 2006-04-03 at 10:22 -0700, Thomas Schmid wrote:

> Also, did you consider the power consumption of the usrp? I don't know
> if someone measured it, but this might also be a problem for you,
> i.e., finding a battery which supports the usrp over a longer time.
>

I use a usrp with battery a lot and with two mounted basic Rx boards,
one basic Tx board, doing input from just one Rx board - a 6V 7AH
sealed lead-acid battery lasts just about 3 hours.

--Chuck
745d8202ef5a58c1058d0e5395a78f9c?d=identicon&s=25 Eric Blossom (Guest)
on 2006-04-03 21:55
(Received via mailing list)
On Mon, Apr 03, 2006 at 12:33:26PM -0700, Thomas Schmid wrote:
> > > easy to have a running linux system.
> >
> > Have you tried the geode nx?
>
> No, we didn't actually try the geode nx yet, but on paper they look
> pretty good. They are completely different from the older geode, which
> are less powerfull. For some more information:
>
> 
http://www.amd.com/us-en/ConnectivitySolutions/Pro...

Thanks for the link.

Let us know how it turns out.

Eric
F4c3ef08c85c5729a5521bc80d5025c0?d=identicon&s=25 Lee Patton (Guest)
on 2006-04-03 22:35
(Received via mailing list)
Thanks, Chuck. Very useful to know.
891c140a55f53b9a238adcb9cc5febb6?d=identicon&s=25 John Gilmore (Guest)
on 2006-04-04 01:18
(Received via mailing list)
> A couple of observations: no floating point
>> Have you tried the geode nx?

The MIT "$100 Laptop" uses the AMD Geode GX500@1.0W.  See:
  http://wiki.laptop.org/wiki/Hardware_specification

According to Jim Gettys, the reason is because it's the only low-power
processor they could find that has hardware floating point.  OLPC
needs extremely low power (~5W for the whole system including LCD).

The GX runs much slower than the NX, but consumes about 1W rather than
6-14W.
The GX is the old National Semiconductor Geode line; the NX is a
low-power
Athlon core.

The OLPC demonstrator used an AMD "Rumba" development board, which you
might
be able to find and try.

The simplest thing that works is almost certainly a modern laptop.
Beware of the Sony small/light ones: they use lots of custom,
undocumented
chips, and won't run standard Windows or Linux distributions (they're
only warranteed when running their OWN windows distro).

	John
24a3fcde2a3a647231c7148b2a216e20?d=identicon&s=25 dlapsley (Guest)
on 2006-04-04 16:12
(Received via mailing list)
Hi Eric,

Thank you for the feedback. We will incorporate your comments into
a new revision of the document that we should have out soon.

Timing and scheduling seem to be the hardest issues, so we'd like
to think about them a bit more and then discuss on the list.

Cheers,

David.
F4c3ef08c85c5729a5521bc80d5025c0?d=identicon&s=25 Lee Patton (Guest)
on 2006-04-19 00:20
(Received via mailing list)
On Mon, 2006-04-03 at 11:20 -0700, Eric Blossom wrote:
> ... unless you're a glutton for punishment, don't get the Celeron
> version, spend the extra bucks and get the Pentium M.

Besides the dearth of on-board cache, what are the other drawbacks of a
Celeron?

To fit the dimensional requirement I was given (5"x5"x~1"), the SBC must
be passively cooled.  However, I'm not finding a Pentium-M solution that
can be passively cooled and meets our availability requirements.  I have
found a 600 MHz Celeron solution, but has half the L2 cache.

In our application, we'll be pulling full throttle from the USRP, maybe
FIR filtering, and then pushing back out to USRP.  Not too heavy on the
signal processing.

All advice appreciated.

- Lee

P.S.

Some potential solutions:
http://www.gms4sbc.com/P60x_BO.html  (can't meet availability)
http://www.kontron-emea.com/index.php?id=82&cat=58 (JRex-PM, can only
air cool Celeron M 600 MHz)
745d8202ef5a58c1058d0e5395a78f9c?d=identicon&s=25 Eric Blossom (Guest)
on 2006-04-19 04:29
(Received via mailing list)
On Tue, Apr 18, 2006 at 06:18:55PM -0400, Lee Patton wrote:
> found a 600 MHz Celeron solution, but has half the L2 cache.
>
> Some potential solutions:
> http://www.gms4sbc.com/P60x_BO.html  (can't meet availability)
> http://www.kontron-emea.com/index.php?id=82&cat=58 (JRex-PM, can only
> air cool Celeron M 600 MHz)

You should be able to benchmark this, including cache performance
using oprofile.  To track cache misses you'll need to enable a
non-default set of counters in oprofile, but it's possible.
http:://oprofile.sf.net  You should be able to determine the cache
hit/miss ratio for you existing configuration using oprofile.

Benchmark the app you want to run on whatever you've currently got.
The closer in architecture/microarchitecture, the better.  Then scale
by CPU freq, and a big wild-ass guess on cache size differences.

Eric
Db335a6b23485056042bd6cc1be3243d?d=identicon&s=25 Jim Hanlon (Guest)
on 2006-04-20 20:41
(Received via mailing list)
I tried to get the USRP talking to an 800MHz VIA fanless industrial
control computer, with expansion card to get USB 2.0
performance. I found that gnuradio installed and ran OK, but USRP
performance consistently exhibited USB overruns and underruns, to
the point that our evaluation of USRP capabilities was limited by our
compute power.

I think its nice to be fanless, if you can. And maybe, some clever
engineering of USB interrupt service routines could improve i/o
performance. But  it is much simpler these days just to outrun the
problem with a faster CPU clock.

If you are trapped by form factor and "no fan" rules, you may be out of
luck. Does your architecture allow for (say, USB) expansion
cards ?  If so, you could build one that did block transfers to/from the
USB bus and your slowish CPU.
HTH
Jim Hanlon
F4c3ef08c85c5729a5521bc80d5025c0?d=identicon&s=25 Lee Patton (Guest)
on 2006-04-20 21:55
(Received via mailing list)
Thanks for the advice, Jim. I appreciate it.  I think I found a 1.1 GHz
Pentium M (thanks to Matt) that can be cooled fanless.  We're going to
give either that board or a 1 GHz Celeron board from Kontron a try.  I
think "outrunning" is the way to go right now.
745d8202ef5a58c1058d0e5395a78f9c?d=identicon&s=25 Eric Blossom (Guest)
on 2006-04-20 22:08
(Received via mailing list)
On Thu, Apr 20, 2006 at 01:38:52PM -0500, Jim Hanlon wrote:

> I tried to get the USRP talking to an 800MHz VIA fanless industrial
> control computer, with expansion card to get USB 2.0 performance. I
> found that gnuradio installed and ran OK, but USRP performance
> consistently exhibited USB overruns and underruns, to the point that
> our evaluation of USRP capabilities was limited by our compute
> power.

> I think its nice to be fanless, if you can. And maybe, some clever
> engineering of USB interrupt service routines could improve i/o
> performance. But it is much simpler these days just to outrun the
> problem with a faster CPU clock.

> If you are trapped by form factor and "no fan" rules, you may be out
> of luck. Does your architecture allow for (say, USB) expansion cards
> ?  If so, you could build one that did block transfers to/from the
> USB bus and your slowish CPU.

> HTH
> Jim Hanlon

As always, there are lot of differences between processors.  Some of
the VIA's are dogs, other's aren't so bad.  We've run the USRP and
gmsk2 code on one particular compact VIA board, and it wasn't horrible.
We were able to run GMSK at a few 100 kb/sec with it.

The USB expansion card could have been the problem.  How do you know
it was lack of CPU?

Eric
Db335a6b23485056042bd6cc1be3243d?d=identicon&s=25 Jim Hanlon (Guest)
on 2006-04-20 23:09
(Received via mailing list)
> you know it was lack of CPU?
>
To be honest, we did not try a controlled experiment of varying CPU
speed, nor engage in application profiling, or a lot of other
things we might have done. We were in cut-and-try mode. And our trial of
the "fanless industrial control computer" indicated that we
were on the margins of acceptable behavior; not promising for a
potential product. We saw to it that the USB card was not sharing
interrupts with anything demanding (as I recall, we did not have the USB
card isolated on its own IRQ.) And all the other reports of
success from the group involved multi-GHz machines. There was even sage
advice to move to 64 bit CPUs. The prudent call seemed to
move away from the "slow and cool" mode, as attractive as it is for a
lot of good reasons.

That said, CPU clock speed and USB signal timing are two different
things. Probably the card's interconnection bus protocol
(careful, some single board computers have oddball expansion
connectors), and the IRQ service logic, are more important factors. And
as Eric points out, the answer to these "is the computer fast enough"
questions all depends on the intended application's sample
rates. So if your app allows it, and you have the time and the hardware,
by all means dig deeper.
Jim Hanlon
This topic is locked and can not be replied to.