[GSoC] Co-Processors Update #10

Hello all,

Logistical:

Progress:

  • I have pushed the tcp3d test that passes the llrs from the arm and the
    dsp processes them on the tcp3d then the arm checks whether they match
    the
    expected result. https://github.com/muniza/tcp3d_dsp_test

  • I have some code that modifies GNU Radio runtime to isolate blocks
    based
    on flags we set in the blocks constructor. This will enable us to treat
    certain blocks differently such as putting the buffer in a different
    location in memory. Also is code that exposes the buffer object to the
    work
    function so that we can get the start address and size of the buffer in
    bytes. https://github.com/muniza/gnuradio

  • I have an OOT module that passes a struct with the gnuradio buffer
    start
    address and size to a kernel module using ioctl.
    https://github.com/muniza/gr-buffertest

  • Lastly the kernel module that runs get_user_pages on the gnuradio
    buffer
    struct. https://github.com/muniza/gsoc_2014

Plan:
I did more research on the contigous memory allocation method and I now
see
that it is not a good zero-copy solution for ALL the devices we want to
support. A good discussion is available on the linux kernel news site
that
discusses the reasons for NOT integrating ION, another contigous memory
allocator, into the linux kernel: http://lwn.net/Articles/565469/. I’m
still going to get a minimal CMEM GPLv3 integrated into GNU Radio as a
stepping stone for modifying runtime and using zero copy with the
keystone.
This shouldn’t take long at all since I just need ioctl. For part of my
talk at the conference, I am going to discuss this method of contigous
memory along with positives and negatives as it relates to GNU Radio and
a
couple of devices. I’m also hoping to make more progress so I can show
the
integration of the get_user_pages dma method but we’ll have to see what
happens when I’m back at Penn. I think a discussion of various methods
this
will bring up good conversation for the coproc working group.

Expect one more update to mark the finishing of documentation on Friday.
I’ll also cry a little and give thanks to those that helped but thats in
a
couple of days.

On Mon, Aug 18, 2014 at 1:42 PM, Alfredo M. [email protected]
wrote:

Expect one more update to mark the finishing of documentation on Friday.

Philip requested that I do a summary of my findings before then so here
goes.

If we want GNU Radio to support a multitude of coprocessors we need to
be
able to offer both contiguous memory allocation support and scatter
gather
list support (get_user_pages method). I think we all agree that whatever
goes into runtime must be general and not device specific since we don’t
want to keep changing things when a new device comes out.

Contiguous memory allocation support is needed for devices such as the
ARM<==>DSP in the keystone2 because the DSP doesn’t have an MMU or
IOMMU.
There is support for scatter-gather lists but is highly abstracted by
the
multicore navigator that its seems like a different thing all together.
TI
uses CMEM, Nvidia uses NVMAP, Qualcomm uses PMEM, Android uses ION,
which
essentially do the same thing but for different devices. I went down the
CMEM road because thats what people were recommending on the e2e forums
for
zero copy work on the keystone2. I think we should be able to support
things like this and I think we can with minor additions to runtime.
Integrating a memory allocator into GNU Radio will require that most of
device specific things/memory allocator specific things be done in the
OOT
which I think is doable. All we need to do to GNU Radio is change the
buffer location in memory.

Scatter gather list support is needed for devices such as the Zynq where
the AXI DMA supports scatter gather lists. First we pass our buffer from
userspace to the kernel module using ioctl. The module runs
get_user_pages
on it which essentially gives us the translation from userspace virtual
to
kernelspace physical without a need to copy to kernel (kernelspace
virtual). We can then use the scatter-gather api in the kernel to send
to
the bus address. The reason for scatter-gather is that the buffer is
large
so spans many pages that are not contiguous in memory so devices that
support scatter-gather make it appear contiguous in memory. This is an
overall better solution as its pretty much the standard to include
scatter-gather dma support (its just very abstracted in the keystone2).
Integrating this into GNU Radio is possible with the changes I have
already
made to runtime on my GNU Radio branch. I can probably write a module
and
test this myself before the conference since I have a zedboard and
experience with developing on the zynq and friends with experience
developing on the zynq aka you guys plus I’ve probably read LDD3 too
many
times.

So thats my recommendation for GNU Radio after three full months of
working
on this. I guess after my last documentation update on Friday and of
course
the conference, I’ll be updating during the coproc calls.