Nginx - Google Summer of Code ideas

Hi everyone,

I’m organizing some google summer of code ideas for our organization and
nginx could play a very interesting and key role in one of them. So
two quick questions.

  1. Would anyone be willing to mentor a student?
  2. What are some other ideas that we could possibly add to the list to
    make it exciting?

We have about a day or so to collect ideas and add stuff to the
application if it makes sense. I’m on irc if anyone wants to say hi and
talk about ideas.

Cheers,

./Christopher

codestr0m - irc.freenode.net #osunix / #gsoc / nginx

oh, i’ve got a nice little wishlist going :wink:

features:

  • mod_svn
  • mogilefs integration:
    ** engine yard would pay a bounty for this (as they’ve said)
    ** basically just needs to interface with a tracker to say “where is
    this file based on this URI” and then dynamically proxy to the
    mogstored server. the dynamic proxy piece is either complete now or
    would be a very small tweak I am thinking

using nginx as a layer 7 / gzip / ssl capable proxy:
I’ve used it this way in the past but when upstreams went down it was
not very graceful. The only idea I had was to trigger a command to
remove the upstream from the pool until it came back again.

  • dynamic upstream management, basic healthchecking should fix that, I
    think.
  • external alert when upstream goes down (or could also trigger a
    command) - might not be required with above
  • fair scheduler (might not need it) but what about weighted least
    connections? (emulate lvs functionality)

authentication:

  • SPNEGO/Kerberos/etc. integration - (I’m paying someone $400 w/
    RentACoder already)
  • someone on IRC mentioned adding in digest support

logging:

  • ability to error_log off; or related in any area (server, location,
    etc) - i don’t think it supports this right now
  • allowing for error_log to be overridden anywhere would allow for
    debugging based not on debug_connection or a global error_log foo
    debug; but would allow for specific debugging based JUST on the host
    you want
  • perhaps a different debug output, sometimes it’s a bit too much.
    like a more simplified debug log (look how lighttpd does it - it
    allows you to figure out how it’s handling the file and such) - right
    now ngnix’s can be overkill
  • have a way to log statistics (simple byte and request counter) per
    Host: header, right now flatfile parsing and stuff for that basic info
    is not the funnest

management:

  • cluster manager (manage multiple nginx instances, like zeus’ web
    interface)

should also note

On Tue, Mar 10, 2009 at 11:01 AM, mike [email protected] wrote:

oh, i’ve got a nice little wishlist going :wink:

  1. Would anyone be willing to mentor a student?

igor (of course), maxim, valery all seem to be pretty good hackers.
but probably don’t have time.

this would be neat because we’d have more nginx module/code hackers
trained for future bug fixing/features/modules…

  1. What are some other ideas that we could possibly add to the list to
    make it exciting?

well, if we do look into doing adaptive process spawning and need an
nginx module to communicate how many backends are in use (to
fcgiwrap/spawn-fcgi/whatever) there needs to be a module done on the
nginx side to handle that.

mike wrote:

this would be neat because we’d have more nginx module/code hackers
trained for future bug fixing/features/modules…

Hi Mike

yes exactly my point!

I have a rather lengthy write-up on some ideas about dtrace,
communication layer and near real time algorithm adjustments for
balancing a large cluster.

If you or any of the possible mentors are on irc I think this will 1)
make it easier to balance mentoring and 2) speed up communication for
planning. Also feel free to email me off list so we can speed up
planning and cause less list noise.

./Christopher

C. wrote:

Hi everyone,

I’m organizing some google summer of code ideas for our organization and
nginx could play a very interesting and key role in one of them. So
two quick questions.

  1. Would anyone be willing to mentor a student?

Depends on how many hours per week you need for this. I could handle 5
to 10 hours I think.

Valery K. wrote:

Depends on how many hours per week you need for this. I could handle 5
to 10 hours I think.

I think that’s more than enough. The key is a couple things like any
project…

  1. Having a very detailed plan to avoid questions in the first place
  2. Selecting students who are qualified
  3. Being on irc, but letting them know not to expect instant replies…
    If more than one mentor is around it can balance out well like any
    normal dev channel.

#osunix is what I’m using for all gsoc stuff now and currently very
quiet, but some students may pop in and ask questions. It’s also a good
chance to get to know them and bring them into the community a bit more.
This way the gsoc not only goes well, but hopefully they stay as
contributors.

./C

C. wrote:

Depends on how many hours per week you need for this. I could handle 5
to 10 hours I think.

I think that’s more than enough. The key is a couple things like any
project…

  1. Having a very detailed plan to avoid questions in the first place
  2. Selecting students who are qualified

Be more specific: plan for what and qualified for what?

  1. Being on irc, but letting them know not to expect instant replies…
    If more than one mentor is around it can balance out well like any
    normal dev channel.

Well, I have been on irc for a while, but then I forgot to restore irc
account in my client because of general laziness. After your mail I’m
back there.

But generally I’m available per email. I’m trying to respond to every
question and fortunately there are not much of them. If someone feels I
don’t respond – alert me in maillist or per phone.

Sorry for the messy email I’m in a lab with a crappy keyboard that is
too hard to use properly and the mouse is basically useless :stuck_out_tongue:

On Wed, Mar 11, 2009 at 10:50 AM, Manlio P.
[email protected] wrote:

mike ha scritto:

oh, i’ve got a nice little wishlist going :wink:

features:

  • mod_svn

The only idea of implementing mod_svn from scratch in Nginx is crazy :).

why is that? isn’t it just DAV? it just needs to support some more
OPTIONS commands or something? (I could be off my rocker, I thought I
read that somewhere)

servers.
and what, have an /etc/nginx/upstreams.conf file that i manually
update and kill -HUP nginx or whatever appropriate signal to reload
every time i notice an upstream going up or down?

another issue is nginx is not ‘smart’ so the healthchecking would need
to be more than just tcp port 80 is open… i guess that’s where
external things come in to play. but simplifying the software stack
would be amazing, and it could be an -optional- module in nginx :slight_smile:

to implement it.
I have already implemented a rather complete HTTP Digest Authentication
support in my Python WSGI framework.

I don’t care as much about digest myself but it was something someone
brought up. This is open source software too. Doesn’t anyone do
anything anymore just for the ego boost? :slight_smile: I know I would be if I
knew some C!

[…]

  • have a way to log statistics (simple byte and request counter) per
    Host: header, right now flatfile parsing and stuff for that basic info
    is not the funnest

Why not use a separate log parser?

i am right now. but for basic statistics… seems like major overkill,
especially when you start looking at gigs of data from multiple
webservers and having to parse/merge the data

management:

  • cluster manager (manage multiple nginx instances, like zeus’ web
    interface)

Isn’t it possible to use some existing tool?

[…]

like cpanel? etc? all of those suck. have you ever used zeus’s? it has
all the capabilities of the webserver avaialble. it needs to be a tool
designed around nginx’s capabilities and such. not just “setup foo.com
under /home/foo/bar/web”

Hi,

On Mar 10, 2009, at 12:50 , C. wrote:

make it exciting?
Here’s two ideas/suggestions from my wish list:

1: Nginx should keepalive connections against proxied backends
This could/should apply to all kinds of backends ranging from http to
FastCGI. While digging into backend unification, why not enable
memcached as a “standard” backend instead of a third party module (of
course compile-time option)? I recall seeing proof of concepts on this
at this mailing list.

2: Nginx proxy store should be purge:able
The most fundamental action for a proper proxy cache. One could dig
deeper into this and enable more advanced rules for invalidation by
looking at ncache or other proxy servers such as varnish. I’m not
suggesting Nginx to act as a “full fledged” proxy - only cover the
basic needs.

Here’s my wish:

Implement “init master” hook for module development

mike ha scritto:

oh, i’ve got a nice little wishlist going :wink:

features:

  • mod_svn

The only idea of implementing mod_svn from scratch in Nginx is crazy :).

  • mogilefs integration:
    […]
  • dynamic upstream management, basic healthchecking should fix that, I think.
  • external alert when upstream goes down (or could also trigger a
    command) - might not be required with above

This does not need to be done with Nginx.
Just use a pre existing healthchecking software with each of the
upstream servers.

[…]
authentication:

  • SPNEGO/Kerberos/etc. integration - (I’m paying someone $400 w/
    RentACoder already)
  • someone on IRC mentioned adding in digest support

If someone is interested in sponsoring digest auth support, I should be
able to implement it.
I have already implemented a rather complete HTTP Digest Authentication
support in my Python WSGI framework.

[…]

  • have a way to log statistics (simple byte and request counter) per
    Host: header, right now flatfile parsing and stuff for that basic info
    is not the funnest

Why not use a separate log parser?

management:

  • cluster manager (manage multiple nginx instances, like zeus’ web interface)

Isn’t it possible to use some existing tool?

[…]

Manlio P.

Hello!

On Wed, Mar 11, 2009 at 10:36:41PM +0100, Johan Bergström wrote:

  1. Would anyone be willing to mentor a student?
  2. What are some other ideas that we could possibly add to the list to
    make it exciting?

Here’s two ideas/suggestions from my wish list:

1: Nginx should keepalive connections against proxied backends
This could/should apply to all kinds of backends ranging from http to
FastCGI. While digging into backend unification, why not enable

Keepalive to memcached backends works perfectly with
ngx_http_upstream_keepalive module.

Making it work with http & fastcgi backends requires some
non-trivial modifications in proxy/fastcgi/upstream modules and
nginx core. I have some work-in-progress for this, but it’s not
yet complete.

memcached as a “standard” backend instead of a third party module (of
course compile-time option)? I recall seeing proof of concepts on this
at this mailing list.

Memcached is standard backend, not a “third party module”. And
it shares most of it’s code with http backends (proxy_pass) and
fastcgi backends (fastcgi_pass).

Maxim D.

Digest Authentication?

I have Implemented a simple token
modulehttp://www.libing.name/2009/03/11/nginx-token-module.html,
used for http authentication with backend memcached. Maybe that is
helpful
for you.

I would like to see Digest Authentication in Nginx.

C. さんは書きました:

å¼ ç«‹å†° <zhang.libing@…> writes:

Digest Authentication?
Â
I have Implemented a simple token module, used for http authentication with
backend memcached. Maybe that is helpful for you.

Hi zhang,
really interesting to see your modules, the idea is exactly what im
trying to do
these 2 weeks.
What I see after checking your code is that :

  1. If the token is invalid : return 403, otherwise 404.
  2. When the request is valid (the token is exist), it returns 404, and
    then
    internal redirect to the ‘real’ place => If someone knows the url of
    real place,
    he can access without any authentication.

I had develop a module similar like that ( in fact I see that we have
80% same
code ), but the process is a little different. Can you take a look at
these
threads and we can work on it together :slight_smile:

http://thread.gmane.org/gmane.comp.web.nginx.english/10008
http://thread.gmane.org/gmane.comp.web.nginx.english/10379

Hi,Huy Phan.

Wow, we did the same job. [?]

  1. When the request is valid (the token is exist), it returns 404, and
    then

internal redirect to the ‘real’ place => If someone knows the url of real
place,
he can access without any authentication.

Yes, I hook the http status 404, and redirect to the ‘real’ place at the
backend. BUT this place must be placed at intranet or the same server,
just
like 127.0.0.1.And that is the trust environment. :slight_smile:

I have placed a simple conf at this page.(
http://www.libing.name/2009/03/11/nginx-token-module.html).

http://thread.gmane.org/gmane.comp.web.nginx.english/10008

http://thread.gmane.org/gmane.comp.web.nginx.english/10379

I have checked those two posts.It seems you want a module to do the
access
check job for media files.(/v/empty.flv?token=1234).
And I think there is no need memcached to store the check token. Maybe
you
can work with http access key
modulehttp://www.nginx-community.org/NginxHttpAccessKeyModule
and mod_parsed_vars http://hg.mperillo.ath.cx/nginx/mod_parsed_varsto
generate dynamic token with COOKIE/GET/POST vars. Just like
sessionid.And
that is betteeeer than work with memcached.

å¼ ç«‹å†° <zhang.libing@…> writes:

I have checked those two posts.It seems you want a module to do the access
check job for media files.(/v/empty.flv?token=1234).And I think there is
no need
memcached to store the check token. Maybe you can work with http access
key
module and mod_parsed_vars to generate dynamic token with
COOKIE/GET/POST vars.
Just like sessionid.And that is betteeeer than work with memcached.

Hi å¼ ç«‹å†°,

Actually, this is the second phase of our flow, the 1st phase is that :
i. user request to play a media
ii. we insert a value for a random key ( generated by us ) to memcached.
iii. we give the key to user.

And you’ve already known the second phase :slight_smile:
What I mean here is that, memcached is a kind of “must-have-thing”.

As you can see in my previous posts, I’ve almost done the code. It just
needs some small tweaks to be perfect :). So if you have time, I can
share
the code and we can work on it.

And here is a post about this topic.
http://www.libing.name/2008/11/23/nginx-application-access.html
I’m sorry, it is in Chinese. But I think the script at this page will
helpful for you.

2009/3/12 å¼ ç«‹å†° [email protected]

As mentioned on IRC.

  1. sticky sessions/session affinity
  2. ajp13 connector for java web servers

./C

that is not exactly the same, is it? (although it is very useful)