Forum: NGINX nginx-lua and nginx upload module

2974d09ac2541e892966b762aad84943?d=identicon&s=25 strawman (Guest)
on 2011-11-26 08:37
(Received via mailing list)
Hi all,

I'm hoping to write an nginx-lua script which handles file uploads from
the nginx upload module.  I can do this fine using an external backend
and the upload_pass directive, but the lua module has no functionality
for parsing the multipart form data that this generates.  I've also
tried accessing the $upload_file_name, $upload_tmp_path etc. variables
generated by the upload module directly, but I don't seem to be able to
access these outside of the upload_set_form_field directive.

Is there any way to get the contents of those variables into a lua
handler in a form that can be easily parsed?

Posted at Nginx Forum:
http://forum.nginx.org/read.php?2,219085,219085#msg-219085
37f3ea777f96500b332a1a89d6027897?d=identicon&s=25 agentzh (Guest)
on 2011-11-26 09:08
(Received via mailing list)
On Sat, Nov 26, 2011 at 3:36 PM, strawman <nginx-forum@nginx.us> wrote:
> Hi all,
>
> I'm hoping to write an nginx-lua script which handles file uploads from
> the nginx upload module. I can do this fine using an external backend
> and the upload_pass directive, but the lua module has no functionality
> for parsing the multipart form data that this generates. I've also
> tried accessing the $upload_file_name, $upload_tmp_path etc. variables
> generated by the upload module directly, but I don't seem to be able to
> access these outside of the upload_set_form_field directive.
>

Well, there will be a streaming request body reading API for ngx_lua
soon. For instance:

    local data, err = ngx.req.recv_body_data()

Atop this, we can easily build a multipart streaming parser in pure
Lua and no longer rely on the ngx_upload module to read and process
the request body. (The UI of ngx_upload is not so friendly for Lua
scripting and it'll be technically difficult to make them work
together without lots of line noises.)

Someone is already willing to sponsor this work but I'm too busy to
implement this lately :P

Regards,
-agentzh
Ad7868025849cc38d7541229e4846a15?d=identicon&s=25 Nginx User (Guest)
on 2011-11-26 10:17
(Received via mailing list)
Attachment: form_parser_module.lua (20 KB)
On 26 November 2011 10:36, strawman <nginx-forum@nginx.us> wrote:
> Is there any way to get the contents of those variables into a lua
> handler in a form that can be easily parsed?

Hi There,

I hit the multipart form limitation of the ngx_lua module recently and
wrote a lua module to do the job.

Basically I combined a number of functions from various cgilua modules
and added a few of my own to create a single module.

I have attached a copy and the instructions for calling it are added
as comments at the top of the module.

It will handle both url encoded and multipart encoded forms and return
the post data as a lua key/value pairs
37f3ea777f96500b332a1a89d6027897?d=identicon&s=25 agentzh (Guest)
on 2011-11-26 10:47
(Received via mailing list)
On Sat, Nov 26, 2011 at 5:16 PM, Nginx User <nginx@nginxuser.net> wrote:
>
> It will handle both url encoded and multipart encoded forms and return
> the post data as a lua key/value pairs
>

This approach certainly works but it requires the request body to be
hold completely in memory, which may be unacceptable for big file
uploading.

Regards,
-agentzh
Ad7868025849cc38d7541229e4846a15?d=identicon&s=25 Nginx User (Guest)
on 2011-11-26 11:07
(Received via mailing list)
On 26 November 2011 12:46, agentzh <agentzh@gmail.com> wrote:
> This approach certainly works but it requires the request body to be
> hold completely in memory, which may be unacceptable for big file
> uploading.

Correct. This is why there is a default 2mb "maxinput" flag.

Hopefully a directive to handle multipart encoded forms will be added
to ngx_lua by the developer soon :)
Ad7868025849cc38d7541229e4846a15?d=identicon&s=25 Nginx User (Guest)
on 2011-11-26 11:17
(Received via mailing list)
Attachment: form_parser_module.lua (20 KB)
On 26 November 2011 12:16, Nginx User <nginx@nginxuser.net> wrote:
> I have attached a copy and the instructions for calling it are added
> as comments at the top of the module.
>
> It will handle both url encoded and multipart encoded forms and return
> the post data as a lua key/value pairs

Use this version instead
2974d09ac2541e892966b762aad84943?d=identicon&s=25 strawman (Guest)
on 2011-11-26 14:11
(Received via mailing list)
Thanks for the swift replies, unfortunately I'm still having trouble
getting it to play nicely with the upload module.

Here's the config I'm using: http://pastebin.com/GsrYHwj0

It seems as though the request body isn't being read correctly;
ngx.req.get_body_data() works as expected if I run it from inside
/upload and make a multipart post to it, but it only returns the first
line of the body data when run from inside @handler.  echo_request_body
works fine either way.

Here's what I get when posting to it with the above config:

$ curl -F file=@TDXBN.gif http://localhost/upload
* post defs:
content_type: multipart/form-data;
boundary=----------------------------ff78acc43147
maxinput: 2097152
content_length: 358
maxfilesize: 1048576
args:

* post args:

* request body:
------------------------------ff78acc43147

On a slightly different tack, I noticed one of the release announcements
for ngx_echo showed this directive being used:

echo_subrequest PUT /some_upstream/$upload_file_md5 -f $upload_tmp_path

Is there something in particular you need to do to get access to
$upload_tmp_path etc. outside of the upload_set_form_field directive?
I've tried hacking this into working with echo_subrequest, but if i try
to use any of the $upload_* variables anywhere outside of that directive
then it doesn't work at all.

Posted at Nginx Forum:
http://forum.nginx.org/read.php?2,219085,219107#msg-219107
37f3ea777f96500b332a1a89d6027897?d=identicon&s=25 agentzh (Guest)
on 2011-11-26 14:16
(Received via mailing list)
On Sat, Nov 26, 2011 at 9:10 PM, strawman <nginx-forum@nginx.us> wrote:
>
Sigh. I really hope that ngx_upload could be implemented as a rewrite
or access phase handler rather than a content handler, then we can
trivially combine with content_by_lua in the same location, for
example, and no longer need to do any kind of internal redirections
that clear everything including nginx variables and module contexts.

Maybe Valery can work on an alternative UI for his ngx_upload module? ;)

Regards,
-agentzh
Ad7868025849cc38d7541229e4846a15?d=identicon&s=25 Nginx User (Guest)
on 2011-11-26 14:25
(Received via mailing list)
On 26 November 2011 16:16, agentzh <agentzh@gmail.com> wrote:
> Sigh. I really hope that ngx_upload could be implemented as a rewrite
> or access phase handler rather than a content handler, then we can
> trivially combine with content_by_lua in the same location, for
> example, and no longer need to do any kind of internal redirections
> that clear everything including nginx variables and module contexts.

For your info, I believe agentzh may be refering, obliquely, to the
restriction in the last line here:
http://wiki.nginx.org/HttpLuaModule#content_by_lua (which also applies
to content_by_lua_file).
Abbd9d5312c5d54114a96a35dc94fdb1?d=identicon&s=25 Valery Kholodkov (Guest)
on 2011-11-26 21:14
(Received via mailing list)
The limitation on the phase that is used to process uploads at the
moment comes from nginx itself and not from upload module.

If you describe what sort of API you need from upload module, I will be
able to implement it.

--
Best regards,
Valery Kholodkov
Ad7868025849cc38d7541229e4846a15?d=identicon&s=25 Nginx User (Guest)
on 2011-11-26 22:18
(Received via mailing list)
On 26 November 2011 16:10, strawman <nginx-forum@nginx.us> wrote:
> echo_request_body works fine either way.

If it was possible to access echo_request_body from ngx_lua, then
could possibly should solve your issue as the form parser should be
able to use this with some modifications.
37f3ea777f96500b332a1a89d6027897?d=identicon&s=25 agentzh (Guest)
on 2011-11-26 23:09
(Received via mailing list)
Hi, Valery!

On Sun, Nov 27, 2011 at 4:14 AM, Valery Kholodkov
<valery+nginxen@grid.net.ru> wrote:
> The limitation on the phase that is used to process uploads at the moment comes
from nginx itself and not from upload module.
>

Happily we no longer have such limitations for nginx 0.8.54+ :)

> If you describe what sort of API you need from upload module, I will be able to
implement it.
>

Great! Just a quick example from my head:

    location /upload {
        upload_in_access_phase;
        proxy_pass http://backend;
    }

That is, no longer introducing an internal location here for internal
redirections ;)

Thanks!
-agentzh
Ad7868025849cc38d7541229e4846a15?d=identicon&s=25 Nginx User (Guest)
on 2011-11-27 15:02
(Received via mailing list)
On 26 November 2011 10:36, strawman <nginx-forum@nginx.us> wrote:
> Is there any way to get the contents of those variables into a lua
> handler in a form that can be easily parsed?

Hi,

I have been intrigued by your question and been looking at various
possible solutions but then it struck me as odd as to why you would
want to do this in the first place.

A bit of trying to step out of a potential XY Problem situation. What
is it exactly that you are trying to achieve?
Abbd9d5312c5d54114a96a35dc94fdb1?d=identicon&s=25 Valery Kholodkov (Guest)
on 2011-11-28 10:46
(Received via mailing list)
----- agentzh <agentzh@gmail.com> wrote:
> Hi, Valery!
>
> On Sun, Nov 27, 2011 at 4:14 AM, Valery Kholodkov
> <valery+nginxen@grid.net.ru> wrote:
> > The limitation on the phase that is used to process uploads at the moment
comes from nginx itself and not from upload module.
> >
>
> Happily we no longer have such limitations for nginx 0.8.54+ :)

Explain?

>
> That is, no longer introducing an internal location here for internal
> redirections ;)

Doesn't look like API. We are talking about being able to use
accelerated uploads feature from lua, aren't we?

--
Regards,
Valery Kholodkov
37f3ea777f96500b332a1a89d6027897?d=identicon&s=25 agentzh (Guest)
on 2011-11-28 10:52
(Received via mailing list)
On Mon, Nov 28, 2011 at 5:45 PM, Valery Kholodkov
<valery+nginxen@grid.net.ru> wrote:
>>
>> Happily we no longer have such limitations for nginx 0.8.54+ :)
>
> Explain?
>

Now we can happily read and process the request body in both rewrite
and access phase handlers in nginx 0.8.54+. And ngx_lua is already
doing that for both the lua_need_request_body config directive and the
"ngx.req.read_body()" Lua API.

> Doesn't look like API. We are talking about being able to use accelerated
uploads feature from lua, aren't we?
>

No, I think we're talking about making ngx_upload easier to work with
other nginx modules, especially with those registering a content
handler like ngx_lua, ngx_echo, ngx_proxy, and ngx_fastcgi.

The whole point here is to make ngx_upload read and process the
request body in an earlier phase like "rewrite" and "access" phases,
such that we can preserve the content handler for other modules and
can eliminate internal redirects altogether to reduce runtime cost.

Sorry for the confusions in my previous emails :)

Thanks!
-agentzh
Abbd9d5312c5d54114a96a35dc94fdb1?d=identicon&s=25 Valery Kholodkov (Guest)
on 2011-11-28 11:22
(Received via mailing list)
----- agentzh <agentzh@gmail.com> wrote:
> doing that for both the lua_need_request_body config directive and the
> "ngx.req.read_body()" Lua API.

I think we've been able to do that even before. But that's not point.
The problem is not in what particular phase upload module will run, but
that it must run whenever some module tries to read the request body.
This kind of customisation is not supported by nginx yet.

> >
> can eliminate internal redirects altogether to reduce runtime cost.
>
> Sorry for the confusions in my previous emails :)

Again, the point is not in what particular phase it will run... And by
the way, the costs of internal redirects are negligible.

--
Regards,
Valery Kholodkov
37f3ea777f96500b332a1a89d6027897?d=identicon&s=25 agentzh (Guest)
on 2011-11-29 07:01
(Received via mailing list)
On Mon, Nov 28, 2011 at 6:22 PM, Valery Kholodkov
<valery+nginxen@grid.net.ru> wrote:
>
> I think we've been able to do that even before.

Sadly for nginx 0.8.41 ~ 0.8.53, "IO interruptions" are explicitly
prohibited in rewrite phase handlers and for exactly the same reason
your ngx_eval module does not work at all for these versions of Nginx
:)

> But that's not point. The problem is not in what particular phase upload module
will run, but that it must run whenever some module tries to read the request
body. This kind of customisation is not supported by nginx yet.
>

True. We definitely need an input filter mechanism for Nginx. I've
cc'd Andrew Alexeev. Maybe he can help make this happen in the Nginx
core ;)

>
> Again, the point is not in what particular phase it will run... And by the way,
the costs of internal redirects are negligible.
>

Not really if the user has a lot of regex style locations defined in
his nginx.conf. For now, the nginx core does not (yet) utilize the
PCRE JIT feature for its pattern matching, so it does consume quite a
few CPU cycles for complicated location patterns ;) And...if we *can*
eliminate the cost of internal redirects altogether, why shouldn't we
make that happen? ;)

Also, I believe it can simplify the user config file greatly and does
not cause variable scope problems (like $uri and $args' scope). I know
there is a upload_pass_args directive to forward the original URI
query args, but if the $args scope issue does not exist in the first
place, we do not need this config directive at all.

BTW, I notice that if ngx_upload's special variables like
$upload_file_md5 are used in a wrong context, it will simply crash the
nginx worker (segmentation faults) due to null module context pointer
access (at least for the latest 2.2.0 release). Maybe it's more
reasonable to check the ctx pointer and print out a helpful error
message in functions like ngx_http_upload_md5_variable?

Best,
-agentzh
Abbd9d5312c5d54114a96a35dc94fdb1?d=identicon&s=25 Valery Kholodkov (Guest)
on 2011-11-29 11:30
(Received via mailing list)
----- agentzh <agentzh@gmail.com> wrote:
> > But that's not point. The problem is not in what particular phase upload
module will run, but that it must run whenever some module tries to read the
request body. This kind of customisation is not supported by nginx yet.
> Not really if the user has a lot of regex style locations defined in
> his nginx.conf. For now, the nginx core does not (yet) utilize the
> PCRE JIT feature for its pattern matching, so it does consume quite a
> few CPU cycles for complicated location patterns ;) And...if we *can*
> eliminate the cost of internal redirects altogether, why shouldn't we
> make that happen? ;)

Did you mean "quite a lot"?

> access (at least for the latest 2.2.0 release).
I'll add a check for that.

> Maybe it's more
> reasonable to check the ctx pointer and print out a helpful error
> message in functions like ngx_http_upload_md5_variable?

--
Regards,
Valery Kholodkov
37f3ea777f96500b332a1a89d6027897?d=identicon&s=25 agentzh (Guest)
on 2011-11-29 11:50
(Received via mailing list)
On Tue, Nov 29, 2011 at 6:29 PM, Valery Kholodkov >>
>> Not really if the user has a lot of regex style locations defined in
>> his nginx.conf. For now, the nginx core does not (yet) utilize the
>> PCRE JIT feature for its pattern matching, so it does consume quite a
>> few CPU cycles for complicated location patterns ;) And...if we *can*
>> eliminate the cost of internal redirects altogether, why shouldn't we
>> make that happen? ;)
>
> Did you mean "quite a lot"?
>

I didn't say it costs quite a lot in the real world settings. I just
said we should avoid wasting CPU cycles wherever possible (just like
what Marcus Clyne originally told me two years ago) :)

Regards,
-agentzh
01d109477433f1725357f49c29267615?d=identicon&s=25 Andrew Alexeev (Guest)
on 2011-11-29 15:59
(Received via mailing list)
Hi,

>> But that's not point. The problem is not in what particular phase upload module
will run, but that it must run whenever some module tries to read the request
body. This kind of customisation is not supported by nginx yet.
>>
>
> True. We definitely need an input filter mechanism for Nginx. I've
> cc'd Andrew Alexeev. Maybe he can help make this happen in the Nginx
> core ;)

I'm not a programmer (at least not anymore), so I personally can't
implement this :) Input filters are planned in the next major release of
nginx, where the core architecture would allow that.

> his nginx.conf. For now, the nginx core does not (yet) utilize the
> PCRE JIT feature for its pattern matching, so it does consume quite a

Btw, we have an almost ready implementation for PCRE JIT and it'll soon
be released in the dev train.
Abbd9d5312c5d54114a96a35dc94fdb1?d=identicon&s=25 Valery Kholodkov (Guest)
on 2011-11-29 16:51
(Received via mailing list)
----- Andrew Alexeev <andrew@nginx.com> wrote:
> > True. We definitely need an input filter mechanism for Nginx. I've
> > cc'd Andrew Alexeev. Maybe he can help make this happen in the Nginx
> > core ;)
>
> I'm not a programmer (at least not anymore), so I personally can't implement
this :) Input filters are planned in the next major release of nginx, where the
core architecture would allow that.

I'm waiting for that for 5 years already. Couple of years more is quite
okay for me.

--
Regards,
Valery Kholodkov
Please log in before posting. Registration is free and takes only a minute.
Existing account

NEW: Do you have a Google/GoogleMail, Yahoo or Facebook account? No registration required!
Log in with Google account | Log in with Yahoo account | Log in with Facebook account
No account? Register here.