Nginx-lua and nginx upload module

Hi all,

I’m hoping to write an nginx-lua script which handles file uploads from
the nginx upload module. I can do this fine using an external backend
and the upload_pass directive, but the lua module has no functionality
for parsing the multipart form data that this generates. I’ve also
tried accessing the $upload_file_name, $upload_tmp_path etc. variables
generated by the upload module directly, but I don’t seem to be able to
access these outside of the upload_set_form_field directive.

Is there any way to get the contents of those variables into a lua
handler in a form that can be easily parsed?

Posted at Nginx Forum:

On Sat, Nov 26, 2011 at 3:36 PM, strawman [email protected] wrote:

Hi all,

I’m hoping to write an nginx-lua script which handles file uploads from
the nginx upload module. I can do this fine using an external backend
and the upload_pass directive, but the lua module has no functionality
for parsing the multipart form data that this generates. I’ve also
tried accessing the $upload_file_name, $upload_tmp_path etc. variables
generated by the upload module directly, but I don’t seem to be able to
access these outside of the upload_set_form_field directive.

Well, there will be a streaming request body reading API for ngx_lua
soon. For instance:

local data, err = ngx.req.recv_body_data()

Atop this, we can easily build a multipart streaming parser in pure
Lua and no longer rely on the ngx_upload module to read and process
the request body. (The UI of ngx_upload is not so friendly for Lua
scripting and it’ll be technically difficult to make them work
together without lots of line noises.)

Someone is already willing to sponsor this work but I’m too busy to
implement this lately :stuck_out_tongue:

Regards,
-agentzh

On 26 November 2011 10:36, strawman [email protected] wrote:

Is there any way to get the contents of those variables into a lua
handler in a form that can be easily parsed?

Hi There,

I hit the multipart form limitation of the ngx_lua module recently and
wrote a lua module to do the job.

Basically I combined a number of functions from various cgilua modules
and added a few of my own to create a single module.

I have attached a copy and the instructions for calling it are added
as comments at the top of the module.

It will handle both url encoded and multipart encoded forms and return
the post data as a lua key/value pairs

On Sat, Nov 26, 2011 at 5:16 PM, Nginx U. [email protected] wrote:

It will handle both url encoded and multipart encoded forms and return
the post data as a lua key/value pairs

This approach certainly works but it requires the request body to be
hold completely in memory, which may be unacceptable for big file
uploading.

Regards,
-agentzh

On 26 November 2011 12:16, Nginx U. [email protected] wrote:

I have attached a copy and the instructions for calling it are added
as comments at the top of the module.

It will handle both url encoded and multipart encoded forms and return
the post data as a lua key/value pairs

Use this version instead

On 26 November 2011 12:46, agentzh [email protected] wrote:

This approach certainly works but it requires the request body to be
hold completely in memory, which may be unacceptable for big file
uploading.

Correct. This is why there is a default 2mb “maxinput” flag.

Hopefully a directive to handle multipart encoded forms will be added
to ngx_lua by the developer soon :slight_smile:

Thanks for the swift replies, unfortunately I’m still having trouble
getting it to play nicely with the upload module.

Here’s the config I’m using: ##### truncated vhost config:server { location /upload { upload_pass - Pastebin.com

It seems as though the request body isn’t being read correctly;
ngx.req.get_body_data() works as expected if I run it from inside
/upload and make a multipart post to it, but it only returns the first
line of the body data when run from inside @handler. echo_request_body
works fine either way.

Here’s what I get when posting to it with the above config:

$ curl -F [email protected] http://localhost/upload

  • post defs:
    content_type: multipart/form-data;
    boundary=----------------------------ff78acc43147
    maxinput: 2097152
    content_length: 358
    maxfilesize: 1048576
    args:

  • post args:

  • request body:
    ------------------------------ff78acc43147

On a slightly different tack, I noticed one of the release announcements
for ngx_echo showed this directive being used:

echo_subrequest PUT /some_upstream/$upload_file_md5 -f $upload_tmp_path

Is there something in particular you need to do to get access to
$upload_tmp_path etc. outside of the upload_set_form_field directive?
I’ve tried hacking this into working with echo_subrequest, but if i try
to use any of the $upload_* variables anywhere outside of that directive
then it doesn’t work at all.

Posted at Nginx Forum:

On 26 November 2011 16:16, agentzh [email protected] wrote:

Sigh. I really hope that ngx_upload could be implemented as a rewrite
or access phase handler rather than a content handler, then we can
trivially combine with content_by_lua in the same location, for
example, and no longer need to do any kind of internal redirections
that clear everything including nginx variables and module contexts.

For your info, I believe agentzh may be refering, obliquely, to the
restriction in the last line here:
Lua | NGINX (which also applies
to content_by_lua_file).

On Sat, Nov 26, 2011 at 9:10 PM, strawman [email protected] wrote:

Sigh. I really hope that ngx_upload could be implemented as a rewrite
or access phase handler rather than a content handler, then we can
trivially combine with content_by_lua in the same location, for
example, and no longer need to do any kind of internal redirections
that clear everything including nginx variables and module contexts.

Maybe Valery can work on an alternative UI for his ngx_upload module? :wink:

Regards,
-agentzh

On 26 November 2011 16:10, strawman [email protected] wrote:

echo_request_body works fine either way.

If it was possible to access echo_request_body from ngx_lua, then
could possibly should solve your issue as the form parser should be
able to use this with some modifications.

The limitation on the phase that is used to process uploads at the
moment comes from nginx itself and not from upload module.

If you describe what sort of API you need from upload module, I will be
able to implement it.


Best regards,
Valery K.

Hi, Valery!

On Sun, Nov 27, 2011 at 4:14 AM, Valery K.
[email protected] wrote:

The limitation on the phase that is used to process uploads at the moment comes
from nginx itself and not from upload module.

Happily we no longer have such limitations for nginx 0.8.54+ :slight_smile:

If you describe what sort of API you need from upload module, I will be able to
implement it.

Great! Just a quick example from my head:

location /upload {
    upload_in_access_phase;
    proxy_pass http://backend;
}

That is, no longer introducing an internal location here for internal
redirections :wink:

Thanks!
-agentzh

On 26 November 2011 10:36, strawman [email protected] wrote:

Is there any way to get the contents of those variables into a lua
handler in a form that can be easily parsed?

Hi,

I have been intrigued by your question and been looking at various
possible solutions but then it struck me as odd as to why you would
want to do this in the first place.

A bit of trying to step out of a potential XY Problem situation. What
is it exactly that you are trying to achieve?

On Mon, Nov 28, 2011 at 5:45 PM, Valery K.
[email protected] wrote:

Happily we no longer have such limitations for nginx 0.8.54+ :slight_smile:

Explain?

Now we can happily read and process the request body in both rewrite
and access phase handlers in nginx 0.8.54+. And ngx_lua is already
doing that for both the lua_need_request_body config directive and the
“ngx.req.read_body()” Lua API.

Doesn’t look like API. We are talking about being able to use accelerated
uploads feature from lua, aren’t we?

No, I think we’re talking about making ngx_upload easier to work with
other nginx modules, especially with those registering a content
handler like ngx_lua, ngx_echo, ngx_proxy, and ngx_fastcgi.

The whole point here is to make ngx_upload read and process the
request body in an earlier phase like “rewrite” and “access” phases,
such that we can preserve the content handler for other modules and
can eliminate internal redirects altogether to reduce runtime cost.

Sorry for the confusions in my previous emails :slight_smile:

Thanks!
-agentzh

----- agentzh [email protected] wrote:

doing that for both the lua_need_request_body config directive and the
“ngx.req.read_body()” Lua API.

I think we’ve been able to do that even before. But that’s not point.
The problem is not in what particular phase upload module will run, but
that it must run whenever some module tries to read the request body.
This kind of customisation is not supported by nginx yet.

can eliminate internal redirects altogether to reduce runtime cost.

Sorry for the confusions in my previous emails :slight_smile:

Again, the point is not in what particular phase it will run… And by
the way, the costs of internal redirects are negligible.


Regards,
Valery K.

----- agentzh [email protected] wrote:

Hi, Valery!

On Sun, Nov 27, 2011 at 4:14 AM, Valery K.
[email protected] wrote:

The limitation on the phase that is used to process uploads at the moment
comes from nginx itself and not from upload module.

Happily we no longer have such limitations for nginx 0.8.54+ :slight_smile:

Explain?

That is, no longer introducing an internal location here for internal
redirections :wink:

Doesn’t look like API. We are talking about being able to use
accelerated uploads feature from lua, aren’t we?


Regards,
Valery K.

On Mon, Nov 28, 2011 at 6:22 PM, Valery K.
[email protected] wrote:

I think we’ve been able to do that even before.

Sadly for nginx 0.8.41 ~ 0.8.53, “IO interruptions” are explicitly
prohibited in rewrite phase handlers and for exactly the same reason
your ngx_eval module does not work at all for these versions of Nginx
:slight_smile:

But that’s not point. The problem is not in what particular phase upload module
will run, but that it must run whenever some module tries to read the request
body. This kind of customisation is not supported by nginx yet.

True. We definitely need an input filter mechanism for Nginx. I’ve
cc’d Andrew A… Maybe he can help make this happen in the Nginx
core :wink:

Again, the point is not in what particular phase it will run… And by the way,
the costs of internal redirects are negligible.

Not really if the user has a lot of regex style locations defined in
his nginx.conf. For now, the nginx core does not (yet) utilize the
PCRE JIT feature for its pattern matching, so it does consume quite a
few CPU cycles for complicated location patterns :wink: And…if we can
eliminate the cost of internal redirects altogether, why shouldn’t we
make that happen? :wink:

Also, I believe it can simplify the user config file greatly and does
not cause variable scope problems (like $uri and $args’ scope). I know
there is a upload_pass_args directive to forward the original URI
query args, but if the $args scope issue does not exist in the first
place, we do not need this config directive at all.

BTW, I notice that if ngx_upload’s special variables like
$upload_file_md5 are used in a wrong context, it will simply crash the
nginx worker (segmentation faults) due to null module context pointer
access (at least for the latest 2.2.0 release). Maybe it’s more
reasonable to check the ctx pointer and print out a helpful error
message in functions like ngx_http_upload_md5_variable?

Best,
-agentzh

----- agentzh [email protected] wrote:

But that’s not point. The problem is not in what particular phase upload
module will run, but that it must run whenever some module tries to read the
request body. This kind of customisation is not supported by nginx yet.
Not really if the user has a lot of regex style locations defined in
his nginx.conf. For now, the nginx core does not (yet) utilize the
PCRE JIT feature for its pattern matching, so it does consume quite a
few CPU cycles for complicated location patterns :wink: And…if we can
eliminate the cost of internal redirects altogether, why shouldn’t we
make that happen? :wink:

Did you mean “quite a lot”?

access (at least for the latest 2.2.0 release).
I’ll add a check for that.

Maybe it’s more
reasonable to check the ctx pointer and print out a helpful error
message in functions like ngx_http_upload_md5_variable?


Regards,
Valery K.

On Tue, Nov 29, 2011 at 6:29 PM, Valery K. >>

Not really if the user has a lot of regex style locations defined in
his nginx.conf. For now, the nginx core does not (yet) utilize the
PCRE JIT feature for its pattern matching, so it does consume quite a
few CPU cycles for complicated location patterns :wink: And…if we can
eliminate the cost of internal redirects altogether, why shouldn’t we
make that happen? :wink:

Did you mean “quite a lot”?

I didn’t say it costs quite a lot in the real world settings. I just
said we should avoid wasting CPU cycles wherever possible (just like
what Marcus C. originally told me two years ago) :slight_smile:

Regards,
-agentzh

Hi,

But that’s not point. The problem is not in what particular phase upload module
will run, but that it must run whenever some module tries to read the request
body. This kind of customisation is not supported by nginx yet.

True. We definitely need an input filter mechanism for Nginx. I’ve
cc’d Andrew A… Maybe he can help make this happen in the Nginx
core :wink:

I’m not a programmer (at least not anymore), so I personally can’t
implement this :slight_smile: Input filters are planned in the next major release of
nginx, where the core architecture would allow that.

his nginx.conf. For now, the nginx core does not (yet) utilize the
PCRE JIT feature for its pattern matching, so it does consume quite a

Btw, we have an almost ready implementation for PCRE JIT and it’ll soon
be released in the dev train.