Forum: NGINX nginx mogilefs module 1.0.1

Announcement (2017-05-07): www.ruby-forum.com is now read-only since I unfortunately do not have the time to support and maintain the forum any more. Please see rubyonrails.org/community and ruby-lang.org/en/community for other Rails- und Ruby-related community platforms.
Abbd9d5312c5d54114a96a35dc94fdb1?d=identicon&s=25 Valery Kholodkov (Guest)
on 2009-04-16 12:49
(Received via mailing list)
Greetings!

I've managed to implement module for nginx which fetches files from
MogileFS.

Please see module's manual page on English or my nginx modules page:

http://www.grid.net.ru/nginx/mogilefs.en.html
http://www.grid.net.ru/nginx/

I expect minor issues to pop up in near future, therefore I advise you
to test module before putting into production mode.

I would like to thank to Michael Shadle for the idea.

I hope you'll enjoy it!

And your feedback is always welcomed.
A8108a0961c6087c43cda32c8616dcba?d=identicon&s=25 Maxim Dounin (Guest)
on 2009-04-16 14:31
(Received via mailing list)
Hello!

On Thu, Apr 16, 2009 at 11:39:31AM +0100, Valery Kholodkov wrote:

> I would like to thank to Michael Shadle for the idea.
>
> I hope you'll enjoy it!
>
> And your feedback is always welcomed.

Not really used (and not likely to in near future), but here are
some questions:

1. Any reason why you create hidden location from the module instead
of accepting name of existing one?  It looks unnatural for me.

2. As far as I see it uses only first path returned by mogilefs.
Is it planned to support failover?  From my understanding it
should be simple, something like

    location /mogilefs {
        mogilefs_tracker ...
        mogilefs_pass  /mogilefs_fetch;
    }
    location /mogilefs_fetch {
        error_page 502 503 504 = @failover;
        proxy_pass $mogilefs_path_0;
    }
    location @failover {
        proxy_pass $mogilefs_path_1;
    }
    ...

Maxim Dounin
Abbd9d5312c5d54114a96a35dc94fdb1?d=identicon&s=25 Valery Kholodkov (Guest)
on 2009-04-16 14:59
(Received via mailing list)
----- "Maxim Dounin" <mdounin@mdounin.ru> wrote:

> Not really used (and not likely to in near future), but here are
> some questions:
>
> 1. Any reason why you create hidden location from the module instead
> of accepting name of existing one?  It looks unnatural for me.

There are several reasons:

1) The other way around looks unnatural to me :)
2) People will tend to forget internal; directive, leaving fetch
locations open to public, which is kinda security hole. You did it as
well, by the way:) I don't feel comfortable with it.

In alpha version of this module, however, mogilefs_pass took location
name as an argument as you describe. This also allowed to compile module
with nginx 0.6.x.

I don't know where the balance is at the moment. I don't have enough
feedback.

>         proxy_pass $mogilefs_path_0;
>     }
>     location @failover {
>         proxy_pass $mogilefs_path_1;
>     }

That's actually good idea, I'll implement it. The only thing I'd love to
have to do this is access to parametric variables from modules.
A8108a0961c6087c43cda32c8616dcba?d=identicon&s=25 Maxim Dounin (Guest)
on 2009-04-16 15:27
(Received via mailing list)
Hello!

On Thu, Apr 16, 2009 at 01:49:46PM +0100, Valery Kholodkov wrote:

>
> 1) The other way around looks unnatural to me :)
> 2) People will tend to forget internal; directive, leaving fetch locations open to 
public, which is kinda security hole. You did it as well, by the way:) I don't feel 
comfortable with it.
>
> In alpha version of this module, however, mogilefs_pass took location name as an 
argument as you describe. This also allowed to compile module with nginx 0.6.x.
>
> I don't know where the balance is at the moment. I don't have enough feedback.

They will be unable to fetch anything without appropriate variable
correctly set.

And personally I think that security is quite a different thing
and should be handled in the way admin prefers.  It may be
internal, may be allow/deny, may be something else.  That's why I
usually omit internal from the examples.  And any magic is really
bad here since people may think that software will handle security
for them while it actually can't.

On the other hand, it may be handy to actually have this location
non-internal (e.g. for direct requests from internal services or
just admin checks).

> >         proxy_pass $mogilefs_path_0;
> >     }
> >     location @failover {
> >         proxy_pass $mogilefs_path_1;
> >     }
>
> That's actually good idea, I'll implement it. The only thing I'd love to have to do this 
is access to parametric variables from modules.

For this particular case I suppose it will be simple enough to
register just 10 variables with appropriate names.

Alternatively, this may be handled by something like

    set $mogilefs_failover 1;
    proxy_pass $mogilefs_path;

with appropriate lookup of $mogilefs_failover in code before
returning value for $mogilefs_path.

Maxim Dounin
Abbd9d5312c5d54114a96a35dc94fdb1?d=identicon&s=25 Valery Kholodkov (Guest)
on 2009-04-16 15:49
(Received via mailing list)
----- "Maxim Dounin" <mdounin@mdounin.ru> wrote:

> And personally I think that security is quite a different thing
> and should be handled in the way admin prefers.  It may be
> internal, may be allow/deny, may be something else.  That's why I
> usually omit internal from the examples.  And any magic is really
> bad here since people may think that software will handle security
> for them while it actually can't.

> On the other hand, it may be handy to actually have this location
> non-internal (e.g. for direct requests from internal services or
> just admin checks).

You might be right. After all, I disclaimed any responsibility for
security damages in the license.

Will see whether I'll get any other comments regarding this part.

> Alternatively, this may be handled by something like
>
>     set $mogilefs_failover 1;
>     proxy_pass $mogilefs_path;
>
> with appropriate lookup of $mogilefs_failover in code before
> returning value for $mogilefs_path.

This is impossible, because I do not evaluate $mogilefs_path variable
dynamically. nginx discards all modules' context when it does an
internal redirect. Instead, I simply assign a value to variable and the
value survives during internal redirect.
A8108a0961c6087c43cda32c8616dcba?d=identicon&s=25 Maxim Dounin (Guest)
on 2009-04-16 16:39
(Received via mailing list)
Hello!

On Thu, Apr 16, 2009 at 02:38:34PM +0100, Valery Kholodkov wrote:

[...]

> > Alternatively, this may be handled by something like
> >
> >     set $mogilefs_failover 1;
> >     proxy_pass $mogilefs_path;
> >
> > with appropriate lookup of $mogilefs_failover in code before
> > returning value for $mogilefs_path.
>
> This is impossible, because I do not evaluate $mogilefs_path variable dynamically. nginx 
discards all modules' context when it does an internal redirect. Instead, I simply assign 
a value to variable and the value survives during internal redirect.

You may preserve old context by (surprise!) assigning it to a
variable.  But it seems overkill for me, too. :)

Maxim Dounin
Abbd9d5312c5d54114a96a35dc94fdb1?d=identicon&s=25 Valery Kholodkov (Guest)
on 2009-04-16 16:55
(Received via mailing list)
----- "Maxim Dounin" <mdounin@mdounin.ru> wrote:

> > >
> > > with appropriate lookup of $mogilefs_failover in code before
> > > returning value for $mogilefs_path.
> >
> > This is impossible, because I do not evaluate $mogilefs_path
> variable dynamically. nginx discards all modules' context when it does
> an internal redirect. Instead, I simply assign a value to variable and
> the value survives during internal redirect.
>
> You may preserve old context by (surprise!) assigning it to a
> variable.  But it seems overkill for me, too. :)

Yes. This looks like a dirty hack. Personally, I was surprised when I
discovered that redirects clear contexts.
F5a6ed477b109fe6acc11a5a8f87e7e8?d=identicon&s=25 Michael Shadle (Guest)
on 2009-04-16 18:18
(Received via mailing list)
On Thu, Apr 16, 2009 at 5:49 AM, Valery Kholodkov
<valery+nginxen@grid.net.ru> wrote:

>>         proxy_pass $mogilefs_path_0;
>>     }
>>     location @failover {
>>         proxy_pass $mogilefs_path_1;
>>     }
>
> That's actually good idea, I'll implement it. The only thing I'd love to have to do this 
is access to parametric variables from modules.

There should be no need for multiple $mogilefs_path's as the tracker
supplies the locations that nginx should be proxying to...

However, failover to a non-mogilefs source does make sense. in this
case it would be something like this I think:

error_page 502 503 504 = /maintenance.html;

Or something of that nature?

Remember, mogilefs already has its own intelligence built in for
redundancy. All nginx has to do is take the list of mogstoreds
(storage nodes that listen over http for basic webdav commands, which
can be use nginx for that too :)) and try them in either a) the order
given, b) opposite order or c) random order - the choice should come
from the feedback from dormando or someone knowledgable with the
mogilefs code. I am not sure if the tracker arbitrarily gives a list
of URLs or if there is any context behind it.
F5a6ed477b109fe6acc11a5a8f87e7e8?d=identicon&s=25 Michael Shadle (Guest)
on 2009-04-16 18:22
(Received via mailing list)
On Thu, Apr 16, 2009 at 9:05 AM, Michael Shadle <mike503@gmail.com>
wrote:

> There should be no need for multiple $mogilefs_path's as the tracker
> supplies the locations that nginx should be proxying to...

which probably means something like:

 location /mogilefs {
         mogilefs_tracker ...
         mogilefs_pass $mogilefs_url;  <- this would be an array/list
of urls (like an nginx upstream construct)
}

location /mogilefs_fetch {
     error_page 502 503 504 = @failover;
     proxy_pass $mogilefs_path_0;  <- this makes no sense (in my
opinion)
}

you can't arbitrarily assume that people have only 2 or 10 copies of
the files available. unless the tracker has a limit of how many it
replies to, then you would have up to $mogilefs_path_X; but i see it
much better to take what is given and create something like an
upstream{} internally for it, and then mogilefs_pass is essentially
proxy_pass to the upstream at that point.

you might not even need mogilefs_pass then (unless it does additional
work) as it should technically be an upstream{} created on the fly in
memory for that request, and it would be just like proxy_pass
@mogilefs_reply; or something?
Abbd9d5312c5d54114a96a35dc94fdb1?d=identicon&s=25 Valery Kholodkov (Guest)
on 2009-04-17 11:46
(Received via mailing list)
----- "Michael Shadle" <mike503@gmail.com> wrote:

> Remember, mogilefs already has its own intelligence built in for
> redundancy. All nginx has to do is take the list of mogstoreds
> (storage nodes that listen over http for basic webdav commands, which
> can be use nginx for that too :)) and try them in either a) the order
> given, b) opposite order or c) random order - the choice should come
> from the feedback from dormando or someone knowledgable with the
> mogilefs code. I am not sure if the tracker arbitrarily gives a list
> of URLs or if there is any context behind it.

Regardless of the fact, that tracker returns path to the host with least
load, it would be worth to try secondary locations, since there could
some network configuration or routing issue suddenly appear.

However, sysads must be punished for network configurations, where a
situation can appear such as tracker can contact storage nodes and
frontends not.

Ideally, every frontend node must have a clone of distributed fs repo,
which it can contact locally. I don't know whether mogile can do this.
F5a6ed477b109fe6acc11a5a8f87e7e8?d=identicon&s=25 Michael Shadle (Guest)
on 2009-04-17 12:11
(Received via mailing list)
On Thu, Apr 16, 2009 at 9:34 AM, Valery Kholodkov
<valery+nginxen@grid.net.ru> wrote:

> Regardless of the fact, that tracker returns path to the host with least load, it would 
be worth to try secondary locations, since there could some network configuration or 
routing issue suddenly appear.

exactly - that's why it should be an nginx upstream {} construct
probably; and your mogilefs_timeout or whatever settings would
essentially be the same parameters given normally with the upstream
parameters of timeout and such or whatever. that way nginx is using
it's "smart" retry-multiple-upstreams functionality. basically
mogilefs is just giving us a dynamic list of available upstreams for a
specific URI. that's in a nutshell i think 90% of what this module
does, is translating a URI until a mogilefs key and domain and asking
a tracker where it is, then offloading it to standard nginx
proxying... if that is possible

> However, sysads must be punished for network configurations, where a situation can 
appear such as tracker can contact storage nodes and frontends not.
>
> Ideally, every frontend node must have a clone of distributed fs repo, which it can 
contact locally. I don't know whether mogile can do this.

to me, this is up to how they want to configure their mogilefs
installation. you can create domains with 2 or 3 or N replicas of the
file in question.

for fallbacks i would use standard nginx fallback functionality using
error_page or whatever.
This topic is locked and can not be replied to.