Serving static file dynamically


#1

Hello,

I’m interrested in NGINX to serve static files according to his
performance :p. But I will have to serve big static file (video
content in hd) and to protect the access I would like to use a dynamic
url with a key inside like:

  • http://static/qwertyuiopasdfghjkl which will return the content of
    file.mov where the key(qwertyuiopasdfghjkl) is associated with the
    file in the DB. The user received the content not the file.

So a user can share an URL with his contact and stop sharing it when
he want (by removing the record in the DB). I will have to serve
thousand of files at the same time and I want to minimize my server
farm. Do you have any idea ?

Thank you :slight_smile:


#2

Brice L. a écrit :

So a user can share an URL with his contact and stop sharing it when he
want (by removing the record in the DB). I will have to serve thousand
of files at the same time and I want to minimize my server farm. Do you
have any idea ?

Thank you :slight_smile:

You can use X-Accel-Redirect: http://wiki.nginx.org/NginxXSendfile


#3

Merci Jean-Philippe !
Thank you Jean-Philippe, looks like the perfect solution… :slight_smile:


#4

Marcus,
Thank you for you advice but I think this solution will not work
for me. As I wrote on my previous email, I’m going to serve HD video
content which is going to be more than 1GB.

All the best,

  • Brice

#5

Brice,

If you haven’t done so already, have a look at Primebase Media Streaming
(www.blobstreaming.org).

It’s a MySQL plugin that has a lightweight HTTP server on the front of
it to serve blobs out of a database.

I’ve done some tests on it (up to 2M objects), and the speed was
comparable in many cases to serving content statically.

As part of the system, it allows you to provide an alias for your blobs,
so as to hide any database information.

Cheers,

Marcus.


#6

On Tue, 2009-03-24 at 12:06 -0700, Brice L. wrote:

Marcus,
Thank you for you advice but I think this solution will not work for
me. As I wrote on my previous email, I’m going to serve HD video
content which is going to be more than 1GB.

A quick perusal of the docs turned this up:

12.3. Limitations

The current implementation is based on JDK 1.5.x, which only allows for
lengths in setBinaryStream(), setAsciiStream() and setBlob() of type
‘int’. This means that the maximum BLOB size that can currently
permitted is 2GB.

Regards,
Cliff


#7

not to mention that this is only useful if the OP is storing files in
the database to begin with, I believe.

if it’s filesystem, then X-Accel-Redirect is the way to go.


#8

Brice,

PBMS was written exactly for the purpose you describe - serving blobs.
The blobs can be anything, text, images, video. If you are streaming
content, then that’s different. If you are serving them statically,
then it may well be suitable, though inserting files that big, though,
might have its own difficulties.

Note : the HTTP server does not use MySQL for serving them (i.e. it
doesn’t go through MySQL - this would be very slow). The HTTP server
accesses the data directly (Primebase have written their own storage
engine for MySQL, but it can use other storage engines too). It’s a
plugin for MySQL to make it easy to add/delete the blobs, but this is
only at the insertion/updating/deletion stage. Using the Alias settings
for it, you wouldn’t even need another web server.

Bon courage,

Marcus.


#9

Cliff,

The limit mentioned there is just for the JDBC driver, and is due to the
limits of JDK 1.5.x. You don’t have to use the JDBC driver for PBMS.
There’s a direct C API, you can use any MySQL APIs to input data and and
you can ‘upload’ the blobs via HTTP.

The limits on blob size are based on MySQL’s limits, which glancing at
http://dev.mysql.com/doc/refman/5.1/en/storage-requirements.html appears
to be 4GB for LONGBLOB (which all data in PBMS is stored as).

The**** ‘problems’ I was thinking about, though, was with the actual
time/overhead it takes to load a file that big into a db. The main
advantage I see for PBMS over static files is when you have a very large
number of them (hundreds of thousands or more), because then you avoid
having large extra storage capacity for metadata and you don’t ask the
filesystem to deal with a huge number of files.

Serving just a few thousand files from the filesystem, though, isn’t
going to put it under stress.

Cheers,

Marcus.


#10

Michael S. wrote:

not to mention that this is only useful if the OP is storing files in
the database to begin with, I believe.

if it’s filesystem, then X-Accel-Redirect is the way to go.

Yes, if the filesystem is the storage mechanism, then I’d agree. If you
have a very large number of files, though, storing them in the DB can be
more efficient than using the filesystem (depending on platform, file
number, directory hierarchy etc).

Marcus.


#11

On Mar 24, 2009, at 1:50 PM, Marcus C. wrote:

Cliff,

The limit mentioned there is just for the JDBC driver, and is due to
the limits of JDK 1.5.x. You don’t have to use the JDBC driver for
PBMS. There’s a direct C API, you can use any MySQL APIs to input
data and and you can ‘upload’ the blobs via HTTP.

If I need an instance of java running per client, it’s going to be
impossible to serve unless I have more than 50 servers (which is very
expensive just on electricity)

Serving just a few thousand files from the filesystem, though, isn’t
going to put it under stress.

because some file is going to be seen multiple time by different
people I’m expecting the ISP to proxy the data for the same URL. I’m
more worried by the on demand which will not be proxy.

  • Brice

#12

On Mar 24, 2009, at 1:55 PM, Marcus C. wrote:

Michael S. wrote:

not to mention that this is only useful if the OP is storing files in
the database to begin with, I believe.

if it’s filesystem, then X-Accel-Redirect is the way to go.

Yes, if the filesystem is the storage mechanism, then I’d agree. If
you have a very large number of files, though, storing them in the
DB can be more efficient than using the filesystem (depending on
platform, file number, directory hierarchy etc).

How can the filesystem can be slower than DB to serve huge video
files ? That’s completely on the opposite to my culture ! Can you
explain me how this is possible ?
My situation will be serving thousand(maybe more later) of different
big files(between 200MB and 4GB) to different users.

Brice


#13

Tue, Mar 24, 2009 at 2:35 PM, Brice L. removed_email_address@domain.invalid wrote:

How can the filesystem can be slower than DB to serve huge video files ?
That’s completely on the opposite to my culture ! Can you explain me how
this is possible ?
My situation will be serving thousand(maybe more later) of different big
files(between 200MB and 4GB) to different users.

i’m not saying faster or slower. i’m just saying that if the files are
on the filesystem, then blob streaming doesn’t make sense, unless he
wants to load his files into the blob streaming db.

if the files are in a database, then it does. unless i’m missing the
fact that blob streaming does not require the database-backing like a
normal mysql installation type blob column thing.

he has a few options

  1. do nothing, serve from filesystem (i don’t think this will work for
    him)
  2. use a script wrapper which means it will keep the script open
    spooling the file out
  3. use a script wrapper with x-accel-redirect (my choice for file-based)
  4. load them into a db and look at blob streaming

#14

On Mar 24, 2009, at 2:56 PM, Michael S. wrote:

files(between 200MB and 4GB) to different users.

  1. do nothing, serve from filesystem (i don’t think this will work
    for him)
  2. use a script wrapper which means it will keep the script open
    spooling the file out
  3. use a script wrapper with x-accel-redirect (my choice for file-
    based)

I was thinking this way too. I will choose this option ;).
Thanks a lot, this x-accel-redirect is just amazing !

  • Brice

#15

If I need an instance of java running per client, it’s going to be
impossible to serve unless I have more than 50 servers (which is very
expensive just on electricity)
You don’t need Java running. There’s a C API to input data if you
wanted, or you can use the MySQL interface or you can just upload to the
HTTP server. However, I’d say it’s more suited to relatively small
files, and I don’t think if you’re only serving a few thousand that you
wouldn’t benefit from it.

Marcus.


#16

Hi Brice,

Firstly, I’m not talking about a full-blown database that’s serving
files, but a lightweight front to the files (as PBMS is). PBMS is just
an HTTP front to a storage engine for MySQL, and doesn’t deal with SQL
or anything like that. It’s really like serving the content directly
out of a large file, in a pretty similar way to many caches do.

Ordinarily, the filesystem will be much faster than even the lightest of
fronts to a db, but if you have milllions of files, then each file will
have metadata associated with it (which takes up space - usually at
least 4KB) and the filesystem has to cope with all the files, and many
filesystems struggle when you start getting to large numbers of files,
and it slows things down.

If you have billions of files, then you couldn’t even serve them off a
normal 32-bit fs, because you’d run out of inodes (I believe).

For thousands of files, or tens of thousands, you’d be fine, though, and
the filesystem will definitely be quicker than PBMS.

When I did my tests with PBMS, I created 2M objects, and put them in the
database with MySQL, then served them using PBMS. My benchmarks showed
that the number of req/s I could serve was similar to Apache 2 at best,
and about 20% slower at worst (it depended on the index of the file).
That might not seem great, but that was serving from 1 file vs 2M. I
tried creating files in a hierarchical structure, and after I got to
around 400k (? I can’t quite remember), my system almost completely
stalled, so I stopped trying to add files.

In most scenarios, the filesystem will be quicker, but not always.

Cheers,

Marcus.


#17

Thank you Marcus. I never think about those kind of FS issues. files
should not exceed hundred thousand in the worse case scenario so I
think file system should be the better solution for me.


#18

With regards to serving large files through nginx, the following thread
may
be interesting:

http://www.litespeedtech.com/support/forum/showthread.php?t=2446&page=2&highlight=sendfile

"nginx is very well SMP implemented webserver with awesome bw throttle
features, you can set max concurrent connection limits per each IP
globally
or at VHost or for any path/file type individual using PCRE conditions
nginx is suitable for smaller files about 10~20 MB but for larger files
more
than 100MB it gets in IO bottlenecks with my huge connections at peak
100% iowait for all cores which leads to high load and forces server to
very
lower throughput

lighttpd, super webserver for static contents, specially when compiled
with
LOCAL BUFFERING enabled in src\network_writev.c and lib async io

it [lighttpd] rocks! for me it can handle 2x throughput againts
litespeed/nginx without any iowait
iowait is almost 0 with that huge file sizes and awesome connetions"


#19

On Mar 25, 2009, at 12:34 PM, Marcus C. wrote:

Brice L. wrote:

Thank you Marcus. I never think about those kind of FS issues.
files should not exceed hundred thousand in the worse case scenario
so I think file system should be the better solution for me.
You’re welcome. If you’re not already doing so, if you’ve got many
thousands of files, then putting them in multiple directories is
probably a good idea. I tend to have no more than around 1000 files
in each directory, but it may not impact on performance to have
quite a few more - but different systems will coped better/worse to
large numbers of files.
That’s the case, customer_id as folder name :slight_smile:


#20

Brice L. wrote:

Thank you Marcus. I never think about those kind of FS issues. files
should not exceed hundred thousand in the worse case scenario so I
think file system should be the better solution for me.
You’re welcome. If you’re not already doing so, if you’ve got many
thousands of files, then putting them in multiple directories is
probably a good idea. I tend to have no more than around 1000 files in
each directory, but it may not impact on performance to have quite a few
more - but different systems will coped better/worse to large numbers of
files.

Marcus.