I’ve been a bit busy earning my keep for the last week, hence the lack of progress on by rails performance tuning project. I will share with you though something I’ve been doing that’s really cool.
A website I’ve been working on allows users to upload files of any size. Let me just repeat that - any size. No healthy four megabyte limit, no limits at all. They then do things to the uploaded files. Apache with mod_proxy_balancer is being used, as a result their users are experience heaps of proxy timeout errors when someone uploads something large. No good at all.
So how would I fix this? Well, first off get rid of mod_proxy_balancer in favour of HAProxy so that requests will be intelligently sent to mongrels not actually doing something.
Secondly, I had thought that Apache just piped requests straight through to Mongrel in eight kilobyte chunks as they were received. Meaning the Mongrel would be busy receiving the request the moment Apache started receiving it. So a ten minute file upload would tie up a Mongrel instance for the whole ten minutes. Ugh! I’d fully read somewhere that that’s how Apache worked, but my testing has shown that Apache receives the whole file itself, buffering to disk so as to not bloat it’s memory usage, and then passes it to Mongrel in one big fast hit. So not so bad. This is with the Worker MPM, so maybe I was reading about the Prefork MPM?? Don’t use Apache Prefork MPM people, it’s bad news.
The problem here is that the uploaded file is passed through in the request, probably over the internal network, to Mongrel which then saves it probably to a NFS share, again back over the internal network. Hmm, not the best. And its likely that the Rails code will then read that back in to process it, back over the network. That’s not particularly ideal.
So in steps Nginx. First off it’s faster than Apache and less memory hungry. Additionally it doe’s the same thing as Worker MPM with uploaded files, buffers them to disk before bothering to tell Mongrel about the request. Not going to solve the above problem though, but there is a Upload Module available for it which does. Instead of sending the file to Mongrel as part of the request it takes the buffered version and puts it anywhere on disk that you want and alters the request’s params to contain that location. The request will get processed quicker as there’s much less network traffic, this’ll be noticeable with the large file uploads.
It all looked simple enough to implement, except I did have to overcome one or two snags. My nginx.conf changes for the Upload Module initially caused ActionController::InvalidAuthenticityToken errors because the module essentially rebuilds your request and wasn’t passing through the authenticity_token param. So you have to explicitly tell it what parameters to pass though.
Other problems were that upload_pass command is ignored in version 2.04 of the module, but you still have to define it. Ugh. And finally the location that you post to has to be off of root, not an /controller/action location.
An example of my changes looks like this:
# Upload form should be submitted to this location location /file_transfer_upload_completed { # Pass altered request body to this location # NTW: This seems to be ignored and the above location is used instead upload_pass /dummy; # Store files to this directory # The directory is hashed, subdirectories 0 1 2 3 4 5 6 7 8 9 should exist upload_store /some/nice/location/for/mongrel/to/access 1; # Allow uploaded files to be read by everyone upload_store_access user:rw group:rw all:rw; # Set specified fields in request body # this puts the orginal filename, new path+filename and content type in the requests params upload_set_form_field $upload_field_name.name "$upload_file_name"; upload_set_form_field $upload_field_name.content_type "$upload_content_type"; upload_set_form_field $upload_field_name.path "$upload_tmp_path"; # pass through any other fields from the original request. # allow forgery protection to be used upload_pass_form_field "^authenticity_token$"; } # dummy location that needs to be defined. :-( location /dumy { }
I’ve then created a route for that location forcing it through to where I’ll want. Another thing to note, is that this will capture GET requests too, so don’t be using this url for rendering a page.
Having gotten over all that, it works swimmingly.
Some of you might be asking why I didn’t just implement a dedicated file uploading solution? Well, this site in question doesn’t get heaps of uploads, and in my opinion, it was too much of a change for the client. This solution solves the problem and is nice and simple
Related posts:
- Ngnix Upload Awesomeness pt2 Eek, a week’s gone by without a blog entry. The...
- Using Nginx to send files with x-accel-redirect So far I’ve configured Nginx to handle file uploads by...
- Wordpress with Nginx on Slicehost Over the weekend I moved this blog from being hosted...
![[del.icio.us]](http://www.motionstandingstill.com/wp-content/plugins/bookmarkify/delicious.png)
![[Digg]](http://www.motionstandingstill.com/wp-content/plugins/bookmarkify/digg.png)
![[dzone]](http://www.motionstandingstill.com/wp-content/plugins/bookmarkify/dzone.png)
![[Fark]](http://www.motionstandingstill.com/wp-content/plugins/bookmarkify/fark.png)
![[Google]](http://www.motionstandingstill.com/wp-content/plugins/bookmarkify/google.png)
![[LinkedIn]](http://www.motionstandingstill.com/wp-content/plugins/bookmarkify/linkedin.png)
![[Reddit]](http://www.motionstandingstill.com/wp-content/plugins/bookmarkify/reddit.png)
![[Shoutwire]](http://www.motionstandingstill.com/wp-content/plugins/bookmarkify/shoutwire.png)
![[Slashdot]](http://www.motionstandingstill.com/wp-content/plugins/bookmarkify/slashdot.png)
![[Sphinn]](http://www.motionstandingstill.com/wp-content/plugins/bookmarkify/sphinn.png)
![[StumbleUpon]](http://www.motionstandingstill.com/wp-content/plugins/bookmarkify/stumbleupon.png)
![[Technorati]](http://www.motionstandingstill.com/wp-content/plugins/bookmarkify/technorati.png)
![[Twitter]](http://www.motionstandingstill.com/wp-content/plugins/bookmarkify/twitter.png)



2 Comments
Hi Nahum,
Does nginx upload progress module still lock the mongrel instance for the duration of buffering the uploaded data to disk?
Ideally the upstream mongrel instance should be free to serve other requests until nginx has saved the entire file to disk. That would make it much more efficient.
thanks
I presume you mean the ‘upload module’ rather than ‘upload progress module’? I’ve not used the latter.
Ngnix doesn’t lock the mongrel instance. I’ve tested this by having a single mongrel running, uploading a 200mb file and still been able to surf around the site no problems. This is exactly why I implemented this for a client.
If the upload module isn’t used nginx will still buffer the file to a temp location on disk. Only once it has received the whole file it will send the request through to mongrel with the file in the request itself.
So large files being sent in requests can cause blocking problems while the request is sent and then your app deals with it - compared to just being told by Nginx where it is on disk which is much quicker.
Cheers,
Nahum.