How to serve files with access control in Django?

Asked

Viewed 521 times

7

When studying Django, the typical way to handle uploading files was to create a folder media on the server - establishing a MEDIA_ROOT and a MEDIA_URL in the settings.py - where any uploaded file would go. In the templates, a FileField or ImageField is created, whose upload_to is related to MEDIA_ROOT. In production, the webserver itself (e.g., Apache) is expected to serve the content of the URL /media, leaving only dynamic content to Django.

So far so good, the problem is that I would like to restrict the access of files "uploaded" to logged in users, according to some access control criteria. What is the right way to do this? Is it Django’s or Apache’s responsibility to do this access control? (and if it’s Apache, how to make it use Django’s permissions system?)

For reference, here’s how my virtual host (usage Django 1.4.14):

Alias /media/ /var/www/vhosts/example.com/httpdocs/media/
Alias /static/ /var/www/vhosts/example.com/httpdocs/static/

WSGIDaemonProcess exemplo threads=15 processes=5
WSGIProcessGroup exemplo
WSGIScriptAlias / /var/www/vhosts/example.com/exemplo.wsgi

P.S. For performance reasons, I would prefer that not all the /media had access control - the case of files uploaded by the user that are universally accessible is more frequent than the case where the file is restricted. I could assign a subfolder to them (ex.: /media/restrito) and let Django take care of that folder, but I don’t know how to do it with just the Alias and WSGIScriptAlias. Maybe I need the mod_rewrite also, I do not know... Anyway, I am quite lost, any reference on the subject would be very welcome.

  • 1

    We did this here at the company, I hope it helps. http://blog.wearefarm.com/2015/02/09/contact-form-uploads/

  • @Vanderson seems like a pretty smart solution! Too bad I’m using Apache, not Nginx... :( With a little luck, maybe Apache has some functionality equivalent to X-Accel-Redirect. Does anyone know any?

  • 1

    I found this link. http://francoisgaudin.com/2011/03/13/serving-static-files-with-apache-while-controlling-access-with-django/

  • @Vanderson Yes, it seems to me to be exactly the same functionality. Thanks! I would still have to solve the other half of the problem (make the template store the uploaded files in an unreachable folder) but already helped a lot.

2 answers

3


One solution I found to serve (non-static) files is using X-Sendfile. Basically, the application view (in the case of Django) checks that the user is logged in and sends the request with a header (to Apache or Nginx) stating that he is authorized to download.

This feature is documented in this link: http://wiki.nginx.org/X-accel

No Django, Settings:

import os
BASE_DIR = os.path.dirname(os.path.dirname(__file__))
DIR_PROTEGIDO = os.path.join(settings.BASE_DIR, 'protegido')

views:

def arquivo(request):  
    # verifica se o usuário está logado        
    arquivo = '.../arquivo.jpg'
    response = HttpResponse()
    response['Content-Type']=''
    response['X-Sendfile']= os.path.join(settings.DIR_PROTEGIDO, arquivo)
    return response

And in Apache:

XSendFile on
XSendFilePath "//arquivos/protegidos"
<Directory "//arquivos/protegidos">
    Order Deny,Allow
    Allow from all
</Directory>

I didn’t get to use and I don’t know if it’s efficient, but from what I read it seems to be a good solution to serve files (NOT static) with prior authorization, whenever someone tries to access the file URL without the header passed by Django will fail, in case the view will be intermediary in this process.

References I found:

Solution in Django:

  • The fact that these protected files are inside the MEDIA_ROOT Not a problem, is it? That line XSendFile on XSendFilePath ... ensures that Apache will not serve this sub-folder without the header correspondent? If the answer is "yes", then I guess that solved my question! (in other words, if /caminho/pro/media is publicly accessible, but the value of XSendFilePath for /caminho/pro/media/protegido, then Apache will or will not serve /caminho/pro/media/protegido/arquivo if the header are not present?)

  • Unfortunately, my doubt remains: in my tests, if /media is accessible and I configure /media/restrito to use Xsendfile, it keeps serving the files inside /media/restrito even without the headers. I’ll try a little more using rewrite (to prevent /media/restrito/... be directly accessed), and see if Xsendfile can still send the correct file. If you have any better suggestions, thank you, because I’m still pretty much lost...

  • @mgibsonbr I haven’t worked with XSendFile, but from what I understand you should create a directory outside the /media used by Django, and put the upload_to to that external directory, so the user would not directly access the file. At the time of serving the file of certain user there to view would check and send the request with the header.

  • I changed the answer and added a link to a ready-made solution on github. My explanation was initially mistaken, because the behavior is different from the static files.

  • 1

    With this new example of yours on github, I think it fell through: in normal models (public uploads), I keep the upload_to relating to MEDIA_ROOT, and in the models with access control I do the upload_to related to another folder, protected, via FileSystemStorage. To serve the files in this protected folder, I use Xsendfile. It seems a good strategy, I still need to test but I am satisfied with the answer. Thanks!

1

I have two suggestions. The first uses more server features and ensures more privacy, and the second, which is used by Facebook to store images, performs better, but uses aloe URL pattern.

CASE 1: Use language (or server) to restrict log-in user-based static file access

  1. Place the files in a location not normally accessible, for example, a folder above its equivalent to www or public_html
  2. Use your programming language, where you have full control to know which user is authenticated, for when a URL is accessed, it checks the user and, if allowed, reads the private image and exposes it.

CASE 2: Store image in a location accessible by anyone, but URL difficult to predict

  1. Place the image in a location accessible to anyone, but should have a very complex URL.
    1. Do not use sequential number! .
    2. md5 simple is not random enough, also do not use.
  2. Store this random URL and only display it to users you want to have access to it.

My recommendation: when in doubt, use CASE 1. CASE 2 is interesting only in more peculiar cases, as on Facebook. Another situation of CASE 2 is to allow access to files without authentication, only with URL sent by email, common in emails to billboards.

  • 1

    Why is "simple md5 not random enough"? Attacks on MD5 assume that the opponent can choose the two files to be hasheados, but in this case the opponent already has access to the file; weaknesses of MD5 are only relevant if they allow access to files that the opponent nay knows.

  • 1

    I agree with @ctgPi, the MD5 collision vulnerability makes a certain class of applications unfeasible, but not all, so the fear of this algorithm is kind of unfounded (but SHA-256/512 would be better anyway). Anyway, I would simply use a UUID as part of the URL, I would not use anything depending on the contents of the file (otherwise, if an opponent has a file and wants to know if the same file exists on the site, he could test it; in what practical case this would be relevant, I do not know, but if it can do without giving the opponent this power, better).

  • By the way, thank you for the answer, but my question is of a more practical nature: how to do these things using Apache/Django. I’m fairly familiar with CASE 2 (I first saw it on Google Docs), and that’s what I’ll do if I don’t have a better solution, and CASE 1 as you pointed out is relatively inefficient (but sometimes unavoidable).

  • 1

    In order to understand why to not only use MD5 hash to obscure sensitive data, have the case of BBOM https://tecnoblog.net/136476/bbom-falha-que-expoe-dados-dos-usuarios/ . He went public right after Telexfree had the same flaw, only she used sequential and more obvious ID

  • @Emersonrochaluiz Well observed! It is another reason to use Uuids instead of hashes, because we suppose that the opponent does not know anything that can help guess the URL, but this example showed that is not always the case (BBOM hashed the boleto identification sequence, so that if you guess the sequence - easy, because it’s sequential - you can hash whatever the algorithm is). If it is not possible to use a UUID then the ideal is to use a MAC, and not a hash.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.