How to get mime from a file through Python?

Asked

Viewed 205 times

4

How can I get Mimetype from a file through Python?

I’m not just talking about getting through the extension, but also reading the file metadata.

How can I do that?

Note: I tried to use the python-magic on Windows, but it didn’t work.

3 answers

1

Solution with the Python-Magic:

import magic
m = magic.Magic(mime=True)
mime_type = m.from_file("arquivo.pdf")
print mime_type

Edit:

According to the official library website Python-magic there are 2 factors that can cause the above code not to work.

The first factor consists of a library name conflict as there are two libraries called magic for Python! The correct library would be Python-magic-0.4.13.

The second factor (which may cause non-working on Windows) is the dependencies that need to be installed.

In Windows, the utility file must exist on the machine and its path must be passed as argument in the constructor:

import magic
m = magic.Magic(magic_file="C:\windows\system32\file.exe")
mime_type = m.from_file("arquivo.pdf")
print mime_type

The utility file compiled for Windows can be obtained here.

More details can be obtained here.

  • Tested on Windows?

  • @Andersoncarloswoss What exactly didn’t work on Windows ?

  • I don’t know. The question has the note: "I tried to use python-Magic on Windows, but it didn’t work.". You posted the solution using python-Magic, so I believe you tested it on Windows. Yes?

  • @Andersoncarloswoss There are 2 libraries in python that use the name magic, this is a name conflict (see the response link). If you are using the wrong library, probably the above code will fail. Maybe your problem has no relationship with Windows.

  • @Andersoncarloswoss The correct library package name is python-magic-0.4.13.

  • Yeah, I just wanted to know if you tested it on Windows, because apparently Wallace tested it with this library and it didn’t work.

  • 1

    @Andersoncarloswoss I edited the question with an explanation.

Show 2 more comments

0

shlex:

import subprocess
import shlex

file_name = 'pdf.fake'
cmd = shlex.split('file --mime-type {0}'.format(file_name))
result = subprocess.check_output(cmd)
mime_type = result.split()[-1]
print (mime_type)

Output:

b'application/pdf'

0

You can use the library mimetypes:

>>> import urllib, mimetypes
>>> 
>>> url = urllib.pathname2url('meu_arquivo.xml')
>>> print mimetypes.guess_type(url)
('application/xml', None)

Reference: https://docs.python.org/2/library/mimetypes.html

  • 1

    Remembering that the function urllib.pathname2url was relocated to the module urllib.request in Python 3, getting urllib.request.pathname2url.

  • If you change the termination of meu_arquivo.xml for meu_arquivo.fake, works?

Browser other questions tagged

You are not signed in. Login or sign up in order to post.