-3
good morning, I have this code
#!/usr/bin/env python
#-*- coding: utf-8 -*-
import os
try:
from textract import *
except ModuleNotFoundError:
os.system('sudo apt-get install -y python3 python-dev python-pip build-essential swig git libpulse-dev && pip3 install pocketsphinx && pip3 install textract')
os.system('pip3 install textract')
from textract import *
# É inserido o ficheiro
ficheiro=input('insira o ficheiro pdf:')
#processa o ficheiro
data =process(ficheiro)
#imprime para o ecra e descodifica o texto
print (data.decode('utf8'))
The purpose of this code was to open a pdf file and from it extract text and images but it is only taking the text
Does anyone have any idea how to solve this problem?
add your answer as a comment instead of a response if it is not solving the problem.
– Pedro Costa