Convert PDF document pages to Jpgs

Asked

Viewed 152 times

11

The following code is working within the desired, where it receives parameters in order to convert all pages of a PDF file into JPG files:

  • ID (natural number)
  • Absolute path (must exist and point to PDF file)
  • system user (must exist)
#!/bin/bash

# Collect parameters values into human readable variables
id="$1"
filenamepath="$2"
owner="$3"

# check the ID
if [ -z "$id" ]; then
    echo "Parâmetro #1 deverá conter o ID da base de dados! Nada foi recebido."
    exit 0
else
    if ! (expr "$id" + 0  > /dev/null 2>&1 && [ "$id" -gt 0 ]); then
        echo "Parâmetro #1 deverá ser um inteiro!"
        exit 0
    fi
fi

# check the file
if [ -z "$filenamepath" ]; then
    echo "Parâmetro #2 deverá conter o caminho completo e nome do PDF a processar! Nada foi recebido."
    exit 0
else
    if [ ! -f "$filenamepath" ]; then
        echo "O ficheiro indicado não existe no servidor, confira o caminho e o nome do ficheiro!"
        exit 0
    fi
fi

# check the owner
if [ -z "$owner" ]; then
    echo "Parâmetro #3 deverá conter o nome do proprietário dos ficheiros a gerar."
    exit 0
else
    if ! id -u "$owner" >/dev/null 2>&1; then
        echo "O nome de utilizador indicado não existe no sistema, confira os dados!"
        exit 0
    fi
fi

# All good, lets work

# Set the filename and the filepath 
filename=$(basename $filenamepath)
filepath=${filenamepath%/*}

# Give some feedback to the user
echo "A iniciar trabalhos com o ficheiro $filename"

# create directory if it does not exist
if [ ! -d "$filepath/$id" ]; then
    mkdir -p "$filepath/$id"
    chown "$owner:$owner" "$filepath/$id"
else
    echo "A diretoria de destino já existe, vou terminar assumindo que o documento já está convertido!";
    exit 1
fi

# copy the file into the target directory
cp "$filenamepath" "$filepath/$id/$filename"
chown "$owner:$owner" "$filepath/$id/$filename"

# go to the target directory
cd "$filepath/$id/"

# convert the PDF pages into .ppm files
pdftoppm "$filepath/$id/$filename" tmp

# convert each .ppm file into a .jpg file
# The .jpg files will have 800px of height with a proportional width
# The .jpg files will have a quality of 80%
ls -1 *.ppm | xargs -n 1 bash -c 'convert "$0" -resize x800 -quality 80% "${0%.*}.jpg"'

chown "$owner:$owner" *

# remove .ppm files
rm -rf *.ppm

# remove the .pdf file
rm -rf "$filename"

# Inform the user that the job is completed
echo "Concluído!"

exit 1

Its use may be carried out as follows::

#sh ./meuScript 15 /caminho/para/documento/nome.pdf utilizador
      └───────┘ └┘ └──────────────────────────────┘ └────────┘
      ↓         ↓                ↓                      ↓
     nome do    ID    caminho absoluto para PDF      nome do utilizador
     script                                          para permissões da
                                                     pasta e ficheiro

That will lead to the exit:

A iniciar trabalhos com o ficheiro nome.pdf
Concluído!

Question

For what has been described, the process is proceeding efficiently or the same can be simplified?

1 answer

1

The script is fine and if it fulfills the function, more than fine.

Still, if you’re gonna move it, one possible direction is to put the pdftoppm to convert and make scale directly, or something like:

 mkdir $id
 pdftoppm -jpeg -scale-to  800  $filename  $id/tmp

(Beware that I’m not preserving the details... = need to hit them)

  • That was the initial concept, but in Centos the pdftoppm does not have the parameter -jpg nor -jpeg (?!?) I still don’t understand why I followed to use the convert to transform the .PPM resulting from the pdftoppm in .JPG. It represents additional work, but works and remains compatible with several systems. Anyway, the -scale-to had not explored, I will try! :)

  • Of course; it is possible that it is a matter of versions (the centos is usually more conservative than the Fedora) my version is pdftoppm -v
pdftoppm version 0.18.1
Copyright 2005-2011 The Poppler Developers - http://poppler.freedesktop.org
Copyright 1996-2004 Glyph & Cog, LLC

Browser other questions tagged

You are not signed in. Login or sign up in order to post.