7
I have a system where the files the client will send me are all on file DOC
or DOCX
. But the same wants to be possible to download this document in format TXT
.
Is there any simple way to convert DOC
or DOCX
stop TXT
through PHP?
7
I have a system where the files the client will send me are all on file DOC
or DOCX
. But the same wants to be possible to download this document in format TXT
.
Is there any simple way to convert DOC
or DOCX
stop TXT
through PHP?
3
I managed to solve the problem. I did it this way:
I open the WORD Document through the class IOFactory
library PHPWord
.
$reader = PHPOffice\PhpWord\IOFactory::createReader('Word2007');
$phpword = $reader->load('arquivo.docx');
Save the file as HTML
in a temporary archive:
$tempfile = tempnam(sys_get_temp_dir());
$phpword->save($tempfile, 'HTML');
I use the class DomDocument
to find only the tag body
$dom = new DomDocument('1.0', 'UTF-8');
@$dom->load($tempfile); // Essa arroba é normal ;)
$body = $dom->getElementsByTagName('body')->item(0)->nodeValue;
Then I do the schematic to format the HTML
. I also configure it to display correctly in the notepad of Windows
, exchanging "\n"
for "\r\n"
.
$txt = str_replace("\n", "\r\n", strip_tags($body));
file_put_contents('arquivo.txt', $txt);
Browser other questions tagged php
You are not signed in. Login or sign up in order to post.
Have you tried Phpword?
– rray
Precisely, @rray, I forgot to specify this in the question. You know how to do this in
PHPWord
?– Wallace Maxters
I believe these two links can help you: http://stackoverflow.com/questions/19503653/how-to-extract-text-from-word-file-doc-docx-xlsx-pptx-php http://stackoverflow.com/questions/5540886/extract-text-from-doc-and-docx
– GuiDupas
here is another example: http://stackoverflow.com/questions/188452/reading-writing-a-ms-word-file-in-php
– Ivan Ferrer
Thank you guys. I’m starting to get it. A file
DOCX
is a masked zipped file. If you change the extension of the same toZIP
, you will see that it has several filesXML
for formatting.– Wallace Maxters