7
I have a system where the files the client will send me are all on file DOC or DOCX. But the same wants to be possible to download this document in format TXT.
Is there any simple way to convert DOC or DOCX stop TXT through PHP?
7
I have a system where the files the client will send me are all on file DOC or DOCX. But the same wants to be possible to download this document in format TXT.
Is there any simple way to convert DOC or DOCX stop TXT through PHP?
3
I managed to solve the problem. I did it this way:
I open the WORD Document through the class IOFactory library PHPWord.
$reader = PHPOffice\PhpWord\IOFactory::createReader('Word2007');
$phpword = $reader->load('arquivo.docx');
Save the file as HTML in a temporary archive:
$tempfile = tempnam(sys_get_temp_dir());
$phpword->save($tempfile, 'HTML');
I use the class DomDocument to find only the tag body
$dom = new DomDocument('1.0', 'UTF-8');
@$dom->load($tempfile); // Essa arroba é normal ;)
$body = $dom->getElementsByTagName('body')->item(0)->nodeValue;
Then I do the schematic to format the HTML. I also configure it to display correctly in the notepad of Windows, exchanging "\n" for "\r\n".
$txt = str_replace("\n", "\r\n", strip_tags($body));
file_put_contents('arquivo.txt', $txt);
Browser other questions tagged php
You are not signed in. Login or sign up in order to post.
Have you tried Phpword?
– rray
Precisely, @rray, I forgot to specify this in the question. You know how to do this in
PHPWord?– Wallace Maxters
I believe these two links can help you: http://stackoverflow.com/questions/19503653/how-to-extract-text-from-word-file-doc-docx-xlsx-pptx-php http://stackoverflow.com/questions/5540886/extract-text-from-doc-and-docx
– GuiDupas
here is another example: http://stackoverflow.com/questions/188452/reading-writing-a-ms-word-file-in-php
– Ivan Ferrer
Thank you guys. I’m starting to get it. A file
DOCXis a masked zipped file. If you change the extension of the same toZIP, you will see that it has several filesXMLfor formatting.– Wallace Maxters