How to read and view docx Files using PHP
- Article
- Comment (18)
How to read and view docx Files using PHP. Now days processing Word Document is becoming more popular. Even you can create a new Word Document and process with it. My previous article describes you to create Word Document by using PHP.
Today we are going to discuss about reading the Docx files and convert it into text and view it online. Let’s begin with steps and codes,
<?php function kv_read_word($input_file){ $kv_strip_texts = ''; $kv_texts = ''; if(!$input_file || !file_exists($input_file)) return false; $zip = zip_open($input_file); if (!$zip || is_numeric($zip)) return false; while ($zip_entry = zip_read($zip)) { if (zip_entry_open($zip, $zip_entry) == FALSE) continue; if (zip_entry_name($zip_entry) != "word/document.xml") continue; $kv_texts .= zip_entry_read($zip_entry, zip_entry_filesize($zip_entry)); zip_entry_close($zip_entry); } zip_close($zip); $kv_texts = str_replace('</w:r></w:p></w:tc><w:tc>', " ", $kv_texts); $kv_texts = str_replace('</w:r></w:p>', "\r\n", $kv_texts); $kv_strip_texts = nl2br(strip_tags($kv_texts,’‘)); return $kv_strip_texts; } ?>
The above function will helps you to get parse the text’s in a Word Document and return it.
Now, we need to give the input file and its path as input to the function and print it for results.
<?php $kv_texts = kv_read_word('path/to/the/file/kvcodes.docx'); if($kv_texts !== false) { echo nl2br($kv_texts); } else { echo 'Can't Read that file.'; } ?>
That’s it to read a docx file and print it as text.
I have another article for WordPress user, who can try this to process Docx files using php and WordPress
Hi Did check your codes but cant make it word getting error on line 17 Undefined variable $kv_texts
I check before publishing it. Now, the code is working, actually, i changed the variable declaration. Check it now.
Header part did not read using this code
it will get you the content. I am not sure about the header and footer
Its really helps me .. but I can’t parse image form docs file ..
i just parsed the texts as like text documents. so better try phpword to do more functional operations.
Change this one line of code to escape the apostrophe in the “Can’t” word:
echo ‘Can\’t Read that file.’;
Also, you might want to convert the ‘/r/n’ to line breaks, if you need to show the text as HTML
$kv_strip_texts = nl2br(strip_tags($kv_texts,’‘));
Otherwise, great little function that appears to work well with simple Word docs.
…Rick…
i just updated your changes. thanks for your suggestions..
hello brother may i know how to read multiple docx in php.. i already read with only one docx, but i need 2 file docx can be read thank’s i’ll visit you soon thank’s
you can write all those it in functions. you just create another loop to pass the file url as parameter, you can query it.
Hi i am using the same code.Its working on the some docx ..But not docx..
You know what when in the docx file have 2 tab space then function zip_entry_read() combined the word..I have alots of docx..I tested too much..BUt not found any exect solution..If you have any idea then please let me know
sorry no idea about it
Hi, I’ve run the code but it didn’t work out superscript/subscript, and lower case Greek symbols (α,β,γ,δ,ε,…).
Please could you tell me how to do it?
Thanks
Sorry,I didn’t work with other languages and symbols. And superscript and Subscript is also i don’t have answer for you.
HI,
I am not able to read header content from the DOC file .Please help on that
I am not sure without seeing your code
If docx file contains an image within it then it is not showing after reading the docx file.
Please help me.
For simple texts you can use this simple method to work on. For images and tables, you need to goto PHPWord. I hope it helps you to work on