FAQ Database Discussion Community

## batch file for runing a java command

batch-file,docx
I have to run the following command for hundreds of .docx files in a directory in a windows in order to convert them to .txt. java -jar tika-app-1.3.jar -t somedocfile.doc > converted.txt I was wondering if there is any automatic way such as writing a ".bat" file to do this....

## Opening Word (.docx) files on a Windows Form C#

c#,winforms,webbrowser-control,docx
I'm trying to make my program have the ability to display a Microsoft Word file on a form but not having any luck in doing so. I want to be able to open the file and display it on the form as a Read-Only. So basically just display it's contents....

## How to insert line break into Word (docx) document using OpenXMLPowerTools?

c#,.net,openxml,docx,openxml-sdk
I'm writing a library which generates Word documents based on a template. Some text needs to be replaced with another text. Everything seems to be working, there is a TextReplacer class which may perform replacements. The things become worse when I need to replace a single-line part of text with...

## PHPWord corrupted file?

php,ms-word,openxml,docx,phpword
My basic PHPWord setup is working. This is my code: <?php require_once 'PhpWord/Autoloader.php'; \PhpOffice\PhpWord\Autoloader::register(); function getEndingNotes($writers) {$result = ''; // Do not show execution time for index if (!IS_INDEX) { $result .= date('H:i:s') . " Done writing file(s)" . EOL;$result .= date('H:i:s') . " Peak memory usage: "...

## Converting a docx containing a chart to PDF

java,pdf,charts,docx,docx4j
I've got a docx4j generated file which contains several tables, titles and, finally, an excel-generated curve chart. I have tried many approaches in order to convert this file to PDF, but did not get to any successful result. Docx4j with xsl-fo did not work, most of the things included in...

## Reading docx files, recognizing and storing italicized text

python,string,docx
How should I go about reading a .docx file with Python and being able to recognize the italicized text and storing it as a string? I looked at the docx python package but all I see is features for writing to a .docx file. I appreciate the help in advance...

## Apply a TableStyle to a Word Table

c#,ms-word,openxml,docx
Trying to style a table using a predefined style but nothing is working. I've tried with a a newly created document and one created from a saved template. Using the SDK Productivity tool I can see the style is there in the template but it's not being applied. I've tried...

## Changing XML of docx to load images from web

xml,image,ms-word,docx
Basically, my docx size is very big and it has many images and I wanted to reduce the size of it, I tried everything, compressed the images and etc, so from 25MB I got it to 13MB. But I wanted to lower it more so I was playing around and...

## Convert docx to mediawiki and preserve [[Image:]]

converter,mediawiki,docx,libreoffice,soffice
Currently, I'm trying to move a docx to a mediawiki file and preserve the proper filenames in the [[Image:]] tags. For some reason, the proper image file gets swallowed (ie, normally it'd be media/image4.jpg, but instead it's just empty). I've tried extracting the docx and looking at docx/word/_rels/document.xml.rels but I...

## DocX c# libary changing page format on InsertSectionPageBreak

c#,ms-word,docx
I am working whit DocX a library for creating Microsoft .docx files inside c#. https://docx.codeplex.com/ I am loading an preexisting file into the program and then adding content. It was the easiest way to get a pre defined header. I noticed that if i use InsertSectionPageBreak the page format of...

## POI ignoring some snippets of docx

java,xml,apache-poi,docx
I'm trying to use this code (POI 3.11) to extract text from a docx file: XWPFDocument doc = new XWPFDocument(OPCPackage.open("sample.docx")); for (XWPFParagraph p : doc.getParagraphs()) { List<XWPFRun> runs = p.getRuns(); if (runs != null) { for (XWPFRun r : runs) { String text = r.getText(0); System.out.println(text); } } } Here...

## Unable to connect the LibreOffice on port 2002?

docx,libreoffice,doc,libreoffice-base,docverter
I am using the docvert 5.1 for convert .doc to html.When i run the "Tests (run all)" during I am getting the error message under the following parts: " ✘Unable to run tests due to exception. Failed to connect to LibreOffice on port 2002. Connector : couldn't connect to socket...

## How to find a list in docx using python?

python,docx,python-docx
I'm trying to pull apart a word document that looks like this: 1.0 List item 1.1 List item 1.2 List item 2.0 List item It is stored in docx, and I'm using python-docx to try to parse it. Unfortunately, it loses all the numbering at the start. I'm trying to...

## get docx file contents using javascript/jquery

javascript,jquery,docx
wish to open / read docx file using client side technologies (HTML/JS). kindly assist if this is possible . have found a Javascript library named docx.js but personally cannot seem to locate any documentation for it. (http://blog.innovatejs.com/?p=184) the goal is to make a browser based search tool for docx files...

## Apache POI characters run for .docx

java,api,apache-poi,document,docx
In .doc files, There is a function to get each character in paragraph by using CharacterRun charrun = paragraph.getCharacterRun(k++); and then I can use those character runs to inspect their attributes like if ( charrun.isBold() == true) System.out.print(charrun.text()); or something like that. But with .docx files seems to have no...

xslt,docx,toc
I am trying to retrieve the TOC from docx's document.xml file using XSLT Here is my XSLT: <xsl:transform xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:sap="http://www.sap.com/sapxsl" xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main" exclude-result-prefixes="w" version="2.0"> <xsl:output indent="yes" method="xml"/> <xsl:strip-space elements="*"/> <xsl:template match="w:sdt"> <xsl:element name="root"> <xsl:attribute name="label"> <xsl:value-of...

## Output Web Page as DOCX?

docx
In the past, if I wanted a web page to display as a .DOC word document, I could do so by doing this in the page load: Response.AddHeader("content-disposition", "attachment;filename=FullDetail.doc") Response.ContentType = "application/vnd.word" I was hoping to output the web page as a .DOCX by doing: Response.AddHeader("content-disposition", "attachment;filename=FullDetail.docx") Response.ContentType = "application/vnd.openxmlformats-officedocument.wordprocessingml.document"...

## Docx generation - reuse

pdf-generation,docx,xdocreport
I'm looking to generate docx and pdf documents in my java application. The best, most cost effective solution seems to be xdocreport - I've started using it and it's good. However, xdocreport doesn't seem to allow reuse of common sections across documents. Eg. I want to create two documents -...

## Asciidoc and math equation does not render on .docx

docx,doc,pandoc,asciidoc,asciidoctor
I'm trying to convert .adoc files to .docx Actually I'm using: asciidoctor file.adoc -o file.html pandoc -s -S file.html -o output.docx My math equations or symbols inside .adoc are equal to: latexmath:[$\phi$] and more text as Inline test latexmath:[$\sin(x)$] It returns after conversion to docx the strange lines inside .docx:...

## Using WordprocessingDocument error: Unable to create mutex

c#,.net,openxml,docx,openxml-sdk
I'm using this simple pattern to create a docx file in an ASP.NET app: var outputFileName = "creating some file name here..."; var outputFile = string.Format("~/App_Data/files/{0}.docx", outputFileName); // creating a file stream to write to var outputStream = new FileStream(HttpContext.Current.Server.MapPath(outputFile), FileMode.OpenOrCreate); // creating the default template using (var sr =...

## Cannot count number of characters in a docx file generated from a not empty XHTML

java,jaxb,xhtml,docx,docx4j
I implemented a XHTML converter to DocX using DocX4J. It creates the DocX file without problems. To finish my task I decided to implement a simple test. The test consists in counting the number os chars in the DocX created and then comparing it with the already known number of...

## How to get rmarkdown and knitr to use em-dash with .docx files?

knitr,docx,rmarkdown
I am new to using rmarkdown and knitr to produce .docx word documents. The rmarkdown reference guide states that using -- gives an en-dash, and --- gives an em-dash. If I knit my .Rmd file to HTML then the en-dashes and em-dashes are working correctly, however when knitting to a...

## Jinja2 for word templating

python,jinja2,template-engine,docx
I would like to use jinja2 for word templating like mentioned is this short article. The problem I'm facing is as follows, if I put {{title}} in my word-file the resulting xml can look like this: <w:r><w:t>{{</w:t></w:r><w:proofErr w:type="gramStart"/><w:r><w:t>title</w:t></w:r><w:proofErr w:type="gramEnd"/><w:r><w:t>}}</w:t></w:r></w:p> so it is impossible for jinja to replace this accordingly. Is...

## Regex to match xml tag with multiple attributes

regex,docx
I'm trying to find a regular expressino that can match the tag <w:proofErr .... />. The regex101 link: regex101 The original string is: <w:pPr xmlns:w="http://schemas.openxmlformats.org/wordprocessingml/2006/main"><w:autoSpaceDE w:val="0"/><w:autoSpaceDN w:val="0"/><w:adjustRightInd w:val="0"/><w:spacing w:after="0" w:line="240" w:lineRule="auto"/><w:rPr><w:rFonts w:cs="SerifGothicStd-Bold"/><w:b/><w:bCs/><w:sz w:val="24"/><w:szCs...

## PHP DOM Document ignore whitespaces appending nodeValue

php,dom,docx
I'm working with MS Office Word document through PHP and DOM. I am adding paragraphs to my document. And now I have to make the part of string bold (it becomes from database and I'm unable to change it). Like this: The part of string is bold really. What I...

## set case of content bound via content controls in docx

ms-word,docx,docx4j
I have a docx file that contains a custom part and a web page that collects input from the user to populate that custom part. One of my "variables" is used multiple times in the document. In some cases, I need it to appear in ALL CAPS. In most cases,...