Reading Image data from Table's Cell via Apache POI

I am stuck in one place and I need help immediately,following is the my problem.

Actually I am using Apache POI (XWPF) to read word (.docx) document, I am able to read successfully the table data except the image which is also inside the table's cell. Since I am new to this Api but as per my understanding i think we can read image byte data from cell as well.

POIXMLDocumentPart pictureData=(POIXMLDocumentPart)imageCell.getPart();

PackageRelationship packageRelationship=pictureData.getPackageRelationship();

System.out.println("Source URI:"+packageRelationship.getSourceURI());

System.out.println("Target URI:"+packageRelationship.getTargetURI());

in above code I can get Image URI as Target but i don't know how to get Image's binary data.

Any Idea friends...

Thanks, -Javed

Answers


First, from a Table Cell, get the list of paragraphs. Next, from the paragraph, get the list of Runs. Finally, from the run, get the list of pictures embedded in the run, and you're largely there.

The .docx text extractor in Apache Tika shows how to do all this, see the source code of it for details. Generally though, it's something like

for (XWPFParagraph p : cell.getParagraphs()) {
  for (XWPFRun run : p.getRuns()) {
    for (XWPFPicture pic : run.getEmbeddedPictures()) {
       byte[] pictureData = pic.getPictureData().getData();
    }
  }
}

.

There is also a much less common kind of way of embedding pictures into a .docx file, which is much more fiddly to work with. On a XWPFDocument, you can use getAllPictures() and getAllPackagePictures() to track down the others, but that won't tell you where in the file the pictures belong.


Need Your Help

F# and Enterprise Software

c# f# functional-programming enterprise

Being a C# developer since version 1.0, F# has captured my free time for the past few weeks. Computers are now sold with 2, 4 .. Cores and multi-threading is not always simple to accomplish.

How to create a see-through triangle?

css html5 css3 svg

I am trying to create a navbar with a triangle cutout. I put together an example so you can see what I am after: