How to diff .odt files with difftool? kdiff3 diff outputs unreadable characters
In git I'm trying to use .gitattributes to compare .odt files, libreofice writer files, with difftool. From following this guide: http://www-verimag.imag.fr/~moy/opendocument/ I made a .gitattributes file .gitattributes with this:
*.ods diff=odf *.odt diff=odf *.odp diff=odf *.ods difftool=odf *.odt difftool=odf *.odp difftool=odf
This made git diff compare the text in .odt, however when git difftool launches kdiff3 to compare the .odt files, I get this pop-up error:
Some input characters could not be converted to valid unicode. You might be using the wrong codec. (e.g. UTF-8 for non UTF-8 files). Don't save the result if unsure. Continue at your own risk. Affected input files are in A, B.
...and all of the characters in the files are mumbo jumbo.
What went wrong? How do I fix this?
I don't know if this is important but I guess I haven't configured 'diff.tool', because every time I command:
$ git difftool
I get this output:
This message is displayed because 'diff.tool' is not configured. See 'git difftool --tool-help' or 'git help config' for more details. 'git difftool' will now attempt to use one of the following tools: opendiff kdiff3 tkdiff xxdiff meld kompare gvimdiff diffuse diffmerge ecmerge p4merge araxis bc codecompare emerge vimdiff Viewing (1/1): 'diffexperiment.odt' Launch 'kdiff3' [Y/n]:
Could that be why kdiff3 doesn't seem to work with odt2txt?
EDIT: I retried to do this with Microsoft Word documents and got a little further here.
I played around with .kdiff3rc configuration... none of the options I added seemed to make the unreadable characters readable. I changed the comparison tool to vimdiff; and when I did git difftool on microsoft word documents, vimdiff displayed a list of files ending in .xml instead of unreadable characters.
When I pushed enter on one of the files this displayed:
<?xml version="1.0" encoding="UTF-8"?> " Browsing zipfile /tmp/4LMJbj_HI I am writing something here..docx |<Types xmlns="http://schemas.openxmlformats.org/package/2006/content-types"><Override PartName " Select a file with cursor and press ENTER |="/_rels/.rels" ContentType="application/vnd.openxmlformats-package.relationships+xml"/><Overr |ide PartName="/word/settings.xml" ContentType="application/vnd.openxmlformats-officedocument.w _rels/.rels |ordprocessingml.settings+xml"/><Override PartName="/word/_rels/document.xml.rels" ContentType= word/settings.xml |"application/vnd.openxmlformats-package.relationships+xml"/><Override PartName="/word/fontTabl word/_rels/document.xml.rels |e.xml" ContentType="application/vnd.openxmlformats-officedocument.wordprocessingml.fontTable+x word/fontTable.xml |ml"/><Override PartName="/word/styles.xml" ContentType="application/vnd.openxmlformats-officed word/numbering.xml |ocument.wordprocessingml.styles+xml"/><Override PartName="/word/document.xml" ContentType="app word/styles.xml |lication/vnd.openxmlformats-officedocument.wordprocessingml.document.main+xml"/><Override Part word/document.xml |Name="/docProps/app.xml" ContentType="application/vnd.openxmlformats-officedocument.extended-p docProps/app.xml |roperties+xml"/><Override PartName="/docProps/core.xml" ContentType="application/vnd.openxmlfo docProps/core.xml |rmats-package.core-properties+xml"/> [Content_Types].xml |</Types>
I posted a new question on this issue here.
You would need, in addition of the .gitattribute, to configure what odf means:
git config diff.odf.textconv odt2txt
And you need odt2txt (a simple converter from OpenDocument Text to plain text) in your $PATH (Linux/Mac) or %PATH% (Windows).
No need to configure difftool, as kdiff3 is enough by default. But kdiff3 needs to open a text file, hence the need to odt2txt (in order to convert first the doc into a text file)
For more on textconv, see "Performing text diffs of binary files":
Sometimes it is desirable to see the diff of a text-converted version of some binary files. For example, a word processor document can be converted to an ASCII text representation, and the diff of the text shown. Even though this conversion loses some information, the resulting diff is useful for human viewing (but cannot be applied directly).
The textconv config option is used to define a program for performing such a conversion. The program should take a single argument, the name of a file to convert, and produce the resulting text on stdout.
The text conversion is generally a one-way conversion; This means that diffs generated by textconv are not suitable for applying.
For this reason, only git diff and the git log family of commands (i.e., log, whatchanged, show) will perform text conversion. git format-patch will never generate this output.
If you want to send somebody a text-converted diff of a binary file (e.g., because it quickly conveys the changes you have made), you should generate it separately and send it as a comment in addition to the usual binary diff that you might send.
On Linux I ran in my home directory:
$ git config diff.odf.textconv odt2txt
I had odt2txt installed... and I assume odt2txt is in $PATH, because when I run $ odt2txt, I get information on odt2txt. However, none of those things seem to make git diff .odt files for some reason. When I $ git diff fileone.odt filetwo.odt, I still get the output of Binary files fileone.odt and filetwo.odt differ instead of exactly how the text differentiates. Not sure why it's not working.
My guess is that kdiff3 in your case
Some input characters could not be converted to valid unicode. You might be using the wrong codec. (e.g. UTF-8 for non UTF-8 files)....
complains because it cannot find glyph for a certain character(s) for particular font, i.e. it cannot draw it (them).
kdiff3 has lots of configuration options that can be set in ~/.kdiff3rc configuration file (here is example). I would play with some of them related to encoding and font. For example, start with changing fonts, e.g.
BTW, when you open these odt files with your editor - which readable for you font it is?
PS Options can be also passed to kdiff3 in command line: kdiff3 --cs "Option1=Val1" --cs "Option2=Val2" --cs ...