Invalid byte 1 of 1-byte UTF-8 sequence

I have a MyFaces Facelets application, where the page coding is a bit rugged. Anyway, it's developed with Eclipse and built with Ant, and kindof runs ok in Tomcat 2.0.26. So far so good.

Now, I'd rather build with Maven, so I made a couple of pom-files, opened them in Netbeans and built, and now I have a war file that deploys ok. However, on any facelet page it barfs out with

com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException: Invalid byte 1 of 1-byte UTF-8 sequence.
        at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.invalidByte(UTF8Reader.java:684)
        at com.sun.org.apache.xerces.internal.impl.io.UTF8Reader.read(UTF8Reader.java:554)
        at com.sun.org.apache.xerces.internal.impl.XMLEntityScanner.load(XMLEntityScanner.java:1742)

So, I've tried a lot of different things, and the application actually run simple pages without facelet stuff. But, everything runs if I just build with Ant instead ... So my question is: What's the most likely difference between an ant build and a maven build that may cause this?

It also seems that even though I've configured for UTF-8 in Netbeans and pom-files, Netbeans eventually ends up reporting the facelet files as ISO-8859-1 after some editing.

I've made sure that most central libs are of same version (especially xerces 2.3.0), I've added an encoding servlet filter that had no effect.

And, I'd rather fix the maven build and keep the buggy pages, than the other way around ... it's my intention to introduce Naven, not fix buggy pages.

Here is what the pom.xml says about encoding:

Basically the pom.xml has the following set ...

 <plugins>

            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-compiler-plugin</artifactId>
                <version>2.0.2</version>
                <configuration>
                    <source>1.6</source>
                    <target>1.6</target>
                    <encoding>${project.build.sourceEncoding}</encoding>>
                </configuration>
            </plugin>
            <plugin>
                <groupId>org.apache.maven.plugins</groupId>
                <artifactId>maven-resources-plugin</artifactId>
                <version>2.2</version>
                <configuration>
                    <encoding>${project.build.sourceEncoding}</encoding>
                </configuration>
            </plugin>

....

    <properties>
        <netbeans.hint.deploy.server>Tomcat60</netbeans.hint.deploy.server>
        <project.build.sourceEncoding>utf-8</project.build.sourceEncoding>
    </properties>

Answers


com.sun.org.apache.xerces.internal.impl.io.MalformedByteSequenceException: Invalid byte 1 of 1-byte UTF-8 sequence.

The cause of this is a file that is not UTF-8 is being parsed as UTF-8. It is likely that the parser is encountering a byte value in the range FE-FF. These values are invalid in the UTF-8 encoding.

The problem could probably be solved by changing the XML declaration of the file to be the correct encoding or re-encoding the file to UTF-8.


On Windows it's very easy. Get Notepad++ if you don't have it, and change the encoding using the "encoding" menu.


I had the same problem!

I had solved it using the following piece of code:

String str = new String(oldstring.getBytes("UTF-8"));

Need Your Help

ViewPager Not Updating Correct Detail

android android-recyclerview fragmentpageradapter

I have a ViewPager full of different people with work orders assigned to them. I'm trying to follow the master/detail flow with the people as the view pagers, master is the list of work orders, and...

write Python string object to file

python string object

I have this block of code that reliably creates a string object. I need to write that object to a file. I can print the contents of 'data' but I can't figure out how to write it to a file as output...