Monday, March 6, 2017

Weird issue: java.nio.charset.IllegalCharsetNameException: UTF-8



I recently met a weird issue, thought I’d noted down for future references.

When I tried to ant build, this error happened:

Error occurred during initialization of VM
java.nio.charset.IllegalCharsetNameException: UTF-8
        at java.nio.charset.Charset.checkName(Charset.java:315)
        at java.nio.charset.Charset.lookup2(Charset.java:484)
        at java.nio.charset.Charset.lookup(Charset.java:464)
        at java.nio.charset.Charset.defaultCharset(Charset.java:609)
        at sun.nio.cs.StreamEncoder.forOutputStreamWriter(StreamEncoder.java:56)
        at java.io.OutputStreamWriter.<init>(OutputStreamWriter.java:111)
        at java.io.PrintStream.<init>(PrintStream.java:104)
        at java.io.PrintStream.<init>(PrintStream.java:151)
        at java.lang.System.newPrintStream(System.java:1148)
       at java.lang.System.initializeSystemClass(System.java:1192)

This is easy, a few googled showed this error has something to do with how you pass encoding to the java runtime argument. ant --help shows ant has an argument execdebug, which will show the execution command, so:

vagrant@exp:/PPM941/SourceCode/java$ ant --execdebug

 -classpath "/usr/share/ant/lib/ant-launcher.jar" -Dant.home="/usr/share/ant" -Dant.library.dir="/usr/share/ant/lib" org.apache.tools.ant.launch.Launcher

Now here comes the weird part, the red part is supposed to be the java execution command, but it can’t be! A normal java execution command should start with “java”, and followed with a bunch of arguments, what is this red part?

The insight comes slowly, after a long time tinkering (with lots of debugging on the /usr/bin/ant file), I began to suspect, could Linux mess up with the string concatenation? Google “strange string concatenation linux" does show a lot of results.

To prove this, I used sed –n l, sed l shows the unprintable characters; sed –n suppresses automatic printing (so the red part in the previous paragraph doesn’t show):

$ ant --execdebug | sed -n l

exec "/usr/lib/jvm/java-8-openjdk-amd64/jre/bin/java" -Xmx520m -Dfil\
e.encoding=UTF-8\r -classpath "/usr/share/ant/lib/ant-launcher.jar" -\
Dant.home="/usr/share/ant" -Dant.library.dir="/usr/share/ant/lib" org\
.apache.tools.ant.launch.Launcher

In the output, you can clearly see that UTF-8 is followed by a \r carriage return.

Weirder! How did it get in? Eventually, I noticed that I accidentally executed a file with windows end-of-line in linux. This file will export some environmental variables, one of them is ANT_OPTS="-Xmx520m -Dfile.encoding=UTF-8". Linux shell executed this file even though it has windows end-of-line (though it did throw out some warnings, which slipped my attention).


The issue is strange because its symptom has nothing to do with the true nature of the issue.
 

No comments:

Post a Comment