Java FileWriter, XML and UTF-8

Oddly enough the class doesn’t use UTF-8 by default. I’m not exactly sure what the default encoding is (possibly ISO-8859-1 or US-ASCII?) but it doesn’t seem to be UTF-8, which is odd given that java strings are supposed to be unicode. This causes a problem if you want to have non-ascii characters and you don’t realise what’s happening. This was a bug in SQLEditor and somebody accidentally typed an umlaut into one of the fields and the file wouldn’t reload. (Which was annoying).

The correct thing to do seems to be to use the following:

OutputStreamWriter out = new OutputStreamWriter(new FileOutputStream(path),"UTF-8");

Which ensures that you are using UTF-8.

I suppose that the motivation for this is that it means that simple use of FileWriter is compatible with applications that are not unicode aware and don’t support UTF-8. It probably makes sense at some level, but it just goes to show that you can’t assume anything. 🙂

Update: Bela’s comment (below) explains more about which character set you’ll actually get.

43 Responses to Java FileWriter, XML and UTF-8

  11. Bela says:

    Köszönöm (thx in Hungarian)

    I’ve read after this article the Java API carefully and there is the answer: (

    “Convenience class for writing character files. The constructors of this class assume that the default character encoding and the default byte-buffer size are acceptable. To specify these values yourself, construct an OutputStreamWriter on a FileOutputStream. ”

    You get the default character encoding on your system:
    System.getProperty(file.encoding) => I have the cp1252

    So, never use FileWriter! It is everything, but convenient.

    Your answer give me the absolute answer of my question. I used same concept for utf-16 encoding, for my encryption -decryption project. I come up with success. But have still a problem while decryption it is saving file in such [] blocks everytime, but reading it write. I checked the utf-16 code it is reading. I would like to chat you about the problem any time you would like.

    I was using fileWriter and was facing some problems with the copyright symbol, due to which my xml contained invalid characters. Your line of code gave exactly what I was looking for….

    first result from Google:
    “java xml output utf-8”

    just to add:
    looking at:
    made me add like this:
    Writer out = new BufferedWriter( new OutputStreamWriter(new FileOutputStream(this.outputFilename),”UTF-8″));

    (I guess that’s what they call the “decorator pattern” in for example:

    I found similar problem in March 2008 with reading UTF-8 encoded files in. I wrote it up here:

