what is the use of character stream in java?

what is the use of character stream in java? why we need while there is byte streams?I hav gone through many sites i dint get it clearly...Please make me clear...please dont paste google results


I assume you mean the Reader interface? If you're working with text, you don't want to have to worry about binary/text conversions at every layer. Indeed, you may not even have binary data to convert, conceptually - look at StringReader for example.

Having the Reader interface makes it clear that you really want text, and that the binary data underlying that text is irrelevant to you, so long as it can be provided as a sequence of Unicode characters somehow.

It's worth being very clear in your mind that binary and text are different. If you try to treat arbitrary binary data as text (e.g. trying to read an image file into a string) you will lose information. Why would you not want different types to deal with the different forms of data?

It's a bit like saying, "We could store the data for every object in a byte array, and convert pieces of that as and when we need to." Well yes, we could (if you could convert opaque binary data to/from a reference) but it would be hideous in terms of encapsulation. The Reader interface is a layer of encapsulation, allowing various data sources to expose character data (whether they need to decode binary data or not), and other code to consume that character data, without having to be aware of exactly where it comes from.

I'm sorry, but google results actually answer your question. From Sun's website (first google hit for character stream java):

Why use character streams? The primary advantage of character streams is that they make it easy to write programs that are not dependent upon a specific character encoding, and are therefore easy to internationalize.

A second advantage of character streams is that they are potentially much more efficient than byte streams. The implementations of many of Java's original byte streams are oriented around byte-at-a-time read and write operations. The character-stream classes, in contrast, are oriented around buffer-at-a-time read and write operations. This difference, in combination with a more efficient locking scheme, allows the character stream classes to make up for the added overhead of encoding conversion in many cases.

Character streams are handy way to deal with character data like text file. You can take take a byte stream and supply it a character encoding and effectively convert it into with character stream and similarly you can take a character stream and supply a character encoding and effectively convert it into a byte stream.

The conversion usually done by decorating streams. As you may guess, a character encoding is an algorithm to convert chars to bytes and vice-versa.

At the risk of sounding sarcastic its purpose is to read a stream of characters.

The point is that characters are not the same as bytes.

A byte is a collection of 8 bits the only varation possible being whether its big-endian or little-endian.

A character is a more complex little beast. It belongs to a character set and is affected by National Language Settings.

The simplest is ASCII with NLS set to "C" which is pretty much identical to a bytes, except that the values have specific meano=ings eg x'30' is ASCII character '0' which will return true if an 'isNumber()' method is applied. Next up the scale are the various ISO eight bit code pages which shuffle the character assignments above x'7F' around normally to handle european accented characters.

In addition there are other encodings such as EBCDIC which are still used widely -- here a '0' is encoded as x'F0'.

Then there is unicode-16 which will en-code several thousand characters from various widly used alphabets and unicode-32 which encodes several million characters from widely and not so widely used (e.g. Klingon and Mycinean) alphabets.

There is the intermediate utf-8 encodeing which leaves plain old ASCII characters as is coded in 7 bits but also has fiendishly clever algorithms for storing the entire unicode alphabet in two, three or four bytes.

In addition there are "legacy" far eastern coding schemes for Japanese and chinese characters which have complex schemes for indicating if a character is held in one, two or three bytes.

The point is that that character stream class knows about all these code pages and can do intelligent things with character input such as convert cp874 to unicode-16 which the byte stream classes cannot.

I think it's just a matter of following a hierarchy. The same reason there are strings, even though you could simply use a character array instead of a string.

Yes it is possible to handle text data by using InputStream/OutputStream, but Reader/Writer interfaces provide useful methods to work with text data.

Essentially the issue is that "a byte" does not map well to "a character". Unicode specifies more characters than fit in 16 bits, so 8 bits are not enough.

There are more reasons but that is the primary one.

Need Your Help

Securing communication [Authenticity, Privacy & Integrity] with mobile app?

android iphone python django security

An Android/Iphone app will be accessing application data from the server.

MVC audio controls playing song from bytes

asp.net-mvc byte html5-audio

I have my songs stored in database as bytes[]. How do I use these in the <audio> tag.