Difference between   and  

Can any one explain me difference between   and   ?

I have html data stored in database in binary form and space in that can be either of   or   or sometimes  .

Also issue is when I convert this HTML to plain text using JSoup lib it is converting it properly but if I use String.contains(my string) method of java. It looks like the HTML data which is having   is different from which is having  . String is not found in either vice versa.


HTML1 : This is my test string

HTML2 : This is my test string

If I convert it to plain text using JSoup. It returns

HTML 1 : This is my test string

HTML 2 : This is my test string

But still both string are not same. Why is it so?


  is the classic space, the one you get when you hit your spacebar, represented by his HTML entity equivalent.

  and   represents the non-breaking space, often used to prevent collapse of multiple spaces togethers by the browser :

"    " => " " (collapsed into only one space)

"    " => "    " (not collapsed)

If you are parsing a string containing both classic and non-breaking spaces, you can safely replace one by the other.

&#32 is the character for the space key.

&#160 and &nbsp are both the characters for Non breaking space.

If your data has come from different sources it may be possible that the space symbols have been encoded differently.

In direct comparison they will likely be shown as being different.

 , is just a space character nothing more. Regular occurrence of this character will collapse to one space character at the end.

Where as &#160 and   both represent non-breaking space character and if they occur continuously one after another, they will be collapse or break to one space character.

Only, difference between them is that &#160 is the HTML number and   is a HTML name.

Basically all of these are HTML entities. You can learn and know about them, seeing the following links.

  1. Link 1
  2. Link 2

Java 8 onwards following should work:

string.replace("\\h", " ");

where \h is a horizontal whitespace character as described here

Need Your Help

URL/HTML Escaping/Encoding

php html url urlencode html-entities

I have always been confused with URL/HTML Encoding/Escaping. I am using PHP, so want to clear somethings up.