Difference between   and
Can any one explain me difference between   and ?
I have html data stored in database in binary form and space in that can be either of or   or sometimes  .
Also issue is when I convert this HTML to plain text using JSoup lib it is converting it properly but if I use String.contains(my string) method of java. It looks like the HTML data which is having is different from which is having  . String is not found in either vice versa.
HTML1 : This is my test string
HTML2 : This is my test string
If I convert it to plain text using JSoup. It returns
HTML 1 : This is my test string
HTML 2 : This is my test string
But still both string are not same. Why is it so?
is the classic space, the one you get when you hit your spacebar, represented by his HTML entity equivalent.
and   represents the non-breaking space, often used to prevent collapse of multiple spaces togethers by the browser :
"    " => " " (collapsed into only one space)
" " => " " (not collapsed)
If you are parsing a string containing both classic and non-breaking spaces, you can safely replace one by the other.
is the character for the space key.
and   are both the characters for Non breaking space.
If your data has come from different sources it may be possible that the space symbols have been encoded differently.
In direct comparison they will likely be shown as being different.
, is just a space character nothing more. Regular occurrence of this character will collapse to one space character at the end.
Where as   and both represent non-breaking space character and if they occur continuously one after another, they will be collapse or break to one space character.
Only, difference between them is that   is the HTML number and is a HTML name.
Basically all of these are HTML entities. You can learn and know about them, seeing the following links.
Java 8 onwards following should work:
string.replace("\\h", " ");
where \h is a horizontal whitespace character as described here