0 width html character
The 0 width charcter (​) is displayed on database as "". Even if we do replace or index of , we wont be able to see this character.
This can be seen if you paste it in notepad++.
To Handle this :
private static string[] wsChars = new string[] {
char.ConvertFromUtf32(9),
char.ConvertFromUtf32(10),
char.ConvertFromUtf32(11),
char.ConvertFromUtf32(12),
char.ConvertFromUtf32(13),
char.ConvertFromUtf32(32),
char.ConvertFromUtf32(133),
char.ConvertFromUtf32(160),
char.ConvertFromUtf32(5760),
char.ConvertFromUtf32(8192),
char.ConvertFromUtf32(8193),
char.ConvertFromUtf32(8194),
char.ConvertFromUtf32(8195),
char.ConvertFromUtf32(8196),
char.ConvertFromUtf32(8197),
char.ConvertFromUtf32(8198),
char.ConvertFromUtf32(8199),
char.ConvertFromUtf32(8200),
char.ConvertFromUtf32(8201),
char.ConvertFromUtf32(8202),
char.ConvertFromUtf32(8203),
char.ConvertFromUtf32(8232),
char.ConvertFromUtf32(8233),
char.ConvertFromUtf32(12288),
char.ConvertFromUtf32(65279)};
/// <summary>
/// handle 0 widhth white space character
/// </summary>
internal static string whiteSpaceCharHandling(string msg)
{
string returnStr = msg;
if (!String.IsNullOrEmpty(msg))
{
StringBuilder sb = new StringBuilder(msg);
foreach (string wc in wsChars)
{
sb.Replace(wc, "");
returnStr = sb.ToString();
}
}
return returnStr;
}
This issue if often faced when we use Html Editor. The Html retrived from the editors returns this character which is not visible even in db.
Char.ConvertFromUtf32 method converts the specified unicode code point to a UTF-16 encoded string.
0 Comment(s)