Monday, June 21, 2010

Removing HTML tags from a String using Regex

result = data.replaceAll("\\\\n","<br/>"). //replace \n with <br/>, 
replaceAll("\\\\", ""). //remove stray "\"s
replaceAll("\\\"", "\\\\\""). // and escape double quotes
replaceAll("<[\\p{Alnum}\\p{Space}\\.\\-=/:\\\"\\\\;]*>"," "). //remove all opening html tags
replaceAll("</[\\p{Alnum}]*>"," "); //remove all closing html tags

No comments:

Post a Comment