Monday, November 08, 2010 at 12:05 AM.
system.verbs.builtins.searchEngine.cleanText
on cleanText (s) {
<<Prepare text to be indexed.
<<Replace whitespace characters and punctuation with spaces.
<<(Don't replace #, <, >, {, and } characters.)
<<Then strip HTML tags and macros.
s = string.replaceAll (s, "\r", " ");
s = string.replaceAll (s, "\n", " ");
s = string.replaceAll (s, "\t", " ");
s = string.replaceAll (s, ".", " ");
s = string.replaceAll (s, "-", " ");
s = string.replaceAll (s, "\\", " ");
s = string.replaceAll (s, "/", " ");
s = string.replaceAll (s, ":", " ");
s = string.replaceAll (s, "?", " ");
s = string.replaceAll (s, "!", " ");
s = string.replaceAll (s, ";", " ");
s = string.replaceAll (s, "@", " ");
s = string.replaceAll (s, "$", " ");
s = string.replaceAll (s, "%", " ");
s = string.replaceAll (s, "^", " ");
s = string.replaceAll (s, "&", " ");
s = string.replaceAll (s, "*", " ");
s = string.replaceAll (s, ")", " ");
s = string.replaceAll (s, "(", " ");
s = string.replaceAll (s, "[", " ");
s = string.replaceAll (s, "]", " ");
return (searchEngine.stripMarkup (s))} //strip HTML tags and macros
This listing is for code that runs in the OPML Editor environment. I created these listings because I wanted the search engines to index it, so that when I want to look up something in my codebase I don't have to use the much slower search functionality in my object database. Dave Winer.