Monday, November 08, 2010 at 12:05 AM.

system.verbs.builtins.searchEngine.cleanText

on cleanText (s) {
	<<Prepare text to be indexed.
		<<Replace whitespace characters and punctuation with spaces.
		<<(Don't replace #, <, >, {, and } characters.)
		<<Then strip HTML tags and macros.
	s = string.replaceAll (s, "\r", " ");
	s = string.replaceAll (s, "\n", " ");
	s = string.replaceAll (s, "\t", " ");
	s = string.replaceAll (s, ".", " ");
	s = string.replaceAll (s, "-", " ");
	s = string.replaceAll (s, "\\", " ");
	s = string.replaceAll (s, "/", " ");
	s = string.replaceAll (s, ":", " ");
	s = string.replaceAll (s, "?", " ");
	s = string.replaceAll (s, "!", " ");
	s = string.replaceAll (s, ";", " ");
	s = string.replaceAll (s, "@", " ");
	s = string.replaceAll (s, "$", " ");
	s = string.replaceAll (s, "%", " ");
	s = string.replaceAll (s, "^", " ");
	s = string.replaceAll (s, "&", " ");
	s = string.replaceAll (s, "*", " ");
	s = string.replaceAll (s, ")", " ");
	s = string.replaceAll (s, "(", " ");
	s = string.replaceAll (s, "[", " ");
	s = string.replaceAll (s, "]", " ");
	
	return (searchEngine.stripMarkup (s))} //strip HTML tags and macros



This listing is for code that runs in the OPML Editor environment. I created these listings because I wanted the search engines to index it, so that when I want to look up something in my codebase I don't have to use the much slower search functionality in my object database. Dave Winer.