|
DLESE Tools v1.6.0 |
||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||
java.lang.Objectorg.dlese.dpc.index.writer.IndexingTools
public class IndexingTools
Tools to aid in indexing.
| Field Summary | |
|---|---|
static String |
adminDefaultFieldName
Admin default field 'admindefault' |
static String |
defaultFieldName
Default field 'default' |
static String |
PHRASE_SEPARATOR
String used to separate and preserve phrases indexed as text, includes leading and trailing white space. |
static String |
stemsFieldName
Stems field 'stems' |
| Constructor Summary | |
|---|---|
IndexingTools()
|
|
| Method Summary | |
|---|---|
static void |
addToAdminDefaultField(org.apache.lucene.document.Document myDoc,
String content)
Indexes the given text into the admin default field. |
static void |
addToDefaultAndStemsFields(org.apache.lucene.document.Document myDoc,
String content)
Indexes the given text into the default and stems fields. |
static String |
encodeToTerm(String text)
Same as {org.dlese.dpc.index.SimpleLuceneIndex#encodeToTerm(String)}. |
static String |
encodeToTerm(String text,
boolean encodeWildCards)
Same as {org.dlese.dpc.index.SimpleLuceneIndex#encodeToTerm(String,boolean)}. |
static String[] |
extractSeparatePhrasesFromString(String separatedPhrases)
Extracts the phrases from a String that was created using the method makeSeparatePhrasesFromNodes(List nodes) or makeSeparatePhrasesFromStrings(List strings). |
static String[] |
extractStringsFromString(String separatedWords)
Extracts the words from a String that was created using the method makeStringFromNodes(List
nodes). |
static String[] |
getAnalyzedTerms(String textToParse,
String field,
org.apache.lucene.analysis.Analyzer analyzer)
Extracts all terms in any field from a Lucene query using the given Analyzer. |
static org.apache.lucene.analysis.Token[] |
getAnalyzedTokens(String textToParse,
String field,
org.apache.lucene.analysis.Analyzer analyzer)
Extracts all Tokens from a Lucene query using the given Analyzer. |
static StringBuffer |
getAnalyzerOutput(String textToParse,
String field,
org.apache.lucene.analysis.Analyzer analyzer)
Creates a StringBuffer to display the tokens created by a given analyzer. |
static String |
makeSeparatePhrasesFromNodes(List nodes)
Creates a String separated by the phrase separator term from the text of each of the Element or Attributes dom4j Nodes provided. |
static String |
makeSeparatePhrasesFromStrings(List strings)
Creates a String separated by the phrase separator term from each of the Strings provided. |
static String |
makeSeparatePhrasesFromStrings(String[] strings)
Creates a String separated by the phrase separator term from each of the Strings provided. |
static String |
makeStringFromNodes(List nodes)
Creates a String separated by spaces from the text of each of the Element or Attributes dom4j Nodes provided. |
static String |
tokenizeID(String ID)
Tokenizes a DLESE ID by replacing the char - with a blank space. |
static String |
tokenizeURI(String uri)
Tokenizes a URI by replacing the unindexable chars with a blank space. |
| Methods inherited from class java.lang.Object |
|---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
| Field Detail |
|---|
public static final String defaultFieldName
public static final String stemsFieldName
public static final String adminDefaultFieldName
public static final String PHRASE_SEPARATOR
| Constructor Detail |
|---|
public IndexingTools()
| Method Detail |
|---|
public static final void addToDefaultAndStemsFields(org.apache.lucene.document.Document myDoc,
String content)
myDoc - Document to add tocontent - Content to add
public static final void addToAdminDefaultField(org.apache.lucene.document.Document myDoc,
String content)
myDoc - Document to add tocontent - Content to addpublic static final String makeSeparatePhrasesFromNodes(List nodes)
A call to this method might look like:
String value = makeIndexPhrasesFromNodes(xmlDoc.selectNodes("/news-oppsRecord/topics/topic"));
nodes - List of Elements or Attributes
public static final String makeSeparatePhrasesFromStrings(List strings)
strings - List of Strings or null
public static final String makeSeparatePhrasesFromStrings(String[] strings)
strings - Array of Strings or null
public static final String[] extractSeparatePhrasesFromString(String separatedPhrases)
makeSeparatePhrasesFromNodes(List nodes) or makeSeparatePhrasesFromStrings(List strings).
separatedPhrases - String that contains the phrase separator to seperate phrases
public static final String makeStringFromNodes(List nodes)
A call to this method might look like:
String value = makeStringFromNodes(xmlDoc.selectNodes("/news-oppsRecord/topics/topic"));
nodes - List of dom4j Nodes of Elements or Attributes
public static final String[] extractStringsFromString(String separatedWords)
makeStringFromNodes(List
nodes).
separatedWords - DESCRIPTION
public static final String tokenizeID(String ID)
ID - The ID String
public static final String tokenizeURI(String uri)
uri - A URL or URI
public static final String encodeToTerm(String text)
text - Text
public static final String encodeToTerm(String text,
boolean encodeWildCards)
text - TextencodeWildCards - True to encode the '*' wildcard char, false to leave unencoded.
public static final org.apache.lucene.analysis.Token[] getAnalyzedTokens(String textToParse,
String field,
org.apache.lucene.analysis.Analyzer analyzer)
Tokens from a Lucene query using the given Analyzer.
textToParse - The text to analyze with the analyzeranalyzer - The analyzer to usefield - The field this Analyzer should interpret the text as, or null to use 'default'
public static final String[] getAnalyzedTerms(String textToParse,
String field,
org.apache.lucene.analysis.Analyzer analyzer)
Analyzer.
textToParse - The text to analyze with the analyzeranalyzer - The analyzer to usefield - The field this Analyzer should interpret the text as, or null to use 'default'
public static final StringBuffer getAnalyzerOutput(String textToParse,
String field,
org.apache.lucene.analysis.Analyzer analyzer)
textToParse - The text to analyze with the analyzeranalyzer - The analyzer to usefield - The lucene field name, or null to use default
|
DLESE Tools v1.6.0 |
||||||||
| PREV CLASS NEXT CLASS | FRAMES NO FRAMES | ||||||||
| SUMMARY: NESTED | FIELD | CONSTR | METHOD | DETAIL: FIELD | CONSTR | METHOD | ||||||||