Close
STAR's Indexing Rules/Basic Index

Input data are extracted into search terms automatically when records are created or edited. STAR constructs a continually updated file of these search terms, referred to as the "index." The version of STAR's internal index that can be accessed from search pages, as shown below, contains the search terms, the name of the search field to which each term belongs, and the current count of records that contain a given term.

The example below illustrates a partial index display from an Additional Search Options search line with both full terms and extracted words associated with several of the search fields included in the default, Basic Index.

The next screen capture shows search terms in an index display from a single field.

Indexing Rules

STAR's index is constructed according to rules that are specified for each input field in the definition of a given database. Any one input field can be made searchable more than one way, so that a single input field may be associated with more than one search field name. "Potpourri" search fields have also been defined to combine search terms from multiple, related input fields.

Text and Date Fields

Among the indexing rules available for different types of data, the three main ones are the Fields, Words, and Date rules.

Fields Rule:  In applying this rule, STAR extracts the entire value in a field or in each occurrence of a repeating field. For example, if a record has two Subject occurrences:

Western Territories

War

Each would be individually searchable in a FIELDS-searchable SUBLC search field as:

SUBLC=WESTERN TERRITORIES

SUBLC=WAR

FIELDS-searchable search terms may be a single word or a multi-word phrase. A search of SUBLC=WAR is an automatic "exact match" search. This means that the search will not retrieve on a broader search across all terms that include the word "war" (e.g., American Revolutionary War; American Civil War). Records with these terms may still be retrieved if the terms are also present with the single-word "war."

Words Rule:  In applying this rule, STAR extracts individual words from a field or from each occurrence of a repeating field. If proximity indexing rule options have been specified in the definition, STAR also maintains information about the position of each word in the field, occurrence, paragraph, subfield or sentence so that proximity operators can be used in combining words for precise retrieval.

Hyphenated terms are split into separate words and punctuation in a word is stripped. For example, in the title "No Man's Land Benton County," any combination of these search terms could be used to retrieve the item: no, man, s, land, benton, county.

For more convenient word searches of numeric strings with embedded punctuation — i.e., if there is a number on at least one side of a period or comma — the "full value" is retained as a word. For example, if searching for Article 27.1 of the European Patent Convention, you could a enter a search of:  ARTICLE NEAR 27.1.

Date Rule:  STAR extracts the value in a date-defined input field, e.g., mm/dd/yyyy, and transforms the value into this index format:  yyyy mm dd.

Subfielded Fields

Each subfield can be made separately searchable. For example, in the Creator (AU) occurrence that looks like the ones below, the personal name in the first, unlabeled subfield can be made searchable in one field and a different search field can reference the role in subfield "r" (where | is the subfield marker and "r" is the label; another for the date in subfield "d," and another in subfield "z" for the Record ID of the linked-to Names authority record.

These separately defined subfields allow for index displays in which name data are not mixed in with other data about the person and for searching each subfield separately. For example, you could enter this expert search:

Indexing Rules and Searching

Expert search examples are used below to illustrate the types of searches that can be formulated for fields defined with one or more of the basic set of STAR's indexing rules. When you are searching in Assisted Search mode, the search field qualifiers and combinations of words into phrases are applied automatically.

Fields-searchable Searches

Text fields, such as titles, creators/authors, subjects, media types, and status data, are indexed with the Fields rule for "exact match" searching.

For example, a Title input value of Contemporary Writings would be searchable in a Fields-searchable TI search field as:  contemporary writings.

If a title were simply Writings, a TI search of writings would retrieve the one record with that value, but not all other titles with that word.

If a term includes reserved characters or words, e.g., the AND in a title of Historical and Contemporary Writings, expert searching requires that you surround the search term with double quotes so that STAR will treat the reserved word as part of the search term. In assisted searches, this double-quoting is applied automatically.

Words-searchable Searches

Most of the same text fields defined for Fields searching are typically indexed as well with the Words rule to provide for key word searching. Long-text fields, such as the Scope and Descriptive Notes fields, are indexed only with the WORDS rule.

For example, a TIW words search of contemporary near writing* might retrieve a number of titles that included the two words.

  • In Expert Search Mode, the operator you select to combine words into phrases will determine the level of precision in your results, e.g., with OR being the broadest and AND being the next broadest, with proximity operators providing additional degrees of precision.
  • In Assisted Search Mode, you would enter contemporary writing*, and the system automatically applies an operator between the two words (e.g., W/O or NEAR). In the Additional Search Options section of each page, you can select the "combine words" operator to be applied, for the level of precision that you want.

Date Searches

All date search lines are defined for assisted searching so that you can enter any of several formats, e.g.:

yyyy
m/d/yyyy
m/yyyy
mm/dd/yyyy

In expert mode, use the index format of:  yyyy 0m 0d. Truncation can be applied after any component.

The Basic Index

The Basic Index (BI) search field is the default search field in all databases. It is defined to include multiple search fields — both Fields- and Words-defined fields — for the main content-related input fields, along with any other key identifier or descriptive fields.

When searching in the BI field, STAR automatically ORs together your search term(s) across all of the constituent search fields included in the BI definition. You can override the default BI search by specifying your own search field qualifiers in Expert mode or, in the Additional Search Options section of a Search page, select from the list of search fields.

To specify your own search field in Expert mode, enter:

FIELDA=searchterm1 AND searchterm2

FIELDA,FIELDB=searchterm1 OR searchterm2

FIELDA=(searchterm1 OR searchterm2) AND FIELDB=searchterm3