Tips for Using the Machine Aided Indexing Application
Below are some general facts, guidelines, and recommendations
that should help in getting the best results from the various
features of the MAI application.
Adding and Selecting Input Text
- The MAI text processor will accept any character-based
text that you copy&paste (or type) into the workscreen--from
any source, including simple-text and word-processing documents,
text from web pages, and character-based text copied from PDF
documents.
- How much text can be input? The application is designed
to handle at least an average-sized full-text report or article.
A limitation may be imposed by your web browser (e.g., a 30,000
character limit was noticed for some older versions of Netscape).
It is recommended that you use the latest available version of
your browser.
- How little text can be input? Good results will be
obtained from short summaries and abstracts. In most cases, entering
simply a word, phrase, or short sentence will not provide sufficient
information to get good results.
- Texts from different subject areas. The MAI knowledge
base uses the NASA Thesaurus as its subject domain, which has
a very broad coverage in all of the natural and applied sciences,
engineering, and many specialized technologies. MAI works best
in these areas, but keep in mind that good results can be obtained
for some non-technical topics as well.
- In some cases, better quality output will result if you omit
those parts of a document that do not address the theme of
your document (such as a preface or acknowledgements section).
- Processing more than one abstract or short document at
a time. Processing multiple abstracts, or multiple brief
documents or web pages at once can be helpful when you want to
get a quicklook overview of concepts addressed
by those items as a collection. The output will also provide
a frequency ranking for each concept term.
Selecting Terms
The MAI application was designed as a machine-aided
rather than a fully automatic indexing tool. For this reason
it is recommended that you review the list of output terms
and select those you judge to be appropriate for a given
document. Additional terms can also be identified by browsing
the NASA Thesaurus using the thesaurus search function.
Output terms are ranked according to the frequency with
which each particular term was suggested from text expressions
in a document. Particularly in cases where a full-text document
was input, this ranking can be helpful in reviewing the term list
and making term selections.
Additional information: