magazineszuloo.blogg.se

Article extraction
Article extraction











Article extraction full#

First of all, is the information in full text organized enough so that keywords can be extracted? Secondly, different biological concepts (for example, gene and protein names, tissue names, organisms, experimental conditions, etc.) may be located in different parts of the article.

article extraction

Other questions regard the quality of the information carried by different sections of an article. On the other hand, an Abstract, as a summary, contains a high frequency of relevant terms (keywords), but this may not be the case of the rest of the article. On the one hand, the storage of full text articles requires more disk space and the analysis needs more computational capacity. However, in approaching full text analysis several problems must be tackled. It is obvious that the full text of an article contains more information than its Abstract. However, nowadays most journals are also available in electronic version, and thus full text articles can be used for information extraction. Moreover, abstracts are available in public databases. Therefore the Abstract of a paper is a good target for information extraction because by definition an abstract synthesizes the content of the article. In the context of information extraction in molecular biology it is usually understood that the information to be extracted from an article are words regarding biological concepts that could synthesize the main points of the article (keywords). Most applications of information extraction from the scientific medical bibliography use the Abstract of the publication (for review see for example ).











Article extraction