Skip to content

shantanu27/Text-Annotation

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Text-Annotation

One basic unit of metadata for text processing is the annotation. Every annotation has a type (String), a begin offset (int) and an end offset (int). The begin and end offsets indicate the position of the annotation in the original text. Specific annotation types may also contain additional attributes. For example, the annotation <type=PERSON, begin=0, end=4, gender=MALE> may be associated with the string "John loves Mary", indicating that the first four characters in the string represent a PERSON, with the additional feature "gender=MALE".

An annotated string contains a text string (the original raw data) and a (possibly empty) set of all annotations produced for that text string. In the UIMA framework, the set of annotations for a text is referred to as the annotation index. Annotators may access the content of the annotation index to examine pre-existing annotations. The annotation index may be accessed by type (returning all existing annotations of a given type), or by a span (begin, end), which returns all annotations of any type which are contained in the given span.

About

No description or website provided.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published