Here is an example of a fully-annotated sentence:
WORDS----> NE---> POS PARTIAL_SYNT
FULL_SYNT------> VS TARGETS PROPS------->
The * DT (NP* (S* (S(NP* - -
(A0* (A0*
$ * $ * * (ADJP(QP* - -
* *
1.4 * CD * * * - -
* *
billion * CD * * *)) - - *
*
robot * NN * * * - -
* *
spacecraft * NN *) * *) - -
*) *)
faces * VBZ (VP*) * (VP* 01 face
(V*) *
a * DT (NP* * (NP* - -
(A1* *
six-year * JJ * * * - -
* *
journey * NN *) * * - -
* *
to * TO (VP* (S* (S(VP* - -
* *
explore * VB *) * (VP* 01 explore
* (V*)
Jupiter (ORG*) NNP (NP*) * (NP(NP*) - -
* (A1*
and * CC * * * - -
* *
its * PRP$ (NP* * (NP* - -
* *
16 * CD * * * - -
* *
known * JJ * * * - -
* *
moons * NNS *) *) *))))))) - -
*) *)
. * . * *) *) - -
* *
There is one
line for each token, and a blank line after the last token. The columns,
separated by spaces, represent different annotations of the sentence with a
tagging along words. For structured annotations (named entities, chunks,
clauses, parse trees, arguments), we use the Start-End format.
The Start-End format represents phrases (chunks, arguments, and syntactic
constituents) that constitute a well-formed bracketing in a sentence (that is,
phrases do not overlap, though they admit embedding). Each tag is of the form STARTS*ENDS,
and represents phrases that start and end at the corresponding word. A phrase of
type k places a (k parenthesis at the STARTS part of the
first word, and a ) parenthesis at the END part of the last word.
Scripts will be provided to transform a column in Start-End format into other
standard formats (IOB1, IOB2, WSJ trees). The Start-End format used last year
(that considered the phrase type in the start and end parts) will be compatible
with the current software and scripts.
The different annotations in a sentence are grouped in the following blocks:
(S
(NP (DT The)
(ADJP
(QP ($ $) (CD 1.4) (CD billion) ))
(NN robot) (NN spacecraft) )
(VP (VBZ faces)
(NP (DT a) (JJ six-year) (NN journey)
(S
(VP (TO to)
(VP (VB explore)
(NP
(NP (NNP Jupiter) )
(CC and)
(NP (PRP$ its) (CD 16) (JJ known) (NNS moons) )))))))
(. .) )