Penn Treebank Online

This is a tgrep interface to several Penn Treebank parsed corpora. To use this interface, you need to know the tgrep query syntax and be familar with the tgrep options. See the short introduction or complete documentation for more information.
tgrep search pattern:

(The default search pattern gets Verb Phrases headed by "believe" that have an infinitival complement with a non-null subject, plus some false matches because this pattern attempts to cover Treebank I & II styles and also cover tagged and untagged corpora. More specific patterns are possible, although some error caused by annotator mistakes will always remain.)

command-line options for tgrep:
(The man page provides a summary of options.)

Available corpora:
Corpora labelled "with POS tags" include part-of-speech tags as preterminals for each word. Searches involving lexical items are frequently easier to specify using corpora that do not include POS tags (labelled "w/o POS tags").
The terms "Treebank I" and "Treebank II" refer to different bracketing styles -- Treebank II is a richer, more complicated system designed to allow the extraction of simple predicate/argument structure.


(Your search may take a while (up to two minutes) because of the I/O requirements of running tgrep.)