Document Date (original) (raw)

Table of contents


Description

Provides several methods for setting the date of documents. One can use the standalone docdate annotator or use the sub-annotator ner.docdate that is contained by the ner annotator. If using the sub-annotator in ner do not also use the standalone annotator.

Property name Annotator class name Generated Annotation
docdate DocDateAnnotator DocDateAnnotation

Example Usage

Command Line

# as a standalone annotator
java edu.stanford.nlp.pipeline.StanfordCoreNLP edu.stanford.nlp.pipeline.DocDateAnnotator -annotators tokenize,docdate -docdate.useFixedDate 2019-01-01 -file example.txt
# as a sub-annotator of ner
java edu.stanford.nlp.pipeline.StanfordCoreNLP -annotators tokenize,pos,lemma,ner -ner.docdate.useFixedDate 2019-01-01 -file example.txt

Options

Option name Type Default Description
docdate.useFixedDate String - Set every document to have a fixed date (e.g. 2019-01-01)
docdate.useMappingFile file, classpath, or URL - Use a tab-delimited file to specify doc dates. First column is document ID, second column is date.
docdate.usePresent - - Set every document to have the present date as the date.
docdate.useRegex String - Specify a regular expression matching file names. The first group will be extracted as the date. (e.g. NYT-([0-9]{4}-[0-9]{2}-[0-9]{2}).xml )