A dataset containing the HTIDs of the volumes classified as "poetry" in Ted Underwood, Boris Capitanu, Peter Organisciak, Sayan Bhattacharyya, Loretta Auvil, Colleen Fallaw, J. Stephen Downie (2015). Word Frequencies in English-Language Literature, 1700-1922 (0.2) Dataset. HathiTrust Research Center. doi:10.13012/J8JW8BSJ. Taken from the summary netadata file at http://data.analytics.hathitrust.org/genre/drama_metadata.csv

poetry

Format

An object of class tbl_df (inherits from tbl, data.frame) with 58724 rows and 3 columns.

Details

htid

The Hathi Trust ID of the volume.

poetry_prob

A confidence metric: the probability that more than 80% of the pages in the volume assigned to "drama" have been correctly classified.

poetry_prop

The proportion of pages in the volume classified as "poetry". Calculated from genrepages/totalpages in the original metadata file.