A dataset containing the HTIDs of the volumes classified as "fiction" in Ted Underwood, Boris Capitanu, Peter Organisciak, Sayan Bhattacharyya, Loretta Auvil, Colleen Fallaw, J. Stephen Downie (2015). Word Frequencies in English-Language Literature, 1700-1922 (0.2) Dataset. HathiTrust Research Center. doi:10.13012/J8JW8BSJ. Taken from the summary netadata file at http://data.analytics.hathitrust.org/genre/fiction_metadata.csv

fiction

Format

An object of class tbl_df (inherits from tbl, data.frame) with 101948 rows and 3 columns.

Details

htid

The Hathi Trust ID of the volume

fiction_prob

A confidence metric: the probability that more than 80% of the pages in the volume assigned to "fiction" have been correctly classified.

fiction_prop

The proportion of pages in the volume classified as "fiction". Calculated from genrepages/totalpages in the original metadata file.