A dataset containing the HTIDs of the volumes classified as "fiction" in Ted Underwood, Boris Capitanu, Peter Organisciak, Sayan Bhattacharyya, Loretta Auvil, Colleen Fallaw, J. Stephen Downie (2015). Word Frequencies in English-Language Literature, 1700-1922 (0.2) Dataset. HathiTrust Research Center. doi:10.13012/J8JW8BSJ. Taken from the summary netadata file at http://data.analytics.hathitrust.org/genre/fiction_metadata.csv
fiction
An object of class tbl_df
(inherits from tbl
, data.frame
) with 101948 rows and 3 columns.
The Hathi Trust ID of the volume
A confidence metric: the probability that more than 80% of the pages in the volume assigned to "fiction" have been correctly classified.
The proportion of pages in the volume classified as
"fiction". Calculated from genrepages
/totalpages
in the original metadata
file.