Adds an 'imputed' date of publication column based on the imprint
column of
the hathifile. This function checks for a year in the imprint
column using
a regex that identifies 4 numbers starting with 1 or 2 and extracts that as
an imputed date, after checking that it's not greater than the current year.
add_imputed_date(hathi_file)
The hathifile in memory, typically loaded by load_raw_hathifile (and perhaps filtered etc.). Must contain an "imprint" column.
The hathifile with added imputed date
column.