The Hoover Institution recently finished digitizing 586 Nazi biograms collected by Theodore Abel, professor of sociology at Columbia University from 1929 to 1950. Abel gathered these documents a year after Hitler’s appointment as chancellor by setting up a contest offering 400 German marks “for the best personal life history of an adherent of the Hitler movement.” The collection which represents a sample of 2.5% of early NSDAP members forms the basis for Abel’s insightful book Why Hitler Came Into Power published in 1938.
With the increasing sophistication of quantitative text analysis coinciding with the public release of the collection, we are now presented with the opportunity to reanalyze these data with computational methods. Earlier applications of such methods already highlighted the value of these data to reveal the narrative structures of becoming and being a Nazi, but to leverage the full potential of new computational methods the complete collection needs to be converted into a machine readable format. To create these machine readable files, this project will use the optical character recognition software from the leader in the field ABBYY which is accessible through their cloud based ABBYY OCR SDK.