Artificial colorization of digitized microfilms: a preliminary study (original) (raw)

2023, Journal of Data Mining and Digital Humanities

A lot of available digitized manuscripts online are actually digitized microfilms, a technology dating back from the 1930s. With the progress of artificial colorization, we make the hypothesis that microfilms could be colored with these recent technologies, testing InstColorization. We train a model over a new dataset of 18 788 color images that are artificially gray-scaled for this purpose. With promising results in terms of colorization but clear limitations due to the difference between artificially grayscaled images and "naturaly" grayscaled microfilms, we evaluate the impact of this artificial colorization on two downstream tasks using Kraken: layout analysis and text recognition. The results show little to no improvements which limits the interest of artificial colorization on manuscripts in the computer vision domain. Many low resolution digital scans of microfilms exist. These are surrogates of surrogates. They can still be (and are) profitably used, for example to corroborate a particular reading. I am however skeptical of using them as a single source for making an edition. Perhaps, indeed, 99% of a manuscript can still be deciphered by using them, but it is about that 1% of cases in which the scribe fumbled a bit with his pen and it is unclear what the word reads. In those 1% cases, you do not wish to have a low-resolution, black and white reproduction of a reproduction as your sole witness L. W. C. van Lit [2019]

Sign up for access to the world's latest research.

checkGet notified about relevant papers

checkSave papers to use in your research

checkJoin the discussion with peers

checkTrack your impact