Recovering Teletext data from VHS recordings

Rob Beschizza

9:28 am Mon, Jan 18, 2016

Teletext was an early mainstream precursor to the web that became successful in the UK and France: hundreds of low-res pages a day streamed in the invisible overscan margins of the TV signal. It died with analog television; archivists are finding the original data can be recovered from VHS tapes.

Technology is changing that. The continuing boom in processor power means it’s now possible to feed 15 minutes of smudged VHS teletext data into a computer and have it relentlessly compare the pages as they flick by at the top of the picture, choosing to hold characters that are the same on multiple viewing (as they’re likely to be right) and keep trying for clearer information for characters that frequently change (as they’re likely to be wrong).

It’s an interesting study in horsepower: it takes such “phenomenal processing power” to accurately and reliably scan VHS recordings of text that we’re only now on the cusp of being able to do so.

That hundreds, even thousands of frames of each teletext page are required to OCR each one is also a powerful tribute to just how astoundingly awful VHS is.