FairScan 2.0 released

Post Syndicated from jzb original https://lwn.net/Articles/1078242/

Version
2.0
of the FairScan document-scanning app for Android has been
released. The headline feature for this release is the addition of
optical-character-recognition (OCR) support using Tesseract to produce PDFs
with searchable text from scans. FairScan developer Pierre-Yves
Nicolas has written a detailed
blog
about adding the feature and explaining why it had not been added
previously.

That looks nice, so why didn’t FairScan have it before? That’s
because FairScan wasn’t ready for it: I wouldn’t be comfortable if
FairScan was giving you wrong text half of the time. To get good
results from an OCR engine, you need to provide it a readable
image. If it’s hard to read for a human, it’s certainly also hard to
read for an OCR engine.

Over the past year, I worked on different parts of FairScan’s
automatic processing to transform photos of documents into PDFs that
are easy for humans to read:

  • document detection
  • perspective correction
  • shadow reduction
  • brightness and contrast enhancement

All this work on image processing helped FairScan produce clean
PDFs and can now also contribute to making text recognition effective.

FairScan is available via Google
Play
or F-Droid.