I have scanned books with black imitation leather as background. The text recognition unfortunately recognizes text on this background. I like to color the border black, so that the program does not find any text at the edge. Is this possible with tools like ImageMagick or GraphicsMagick?
2 Answers
Perhaps a combination of floodfill and fuzz?
convert input.png -fill white -fuzz 20% -draw 'color 1,1 floodfill' output.png
Also checkout Fred's awesome textcleaner script.
-
Yes! Provided that Imagemagick was installed with libtiff support (usually included with default installers). Commented Aug 21, 2018 at 13:24
emcconville
has an excellent solution. I might add just a bit to it to include some deskew and trim/shave, since your margins are large enough to permit shaving the excess black that remains after a trim. The deskew might help in the OCR.
convert image.png -bordercolor black -border 1 -background black -deskew 40% -fuzz 50% -trim +repage -shave 10x10 result.png