How to ocr low resolution image
I have to extract the text from a video which has latitude-longitude data that looks like the following image:
The image is very low resolution and tesseract and online OCR failed to extract the text without any processing. I tried to remove grey background using this, and subtracting grey colour matrix, but it did not produce meaningful output. I converted to HSV to extract the yellow text but again, got no meaningful results.
I was wondering if there is any way I can extract the text. The most promising lead seems to be that the background is greyscale and the text is translucent yellow.
The main issue with these images is to segment the characters. If they have a fixed place, you are done. (Skip the next paragraph.)
If not, start by locating the voids between the groups of characters by profile analysis, to ease the task. For every group, try to recognize the leftmost character, then skip it to get to the next character, and so on.
Recognition of the characters can be made by straight SAD or SSD comparison with reference characters of the same font.
Do not expect too good results.
In general, DPI is just a number, so you can change it (I don't know how in python but there should be a way). Try changing it to 200 or 300 before passing it to the OCR engine. If that didn't help, try re-sizing it 200% in addition to setting the DPI to 300. Also, the best OCR results are for black and white images so try apply images processing on the image to turn it to black and white. I found the following link that might help you with that: Using python PIL to turn a RGB image into a pure black and white image