Tuesday, 2 September 2014

Print screen OCR.


Browsing the web, there are some important text are on the pictures.
How to get the text from it?
( You can read and type it, but -come on- it is so obsolete :)

This description is a "how to..." OCR (Optical Chararacter Recognition) a print screen on Ubuntu.

Most of information is coming from here: askubuntu.com/

You have to install some programs:
sudo apt-get install tesseract-ocr
sudo apt-get install tesseract-ocr-hun
sudo apt-get install scrot
sudo apt-get install xsel
sudo add-apt-repository ppa:webupd8team/y-ppa-manager
sudo apt-get update
sudo apt-get install yad

Then copy this lines to a file. 
It is shell script. 
Call the file to for example: "ocr.gui"
#!/bin/bash
# DEPENDENCIES: tesseract-ocr imagemagick scrot
# AUTHOR: Glutanimate 2013 (http://askubuntu.com/users/81372/)
# NAME: ScreenOCR
# LICENSE: GNU GPLv3
#
# BASED ON: OCR script by Salem (http://askubuntu.com/a/280713/81372)
#TITLE=ScreenOCR # set yad variables
#ICON=gnome-screenshot
# - tesseract won't work if LC_ALL is unset so we set it here
# - you might want to delete or modify this line if you
# have a different locale:
export LC_ALL=hu_HU.UTF-8

LANG=quot;hun"
echo "Language set to $LANG"
SCR_IMG=`mktemp` # create tempfile
trap "rm $SCR_IMG*" EXIT # make sure tempfiles get deleted afterwards
scrot -s $SCR_IMG.png #take screenshot of area
mogrify -modulate 100,0 -resize 400% $SCR_IMG.png # postprocess to prepare for OCR
tesseract -l $LANG $SCR_IMG.png $SCR_IMG # OCR in given language
cat $SCR_IMG.txt | xsel -bi # pass to clipboard
exit
Copy this "ocr.gui" shell script to /usr/local/bin as root.
Modify its rights to runable shell script:

cd /usr/local/bin
chmod 755 ocr.gui

Give it to a hotkey: Printscreen+Shift+Control.
Start menu -> Administration -> Keyboard -> Hotkeys