lundi 25 mars 2019

redCV and image denoising

RedCV can be used for image denoising. A lot of functions are included for helping image restoration. Basically a 3x3 kernel is used to calculate the pixel value of neighbors and to replace the pixel value in the image by the result. Of course, kernel size can be changed. According to the noise included in image you can use different parametric filters. 
When noise is simple such as pepper and salt noise a simple median filter will be efficient: Central pixel value is replaced by the median value of neighbors by rcvMedianFilter function.

But when the image is really noisy, median filter is not sufficiently efficient:
In this case *rcvMinFilter* function can be used: Central pixel value is replaced by the minimum value of neighbors and the result is pretty good. 
You can also use *rcvMaxFilter* (max value of neighbors) or *rcvMidPointFilter* (central pixel value is replaced by minimum+ maximum values of neighbors divided by 2 ) according to the noise contained in the image.
RedCV also includes a *rcvMeanFilter* function, which can be used for image smoothing. 

Code sample

As usual with Red and redCV, code is very clear and simple.
Red [
    Title: "Smoothing filters for image "
    Author: "Francois Jouen"
    File:    %smoothing.red
    Needs:   'View
]
#include %../../libs/redcv.red ; for redCV functions
kSize: 3x3
src: make image! 512x512
dst: make image! 512x512
isFile: false

loadImage: does [
    isFile: false
    canvas2/image: none
    tmp: request-file
    if not none? tmp [
        src: load tmp   
        dst: make image! src/size
        canvas1/image: src
        isFile: true
    ]
]
view win: layout [
    title "Red Smoothing Filters"
    origin 10x10 space 10x10
    button "Load"               [loadImage]
    text "Filter Size"
    field "3x3"                 [if error? try [kSize: to-pair face/text] [kSize: 3x3]]
    button "Median"             [if isFile [rcvMedianFilter src dst kSize canvas2/image: dst]]
    button "Min"                [if isFile [rcvMinFilter src dst kSize canvas2/image: dst]]
    button "Max"                [if isFile [rcvMaxFilter src dst kSize canvas2/image: dst]]
    button "MidPoint"           [if isFile [rcvMidPointFilter src dst kSize canvas2/image: dst]]
    button "Arithmetic Mean"    [if isFile [rcvMeanFilter src dst kSize 0 canvas2/image: dst]]
    button "Harmonic Mean"      [if isFile [rcvMeanFilter src dst kSize 1 canvas2/image: dst]]
    button "Geometric Mean"     [if isFile [rcvMeanFilter src dst kSize 2 canvas2/image: dst]]
    button "Quit"               [quit]
    return
    canvas1: base 512x512 white
    canvas2: base 512x512 white
]



jeudi 21 mars 2019

Deep Learning Text Recognition (OCR) using Tesseract and Red

I'm frequently using OCR Tesseract when I have to recognize text in images.
Tesseract was initially developed  by Hewlett Packard Labs. In 2005, it was open sourced by HP, and since 2006 it has been actively developed by Google and open source community.

In version 4, Tesseract has implemented a long short term memory (LSTM) recognition engine which is a kind of recurrent neural network (RNN) very efficient for OCR.

Tesseract library includes a command line tool tesseract which can be used  to perform OCR on images and output the result in a text file.

Install Tesseract

First of all you need to install Tesseract library accoding to your main OS such as sudo apt install tesseract-ocr for Linux or  brew install tesseract for macOS.

If you want multi-language support (about 170 languages) you have to install *tesseract-lang* package.


Using Tesseract with Red Language

This operation is really trivial since Red includes a call fonction which makes possible to use tesseract command line tool. You have to use call/wait refinment in order to wait tessearact execution. You can use different languages according to your documents.

Code Sample

Red [
Title:   "OCR"
Author:  "Francois Jouen"
File:  %tesseract.red
Needs:  View
icon:  %red.ico
]
; Languages
`tessdata: [
"afr (Afrikaans)"
"amh (Amharic)"
"ara (Arabic)"
"asm (Assamese)"
"aze (Azerbaijani)"
"aze_cyrl (Azerbaijani - Cyrilic)"
"bel (Belarusian)"
"ben (Bengali)"
"bod (Tibetan)"
"bos (Bosnian)"
"bre (Breton)"
"bul (Bulgarian)"
"cat (Catalan; Valencian)"
"ceb (Cebuano)"
"ces (Czech)"
"chi_sim (Chinese - Simplified)"
"chi_tra (Chinese - Traditional)"
"chr (Cherokee)"
"cym (Welsh)"
"dan (Danish)"
"deu (German)"
"dzo (Dzongkha)"
"ell (Greek Modern (1453-)"
"eng (English)"
"enm (English Middle (1100-1500)"
"epo (Esperanto)"
"equ (Math / equation detection module)"
"est (Estonian)"
"eus (Basque)"
"fas (Persian)"
"fin (Finnish)"
"fra (French)"
"frk (Frankish)"
"frm (French Middle (ca.1400-1600)"
"gle (Irish)"
"glg (Galician)"
"grc (Greek Ancient (to 1453)"
"guj (Gujarati)"
"hat (Haitian; Haitian Creole)"
"heb (Hebrew)"
"hin (Hindi)"
"hrv (Croatian)"
"hun (Hungarian)"
"iku (Inuktitut)"
"ind (Indonesian)"
"isl (Icelandic)"
"ita (Italian)"
"ita_old (Italian - Old)"
"jav (Javanese)"
"jpn (Japanese)"
"kan (Kannada)"
"kat (Georgian)"
"kat_old (Georgian - Old)"
"kaz (Kazakh)"
"khm (Central Khmer)"
"kir (Kirghiz; Kyrgyz)"
"kor (Korean)"
"kor_vert (Korean (vertical)"
"kur (Kurdish)"
"kur_ara (Kurdish (Arabic)"
"lao (Lao)"
"lat (Latin)"
"lav (Latvian)"
"lit (Lithuanian)"
"ltz (Luxembourgish)"
"mal (Malayalam)"
"mar (Marathi)"
"mkd (Macedonian)"
"mlt (Maltese)"
"mon (Mongolian)"
"mri (Maori)"
"msa (Malay)"
"mya (Burmese)"
"nep (Nepali)"
"nld (Dutch; Flemish)"
"nor (Norwegian)"
"oci (Occitan (post 1500)"
"ori (Oriya)"
"osd (Orientation and script detection module)"
"pan (Panjabi; Punjabi)"
"pol (Polish)"
"por (Portuguese)"
"pus (Pushto; Pashto)"
"que (Quechua)"
"ron (Romanian; Moldavian; Moldovan)"
"rus (Russian)"
"san (Sanskrit)"
"sin (Sinhala; Sinhalese)"
"slk (Slovak)"
"slv (Slovenian)"
"snd (Sindhi)"
"spa (Spanish; Castilian)"
"spa_old (Spanish; Castilian - Old)"
"sqi (Albanian)"
"srp (Serbian)"
"srp_latn (Serbian - Latin)"
"sun (Sundanese)"
"swa (Swahili)"
"swe (Swedish)"
"syr (Syriac)"
"tam (Tamil)"
"tat (Tatar)"
"tel (Telugu)"
"tgk (Tajik)"
"tgl (Tagalog)"
"tha (Thai)"
"tir (Tigrinya)"
"ton (Tonga)"
"tur (Turkish)"
"uig (Uighur; Uyghur)"
"ukr (Ukrainian)"
"urd (Urdu)"
"uzb (Uzbek)"
"uzb_cyrl (Uzbek - Cyrilic)"
"vie (Vietnamese)"
"yid (Yiddish)"
"yor (Yoruba)"
]
;OCR Engine Mode
ocr: [
"Original Tesseract only"
"Neural nets LSTM only"
"Tesseract + LSTM"
"Default, based on what is available"
]



appDir: "Please adapt or use what-dir"
appDir: what-dir
tFile: to-file rejoin[appDir "tempo"]
tFileExt: to-file rejoin[appDir "tempo.txt"]
change-dir to-file appDir

dSize: 512
gsize: as-pair dSize dSize
img: make image! reduce [gSize black]
lang: "eng"
ocrMode: 3
tmpf: none
tBuffer: copy []

loadImage: does [
tmpf: request-file
isFile: false
if not none? tmpf [
clear result/text
img: load tmpf
canvas/image: img
isFile: true
]
]

processFile: does [
if isFile [
if exists? tFileExt [delete tFileExt]
clear result/text 
prog: copy "/usr/local/bin/tesseract " 
append prog form tmpf 
append append prog " " form tFile
case [
ocrMode = 0 [append append prog " -l " lang]
ocrMode = 1 [append append prog " -l " lang append append prog " --oem " ocrMode]
ocrMode = 2 [append append prog " -l " lang]
ocrMode = 3 [append append prog " -l " lang append append prog " --oem " ocrMode]
]
call/wait prog
either cb/data [
clear tbuffer
clear result/data
tt: read tFileExt
tbuffer: split tt "^/"
nl: length? tbuffer 
i: 1
while [i <= nl][
ligne: tbuffer/:i
ll: length? ligne
if  ll > 1 [append result/data rejoin [ligne lf]]
i: i + 1
]
result/text: copy form result/data]
[result/text: read tFileExt]
]
]

; ***************** Test Program Interface ************************
view win: layout [
title "Tesseract OCR with Red"
button  "Load Image" [loadImage]
text 60 "Language"
dp1: drop-down 180 data tessdata
select 24
on-change [ 
s: dp1/data/(face/selected)
lang: first split s " "
]
text 80 "OCR mode" 
dp2: drop-down 230 data ocr
select 4
on-change [ocrMode: face/selected - 1]
cb: check "Lines" false
button "Process" [processFile]
button "Clear" [clear result/text]
button "Quit" [if exists? tFileExt [delete tFileExt] Quit]
return
canvas: base gsize img
result: area  gsize font [name: "Arial" size: 16 color: black] 
data []
return
f: field  512
text "Font"
drop-list 120
data  ["Arial" "Consolas" "Comic Sans MS" "Times" "Hannotate TC"]
react [result/font/name: pick face/data any [face/selected 1]]
select 1
fs: field 50 "16" 
react [result/font/size: fs/data]
button 30 "+"  [fs/data: fs/data + 1]
button 30 "-"  [fs/data: max 1 fs/data - 1]
drop-list 100
data  ["black" "blue" "green" "yellow" "red"]
react [result/font/color: reduce to-word pick face/data any [face/selected 1]]
select 1
do [f/text: copy form appDir]
]


Result

This is an example for simplified chinese document.




samedi 17 novembre 2018

Neural Network with Red language

Thanks to:  
Andrew Blais (onlymice@gnosis.cx), Gnosis Software, Inc. 
David Mertz (mertz@gnosis.cx), Gnosis Software, Inc.
Michael Shook, http://mshook.webfactional.com/talkabout/archive/

For the fun, we'll test Red capacities for generating neural networks. Here we used a simple network with 2 input neurons, 3 hidden neurons and 1 output neuron. Red code is based on Back-Propagation Neural Networks python code by Neil Schemenauer (nas@arctrix.com) and on Karl Lewin’s code for Rebol language.

You’ll find easily on the Internet detailed explanations about neural networks.

Neural Networks

Simply speaking, human brain consists of about billion neurons and a neuron is connected to many other neurons. With this kind of connections, neurons both send and receive varying quantities of signals. One very important feature of neurons is that they don't react immediately to the reception of signals, but they sum signals, and they send their own signals to other neurons only when this sum has reached a threshold. This means that the human brain learns by adjusting the number and strength of connections between neurons.

Threshold logic units (TLUs)

The first step toward understanding neural networks is to abstract from the biological neuron, and to consider artificial neurons as threshold logic units (TLUs). A TLU is an object that inputs an array of weighted values, sums them, and if this sum is equal or superior to some threshold, outputs a signal. This means than TLUs can classify data. Imagine an artificial neuron with two inputs, whose weights equal 1, and threshold equals 1.5. With these weighted inputs [0 0], [0 1], [1,0], and [1,1], the neuron will output 0, 0, 0, and 1 respectively. Hidden neurons used in Red code are TLUs.


Network training 

Since TLUs can classify, neural networks can be built to artificially learn some simple rules as Boolean operators. Learning mechanism is modeled on the brain's adjustments of its neural connections. A TLU learns by changing its weights and threshold. This done by process called training.  The concept is not difficult to understand.  Basically, a set of input values and the desired output for each set of inputs are required. This corresponds to the truth tables associated to Boolean operator we want to be learned such as XOR :
Input 1
Input 2
Output
0
0
0
0
1
1
1
0
1
1
1
0

We set first the weights with random values. Then each set of input values is evaluated and compared to the desired output for that set. We add up each of these differences and get a summed error value for this set of weights. Then we modify the weights and go through each of the input/output sets again to find out the total error for this set of weights. Lastly, we use a backpropagation algorithm to test the network learning. The backpropagation algorithm looks for the minimum value of the error function in weight space using a technique called the delta rule or gradient descent. The weights that minimize the error function is then considered to be a solution to the learning problem.



Different Boolean operators ["XOR" "OR" "NOR" "AND" "NAND"] can be used to test the network. You can also play with the number of iterations you want to train the network. Lastly two algorithms are implemented to compute weights in hidden neurons either exponential or sigmoidal. 

Code

Red [
Title:   "Red Neural Network"
Author:  "Francois Jouen"
File: %neuraln.red
Needs: View
]
{This code is based on Back-Propagation Neural Networks 
by Neil Schemenauer <nas@arctrix.com>
Thanks to  Karl Lewin for the Rebol version}

; default number of input, hidden, and output nodes
nInput: 2
nHidden: 3
nOutput: 1
; activations for nodes
aInput: []
aHidden: []
aOutput: []
; weights matrices
wInput: []
wOutput: []
; matrices for last change in weights for momentum
cInput: []
cOutput: []
learningRate: 0.5; = learning rate
momentumFactor: 0.1; = momentum factor

n: 1280 ; n training sample
netR: copy [] ; learning result
step: 8;

;XOR by default
pattern: [
[[0 0] [0]]
[[1 0] [1]]
[[0 1] [1]]
[[1 1] [0]]
]


;calculate a random number where: a <= rand < b
rand: function [a [number!] b [number!]] [(b - a) * ((random 10000.0) / 10000.0) + a]

; Make matrices
make1DMatrix: function [mSize[integer!] value [number!] return: [block!]][
m: copy []
repeat i mSize [append m value]
m
]
make2DMatrix: function [line [integer!] col [integer!] value [number!] return: [block!]][
m: copy []
repeat i line [
blk: copy []
repeat j col [append blk value]
append/only m blk
]
m
]

tanh: function [x [number!] return: [number!]][ (EXP x - EXP negate x) / (EXP x + EXP negate x)]

;sigmoid function, tanh seems better than the standard 1/(1+e^-x)

sigmoid: function [x [number!] return: [number!]][tanh x]

;derivative of  sigmoid function
dsigmoid: function [y [number!] return: [number!]][1.0 - y * y]

createMatrices: func [] [
aInput: make1DMatrix nInput 1.0
aHidden: make1DMatrix nHidden 1.0
aOutput: make1DMatrix nOutput 1.0
wInput: make2DMatrix nInput nHidden 0.0
wOutput: make2DMatrix nHidden nOutput 0.0
cInput: make2DMatrix nInput nHidden 0.0
cOutput: make2DMatrix nHidden nOutput 0.0
randomizeMatrix wInput -2.0 2.0
randomizeMatrix wOutput -2.0 2.0
]

randomizeMatrix: function [mat [block!] v1 [number!] v2 [number!]][
foreach elt mat [loop length? elt [elt: change/part elt rand v1 v2 1]]
]

computeMatrices: func [inputs [block!] return: [block!]] [
; input activations
repeat i (nInput - 1) [poke aInput i to float! inputs/:i]
; hidden activations
repeat j nHidden [
sum: 0.0
repeat i nInput [sum: sum + (aInput/:i * wInput/:i/:j)]
either cb/data [poke aHidden j sigmoid sum] 
[poke aHidden j 1 / (1 + EXP negate sum)]
]
; output activations
repeat j nOutput [
sum: 0.0
repeat i nHidden [
sum: sum + (aHidden/:i * wOutput/:i/:j)]
either cb/data [poke aOutput j sigmoid sum]
[poke aOutput j 1 / (1 + EXP negate sum)]
]
aOutput
]
backPropagation: func [targets [block!] N [number!] M [number!] return: [number!]] [
; calculate error terms for output
oDeltas: make1DMatrix  nOutput 0.0
sum: 0.0
repeat k nOutput [
either cb/data [
sum: targets/:k - aOutput/:k poke oDeltas k (dsigmoid aOutput/:k) * sum]
[ao: aOutput/:k
poke oDeltas k ao * (1 - ao) * (targets/:k - ao)]
]
; calculate error terms for hidden
hDeltas: make1DMatrix  nHidden 0.0
repeat j nHidden [
sum: 0.0
repeat k nOutput [sum: sum + (oDeltas/:k * wOutput/:j/:k)]
either cb/data [poke hDeltas j (dsigmoid aHidden/:j) * sum]
[poke hDeltas j (aHidden/:j * (1 - aHidden/:j) * sum)]
]
; update output weights
repeat j nHidden [
repeat k nOutput [
chnge: oDeltas/:k * aHidden/:j
poke wOutput/:j k (wOutput/:j/:k + (N * chnge) + (M * cOutput/:j/:k))
poke cOutput/:j k chnge
]
]
; update hidden weights
repeat i nInput [
repeat j nHidden [
chnge: hDeltas/:j * aInput/:i
poke wInput/:i j (wInput/:i/:j + (N * chnge) + (M * cInput/:i/:j))
poke cInput/:i j chnge
]
]
; calculate error
error: 0
repeat k nOutput [error: error + (learningRate * ((targets/:k - aOutput/:k) ** 2))]
error
]
trainNetwork: func [patterns[block!] iterations [number!] return: [block!]] [
blk: copy []
count: 0
x: 10
plot: compose [line-width 1 pen red line 0x230 660x230 pen green]
repeat i iterations [
;sbcount/text: form i
error: 0
foreach p patterns [
r: computeMatrices p/1 
error: error + backPropagation p/2 learningRate momentumFactor
sberr/text: form round/to error 0.001
if system/platform = 'Windows [do-events/no-wait];' win users
do-events/no-wait
append blk error
count: count + 1
]
;visualization
if (mod count step) = 0 [
y: 230 - (error * 320)
if x = 10 [append append plot 'line (as-pair x y)];'
append plot (as-pair x y)
x: x + 1
]
visu/draw: plot
do-events/no-wait
]
sb/text: copy "Neural Network rendered in: "
blk
]
testLearning: func [patterns [block!]] [ 
result2/text: copy ""
foreach p patterns [
r: computeMatrices(p/1) 
append result2/text form to integer! round/half-ceiling first r 
append result2/text newline
]


changePattern: func [v1 v2 v3 v4][
change second first pattern  v1 
change second second pattern v2 
change second third pattern  v3 
change second fourth pattern v4
result2/text: copy ""
result1/text: copy ""
append append result1/text form second first pattern newline
append append result1/text form second second pattern newline
append append result1/text form second third pattern newline
append append result1/text form second fourth pattern newline
]


makeNetwork: func [ni [integer!] nh [integer!] no [integer!] lr [float!] mf [float!]] [
random/seed now/time/precise
nInput: ni + 1
nHidden: nh
nOutput: no
learningRate: lr
momentumFactor: mf
createMatrices
s: copy "Neural Network created: "
append s form ni
append s form " input neurons "
append s form nh
append s form " hidden neurons "
append s form no
append s form " output neuron(s) "
sb/text: s
result2/text: copy ""
sberr/text: copy ""
]


makeTraining: does [
t1: now/time/precise
netR: trainNetwork pattern n ; network training
t2: now/time/precise
testLearning pattern ; test output values after training
append sb/text form t2 - t1
]

view win: layout [
title "Back-Propagation Neural Network"
text  "Pattern" 
dpt: drop-down 70 
data ["XOR" "OR" "NOR" "AND" "NAND"]
select 1
on-change [
switch face/text [
"XOR" [changePattern 0 1 1 0]
"AND" [changePattern 0 0 0 1] 
"OR"  [changePattern 0 1 1 1]
"NOR" [changePattern 1 0 0 0]
"NAND"[changePattern 1 1 1 0]
]
isCreated: false]
text "Sample"
dp2: drop-down 70 
data ["640" "1280" "1920" "2560"]
select 2
on-change [n: to integer! face/text step: (n / 640) * 4 ]
cb: check "Sigmoid" []
button "Run Network" [makeNetwork 2 3 1 0.5 0.1 makeTraining ]  
text 40 "Error"
sberr: field 60
pad 10x0
button "Quit" [quit]
return
visu: base 660x240 black
result1: area 35x80
result2: area 35x80
return
sb: field 660
do [changePattern 0 1 1 0]
]