8 months ago I posted this thread: Neural networks object: digital brain, a test program for my neural network object that recognizes characters drawn with the mouse.

I've been working on the next version for quite a while and most of it is finished now. Some things are missing but I'll give you the prerelease here.

Quick start

Unzip the package somewhere and run digibrain.exe. Initially, the brain is filled with random crap. Push the 'load memory' button on the bottom right. Find 'charbrain.brain' in the same dir as digibrain.exe and open it. Now the brain will be in the state as I trained it. Start drawing characters onto the tablet and you will see the windows below show the results. Above the tablet (where the text 'drawing tablet' is) are two hidden hotspots, when you click left in that area you will delete the last character in the result list. The right part of that area adds a space to the list. That way you can write an entire sentence with your mouse :).

Each character needs to be drawn as one continuous line, and in the same way as it was trained.

Training mode

When you click training mode you will see a directory listing of the dir. 'train_sets', which contain training sets for the digital brain. Currently you can't add these with the tool, only modify them, but you can copy an existing .tdat file so you'll have another file to store training data in. When you click one and then click on one of the characters, you can see a stored drawing of that character. Each set contains drawings for all characters (white are chars without data, blue ones have data and the green one is the active one). When all characters have a stored image, you can train the brain with that training set using the 'train' button, optionally changing the number of times the whole set is shown to the brain (1, 10 or 100x). All training will modify the existing brain data. So you can use many training sets and train the brain with it in turns. If you want a clean brain, use the brainwash button :).

_preset 1.tdat and _preset 2.tdat are the sets I used to create the charbrain.brain file, the empty ones are for your own use and the _numbers ;-).tdat file doesn't contain characters but the numbers (1-9,0) for the characters a-j, the others are just set to a dummy drawing. The numberbrain.brain file contains the brain trained with this set, just load it and draw numbers.

note

In the training mode you can see in what way the characters are drawn, in run mode draw them exactly like in training mode. Even the direction is very important (the U is drawn from left to right, V from right to left). Some characters look a little weird because they had to be drawn as a continuous line.

I hope it's a bit clear, I wrote the text above in a hurry...The final version will have some documentation with it, and the full win32asm source code.

Have fun!

Thomas
Posted on 2002-09-02 14:59:37 by Thomas
Very impressive Thomas! Only letter that gave me trouble was V. Watching the data update realtime is very cool. Your work has such style to it!

I'm curious what MS Language Bar is using - as it requires no training and gets all my letters (print or cursive), also supports other languages. As with the speak recognition, I'm sure there are multiple levels of interpretation going on - letter, word, sentence/grammar, etc.
Posted on 2002-09-02 17:50:57 by bitRAKE
Thomas, very nice. I love the interface, how'd you get those anti-aliased line? :)

I've done a lot of neural network work myself lately, however nothing as intutivily user friendly as this example. Anyting I've done would bore the pants off people, this type of visual demonstration is a nice idea. Congrats.

I'd be curious as to what net you're using, is it a back propagation, or you of the many others. Also what sort of layout are you using, ie no of layers and number of neurons per layer, thats assuming backprop again though. If you don't mind that is. :)
Posted on 2002-09-02 17:54:10 by Eóin
Wow... verrrrrrry good ...

I can not wait for the release :) (sources)

Have wanted to do such a thing all of my life but never had the time to do it :)
Posted on 2002-09-02 20:04:59 by BogdanOntanu
:(

Won't run on my K6-2... causes an invalid instruction (looks like CMOVcc)

--Chorus
Posted on 2002-09-02 20:53:52 by chorus
Very impressive Thomas! Only letter that gave me trouble was V. Watching the data update realtime is very cool. Your work has such style to it!


Did you draw the V from right to left?

I'm curious what MS Language Bar is using - as it requires no training and gets all my letters (print or cursive), also supports other languages. As with the speak recognition, I'm sure there are multiple levels of interpretation going on - letter, word, sentence/grammar, etc.


Well with the proper training you can allow multiple variants of the same characters. You could try different training sets with different styles.

Thomas, very nice. I love the interface, how'd you get those anti-aliased line?


It's a kind of full scene anti-aliasing, it draws everything at double size and then scales it down by taking the average of every 4 pixels (using this code snippet). It also uses the bitmap block blend snippet and bitRAKE's NonUniformAverages function.

I'd be curious as to what net you're using, is it a back propagation, or you of the many others. Also what sort of layout are you using, ie no of layers and number of neurons per layer, thats assuming backprop again though. If you don't mind that is.


It's a multilayer feedforward network with backpropagation. The network has 24 inputs. The sin/cos graph has blended blocks drawn over the line when the drawing is finished. These blocks divide the graph into 12 average sine and cosine values of the drawing angle. These values are directly fed to the network inputs.
Furthermore it has 26 outputs (26 characters), and one hidden layer of 20 neurons.

Won't run on my K6-2... causes an invalid instruction (looks like CMOVcc)


Its because of the PNG library used for the images.. it has one CMOV instruction in the CRC procedure :). Iirc the non-safe build of the library doesn't use the CRC proc so I'll use that one in the next build.

Thomas
Posted on 2002-09-03 09:23:36 by Thomas
Here's the exe without cmov (you still need the other zip)

Thomas
Posted on 2002-09-03 09:28:19 by Thomas
Thats actually a tiny net :eek: . I'd figured such a complicated thing as letter recognition would required a much larger net.

I know personally I'd have used nothing less than 100 Neurons in the hidden layer, but then I do tend to go for overkill ;) . As it stands though I still can't help but feel you could perhaps do with at least 24, I always felt hidden neurons should be never less than the inputs. But I'm sure you've tested many values, and if twenty works then using any more would be needlessly bloated and slow.

I'm very impressed though, I look forward to seeing the net source code.
Posted on 2002-09-03 12:16:51 by Eóin
I know personally I'd have used nothing less than 100 Neurons in the hidden layer, but then I do tend to go for overkill.

Many neurons require more training so I just tried out what value worked reasonably.

As it stands though I still can't help but feel you could perhaps do with at least 24, I always felt hidden neurons should be never less than the inputs.


I don't know if the hidden layer should always have more neurons than the input/output layer, it would seem so. But remember that between each two layers, every possible connection is made. That means that between the input and the hidden layer, 24x20=480 connections are made, and between the hidden layer and the output, 20x26=520 connections. Each connection has it's own weight so there's more information stored than it seems.

Now that I think of it, theoratically, if the hidden layer's output values were of infinite precision, anything could be encoded with it. Floating points don't have infinite precision but enough to define an encoding for only 26 characters (0.01 = A, 0.02 = B, 0.03 =C for example).

I'll try using one neuron in the hidden layer and see what happens :grin:.

I'm very impressed though, I look forward to seeing the net source code.


Well the nn source itself was included in the previous post (version 1), the digibrain app itself will be available very soon.

Thomas
Posted on 2002-09-03 13:09:46 by Thomas
1 neuron in the hidden layer doesn't work at all :).. However I tried 10 and it seems to give better results than with 20 neurons :eek: :confused:.. With 10 neurons even characters drawn less perfect are recognized...

Here's the exe with 10 neurons.. the brain files aren't compatible with this version so you'll have to train yourself.

Thomas
Posted on 2002-09-03 13:18:08 by Thomas
I'd say 0.01, 0.02, etc are just too percise, you need atlot of training to get percision to that level. It is intresting though that 10 work better than 20.
Posted on 2002-09-04 12:52:22 by Eóin
The complete package including the latest binary, full source code and documentation is now available at my website!

www.MadWizard.org

Thomas
Posted on 2002-09-06 14:45:33 by Thomas
Fantastic work Thomas!

I wonder how hard it would be to use the time between each segment, as well, for another dimension in the network's analysis and increase accuracy. When i made custum letters more defined as what the letter should look like, the results are often confused, and justifiably so.

By giving the network 12 average rates to go with the 12 sine and cos values, it may be able to hold more accurate responses. Down side is you would need to train to each person's mousing speed while drawing ;)


But beyond my babling, you've done very good work! Very clean code and quite readable. Keep up the great work!

:alright:
:NaN:
Posted on 2003-07-03 22:09:46 by NaN
Hi Thomas,

Can you more explain here what for algorithms you have used, Backpropagation networks, recurrent networks and how you process the given inputs, how many layers and hidden neurons, how many input neurons.
I ask You because I have written some recognization NN's, like as yours, based on Recurrent Neuronal Networks. On RCC's we don't define the count of Hidden Layers or Neurons per Layer, the RCC train and create these Layers self. Additional my RCC's choose by genetic algortihms the best working Network and Activation Functions.

Most important for me are your Preprocessing of the inputs, but Your source is hard to read for me.

By the way: how can I define my own net to train with your software. I get always an exception if I want to define my own train set of chars.

Best regards, Hagen
Posted on 2003-07-06 08:20:23 by Hagen
Thomas, is there anyway you can document the NNet.asm file a bit more. Im most interested in how you use the there memory arrays. Ie. a road map, so i can follow your source better. I think im getting it, but im now 100%. As well a bit more info on how your training works on these arrays woudl be nice. (Neither is imperitive to me, i will figure both out eventually, but if you can help me along it would be nice ;) )

Again, great work.
:NaN:
Posted on 2003-07-07 14:51:42 by NaN
Hi Nan,

if you can read Delphi I can you send my NN source.
For such complex algorithms is a high level language better to understand.
It contains normal Back Propagation Networks with QuickProp as Backpropagation algorithm.
The second network that I implemented are the Recurrent Neural Network. These RCC's construct they needed Layers and Neurons self on training phase.

Hagen
Posted on 2003-07-07 19:08:05 by Hagen
Sure zip it up and post it. Would be a good external reference (from MASM ;) ).

I have *some* knowledge of NN's . I studied them four years ago in college, but i admit, i dont remember much about the specifics of the hidden layers and how they are configured for different types of networks.... Abstractly i can follow no problem. But the source is less abstract, and more organized (which is slowing me down a bit ;) )

:NaN:
Posted on 2003-07-07 23:50:28 by NaN
Sorry I didn't see and coudn't reply to this thread earlier because I was on vacation..
The main NN code is pretty hard to read I admit, there's not much documentation in it. After this version of the digital brain program I haven't updated it (it was just a fun project for me, I've never really needed NN in other projects yet). I'll see if I can get some time to document the program, I'll have to take a good look myself as well as it's a whike ago that I wrote it.

There is some info in the earlier thread about the program:
http://www.asmcommunity.net/board/index.php?topic=2932

And here as well:
http://www.asmcommunity.net/board/index.php?topic=9855

Thomas
Posted on 2003-07-20 03:25:33 by Thomas
I'm curious what MS Language Bar is using - as it requires no training and gets all my letters (print or cursive), also supports other languages. As with the speak recognition, I'm sure there are multiple levels of interpretation going on - letter, word, sentence/grammar, etc.

hey if you have any links to speech recognition software it would be very appreciated if you could pm me them!
Posted on 2003-07-20 15:36:02 by Qages
Very impressive!
Posted on 2003-07-20 19:46:40 by comrade