Tag Archives: Artificial Neural Networks

Training an artificial neural network

A few days ago I handed in the second assignment on my Artificial Neural Networks (ANN) module.

The task was to train an ANN to detect attacks vs. normal access in the logs of a simulated military network. The data came from KDD Cup 99, of which we used a reduced set for the assignment. This reduced set of 10% was almost half a million records.

We had to use Matlab, which we’ve been using throughout the term, but the assignment was the first time I’d had to do anything which resembled real programming with it. Matlab is a funny beast, particularly on Mac OS.

The Mac OS is the only target platform where Mathworks can’t ship their own JVM, and also the only platform which the version of Matlab I have access to runs in 64bit mode. 64bit mode for a JVM environment is interesting, all it really means (so far as I can tell) is that Matlab has a larger address space.

This meant I could miss a parameter off a function call and accidentally tell Matlab to create a half-a-million columns by half-a-million rows matrix and it will swap till the cows come home, or I run out of disk space, trying to do as I asked. A similar issue was a ‘fast but memory intensive’ algorithm slowing the entire system to a crawl, however things became less painful once a fellow student alerted me to a parameter that would tell Matlab to reduce the memory consumption. Suffice to say that algorithm isn’t so fast when it’s using a slow piece of spinning rust as memory.

In spite of  Matlab I really, really enjoyed the assignment. I had a great time tweaking the network parameters and experimenting to see which things made my network perform better or worse. I enjoyed writing my report on the experiment itself and what I would have liked to do had I more time/a faster machine.

I found two things about the process frustrating:

  1. The assignment was at least partially bounded by the speed of my study computer, a poor white MacBook with 4GB RAM and a 1st gen Intel Core CPU. It ran hot for about 5 days solid generating my final set of results.
  2. Whilst performing the experiments I kept learning new things which I could try, yet that would invalidate earlier experimentation. My take away her is that I need to do a better job in future of designing my experiments rather than just diving in.

Despite all that I’m quite happy with the results. I think the report is one of the better documents I’ve written, despite being a little hastily finished, and the network itself performs admirably. ~65% accuracy on my test data and closer to ~98% when I do some minor post-processing of the outputs.

So, what’s next? I have to give a 15min presentation about the assignment followed by a 15min Q&A session. After that it’s two weeks of lectures left before the end of term.