Wednesday, January 18, 2006

New Course: Machine Learning

Last semester, there was an optional course Machine Learning, a newly introduced one, taken by our dear Abu Wasif Sir. This was likely a specialized branch of AI. In whatever way the course name smells, we saw no mechanical devices or machines.

It was all about how you can make computer programs learn from experience. Now, learning is considered to be a divine aspect of human being. Human can learn more powerfully than any other creatures. It is one of the virtues that kept human being ever growing through science, culture and dominance.

Now, if our computer programs can learn, they can be said to possess some intelligence to some extent. This intelligent agent can be used for our purposes, like to help us recognizing some patterns (e.g., images, face, handwritten characters,) to classify among abundant of data or to drive our car. The reason we need to make computer programs do these tasks for us is not (only) that we are being lazy! In most cases, it will take many people and many months for us to do the same thing by hand. Employing a computer program is like employing a dedicated brain (which has nothing else to do) for cheaper cost.
All these tasks require some extent of intelligence as we cannot provide a computer all possible handwritten characters to store & to recognize a particular character. Computers should learn recognizing characters nearly the way we do. We do not (need to) see all possible alphabets written by all the people in the world to (almost/nearly) recognize an alphabet written by a person whose handwriting is not even familiar to us. We cannot always do it with certainty. What you can do if someone writes something so badly! But, in other cases, we show some capability to recognize an alphabet written in a familiar language. What if it is written in Greek? We say that -it is all Greek to me! Why? Because it is a language or at least a set of alphabets we (probably) did not learn. For, English & Bengali, yes we can recognize the alphabets of these languages written by almost any people.

So, such of our capability is an outcome of our prior learning.

Now, what is learning? It is usually difficult to give definitions to very simple concepts. Most of the time, we define complex ideas with the help of simpler concepts. Then what concepts we would use to define the simpler concepts?

There was a discussion on the definition of learning in our batch forum’s Machine Learning sub-forum which is in cooperation with Abu Wasif Sir. Many came up with many ideas, but at last we converged that the standard definition in CS literature remains indomitable. What it says can be rephrased informally as - whatever you do, if I can show that you have improved with respect to some/any performance measure of some/any task, then I can say that you have learnt. The thing you were doing then can be said as having some experience. So, you were learning from experience. Thus, you do not need to learn something intentionally. Even, you do not have to be aware that you are learning. Perhaps you are trying to forgetting something and still it can be proved that you have learnt if a proper performance measure of a proper task has been found, with respect to which you have yet improved. Very interesting! Though, in case of machines, task & performance measures are set first, and then we need our machines to improve on it from experience.

How can a computer program/machine learn? There are many models that has the capability to learn (according to the definition), such as –decision tree, artificial neural networks, Bayesian learner, genetic algorithms, etc. Each needs elaboration to elucidate. In short, each of them are posed with example instances along with their classes, algorithm governing the model tries to learn the examples and gain a generalized view such that it can classify a newly posed instance (with some significant success), classification of which is yet not known. In terms of statistics, it does nothing but develops general hypothesis by observing the samples. Sounds familiar now?

Generalization on unseen things is not a trivial matter. If I observe that, you look for an umbrella before going out during rainy days and do not need it during winter; I guess that you will soon search for your umbrella before going out if the sky is heavily clouded and about to rain. This guess is a generalization, a hypothesis, built after observing some samples. It was fairly easy because looking for umbrella depends roughly on one variable, the weather. Now, what if your decision depends on thousands of parameters, e.g., your mood today? It gets even harder, if your decision itself is multivariate, e.g., the things you will do today. U.S. Army intelligence detected 9/11 terrorists more than a year before the attack, using one of machine learner’s capabilities of automatically searching data for patterns, particularly known as data mining.

A very interesting perspective of building hypothesis is that you should have some prior belief/bias to generalize your ideas from observing examples. If you do not have a bias, you would fail to generalize over unseen instances. So, you have to set a proper bias before learning. Bad biases classify incorrectly. No bias just cannot classify, correct or incorrect. Probably this is why agnostics frequently suffer from indecisions (on non-secular matters)!

Machine learning is an emerging area of Computer Science. Many works yet has to be done on it. Success rate of the learners are yet increasable in most cases, which require further research. Computer programs still cannot recognize characters as successfully as we do. We have a long way to go before we make them learn like us or when we may need to learn like them!

1 comment:

Unknown said...

Thanks Bhaia, I think it would be very helpful to us who are going to be the new 4-1.