The Machine Learning Journey: Machine Learning

Showing posts with label Machine Learning. Show all posts

Friday, January 23, 2015

Links on Data Science, Research and ML for the day (January 22, 2014)

These are some interesting links for your day:

Ello is basically dead on its tracks (It was supposed to be a free Facebook without ads or tracking). Damn, and I just got my invite.
Antuit (a Big Data firm) just got a bazillion dollars (yes, I'm sure that is a real quantity of money)
Call or papers for Pattern Recognition in Neuroimaging
How research projects are like startups

Thursday, May 23, 2013

Stationarity in ML Data (An EEG application)

A stationary process in statistic is one where the distribution does not change in time or space. Which in layman terms just means that if we measure the data today and we test it tomorrow, the underlying distribution must remain more or less the same.

This is a fundamental concept to Machine Learning. Stationarity in data allow us to train models today and expect them to work in data that we gather in the future. Sure there is a constant retraining of the models, but most of the models will assume that all the data comes from a stationary process. There is no point in modeling your data with some parameters today (mu and sigma if it is Gaussian) if you expect that tomorrow's parameters are going to be wildly different. However much of the data out there is non stationary.

Think about training a robot following a path based on images taken in spring, and then try to have the robot follow the same path in winter. We ran in this issues a lot when working in the Tsukuba Challenge Project, they allowed you to take the data in the summer, but the competition was in fall, when the trees have no leaves to show for.

Interesting problems like these arise in many other areas, like computer vision, where we would like to think that the objects we use to train are not rotated or transformed, when in reality they are. For a more extensive review of CV approaches to this, you can check LeCun's Convolutional Neural Networks.

Some of the most interesting problems in Neuroscience are also non stationary (we like to think they are, but they aren't). EEG readings that we do today are often affected by many environmental and subject conditions. Not to mentions that reading EEG from human scalp is not an exact science. The whole process tends to be messy, time consuming and difficult to replicate.

One cool approach to deal with this is to transform the data in such a way that you can obtain non invariant features of the data. For example, if you train a CV vision system, you could extract these features from objects, instead of measuring size and color of a circle you could measure it's radius and circumference and if the parameters follow the circle's circumference equation, you could assure it was a circle. Convolutional NN do something of sorts with input images. You could also do an extensive training, which means to train with every possible transformation of the data.

In EEG we try to use frequency analysis, since it tends to be more reliable than the simple time series (in theory). A recent paper in the MIT Journal of Neurocomputation has a great introduction on this topic, and how non stationarity is attacked using things like stationary feature extraction and adaptive model algorithms.

Stationary feature extraction is the jargon for what I described before with CV, many people use things like Common Spatial Patterns that tend to remove all of the EEG non stationarity and leave us with nice stationary features to use with our favorite ML algorithm.

Adaptive model algorithms are those that change the algorithms' parameters in subsequent recording sessions. As users get accustomed to have their EEG reading plotted in front of them, they also tend to learn how to control them better. And as such, adaptive algorithms are used to address this non stationarity. Think of it as a videogame that learns your behavior as you get better playing it, and can react better to your inputs.

The approach in the aforementioned paper is interesting in the sense that they used a really simple statistical concept, the Kullback-Leibler (KL) divergence, which is a fancy term for a measure of how the difference between different probability distributions.

They assume that if you have a training session done in day 1, and a small sample of data from day 2, you can use the KL divergence to measure how different the probabilities from day 1 and day 2 are, and create a linear transformation such that the information from day 1 is relevant to that you will obtain in day 2.

The rest of the paper goes on how to obtain these transformation matrices using different flavors of this approach, one where the labels of the day 2 are available, and one where they aren't.

The method looks eerily similar to what you would do in a State Model, where you try to approximate (predict) the values of the next state given the previous state's parameters and then do an update as you can read the data for the next state (Update).

The paper goes the safe route and assume both probabilities are gaussian, I could think in a nice extension where you approximate them to be gaussian but in reality you can have different probability distributions modeling the shape of the data. Using simple gaussian approximations that should not be so hard.

The paper is nice, but still in preprint, so you will need a university account to access it, and maybe a couple of tricks from your university's library.

Monday, February 27, 2012

An introduction to linear regression - Cost Function (ML for the Layman)

I've tried keeping away from posting tutorials on ML topics. Mainly because I did not feel well prepared to do it yet. I hardly think I'm prepared now, but I definitely can give it a better shot, and with your feedback, I can ,at least, get an idea on where I can improve.

Disclaimer: Even though this introduction will be basic, and I mean as basic as it can get. You'll still need some knowledge on matrix operations, basic algebra and calculus to get through some of the explanations.

Imagine you want to sell your car, let's say it is a Prius 2007 with 20,000 miles. It is in a very good condition and you would like to do a survey of how much does your car costs in the market. You definitively want to get, as a seller, the best price for it.

So, how do you price a car? If you know nothing about cars (like me), you can go to someone that has a better idea, in our case, it is the internet.

We can see that the price is set by things, like the age, the maker, the mileage, the overall condition, etc. We will call this things "features". So a set of features is basically the characteristics of our car or any object.

So which features to choose? For the sake of simplicity let's choose year and mileage. It's important to notice that there are whole researches around how to choose features, but we are first learners here, so we do not care about that. It is important to know, however, that having many features is not always better than having few features.

Now, that you have chosen a set of features, how do you compare one car to another? We now go and look for data. For car comparison, we can go to different web sites where you can see different combinations of these features for different cars. Your local newspaper's classifieds or craiglist.

Let's create a mock data set of 5 cars using only 2 features, age and mileage. All of them are Prius:

$year=(2007,2005,2006,2007,2010)$
$mileage=(50000, 60000,54000,40000,20000)$
$price=(12000,9700,10500,13000,20000)$

It is intuitive to think that the price of our car will be directly related to age and mileage. A low mileage and a recent year increases the price, while an old car with a high mileage has a lower price. So there should be an abstract way to write this relation.

To model this kind of data, we use linear regression, which states that a variable is the resutl of a linear combination of other variables. That is, our price actually obeys to some combination of mileage and year.

For the first element (12,000 USD for a 2007 model with 50,000 miles):

$12000=a_1*(50000)+a_2*(2007)$

A second characteristic, is that every element has to share those $a_1$ and $a_2$ variables, so we can use those values with our own Prius, there would be no point in describing individual values for each car, when we want to find the best price for our car, so we can write a general equation:

$Price=a_1*mileage+a_2*year$

How do we find the $a_1$ and $a_2$ that solve this model? How do I know that "1" and "1" are not good choices? Well, for starters, summing up the year and the mileage does not seem like a good idea.

We need some function that'll tell us how bad or how good our prediction is. This is usually called the "cost" function. How do we build a cost function, well, our intuition tells us that we have to compare (rest) the truth with our guess. So for our guess of "1" and "1" for $a_1$ and $a_2$:

$Cost=(12000-(1*50000+1*2007))$

We can see there is something off here, since we are looking for the minimum cost, we can make the values in $a$ large enough and we will have incredibly low values (negative values), so we squared them to have a nice function that has a lower bound (it does not go lower than certain value).

$Cost^2=(12000-(1*50000+1*2007))^2$

We now have an intuition that the best we can do is $cost=0$, since a squared number can never be negative, we cannot do any better than that.

This cost, however, is only the cost for 1 example, we need the cost for all our cars. So we sum all of them:

$Total cost^2=\sum_{i=1}^5(Price_i-(1\times Mileage_i+1\times year_i))^2$

We now have the intuition that our choice of 1 and 1 may not be a very good choice at all, just for kicks, lets put how much the cost would be for different values for $a_1$ and $a_2$:

a1,a2

Cost

0,0

917,340,000

0,1

675,707,259

1,0

6,595,340,000

1,1

7,252,615,259

The costs are terrible! There is, however, something happening for "0,1", since the cost decreased compared to "0,0". But there has to be something better than just random guessing right?

Our guess of 1,1 is just terrible. But we gained an intuition, we see that our solution has to be near the "0,1". Also, some of you may notice by now that no matter which values do we put, there is no way we are going to get a zero. It's just not possible. Not without some extra help, which we will call an offset.

Next time, we'll talk about optimization or how to stop you from trying everypossible combination by hand, how to use the offset and a bit about comparing pears and apples.

See you later

Remember to visit my webpage www.leonpalafox.com. And if you want to keep up with my most recent research, you can tweet me at @leonpalafox.
Also, check my Google+ account, be sure to contact me, and send me a message so I follow you as well.

Friday, September 16, 2011

Every time you write a bad paper an angel cries

"Writing is an art, every word has a specific meaning, and each expressions should hold an objective."

Scientific writing, unlike any other kind of writing has a very specific goal: To report your findings in a clear and concise way. Your opinion while important, has to leave most of the space for the cold, hard, facts. I read once that in writing, academics are like civil engineers, their work is to make sturdy structures that will have a solid foundation, leave the niceties and decorations to the architects.

But how do you know what to write? Usually a good paper's results can be replicated by an informed reader. (In the NIPS reviewers poll, that is one of the conditions). Thus, your paper needs enough information so a fellow researcher does not need to ask you any questions.

Sound easy right? Truth is, it isn't.

Sadly, most people thing that because they can speak, they can write. A large number of people, think that writing the way they speak is the way to do a research paper (or any online publication). However, clarity is a luxury we usually forget when we speak, for example, while saying that there was "a lot" of people in the market is a great expression when telling a story, in a research paper saying that they had "a lot" of data is non representative of the amount, and thus ambiguous.

A research paper, unlike speaking, allows us to rewrite our sentences several times. Remember that is impossible to ask for clarification when reading an article (unless you send an email), so you have to write as clear as possible. All those times you've had to clarify what you meant with a sentence when speaking, are dead sentences in writing. My rule of thumb is that a reader has to understand the paper without me sitting next to them.

But how do we learn to write well? If you are lucky, your advisor is still very prolific, and a lifetime of reading and writing papers have given him at least a good idea of how to structure a paper. With time, you too will have a hold of the most common rules of writing a research paper. Something I tend to do is check for verb consistency, is every noun's action being described by a verb, or not. If you have free nouns in your paper, you will have confusion. If you're not sure, keep yourself from writing complex statements, and instead hold to the basics.

There are other tools you can use to improve your writing. "The elements of Style" by Strunk and White and "On Writing Well" by Zinsser are 2 great books which offer great and basic advice on how to structure a good piece. While the books are mostly oriented towards people writing real literature (for me academic journals are more like technical reviews), they do help you to structure your sentences so they become less ambiguous (clarity, clarity, clarity).

My advise is to practice, you can try writing a blog, writing papers for small conferences, or even mock papers (there is no rule against them) and look for your mistakes, look for things you could write in a better way. Try asking people to read your documents and see how much sense it made to them. Really, I think you have to keep practicing over and over, until you get a firm grasp on paper writing.

Another really important thing is to review. If your paper has typos, it speaks ill of you and your research, it shows you as a sloppy author and thus a sloppy researcher. Take your time to re read your papers. I usually spend around 3 days writing a blog post, first I write it in what we could barely call English, then I re-read the entry and I try to give sense by taking out sentences, useless words and adding clarity. Then I use tools like Microsoft spell checking and style checking to look for the use of odd sentences and passive voice (extreme use of passive voice in a paper adds confusion and usually is better to write using an active form)

It is fair to say I spend 2 to 3 times more doing the reviewing than the writing of the piece, but I know it helps me to write better the next time. So I don't look at it as a burden, but rather as practice.

If you have any other advise on writing let the comment section hear them.

Remember to visit my webpage www.leonpalafox.com. And if you want to keep up with my most recent research, you can tweet me at @leonpalafox.
Also, check my Google+ account, be sure to contact me, and send me a message so I follow you as well.

Tuesday, September 6, 2011

Starting to write

What is the best way to start a book?

Many people will tell you that the best way is to gather experience and then write your book. Others will tell you to write as you gather experience. Truth is..... I do not know yet who is right. While it is true that a lot of great books are written after experience has been learn, it is also true that a lot of great books are written on the go.

So how does this translates to papers in Machine Learning?

For most graduate programs, you'll need to write Conference and Journal papers. In some universities, though, they are not really picky on which journals or conferences, as long as it's published. Others, do care to which conference do you go. Not to mention that your Professor probably has already a set of conferences where he is a regular, thus, asking you to write a paper for such conferences.

So, back to the basic question, do you write as you go, or do you write once you finished every experiment, theory and survey? (at the end of your PhD). Some people will tell you the former is the better, while others, will tell you that is good to have a ton of papers, since most committees will rarely look at the papers themselves, but just at the sheer number of publications.

I think a fair balance is the best policy, you do have to wait until you have good results to publish, but you also have to publish enough so you get to go to different conferences and get feedback from experts in the area you are working in, or at least a good networking. Be careful to remember that really good journal or conferences won't accept papers of half-done research or quick baked results.

Remember, though, that most journals will take their sweet long time to accept your paper, or even reject it. So, if you wait until the end to submit it, you'll definitively will have troubles getting your paper accepted by the end of your PhD. (We are speaking that some IEEE journals take around 7 to 8 months to give you any feedback)

Lastly and most important, when in Rome, do as Romans do. Regardless on your opinion, truth is, you are a simple student. Thus, if you prefer to wait, and your committee wants you to publish.....you'll either publish or go. On the other hand, if the University wants you to go to specific conferences (acceptance rates around 20 or 30 %), you will have to wait, even if you badly want to publish any incremental gain you''ve had.

Good luck writing your first paper, and in the next post, we'll talk about on how to write, and why it is very important for you to be extra careful when writing a paper.

Remember to visit my webpage www.leonpalafox.com. And if you want to keep up with my most recent research, you can tweet me at @leonpalafox.
Also, I've recently started using more and more my Google+ account, be sure to contact me, and send me a message so I follow you as well.

Tuesday, August 30, 2011

Delving in Theory!

Disclaimer: theory is not my forte, since I did not study a math undergrad. This post may be biased of what I think Theory means in the specific area of Machine Learning.

Do you like that iPod you have , does your smartphone apps keep you busy and entertained. How much do you use a computer for your daily work? Do you know whom do we owe the computer and all the electronic devices to?
Quantum Mechanics (QM)! Material Engineers have to take several courses on QM to create transistors, which electric engineers then use to create computers (roughly speaking), and the Computer Scientists use this computers to create algorithms and spawn companies like Google, Facebook or Foursquare. Thus creating worldwide connectivity and interaction. All of this because QM.

Like you could see in the example, a plain theory like QM spawned an enormous amount of wealth and applications. And we can say the same about Machine Learning. Let's put an example, Support Vector Machines (SVM).

There are 2 ways to do research with SVM's, one is to do research on SVM's and the other is to do research using SVM's. In the latter case, you probably will be an application man. But in the former, you will take care of the deepest construction of the SVM. You'll care not only on how it works, but also on how to improve it. Which kind of kernel do you have to use, and in the extreme, design an entirely new kernel to work with the data you have.

Kernel? What's a kernel? It's safe to assume that some people whom uses SVM's have no idea of what a kernel is, or means. But they do know SVM's are pretty good doing classification. SVM's are one of the (few) algorithms that you can take off the shelf and run it over your data without pre-processing or knowledge of the algorithm...... is a magic box that classifies.

But if you knew what a kernel was, you'll have incredible tools at your disposition. Knowing that, now you can choose the best one (I said choose, not guess) for the data you are using. You can also go as far as to try to increase the classification capabilities of the SVM. Thus obtaining incredibly good results. The catch...... you'll probably spend more time modifying the SVM than analyzing your data. But in any good journal or conference on Learning Theory, they'll care little about the test data, and car a lot on the algorithm. (Note to myself: is no use to mention these qualities of your algorithm in an application conference, most of them would not care about the algorithm, but the results)

AS you can see, when you do research ON a ML topic, you do not care deeply about the application at hand. Rather, you care about the algorithm's performance. Or you also care about the best way to represent specific data: images, characters, sounds, etc.

Doing research in ML theory, as I mentioned in the last post, is deep and hard work. You will rarely delve into a specific application, but you'll know tons of different theories. It's also likely that you'll spend most of your time reading books and not doing actual programs or simulations. At the best, you'll be doing it 50/50.

Note: In a previous post, someone asked me to put references to the claim: "While they (Theory man) go over 200 year old proofs, you'll (application man) go over 10 year old proofs, and while their proofs are 20 pages long, yours will be about 1 page (average)."
I really do not have time to go over different applications, their theories and proofs. But bare in mind, while most of the proofs of some of the Machine Learning Papers fit in a 20 page paper, Fermat's Theorem Proof, published in 1993 was about 100 pages long. If you read Bishop's Book on Machine Learning, you'll find most of the basic principles have "easy"proofs of about half a page.

Next time, we'll go over the dreadful fact of writing papers. When to do it, and why to do it.

Until next time

Monday, August 15, 2011

The Application Man!!!

Ask any company how much would they pay for a software, that would allow them to predict in a more reliable way how their products are going to sell. Or how much the competitors are capable of pushing the prices down before losing all profit. Let me tell you...... a lot.

Application, definitively the sweet spot of Machine Learning, it is where most of the money is made and is the thing that most people will relate to when they hear you do something AI related. Barely no one has heard of PageRank, yet everyone knows the Google Company, the same happens with pretty much every machine learning application out there, or any other theory. For example, Quantum Mechanics are the basis for most of the modern electronics we have.

So, if you don't really care on how the algorithms are working, and just care about applying the algorithms for the fame and glory, maybe developing an application might be right for you. Not to say you must entirely disregard the math, just that the math you'll have to go over won't be as dense as the ones statisticians have to use. While they go over 200 year old proofs, you'll go over 10 year old proofs, and while their proofs are 20 pages long, yours will be about 1 page (average).

I can't really emphasize how important is for you to learn the math behind the applications, even if you do not know exactly how is it working, at least is good that you have an idea of what the algorithm is doing. This way, if you have any errors, you can look for solutions in the right places instead of changing variables praying for something good to occur. Also you have bragging rights that you know more math than your peers at undergrad working as software engineers.

There are different machine learning applications, ranking, natural language processing, image processing, activity recognition, etc. However each of this problems have the difficulties and challenges. And different people may be suited for different applications.

How to find the application that best suits you? In my personal point of view, go over you passion. If you like dinosaurs, maybe you could apply recognition algorithms to detect structures and patterns in x-ray scans in the bones. If however, you like financial data, there are some work using Game Theory, and probability to increase your profit.

It would be crazy to try and list every laboratory that has an application and uses machine learning to solve it, instead, I'll list some of the most common machine learning algorithms, and how can you use them.
Of course this list is not exclusive, and there are thousands of different algorithms for the different applications, this is just to give you a head start of where to look and which algorithms you may find interesting to go over. I've chosen some of the most recent algorithms used to solve this problems, those published in ICML and NIPS from the last 10 years.

Financial Data: Markov Chains, Learning, Regression, Gaussian Processes, SVM
Robotics: Kalman Filter, Markov Chain Monte Carlo Methods, Markov Decision Process (Reinforcement Learning), SVM
Biology: Network Structure, Clustering, Network Parameters, Dirichlet Processes, Indian Buffet Processes, SVM
Vision: Markov Random Fields, Belief Networks, Neural Networks, SVM
Natural Language Processing: Conditional Random Fields, Latent Dirichlet Allocation, Mixture Models, SVM

Note: SVM's can solve everything, from a Biology inference to your dishwasher, and are very good out of the box using algorithms.

In future posts we will go over the algorithms and explain them. So if you have a particular interest, stay tuned, because this is just getting interesting.

If you wish to contact me, you can always do so at my Twitter account @leonpalafox, my Google+ account, or my personal webpage www.leonpalafox.com

Take care, and see you next time

Monday, July 25, 2011

So are you an application man or an algorithmic man

Have you tried the Microsoft's Kinect?
Regardless your opinion on Microsoft's products, it is hard to argue that the Kinect isn't a good piece of technology. The sensors it boasts are quite good and the algorithms it uses are state of the art human recognition algorithms. (Andrew Blake was working with the Kinect team)

So, if you were part of the Kinect team, would you like to play with the cameras and sensors, or would you like to see your algorithms in action?

Most of the times, there are 2 sides of a research story, the application and the algorithms you used. Mathematicians and Statisticians work is focused on theory and algorithms. Grad students on Maths or Statistics are sure to develop an extension of an existing algorithms or, in a good case scenario, a new algorithm or theorem. Engineers, however, focus in the application part. Few times you'll see an engineering thesis and find a new algorithm, or a deep mathematical analysis of whats happening in a system (Control Theory people are an exception). You'll find, though, a really good explanation of the hardware and the best way to do an implementation on it.

But what about computer science, and of course Machine Learning?

While some researchers view Machine Learning as a tool, others view it as an end. This primal statement will shape your research in ML. If you view Machine Learning as a tool, you'll probably will want your research to focus in a specific application. Let say computer vision, robotics or bioinformatics. In these applications the algorithms you'll use are already developed and tested by the theory people. You'll find yourself that while your papers may not easily get accepted in conferences like NIPS, ICML or COLT (though they do have application tracks), they might be accepted in things like IROS (for the robotics people) or SIGGRAPH (For Computer Vision). And while your insight of the algorithms might be less than perfect, you'll know a lot of your specific application.

If, however, you see machine learning as an end, and want your graduate thesis to be an extension over an existing work, or an entirely new algorithm - then you'll have to study hard math. You'll have to read dense books, such as convex optimization, learning theory, computational complexity , game theory, etc. Reading this books will give you an insight on how to create a new algorithm and will also allow you to understand how the algorithms are really working. By the end, you'll have an algorithm that might be applied to all kinds of different problems, yet you'll probably will focus in a very shallow problem to confirm your expectations. Remember Nash wasn't even aware his equilibrium could be used in so many applications before they told him.

All of this is advice to new grad students, some tenured professors like Mike Jordan and Alex Smola are behemoths in application and theory, and have accepted papers in both kinds of conferences and journals. And serious Machine Learning Professors have a really good grasp on the applications and theory. But this is something you'll be able to do only after long years as an academic.

So going back to our kinect example, the algorithm people probably created the human detection algorithms - which can be used for a ton of different applications. And the application people were busy implementing those algorithms for the case of the Kinect, its sensors and the architecture of the processors.

Both approaches have their merits and advocates, you just have to be sure it is what you want to do.

In the next post we will discuss how to approach an application path, and after, we will discuss on how to pursue an algorithm path.

Take Care

Remember to visit my website www.leonpalafox.com

And my twitter feed @leonpalafox

Wednesday, July 20, 2011

[Special Edition Post] Tablets and research

Disclaimer: This post is mostly about my personal opinions on current technologies for studying, not an actual pragmatic advice on how to do it.

Some time ago, a friend and I got in a heated discussion. It was about the need of a laptop computer in an MD course. She insisted a laptop was necessary for an MD course. I , of course, disagreed and pointed out that if that were truth, physicians until now were either wizards or time travelers.

Today I asked myself the same question. Are tablets necessary for a Machine Learning researcher? In the ACML 2010 I saw a couple of researchers with iPads, my previous professor bought one himself. And in the MLSS in Singapore, more than 30% of the people had a tablet. I indeed saw the practicality, my netbook seemed too bulky and bothersome to use while just reading papers and following slides.

After pondering a lot, I bought a Motorola Xoom, I did this because I needed a way to read journal papers and ebooks on the train without carrying 5000 pages in my bag. I did not choose a Kindle because as far as I saw, the small version was less than useless to read journal papers and math ebooks, and the DX was almost the same price as a normal tablet. (I got my Xoom for less than 200 USD)

I can say that it has helped me a lot, I can read my papers wherever I go, and I always have them with me, I do not have to worry about printing them anymore, or underlining the reference to look it up later, since I have a 3G->Wi-Fi converter (another advantage against the Kindle)

The reason I did not choose an iPad was that, as an iPhone user, I find the iOS too restrictive to do real work. I have yet to find a way to import PDF's to an iPad without using iTunes. Call me old-fashioned but nothing beats a good old plug and play and just copying and pasting your files. The fact that I have access to the filesystem of the device is another plus.

And here comes the question? Is then a tablet necessary to do good Machine Learning research?

It is a great help, and someone with a tablet does have a clear advantage against someone who doesn't, but then again, I stand with my earlier point, it is not necessary. Most of the greatest work on ML has been done so far without the help of a tablet, and I'm pretty sure it'll keep being that way for many years to come.

Tablet are still a long shot from being the perfect form of paper reading. Their lack of support for precise stylus-like devices is a bother (I love to make notes on my papers). And the slow response of most of them is still something that dampens your productivity.

I'll probably keep buying my math books, but for a quick commuting refreshing, or only if I wish to stay sharp on a particular topic by surveying some papers I think a tablet is an unbeatable companion.

Thank You and see you later

Remember to visit my website www.leonpalafox.com
And my twitter feed @leonpalafox

Friday, July 1, 2011

Where should I start, what should I do?

So you are all set in a Machine Learning Grad Course (I'll leave the admission niceties to you, since they change exponentially from country to country)

If you're lucky and have a good adviser, you'll probably have a project right away, but if not?

A lot of students have the feeling that they are alone, stranded and unwanted. And Machine Learning is no exception. Sometimes you won't really know where to start looking. Even if you have a project, to actually start doing things may take you some time.

In case you do not have a project, try looking around for what people are doing in you laboratory. It's always a good idea to try to work with someone, since you'll have feedback and a sense of commitment to other person. These simple things will help you progress in your research.

You can always go with your professor and see what he's working in (remember I told you it was important to have an active researcher as a professor) and offer your help. Even coding simple things are a great help for him, and give you a pretty good insight on advanced work and which problems need solution.

You should also pick up basic books on the topics you have interest in. A very good introductory book to the different areas of ML is Bishop's Book, (Be aware that you'll need a good background of Linear Algebra, Probability and Calculus to grasp most of the contents.). In a future post we will put a detailed list of which books may help you in your research.

Try also to look for the most recent conferences in a topic you like, see what the world is working on, and what unsolved problems are there. If you're lucky, your professor may pay for you to go to some of these conferences, even if you have no accepted papers.

But do you want to solve fundamental problems, or do you want to solve technical problems? Different problems have different sources.

There is another thing to post next time for choosing your research. Do you want to apply Machine Learning, or do you want to develop ML algorithms?

See you next time.

Don't forget to pay a visit to my webpage and leave some comments here.

Tuesday, June 28, 2011

How to choose a Grad Program in Machine Learning

So, you have decided to continue your studies.

You have decided it'll be Machine Learning.

Where can you start?

Have you been pondering on how to cook a pizza? If you are anything like me, you'll consider is a hassle just to get the ingredients, let alone start making the real stuff. How about fixing that old bicycle you have in the garage?

Most tasks in life are hard because we have a hard time figuring out how to start. And a graduate program is no different. Depending on which country you live, you can find that several universities have the same program, how do you know which one is best or which one will fit you?

Machine Learning, in its current form, is a rather recent area. Because of this, you'll find that few universities offer graduate courses specifically on Machine Learning. Often, to study ML, you'll have to enroll in a Computer Science graduate program and then go with a professor who specializes on ML.

I really recommend that you focus on the professor you want to work with, rather than the University's name. A lot of people will go to good Universities without knowing nothing of the researchers there.

Doing a quick search on Google with Machine Learning and Research Lab + Country name should throw some results. It would be impossible to make a list, since there are many labs to look into, but you can look into my webpage for some insight.

Try also looking into labs that pick your interest, like computer vision, text processing, data mining, a lot of these areas are using Machine Learning. And while the lab might be using other tools as well, you can always try to improve their work using ML.

Now, do you want to do applications or do you want to unravel the mysteries of the algorithms. It is safe to say that very few people would be able to create something entirely new in a 3 year PhD, you might success at modifying an algorithm or applying some obscure test to some unseen data.

Most laboratories will look into applications, and how to apply Machine Learning algorithms, I really recommend you to look into labs that have at least a couple of mathematicians in its staff, since it will be a guarantee that their work is well established on the theoretical part.

Another thing to check is whether the professor you are interested in, is still active as a researcher, I cannot emphasize enough how important this is for a research lab, if the professor does not write papers anymore, it will be hard for him to keep up with you or whatever crazy algorithm you are thinking of.

These are nothing but some advises, and in our next post we will speak more profoundly on applications and algorithms in Machine Learning and how to choose your path.

See you next time

Remember to visit www.leonpalafox.com for my latest research and a list of ML Conferences

Friday, June 10, 2011

Do I even like research?

Disclaimer: These posts are mostly focused on people oriented towards areas such as Math, Physics and of course Machine Learning. Some of the things may not apply to other areas.

"I'm sure I want to study a graduate program.......... Really?"

You would be amazed how many times I've heard people claim they like research, when they usually don't know the first thing about it.

It usually starts with: I like to read, I like math and I want to travel. Then, they ponder how difficult it is to land a job against how difficult is to get in a Grad Program. To finally decide they want to have a PhD. Have in mind that while the labor offer is limited, the Grad Program offers are always raising.

Then, reality kicks in. In order to land a good job once you're finished -and be a half decent researcher- you'll need at least 5 or 6 journal papers, more than a dozen conference papers, and a shinning PhD Thesis ,which you'll probably hate with all your heart.

To finish your PhD on time and do all of these things, you'll need to do 3 basic things:

Read, and I mean read. Forget your monthly book, to stay ahead and informed on the comings and goings of your topic, you'll need to read at least 1 paper each day and 1 academic book chapter every month (sounds easy?) . This will go up near conference dates, and when new specific journals you follow get published (yes, you have to follow journal publications)

You'll also will need to write, and you'll need to balance your load of work reading with writing. Most people fail seeing this, and end up doing all-nighters to finish academic journals on the deadline, often unpolished and unfinished. I'll tackle how to handle your time in a later post.

And finally, you'll need to do real stuff. In most scientific areas, reading is no research, is a part of it, but doing it alone won't take you anywhere. In CS you'll need to implement your ideas on code, and that'll take you more time than you would care to admit. I've spent entire coding sessions working out the bugs of my programs, let alone the real functionality of it.

To accomplish these things, you'll need a lot of self-discipline and in most cases a good advisor is also a plus. Yet, these are hard to find, and a topic I'll talk about in our next post: "How to choose a Grad Program"

See you next time

Remember to visit www.leonpalafox.com for my latest research and a list of ML Conferences

Monday, May 23, 2011

Why choosing a Graduate Program on Machine Learning?

"I wish I were in that program", "I don't like my Graduate Program", "I don't see the meaning of this".

These are some common phrases you'll hear from a fresh Graduate Student. While valid, these reasons are evidence of a dreaded truth in life. Most graduate students are lazy, ill prepared and immature people. As a graduate student myself, I consider that statement true. Not unlike anyone that has chosen the wrong job.

Most Grad students have no idea what a Graduate Program is. They think it is like college (with fewer subjects). Every time I speak with a them, I realize they have the same reasons to continue. A lack of a job offer, and liking school. And so, these kind of students clutter the research area. They often lack a vocation for research and most of the time even hate it.

Here, I'll try to help and address that issue. I'll try to give good advice on how to pursue a grad program. I'll focus on Machine Learning. I'll help you find good programs and advisors. And we will give you some tips to pursue and finish your PhD.

One of the first things you should know, is that these are mere suggestions. I'm not professor, but and enthusiastic who likes to help. I, however, consider myself humbly capable to help you decide and start. Since I've already did it with average results.

Before choosing a grad program, you have to answer these questions first:

- Do I know what research is?

- Do I really want to do research?

- Do I like math?

- Am I willing to study by myself at least 4 hours a day?

- Do I like to write?

- Am I willing to write at least 1000 words every 3 days?

- Am I willing to keep living a student life for this?

These are questions I'll be commenting as we go on. I designed them to help you find your vocation as a researcher.

If the answer to ANY of those questions is NO. I'll ask you to reconsider pursuing a graduate program in Machine Learning.

And if the answer to more than 3 questions is no, I'll ask you to reconsider a graduate program at all.

Ask yourself these questions, sleep it well, and next time we'll see how to choose a program that suits your necessities.

Remember to visit www.leonpalafox.com for my latest research and a list of ML Conferences