The Machine Learning Journey

Monday, July 25, 2011

So are you an application man or an algorithmic man

Have you tried the Microsoft's Kinect?
Regardless your opinion on Microsoft's products, it is hard to argue that the Kinect isn't a good piece of technology. The sensors it boasts are quite good and the algorithms it uses are state of the art human recognition algorithms. (Andrew Blake was working with the Kinect team)

So, if you were part of the Kinect team, would you like to play with the cameras and sensors, or would you like to see your algorithms in action?

Most of the times, there are 2 sides of a research story, the application and the algorithms you used. Mathematicians and Statisticians work is focused on theory and algorithms. Grad students on Maths or Statistics are sure to develop an extension of an existing algorithms or, in a good case scenario, a new algorithm or theorem. Engineers, however, focus in the application part. Few times you'll see an engineering thesis and find a new algorithm, or a deep mathematical analysis of whats happening in a system (Control Theory people are an exception). You'll find, though, a really good explanation of the hardware and the best way to do an implementation on it.

But what about computer science, and of course Machine Learning?

While some researchers view Machine Learning as a tool, others view it as an end. This primal statement will shape your research in ML. If you view Machine Learning as a tool, you'll probably will want your research to focus in a specific application. Let say computer vision, robotics or bioinformatics. In these applications the algorithms you'll use are already developed and tested by the theory people. You'll find yourself that while your papers may not easily get accepted in conferences like NIPS, ICML or COLT (though they do have application tracks), they might be accepted in things like IROS (for the robotics people) or SIGGRAPH (For Computer Vision). And while your insight of the algorithms might be less than perfect, you'll know a lot of your specific application.

If, however, you see machine learning as an end, and want your graduate thesis to be an extension over an existing work, or an entirely new algorithm - then you'll have to study hard math. You'll have to read dense books, such as convex optimization, learning theory, computational complexity , game theory, etc. Reading this books will give you an insight on how to create a new algorithm and will also allow you to understand how the algorithms are really working. By the end, you'll have an algorithm that might be applied to all kinds of different problems, yet you'll probably will focus in a very shallow problem to confirm your expectations. Remember Nash wasn't even aware his equilibrium could be used in so many applications before they told him.

All of this is advice to new grad students, some tenured professors like Mike Jordan and Alex Smola are behemoths in application and theory, and have accepted papers in both kinds of conferences and journals. And serious Machine Learning Professors have a really good grasp on the applications and theory. But this is something you'll be able to do only after long years as an academic.

So going back to our kinect example, the algorithm people probably created the human detection algorithms - which can be used for a ton of different applications. And the application people were busy implementing those algorithms for the case of the Kinect, its sensors and the architecture of the processors.

Both approaches have their merits and advocates, you just have to be sure it is what you want to do.

In the next post we will discuss how to approach an application path, and after, we will discuss on how to pursue an algorithm path.

Take Care

Remember to visit my website www.leonpalafox.com

And my twitter feed @leonpalafox

Wednesday, July 20, 2011

[Special Edition Post] Tablets and research

Disclaimer: This post is mostly about my personal opinions on current technologies for studying, not an actual pragmatic advice on how to do it.

Some time ago, a friend and I got in a heated discussion. It was about the need of a laptop computer in an MD course. She insisted a laptop was necessary for an MD course. I , of course, disagreed and pointed out that if that were truth, physicians until now were either wizards or time travelers.

Today I asked myself the same question. Are tablets necessary for a Machine Learning researcher? In the ACML 2010 I saw a couple of researchers with iPads, my previous professor bought one himself. And in the MLSS in Singapore, more than 30% of the people had a tablet. I indeed saw the practicality, my netbook seemed too bulky and bothersome to use while just reading papers and following slides.

After pondering a lot, I bought a Motorola Xoom, I did this because I needed a way to read journal papers and ebooks on the train without carrying 5000 pages in my bag. I did not choose a Kindle because as far as I saw, the small version was less than useless to read journal papers and math ebooks, and the DX was almost the same price as a normal tablet. (I got my Xoom for less than 200 USD)

I can say that it has helped me a lot, I can read my papers wherever I go, and I always have them with me, I do not have to worry about printing them anymore, or underlining the reference to look it up later, since I have a 3G->Wi-Fi converter (another advantage against the Kindle)

The reason I did not choose an iPad was that, as an iPhone user, I find the iOS too restrictive to do real work. I have yet to find a way to import PDF's to an iPad without using iTunes. Call me old-fashioned but nothing beats a good old plug and play and just copying and pasting your files. The fact that I have access to the filesystem of the device is another plus.

And here comes the question? Is then a tablet necessary to do good Machine Learning research?

It is a great help, and someone with a tablet does have a clear advantage against someone who doesn't, but then again, I stand with my earlier point, it is not necessary. Most of the greatest work on ML has been done so far without the help of a tablet, and I'm pretty sure it'll keep being that way for many years to come.

Tablet are still a long shot from being the perfect form of paper reading. Their lack of support for precise stylus-like devices is a bother (I love to make notes on my papers). And the slow response of most of them is still something that dampens your productivity.

I'll probably keep buying my math books, but for a quick commuting refreshing, or only if I wish to stay sharp on a particular topic by surveying some papers I think a tablet is an unbeatable companion.

Thank You and see you later

Remember to visit my website www.leonpalafox.com
And my twitter feed @leonpalafox

Friday, July 1, 2011

Where should I start, what should I do?

So you are all set in a Machine Learning Grad Course (I'll leave the admission niceties to you, since they change exponentially from country to country)

If you're lucky and have a good adviser, you'll probably have a project right away, but if not?

A lot of students have the feeling that they are alone, stranded and unwanted. And Machine Learning is no exception. Sometimes you won't really know where to start looking. Even if you have a project, to actually start doing things may take you some time.

In case you do not have a project, try looking around for what people are doing in you laboratory. It's always a good idea to try to work with someone, since you'll have feedback and a sense of commitment to other person. These simple things will help you progress in your research.

You can always go with your professor and see what he's working in (remember I told you it was important to have an active researcher as a professor) and offer your help. Even coding simple things are a great help for him, and give you a pretty good insight on advanced work and which problems need solution.

You should also pick up basic books on the topics you have interest in. A very good introductory book to the different areas of ML is Bishop's Book, (Be aware that you'll need a good background of Linear Algebra, Probability and Calculus to grasp most of the contents.). In a future post we will put a detailed list of which books may help you in your research.

Try also to look for the most recent conferences in a topic you like, see what the world is working on, and what unsolved problems are there. If you're lucky, your professor may pay for you to go to some of these conferences, even if you have no accepted papers.

But do you want to solve fundamental problems, or do you want to solve technical problems? Different problems have different sources.

There is another thing to post next time for choosing your research. Do you want to apply Machine Learning, or do you want to develop ML algorithms?

See you next time.

Don't forget to pay a visit to my webpage and leave some comments here.

Tuesday, June 28, 2011

How to choose a Grad Program in Machine Learning

So, you have decided to continue your studies.

You have decided it'll be Machine Learning.

Where can you start?

Have you been pondering on how to cook a pizza? If you are anything like me, you'll consider is a hassle just to get the ingredients, let alone start making the real stuff. How about fixing that old bicycle you have in the garage?

Most tasks in life are hard because we have a hard time figuring out how to start. And a graduate program is no different. Depending on which country you live, you can find that several universities have the same program, how do you know which one is best or which one will fit you?

Machine Learning, in its current form, is a rather recent area. Because of this, you'll find that few universities offer graduate courses specifically on Machine Learning. Often, to study ML, you'll have to enroll in a Computer Science graduate program and then go with a professor who specializes on ML.

I really recommend that you focus on the professor you want to work with, rather than the University's name. A lot of people will go to good Universities without knowing nothing of the researchers there.

Doing a quick search on Google with Machine Learning and Research Lab + Country name should throw some results. It would be impossible to make a list, since there are many labs to look into, but you can look into my webpage for some insight.

Try also looking into labs that pick your interest, like computer vision, text processing, data mining, a lot of these areas are using Machine Learning. And while the lab might be using other tools as well, you can always try to improve their work using ML.

Now, do you want to do applications or do you want to unravel the mysteries of the algorithms. It is safe to say that very few people would be able to create something entirely new in a 3 year PhD, you might success at modifying an algorithm or applying some obscure test to some unseen data.

Most laboratories will look into applications, and how to apply Machine Learning algorithms, I really recommend you to look into labs that have at least a couple of mathematicians in its staff, since it will be a guarantee that their work is well established on the theoretical part.

Another thing to check is whether the professor you are interested in, is still active as a researcher, I cannot emphasize enough how important this is for a research lab, if the professor does not write papers anymore, it will be hard for him to keep up with you or whatever crazy algorithm you are thinking of.

These are nothing but some advises, and in our next post we will speak more profoundly on applications and algorithms in Machine Learning and how to choose your path.

See you next time

Remember to visit www.leonpalafox.com for my latest research and a list of ML Conferences

Friday, June 10, 2011

Do I even like research?

Disclaimer: These posts are mostly focused on people oriented towards areas such as Math, Physics and of course Machine Learning. Some of the things may not apply to other areas.

"I'm sure I want to study a graduate program.......... Really?"

You would be amazed how many times I've heard people claim they like research, when they usually don't know the first thing about it.

It usually starts with: I like to read, I like math and I want to travel. Then, they ponder how difficult it is to land a job against how difficult is to get in a Grad Program. To finally decide they want to have a PhD. Have in mind that while the labor offer is limited, the Grad Program offers are always raising.

Then, reality kicks in. In order to land a good job once you're finished -and be a half decent researcher- you'll need at least 5 or 6 journal papers, more than a dozen conference papers, and a shinning PhD Thesis ,which you'll probably hate with all your heart.

To finish your PhD on time and do all of these things, you'll need to do 3 basic things:

Read, and I mean read. Forget your monthly book, to stay ahead and informed on the comings and goings of your topic, you'll need to read at least 1 paper each day and 1 academic book chapter every month (sounds easy?) . This will go up near conference dates, and when new specific journals you follow get published (yes, you have to follow journal publications)

You'll also will need to write, and you'll need to balance your load of work reading with writing. Most people fail seeing this, and end up doing all-nighters to finish academic journals on the deadline, often unpolished and unfinished. I'll tackle how to handle your time in a later post.

And finally, you'll need to do real stuff. In most scientific areas, reading is no research, is a part of it, but doing it alone won't take you anywhere. In CS you'll need to implement your ideas on code, and that'll take you more time than you would care to admit. I've spent entire coding sessions working out the bugs of my programs, let alone the real functionality of it.

To accomplish these things, you'll need a lot of self-discipline and in most cases a good advisor is also a plus. Yet, these are hard to find, and a topic I'll talk about in our next post: "How to choose a Grad Program"

See you next time

Remember to visit www.leonpalafox.com for my latest research and a list of ML Conferences

Monday, May 23, 2011

Why choosing a Graduate Program on Machine Learning?

"I wish I were in that program", "I don't like my Graduate Program", "I don't see the meaning of this".

These are some common phrases you'll hear from a fresh Graduate Student. While valid, these reasons are evidence of a dreaded truth in life. Most graduate students are lazy, ill prepared and immature people. As a graduate student myself, I consider that statement true. Not unlike anyone that has chosen the wrong job.

Most Grad students have no idea what a Graduate Program is. They think it is like college (with fewer subjects). Every time I speak with a them, I realize they have the same reasons to continue. A lack of a job offer, and liking school. And so, these kind of students clutter the research area. They often lack a vocation for research and most of the time even hate it.

Here, I'll try to help and address that issue. I'll try to give good advice on how to pursue a grad program. I'll focus on Machine Learning. I'll help you find good programs and advisors. And we will give you some tips to pursue and finish your PhD.

One of the first things you should know, is that these are mere suggestions. I'm not professor, but and enthusiastic who likes to help. I, however, consider myself humbly capable to help you decide and start. Since I've already did it with average results.

Before choosing a grad program, you have to answer these questions first:

- Do I know what research is?

- Do I really want to do research?

- Do I like math?

- Am I willing to study by myself at least 4 hours a day?

- Do I like to write?

- Am I willing to write at least 1000 words every 3 days?

- Am I willing to keep living a student life for this?

These are questions I'll be commenting as we go on. I designed them to help you find your vocation as a researcher.

If the answer to ANY of those questions is NO. I'll ask you to reconsider pursuing a graduate program in Machine Learning.

And if the answer to more than 3 questions is no, I'll ask you to reconsider a graduate program at all.

Ask yourself these questions, sleep it well, and next time we'll see how to choose a program that suits your necessities.

Remember to visit www.leonpalafox.com for my latest research and a list of ML Conferences

Wednesday, March 10, 2010

Lets Begin

Hello, my name is Leon Palafox.

Starting April the 1st this year, I will start my final 3 years lap to become a PhD.

As a part of a little bit of self motivation, I decided I will start writing a blog on the ups and downs of this journey, so however might be interested can have a look at this.

The final goal of this project is to have at least one publication in the NIPS Conference and to have at least 2 Journal Papers in Machine Learning related Conferences. And of course earn my PhD degree in the before asserted time.

Any help is welcomed, as well as suggestions and recommendations.

So here we start, to begin, I am watching the online course from Stanford on Machine Learning as well as reading the Bishop Book, as well I am reading the Cover book on Information Theory and I intend to read afterward the Lugosi book on prediction.

So, see you next time.