Friday, October 15, 2010

How to be on top of your field

Most computer science fields are fast moving. There’s a lot going on all the time at universities and research labs. Hundreds of publications appear every year through the few major conferences and journals of a field, not to mention venues with lower ranking. It is particularly important to keep a close eye on the advances going on in your field because research is –usually- incremental. You start where others left off. If you don’t know the recent findings of other scientists in the field, you will have hard time figuring out what topics are important, what approaches are popular, what the state of the art solving a particular problem is, etc.

Problem is: we spend a lot of time working on a particular project, taking care of our day-to-day job responsibilities, and trying to meet deadlines. It’s not uncommon for such activities to take up all the time we have, leaving no time for expanding our knowledge learning about recent findings in the field.

One way to solve this problem is to attend related major conferences. In addition to learning about new advances presented in the conference, you will have the opportunity to mingle with other researchers working on similar/related problems which may lead to collaboration projects. Unfortunately, this is not always a viable solution. Who got the bandwidth and resources to attend all related conferences!

Another solution that’s more affordable for research groups in universities as well as research labs was described by Prof. Azer Bestavros as follows:
  • Each member in the group maintains a list of interesting papers s/he would like to read, covering last year’s proceedings of the major conferences and journals in the field.
  • Schedule a recurrent 15-minutes meeting (frequency depends on the group size) in which one of the group members gives an overview on a paper in his list.
For example, a group of size 10 may choose to hold this meeting every other day. This way, every member gives 0.5 talk per week and listens to 2.5 talks per week. Good deal, huh?

Friday, October 1, 2010

Personal statement

This post is the fifth featuring Prof. Mor Harchol-Balter's talk advising people applying to PhD programs in computer science or related areas.
[1st episode - 2nd episode - 3rd episode - 4th episode - full article]

It’s misleading that the personal statement is called a “personal” statement, since what admission committees are really looking for is a research statement. What admission committees want is a statement about what research you have done, what research you hope to do, and why you like research.

Here’s a template if you need one:

i. First paragraph – Describe the general areas of research that interest you and why. (This is helpful for a committee to determine which professors should read your application.)
ii. Second paragraph and Third paragraph – Descibe some research projects that you worked on. Tell us what you found, what you learned, what approaches you tried. It’s fine to say that you were unable to prove what you wanted or to solve your problem.
iii. Fourth paragraph – Tell us why you feel you need a Ph.D.. Look back to section 2 and explain what in there appealed to you.
iv. Fifth paragraph – Tell us why you want to come to CMU.Whom might you like to work with? What papers have you looked at from CMU that you enjoyed reading? What will CMU teach you?

It’s important to realize that the research statement is not a commitment to do research in that
area. A third of all applicants end up working in an area different from that which they described on their research statement.

Here are the common mistakes that half of our applicants make:

• The grade regurgitator – “In my high school, I was ranked Number 1. Then I got a perfect
score on my college entrance exams. Then I competed in a statewide math competition and I was the best. Then I competed in a national programming competition and I was 5th. In college, my GPA was 3.95 out of 4.0. For these reasons, I believe I will do well in your graduate department.”
What’s wrong with this? This portion of the essay is a waste of space. Awards are certainly relevant, however any award you won should be listed on a separate piece of paper which is titled “Awards and Honors” and which you can include with your application. There is no reason to tell us all this in your essay. It will only piss-off the people reviewing your application because they already read all this information earlier in your application and they now want to hear about research only.

• The boy genius – “When I was born, my mother gave me a glass ball to play with. I would lay and look at the prisms of light shining through my ball. At age 3, my father brought home our first computer and I disassembled it and then put it back together. It was then that I knew I wanted to become a computer scientist. By age 5, I had taken apart every appliance in our house. At age 6, I became a chess whiz ....”
What’s wrong with this? We simply don’t care what you did as a child, and we don’t believe you either. You’d be surprised how many applications from Einstein-wanna-be’s we get. If you really think this is relevant, put the important facts on a separate sheet of paper, and include it in your application. It’s best if your essay can stick with stuff you did in college and later.

Sunday, August 22, 2010

Failure is an orphan

An eye opener! That's the least to say about this inspiring article for Prof. Stan Szpakowicz in Computational Linguistics journal.

According to the article, one project out of a hundred produces results that justify the investment. However, we tend to count ourselves among the 1%. That's because we need to show we outperformed others in order to publish more in order to have a thriving career in research. That's a compelling reason for us NOT to invest more time on something that produced negative results.

But WAIT. "Suppose you have set up an experiment carefully and in good faith, but still it comes up short. That’s not a positive outcome. Maybe your intuition has let you down. Maybe this cannot work. Wait, maybe you can prove that it cannot work?" THAT would be a useful outcome. But you need to get it published.

The problem boils down to peer reviews which aggressively reject failures. "A forthright admission of the inferiority of one’s results—despite the integrity or novelty
of the work—is a kiss of death: no publication. There must be improvement ... conformance to reviewers’ expectations is an asset. Indeed, we write so they are likely to accept". But we're talking about ourselves. WE are the reviewers. If we -the authors- insist on striving for success and run away from every negative result, then we -the reviewers- will check papers for signs of success.

An experiment carefully thought out, a systematic procedure, an honest evaluation—these are the ingredients of good science. It is not mandatory for the results to be positive, though it certainly lifts one up if they are. In areas where empirical methods dominate (e.g. computational linguistics) people try things which fail at the experimental stage. This may be due to lack of rigor, but often there are deeper, unexpected, and intriguing reasons. We can learn a lot if we analyze scientifically why an intuitive and plausible experiment did not work. Then again, to know what leads to dead ends in research surely can warn others off paths which take us nowhere. Simply put, a negative result can be a useful lesson.

Q: Philosophy aside, I want to publish a serious, worthwhile negative result I've obtained. Where to go today?

Here's a non-comprehensive list in different disciplines

Tuesday, August 17, 2010

The illustrated guide to a PhD

Here's how Prof. Matt Might (University of Utah) explains to fresh PhD students what a PhD is:

Imagine a circle that contains all of human knowledge:

By the time you finish elementary school, you know a little:

By the time you finish high school, you know a bit more:

With a bachelor's degree, you gain a specialty:

A master's degree deepens that specialty:

Reading research papers takes you to the edge of human knowledge:

Once you're at the boundary, you focus:

You push at the boundary for a few years:

Until one day, the boundary gives way:

And, that dent you've made is called a Ph.D.:

Of course, the world looks different to you now:

So, don't forget the bigger picture:

Keep pushing.

Taken from - licensed under the Creative Commons Attribution-NonCommercial 2.5 License.

Tuesday, August 3, 2010

Research competitions

Research competitions are designed to accelerate research on a particular topic. The entity which organizes a competition usually have a problem and wants to encourage researchers to find solutions of its problem.

  • For example, Netflix, a popular US company which provides flat-rate DVD rentals and video streaming services, put a $1,000,000 prize for those who come up with the best collaborative filtering algorithm to predict user ratings for films based on previous ratings.
  • Text Analysis Conference (TAC) is a series of evaluation workshops organized by NIST to encourage research in Natural Language Processing.
  • Text REtrieval Conference (TREC) is a series of evaluation workshops organized by NIST to encourage research in Information Retrieval.
  • OpenMT is yet another evaluation series organized by NIST to encourage research in machine translation technologies.
  • Speaker Recognition Evaluation (SRE) is NIST's workshop to encourage research in speaker recognition.
Why should you participate?
  • Data: Organizers of a research competition provide participants with scarce data resources for free so that they can compete. It is very expensive to collect the data yourself. Sometimes, you can subscribe to get the data for (huge) fees, but even then, data catalogs are not made available until many years after the competition was held.
  • Evaluation: Normally, you need to prove your novel technique performs better than state-of-the-art techniques that handle the same problem as yours. First, you need to decide which other techniques you should compare to, which is not always an easy task. Then, you try to obtain the same data set used in their publication so that your results are comparable. Soon you find out they were not using standard data for training or testing. So, you decide to run the other technique on your data, but you can't find a readily available implementation of it. So, you have to implement it yourself. After all, the comparison may not be accurate because there are usually tons of details not mentioned in publications which make a big difference in results. When you participate in a research competition, you don't have to worry about all this painful overhead.
  • Exposure: Normally, when you do something great, no guarantee people will listen to you. Most prestigious conferences, for example, reject high quality papers because they have limitations on the number of papers they may accept. When you participate in such competitions and produce great results compared to other participants, they will listen and learn from what you did.
  • Publications: This is related to the previous point. Most competitions provide a good publication venue for participants to explain their systems and results.

Sunday, July 25, 2010

You and your research

This is the title of a talk given by Richard Hamming in 1986. In this talk, Hamming was trying to address the question "Why do so few scientists make significant contributions and so many are forgotten in the long run?".

In the latest issue (Summer 2010) of ACM's XRDS magazine, Daniel Lemire reflected on that talk in an article titled "Marketing Your Ideas". If you don't have time to read the full transcript of Hamming's talk, you may want to have a look at Daniel's article. Hmm.. If you don't have time to read the transcript, you probably don't have time to read Daniel's article. Let me give an extractive summary of the latter:

1. Take your time

"Young scientists tend to rush their presentations. They work four months to a year on a project, yet they wait until the last minute before writing their paper and rehearsing their presentation—when they rehearse it at all."

"What about reports and research papers? Rushing their publication is trading quality for quantity. It is an unfortunate trade, as there is a glut of poor research papers, and too few high quality ones. Continuous writing, editing, and rehearsal should be an integral part of your activities."

2. Reach out to your audience

"Scientists and engineers are most successful when their work is most available. "

"But posting your content and giving talks is hardly enough."

"If you want people to attend your talks, make sure your title tells them why they should attend. Think about your audience. They want to know whether they should continue reading your paper or come to your talk. Convince them that you have something remarkable to tell them. Avoid jargon, acronyms, and long sentences."

"Do not underestimate email. It is the most powerful medium at your disposal. Yet, you have to use it wisely. To get famous people to read your emails, study their work. Show appreciation for their results. Think of reasons why they might find your question or proposal interesting."

Saturday, July 17, 2010

Research interests change

One of the common mistakes scholars make is to assume a researcher/professor is still interested in a topic he/she has been working on 5 years ago! This usually happens when a student is looking for a supervisor/collaborator with a specific interest. The student finds a good paper on that topic and sends to the author.

As a matter of fact, researchers change their scientific interests over time. This might happen for several reasons. Government and/or industries fund research on a topic when a need arises, then turn down the fund when the need decays. Also, believe it or not, some research problems are eventually solved! Sometimes, researchers also change their focus moving to a new position to align with the research direction of the employer (be it a university or a research lab).

Friday, June 4, 2010

Should I get a Ph.D.?

This post is the fourth episode featuring Prof. Mor Harchol-Balter's talk advising people applying to PhD programs in computer science or related areas.
[first episode - second episode - third episode - full article]

Here are some things to keep in mind when making this decision:
  1. A Ph.D. is not for everyone!
  2. A Ph.D requires 6 years on average. The opportunity cost is high.
  3. Do not even think of applying for a Ph.D. if you have not tried research and/or teaching and found that you like at least one of those. (Note: the Ph.D. program will require mostly research, not teaching, but a love of teaching may help motivate you to get through, so that you can go on to be a teacher. I have seen many examples of this.)
  4. A Ph.D. requires a particular type of personality. You need to be someone who is obsessed with figuring out a problem. You need to have tremendous perseverence and be capable of hard work. You need to be willing to do whatever it takes to solve your problem (e.g., take 5 math classes, learn a whole new area like databases, rewrite the whole kernel, etc.).
  5. You need to know why you want a Ph.D. You need to have vision and ideas and you need to be able to express yourself.
  6. Obviously, many people are still unsure straight after a B.A.. I was one of them, so I understand. For such people working in a research lab or in an industrial lab which involves doing research for a few years will help them decide. If you are unsure, I highly recommend working for a few years before starting a Ph.D.. Do not apply to graduate school until you are sure you know what you want.
Prof. Mor Harchol-Balter's own story:
After I finished my B.A. in CS and Math, I went to work at the Advanced Machine Intelligence Lab at GTE in Massachusetts. At first I was very excited by my paycheck and the great feeling of being independent. I also really enjoyed my area of research at the time: pattern recognition and classification. I was working with frame-of-reference transformations involving eigenvectors of autocorrelation matrices. It was exciting! However I quickly realized that I wanted to know more. I wanted to know why some algorithms produced good results and others didn’t. I wanted to come up with my own algorithms. I worried that I didn’t have enough of a mathematics background to answer my own questions. In summary, I wanted to delve deeper. Everyone around me thought I was odd for wanting these things. I left after 2 years and went to graduate school. That first month of graduate school I looked around and realized that everyone there was just as weird and obsessed as I was, and I knew I had made the right decision.

Friday, April 30, 2010

Specialized mailing lists

If you made your mind what field you want to work within, consider joining specialized scientific mailing lists related to your field. It's very useful to be on the same mailing list with pioneers of the field. Subscription at such mailing lists give you the following benefits:
- Notifications about call-for-papers/participation at conferences as well as deadline extensions.
- Notifications about special issues at journals.
- Announcements on PhD/MSc opportunities and scholarships.
- Announcements on Research-Assistantship, Post-Doc and relevant job vacancies.
- Resources and tools made available for research community.
- Discussions on research directions by professionals in the field.

Following are examples of fields and respective mailing lists:
- Natural Language Processing: Corpora-list
- Data Mining/Databases: KD-Nuggets, DB-World
- Social Network Analysis: Socnet
- Machine Learning: ML-news, UAI
- Neural Nets: Connectionists
- Information Retrieval: SIG-IRList

Thanks to Hossam Sharara for inspiring this post.

Tuesday, April 13, 2010

Previous research experience (if you plan to apply for a PhD program)

This post is the third episode featuring Prof. Mor Harchol-Balter's talk advising people applying to PhD programs in computer science or related areas.
[first episode - second episode - full article]

As I’ve said earlier, to get into a top graduate school you need prior research experience. This is not necessarily true for schools below the top 10, or maybe even the top 5. Note that prior research experience does not mean that you need to have published a paper. It does not even mean that your research needs to have yielded a result – results can sometimes take years. We just need to have confidence that you know what doing research is like. At CMU we receive hundreds of applications each year from 4.0 GPA students who have never done research. These are all put into the high risk pile and are subsequently rejected.

So the question is, where can you get this research experience?
There are five places where you might get research experience:
  1. As an undergraduate, you can do research with a professor. I did this. You can even get course credit for this, and sometimes if you’re really lucky you can get paid a little (e.g., during the summer).
  2. As an undergraduate, you can apply for a summer internship at a research lab, e.g., AT&T. I did this. They will pay you a little and you will learn a lot about doing research. This was a great experience for me! Here’s the web site for the AT&T summer program that I attended: When you go to this web site, click on “Special Programs and Fellowships.”
  3. After graduating, you can get a job, where sometimes you can do research on the job. I did this.
  4. As an MS student, you will work on an MS project.
  5. You can work alone or with a friend. Ask professors in your classes to tell you about interesting open problems and new research (most professors enjoy doing this). Ask them to tell you names of conference proceedings. For example in my area (performance modeling of computer systems) a relevant conference proceeding is Sigmetrics. Sit down and start reading these proceedings. You will come across all sorts of interesting problems. Think about how you can improve upon the solution proposed in the paper.

Warning for international applicants: The admissions committee needs to be able to evaluate your research. If your publications appear in conferences/journals which we are not familiar with and have no access to, then we cannot evaluate the quality of your work. In my experience, this usually leads us to discount such publications. If you don’t want this to happen, here are two things you should do:
  • Publish in internationally recognized conferences – ask your advisor.
  • Create a web site that has links to all of your papers in English in either postscript or pdf. Explain in your application that all of your papers can be found on your web site.

Tuesday, March 16, 2010

How to become a star grad student

In this article, Cal Newport tries to answer this question. He takes James McLurkin as a case study, and starts by explaining how far one could be recognized in his field, and why this is so important.

“Four years earlier, Time Magazine profiled James as part of their Innovators series. The next year, he was featured on an episode of Nova ScienceNOW. Earlier this year, TheGrio, a popular African American-focused news portal, named James one of their 100 History Makers in the Making.”

“In other words, James is famous in his field. So it’s not surprising that in 2009 he landed a professorship at Rice University — one of the country’s top engineering schools — in one of the worst academic job market in decades.”

James' stardom started when he designed a swarm of microbots 'Ants' for his senior thesis project (i.e. graduation project). The devices were designed to perform complex behavior using simple rules. The paper in which he documented this work spread out to the public media, making a star out of James.

“I went to the lab as an undergrad to interview for a position,” James recalls. “Anita Flynn told me they’re not hiring. So I came back with some robots I had built, and some I was halfway through building, and she said, ‘okay, you can work in the lab, and use our parts, but we can’t pay you.’”

Once in the lab, he worked real-hard on one project after another; each stretching his abilities a little bit. He wasn't alone in the lab though. Anita Flynn was shrinking the size of electronic motors, enabling the micro-robot revolution, while Maja Mataric was a leading thinker on robotic swarms.

“By the time he conceived of the Ants project for his thesis, James was an accomplished robot engineer with a number of successful projects under his belt. He also had a cutting-edge knowledge of microrobotics, and was “marinating” in a lab environment obsessed with biologically-inspired systems. With this in mind, the idea of building a robot swarm that behaves like insects was not a big hairy audacious goal to him.”

It was James' knowledge and expertise in cutting-edge techniques in his field that enabled him to take an unprecedented step. To him (and others with equivalent level of knowledge and expertise), Ants was an obvious incremental step. To the rest of the world, including less-aware people in robotics field, it was a huge breakthrough. Cal concludes that, to become a star, you should focus on getting to the bleeding edge of your field as quickly as possible.

“Many graduate students, for example, never arrive at the bleeding edge of their field. Instead, they reach a comfortable level of knowledge — enough to understand relevant research, and make their own acceptably-complex contributions, but not enough to make bold advances. Thousands of chemists could understand Watson and Crick’s 1953 paper on the double helix, but only a handful had the knowledge needed to have discovered it for themselves.”

Now, the obvious question is how do you get to the bleeding edge?

“Every semester, my supervisor, Anita, had me write out goals,” James told Cal. “We would go back at the end of the semester and look at what I did and didn’t do. She would tell me, ‘it’s fine that you didn’t get this all done, but what’s not fine is your inability to estimate how long something will take.’”

James deliberately chose projects that were hard enough to stretch his ability, but reasonable enough to complete in the available timeframe.

“With this in mind, I argue that the secret to James McLurkin’s success is his ability to choose the right projects. By resisting work that reinforced what he’s comfortable with, yet also sidestepping overly-ambitious projects, he consistently advanced his skill until he arrived at the bleeding edge of research robotics. Once there, the “breakthrough” projects that cemented his reputation became obvious next steps. Stretch projects are an effective way to integrate deliberate practice into fields without clear competitive structures and coaching”

To emphasize, Cal gives two definitions:
Stretch Project: A project that requires a skill you don’t have.
Stretch Churn: Number of stretch projects you complete per unit time.

In order to make it to the bleeding edge, you need to maximize your stretch churn. You need to be in a continuous discomfort learning new things and resist the tendency to reinforce what you already know.

Uh.. I think this comment (in response to Cal's article) is also worth quoting
Nianu: What kind of stretch projects would you recommend I start to tackle? I have a hard time thinking of what would be a good way to start as I am still early in my college career.”
Cal: College and graduate-level courses are stretch projects in themselves. They force you to acquire new skills, but everyone completes them within a relatively short time frame.
Attack your courses with the mindset. Savor the hard focus required to master the material (coupled, of course, with smart study tactics to eliminate wasted time and effort), knowing that you're building the skills needed to move toward the bleeding edge.”