Projects
The goal of the course project is to provide the students an opportunity to explore research directions in Natural Language Processing and develop some useful NLP applications and tools. Therefore, the project aims at producing a “deliverable” result, meaning that your project should be self-complete, reproducible (scientifically correct) and related to the course content. A typical (successful) project consists of 1) a novel and sound solution to an interesting problem, 2) correct and meaningful comparisons to baselines or existing approaches, 3) comprehensive literature review and discussion (e.g., error analysis). The best outcome of the project is something that is publishable, reaching the quality of short or workshop papers in major NLP conferences (see the criteria of short papers at ACL CFP and papers in ACL Anthology). However, different from submitting a paper, we will not penalize negative results, as long as the proposed approach is well explored.
It is recommended to form a group with a diversity of background, but not required.
Grading
Here is a summary how your project will be graded: 5% Proposal + 20% Final report, 5% presentation
detailed rubric will be announced later.
By default, students in the same team will get the same score unless special circumstances. We encourage students to use a version control system (e.g., github, gitlab, etc…). It is important to keep your hard work in a safe place and log the contributions of individuals. If your team members complain about you and you cannot provide evidence of your contribution, we may lower your score.
Pick a topic
As a research project, it is recommended not reinvent the wheel from scratch. Therefore, when picking a topic, it is important to know what existing resources that can be leveraged. Asking the following questions to yourself:
- What is the problem? Why this problem interesting and essential?
- Is there an existing approach? How is your idea different from others?
- How to evaluate your idea? What data can be used for evaluating the proposed approach? Is the data set available?
- What software packages and resources that you can use for implementing your idea?
- What is the best and the worse outcomes of the project? (i.e., measure your risk).
- Who will be your group members? Do they have special expertises? How to split the workload?
Some possible ways to find a topic are:
- Take an existing problem we mentioned in the class and come up with some new ideas.
- Read a published paper carefully and ask yourself if there is any challenge left from the paper or if you can improve the proposed approach.
- There are many NLP shared tasks at Semieval, CoNLL, and some workshops. A shared task often provides a well-defined problem and data set, allowing different teams to fairly compared their approaches. You can use the shared tasks in previous years as a testbed of your approaches or participate in a shared task in this year (it is okay if you cannot get the results on the final test data set when the semester end. Just evaluate your approach on the development split),
We will give a chance for students to recruit their group members in class.
Project proposal
The template will be provided. You should address the questions in the “Pick a topic” section in your proposal. You can use this chance to draft the introduction and the related work section of your final report.
Project report
Each team must submit a written project report. You should assume the report is like a short conference paper. If you have a demo system, you can include some screenshots of your system. It is also recommended to include a discussion of how your research work can be further extended. Your final project report can be presented in one of the following format:
- PDF format: it is required to use the provided ACL Latex style files. The report should be less than 4 pages without references (no minimum requirement). A concise and short report is better than a lengthy one.
- A webpage or python notebook that provide code snippets and demo.
Project presentation
Each project team is expected to make a presentation of their project. We expect everyone to attend the final project presentation unless special circumstances. We will announce the final presentation date later. The length of the presentation depends on the number of groups (5min~10min) and will be announced later.
Your presentation will be graded mainly based on
- The clarity of your slides and presentation
- How well your key messages deliver to the audience
- Time control
- How well you handle the questions from the audience (note that the instructor may randomly pick team members to answer questions during the presentation).
The presentation will be graded by both the instructor and the student peers.