LING 471: Computational Methods for Linguists
Information · Policies · Schedule
Information
Course Description
The course Computational Methods for Linguists focuses on learning how computational methods and tools can be applied in linguistics. One of the main goals is to familiarize students with the basics of programming and general technical versatility skills needed for programming tasks. Assignments are organized around linguistics or linguistically annotated data and presentations are graded based on how well they connect to linguistic and social concepts. This course assumes no background of computer science or linguistics.
Learning outcomes
By the end of the course, you will be able to:
- Write computer programs in Python
- Discuss what counts as data in computational linguistics
- Connect linguistic theory to computational method choice
- Reflect on the ethical and social implications of data use
- Use a command line interface effectively
- Apply version control (Git)
- Perform data cleaning, vectorization, modeling, training, interpretation, and visualization
Meeting Times & Format
This course is taught in person. Recordings will not be made. In-class activities require a computer, so it is strongly recommended to bring your laptop with you to every class.
| Days | Time | Location |
|---|---|---|
| Tuesday & Thursday | 3:30-5:20 PM | SMI 309 |
Teaching Staff
| Role | Name | Contact | Office | Office Hours |
|---|---|---|---|---|
| Instructor | Siyu Liang | liangsy at uw.edu | GUG 407 and Zoom | TBD |
Texts
All readings for this class are available at no cost to you, either through open access material or through UW’s library licensing of academic content. We will be doing reading from the following books (among other resources):
- Downey (2015). Think Python (2nd ed.)
- Jurafsky & Martin (2025). Speech and Language Processing (3rd ed.)
Recommended text (for those who have not taken LING 200/400): Language Files 13 (or Essentials of Linguistics) for linguistics review.
Policies
On the Use of Large Language Models (LLMs)
Many of you probably use LLMs (e.g. ChatGPT, Gemini, or Claude) on a daily basis, and for good reason: they are remarkably good at many tasks, in particular writing code. If some of you end up working in tech, you will likely find that everyone around you uses LLMs to write code. This is the reality of the field today.
However, a recent study (Xiao et al., 2025) examined how students with different backgrounds used an LLM coding assistant for data analysis assignments. They found that technical expertise — not AI familiarity or communication skills — was the only significant predictor of homework performance, even though all students had equal access to the same AI tool. Students who already had strong programming foundations wrote clearer prompts, provided better context, and used AI strategically to explore and improve their work. Students without that foundation tended to use AI reactively — to fix errors or get unstuck — and often got trapped in loops of vague prompts that led nowhere. In short, LLMs amplify what you already know; they do not replace the need to learn.
If you are taking this class, that means you are still building your coding foundations. I therefore strongly discourage you from using LLMs to write your code for assignments, and strongly encourage you to learn how to code, test, and debug on your own. These are the skills that will make you effective — with or without AI assistance — in the long run.
Homework
To support this philosophy, all homework assignments are graded on completion, not correctness. You will receive full credit as long as you make a genuine attempt. You should not feel obligated to submit perfect work. Instead, treat each homework as a learning opportunity. Try things out, make mistakes, and learn from them. That is the whole point.
Presentation
You will form teams of 3-4 and present a paper during Week 10. Each presentation should be approximately 10 minutes long. You may choose any paper on computational linguistics that interests you. Presentations will be graded on clarity, how well you connect the paper to linguistic and social concepts discussed in class, and equal participation from all team members.
Final Exam
There will be a final written exam at the end of the course. It will assess your understanding of basic concepts covered in class and your ability to write pseudocode for simple programs. If you have been engaging with the homework assignments throughout the quarter, you will be well prepared.
Extra Credit
Up to 2% adjustment for significant in-class participation. You may also nominate classmates who you think are active learners and helpful in study groups at the end of the quarter for the extra credit.
Grading
Here is a breakdown of the grading components:
| Component | Weight |
|---|---|
| Programming Assignments | 75% |
| Presentation | 15% |
| Final Exam | 10% |
| Participation | 2% |
| Total | 102% |
Grading scale: ≥95% = 4.0; 94% = 3.9; 93% = 3.8; and so on.
Communication & Discussion Boards
- Each assignment has a dedicated Canvas discussion board. Use these for technical and logistics questions so that others can benefit from answers.
- Email should be reserved for private matters such as grades or personal circumstances.
- The instructor will respond to Canvas posts or emails within 48 business hours (excluding weekends).
Accessibility and Disability Accommodations
Your experience in this class is important. It is the policy and practice of the University of Washington to create inclusive and accessible learning environments consistent with federal and state law. If you have already established accommodations with Disability Resources for Students (DRS), please activate your accommodations via myDRS so we can discuss how they will be implemented in this course.
If you have not yet established services through DRS, but have a temporary health condition or permanent disability that requires accommodations (conditions include but not limited to; mental health, attention-related, learning, vision, hearing, physical or health impacts), contact DRS directly to set up an Access Plan. DRS facilitates the interactive process that establishes reasonable accommodations. Contact DRS at (www.disability.uw.edu).
Safety
Call SafeCampus at 206-685-7233 anytime – no matter where you work or study – to anonymously discuss safety and well-being concerns for yourself or others. SafeCampus’s team of caring professionals will provide individualized support, while discussing short- and long-term solutions and connecting you with additional resources when requested.
Religious Accommodations
Washington state law requires that UW develop a policy for accommodation of student absences or significant hardship due to reasons of faith or conscience, or for organized religious activities. The UW’s policy, including more information about how to request an accommodation, is available at Religious Accommodations Policy (https://registrar.washington.edu/staffandfaculty/religious-accommodations-policy/). Accommodations must be requested within the first two weeks of this course using the Religious Accommodations Request form (https://registrar.washington.edu/students/religious-accommodations-request/).”
Schedule
Class schedule
Items in the “Reading” column are to be read before the class meeting they are associated with. Items that are due on a particular day are due at 11:59 PM on that day.
| Week | Date | Topic | Reading | Due |
|---|---|---|---|---|
| 1 | Mar 31 | Introduction, course structure | ||
| Apr 2 | Conceptual overview, data science | - What is data science? | - Online survey (on Canvas) - Request an account on the Patas cluster | |
| 2 | Apr 7 | Basic system and programming concepts | - Think Python Ch. 1: The Way of the Program | |
| Apr 9 | VSCode basics, version control | - The IMDB reviews dataset paper - Data statements for NLP - Version control (read conceptually; ignore RStudio stuff, etc.) | ||
| 3 | Apr 14 | Variables, scope, control flow, FizzBuzz | This looks like a lot, but many of these are only 4 pages long or so. - Think Python Ch. 2: Variables, Expressions, and Statements - Think Python Ch. 5: Conditionals and Recursion (5.2-5.7) - Think Python Ch. 8: Strings (8.1-8.2, 8.4-8.5) - Think Python Ch. 10: Lists (10.1-10.5) - De Morgan’s law | Assignment 1 |
| Apr 16 | Loops, dictionaries, input/output | - Think Python Ch. 4: Case Study: Interface Design (4.1-4.2) - Think Python Ch. 7: Iteration (7.1-7.4) - Think Python Ch. 11: Dicts (11.1-11.3, 11.5) - Input/output (7.1-7.2.1) | ||
| 4 | Apr 21 | Text processing | - Regular expressions - Tokenization - I would not recommend installing Keras or Gensim just to follow along, though we will probably use Keras later - Unicode - Modules (6.1-6.4.1) | |
| Apr 23 | Text processing, unicode, evaluation, PyCharm settings | - Think Python Ch. 19: Goodies (19.2-19.3) | ||
| 5 | Apr 28 | Metrics, precision, recall | - Precision and recall - Can stop when you get to the section headed “In binary classification settings” - Precision and recall 2 | Assignment 2 |
| Apr 30 | Data science, probability, maximum likelihood estimation | - Speech and Language Processing Ch. 3: N-Gram Language Models - Read for conceptual, not technical understanding | ||
| 6 | May 5 | Baye’s theorem, data frames | - Stats tutorial - Skip the section about R but make sure to read about the Bayes Theorem - You can also skip: Entropy and Information Gain; Inferential statistics. Read those if you like (they are generally important), but we probably don’t have time for them. | |
| May 7 | Machine learning and matrices, linear regression | - Regression and classification | ||
| 7 | May 12 | Machine learning, logistic regression, Naive Bayes | - Logistic regression - Naive Bayes | Assignment 3 |
| May 14 | Language models, nonlinearity, neural networks | - Deep learning for NLP - Nonlinear problems - Testing NLP models | ||
| 8 | May 19 | Deep learning, linguistic knowledge in NLP | - Ettinger et al. (2017) | |
| May 21 | Working with linguistic corpora | - Aijmer (2021) or Stange (2021) (both found on Canvas→Files→papers) | ||
| 9 | May 26 | Visualization, communication | - To dissect an octopus - Keras word embedding tutorial (a working version of this is part of your HW5 skeleton) - Visualization with Pandas | Assignment 4 |
| May 28 | Presentations | |||
| 10 | Jun 2 | Presentations | ||
| Jun 4 | Final Exam | Assignment 5 |