LING 471: Computational Methods for Linguists

Information · Policies · Schedule

Information

Course Description

The course Computational Methods for Linguists focuses on learning how computational methods and tools can be applied in linguistics. One of the main goals is to familiarize students with the basics of programming and general technical versatility skills needed for programming tasks. Assignments are organized around linguistics or linguistically annotated data and presentations are graded based on how well they connect to linguistic and social concepts. This course assumes no background of computer science or linguistics.

Learning outcomes

By the end of the course, you will be able to:

  • Write computer programs in Python
  • Discuss what counts as data in computational linguistics
  • Connect linguistic theory to computational method choice
  • Reflect on the ethical and social implications of data use
  • Use a command line interface effectively
  • Apply version control (Git)
  • Perform data cleaning, vectorization, modeling, training, interpretation, and visualization

Meeting Times & Format

This course is taught in person. Recordings will not be made. In-class activities require a computer, so it is strongly recommended to bring your laptop with you to every class.

Days Time Location
Tuesday & Thursday 3:30-5:20 PM SMI 309

Teaching Staff

Role Name Contact Office Office Hours
Instructor Siyu Liang liangsy at uw.edu GUG 407 and Zoom TBD

Texts

All readings for this class are available at no cost to you, either through open access material or through UW’s library licensing of academic content. We will be doing reading from the following books (among other resources):

Recommended text (for those who have not taken LING 200/400): Language Files 13 (or Essentials of Linguistics) for linguistics review.


Policies

On the Use of Large Language Models (LLMs)

Many of you probably use LLMs (e.g. ChatGPT, Gemini, or Claude) on a daily basis, and for good reason: they are remarkably good at many tasks, in particular writing code. If some of you end up working in tech, you will likely find that everyone around you uses LLMs to write code. This is the reality of the field today.

However, a recent study (Xiao et al., 2025) examined how students with different backgrounds used an LLM coding assistant for data analysis assignments. They found that technical expertise — not AI familiarity or communication skills — was the only significant predictor of homework performance, even though all students had equal access to the same AI tool. Students who already had strong programming foundations wrote clearer prompts, provided better context, and used AI strategically to explore and improve their work. Students without that foundation tended to use AI reactively — to fix errors or get unstuck — and often got trapped in loops of vague prompts that led nowhere. In short, LLMs amplify what you already know; they do not replace the need to learn.

If you are taking this class, that means you are still building your coding foundations. I therefore strongly discourage you from using LLMs to write your code for assignments, and strongly encourage you to learn how to code, test, and debug on your own. These are the skills that will make you effective — with or without AI assistance — in the long run.

Homework

To support this philosophy, all homework assignments are graded on completion, not correctness. You will receive full credit as long as you make a genuine attempt. You should not feel obligated to submit perfect work. Instead, treat each homework as a learning opportunity. Try things out, make mistakes, and learn from them. That is the whole point.

Presentation

You will form teams of 3-4 and present a paper during Week 10. Each presentation should be approximately 10 minutes long. You may choose any paper on computational linguistics that interests you. Presentations will be graded on clarity, how well you connect the paper to linguistic and social concepts discussed in class, and equal participation from all team members.

Final Exam

There will be a final written exam at the end of the course. It will assess your understanding of basic concepts covered in class and your ability to write pseudocode for simple programs. If you have been engaging with the homework assignments throughout the quarter, you will be well prepared.

Extra Credit

Up to 2% adjustment for significant in-class participation. You may also nominate classmates who you think are active learners and helpful in study groups at the end of the quarter for the extra credit.

Grading

Here is a breakdown of the grading components:

Component Weight
Programming Assignments 75%
Presentation 15%
Final Exam 10%
Participation 2%
Total 102%

Grading scale: ≥95% = 4.0; 94% = 3.9; 93% = 3.8; and so on.

Communication & Discussion Boards

  • Each assignment has a dedicated Canvas discussion board. Use these for technical and logistics questions so that others can benefit from answers.
  • Email should be reserved for private matters such as grades or personal circumstances.
  • The instructor will respond to Canvas posts or emails within 48 business hours (excluding weekends).

Accessibility and Disability Accommodations

Your experience in this class is important. It is the policy and practice of the University of Washington to create inclusive and accessible learning environments consistent with federal and state law. If you have already established accommodations with Disability Resources for Students (DRS), please activate your accommodations via myDRS so we can discuss how they will be implemented in this course.

If you have not yet established services through DRS, but have a temporary health condition or permanent disability that requires accommodations (conditions include but not limited to; mental health, attention-related, learning, vision, hearing, physical or health impacts), contact DRS directly to set up an Access Plan. DRS facilitates the interactive process that establishes reasonable accommodations. Contact DRS at (www.disability.uw.edu).

Safety

Call SafeCampus at 206-685-7233 anytime – no matter where you work or study – to anonymously discuss safety and well-being concerns for yourself or others. SafeCampus’s team of caring professionals will provide individualized support, while discussing short- and long-term solutions and connecting you with additional resources when requested.

Religious Accommodations

Washington state law requires that UW develop a policy for accommodation of student absences or significant hardship due to reasons of faith or conscience, or for organized religious activities. The UW’s policy, including more information about how to request an accommodation, is available at Religious Accommodations Policy (https://registrar.washington.edu/staffandfaculty/religious-accommodations-policy/). Accommodations must be requested within the first two weeks of this course using the Religious Accommodations Request form (https://registrar.washington.edu/students/religious-accommodations-request/).”

Schedule

Class schedule

Items in the “Reading” column are to be read before the class meeting they are associated with. Items that are due on a particular day are due at 11:59 PM on that day.

Week Date Topic Reading Due
1 Mar 31 Introduction, course structure    
  Apr 2 Conceptual overview, data science - What is data science? - Online survey (on Canvas)
- Request an account on the Patas cluster
2 Apr 7 Basic system and programming concepts - Think Python Ch. 1: The Way of the Program  
  Apr 9 VSCode basics, version control - The IMDB reviews dataset paper
- Data statements for NLP
- Version control (read conceptually; ignore RStudio stuff, etc.)
 
3 Apr 14 Variables, scope, control flow, FizzBuzz This looks like a lot, but many of these are only 4 pages long or so.
- Think Python Ch. 2: Variables, Expressions, and Statements
- Think Python Ch. 5: Conditionals and Recursion (5.2-5.7)
- Think Python Ch. 8: Strings (8.1-8.2, 8.4-8.5)
- Think Python Ch. 10: Lists (10.1-10.5)
- De Morgan’s law
Assignment 1
  Apr 16 Loops, dictionaries, input/output - Think Python Ch. 4: Case Study: Interface Design (4.1-4.2)
- Think Python Ch. 7: Iteration (7.1-7.4)
- Think Python Ch. 11: Dicts (11.1-11.3, 11.5)
- Input/output (7.1-7.2.1)
 
4 Apr 21 Text processing - Regular expressions
- Tokenization
- I would not recommend installing Keras or Gensim just to follow along, though we will probably use Keras later
- Unicode
- Modules (6.1-6.4.1)
 
  Apr 23 Text processing, unicode, evaluation, PyCharm settings - Think Python Ch. 19: Goodies (19.2-19.3)  
5 Apr 28 Metrics, precision, recall - Precision and recall
- Can stop when you get to the section headed “In binary classification settings”
- Precision and recall 2
Assignment 2
  Apr 30 Data science, probability, maximum likelihood estimation - Speech and Language Processing Ch. 3: N-Gram Language Models
- Read for conceptual, not technical understanding
 
6 May 5 Baye’s theorem, data frames - Stats tutorial
- Skip the section about R but make sure to read about the Bayes Theorem
- You can also skip: Entropy and Information Gain; Inferential statistics. Read those if you like (they are generally important), but we probably don’t have time for them.
 
  May 7 Machine learning and matrices, linear regression - Regression and classification  
7 May 12 Machine learning, logistic regression, Naive Bayes - Logistic regression
- Naive Bayes
Assignment 3
  May 14 Language models, nonlinearity, neural networks - Deep learning for NLP
- Nonlinear problems
- Testing NLP models
 
8 May 19 Deep learning, linguistic knowledge in NLP - Ettinger et al. (2017)  
  May 21 Working with linguistic corpora - Aijmer (2021) or Stange (2021) (both found on Canvas→Files→papers)  
9 May 26 Visualization, communication - To dissect an octopus
- Keras word embedding tutorial (a working version of this is part of your HW5 skeleton)
- Visualization with Pandas
Assignment 4
  May 28 Presentations    
10 Jun 2 Presentations    
  Jun 4 Final Exam   Assignment 5