Legal Natural Language Processing Lab

Master Practical Course

Master Practical Course - Legal Natural Language Processing Lab (IN2106)

Instructors: Shanshan Xu

Course language: English

6 SWS, 10 ECTS

Session Times: TBA Information Session

17:00, Wed, 10. July. 2024

Meeting Recording:

tum-conf.zoom-x.de/rec/share/J4cYzXUM5f2VIixRmJcUslGDp-KXBgldgRsp4Yh6NMT1ct_bG2jt_1Ek05WHaQgC.SG9XdWKyjnNCv24e Passcode: cqW$3?Xm

[IMPORTANT]

Following is a short questionnaire meant to pre-assess your background in ML and NLP suffices for the legal data analysis lab, and to provide us with information on how to rank applicants since we plan to only have 12 slots available. If you are interested in the lab and would like to match with us, please fill out the form. If you are decently proficient in Python, have some practical ML experience (e.g., by implementing a classifier) and can answer the "how familiar are you with" questions positively, you should be able to successfully complete the lab.

Questionnaire Link 

Content Outline

The analysis of legal data/text and the design and development of systems that provide valuable functionality to legal practitioners pose various challenges. These include noisy raw data that must be carefully preprocessed, ill-defined tasks for which only small datasets exist and for which learning supervision and evaluation is difficult to obtain, and domain-specific information of various kinds that must be taken into account at many stages of the process.

This lab course provides students with an opportunity to gain practical experience in working with legal data in small teams. The instructors will be offering projects centered around a research question/hypothesis. They will typically involve one or more datasets from a legal domain, one or more formal tasks, and one or more methods to be tried. Over the course of the semester, teams will develop an experimental system/prototype and evaluate it, thereby producing new insight about that hypothesis.

After an initial introduction of the legal informatics topic, students will be matched into teams and assigned projects. Teams will meet with their project mentors regularly to present work updates, discuss progress, and define action items. At the end of three milestone intervals, teams will present their progress to the whole cohort and discuss all projects with their peers. Learning Outcomes

After completing this module, students will have gained practice in planning, implementing, and evaluating a legal data science/informatics project. In particular, they will have gained experience in:

formulating an experimental hypothesis
identifying characteristics of data from the legal domain and explain how they influence technical aspects of project work
conduct a targeted prior work survey in the legal informatics literature for a given project context
designing an experimental system towards producing insight from data and/or developing new functionality of interest
conducting model evaluation and behavior analysis

Requirements

Students must have experience in machine learning and, ideally, natural language processing. They should have taken the following courses or be sufficiently proficient in the topics and methods they cover:

IN2332: Statistical Modeling and Machine Learning
IN2062: Grundlagen der künstlichen Intelligenz / Foundations of Artificial Intelligence
IN2361: Natural Language Processing
IN2395: Legal Data Science & Informatics

If a student has not taken IN2395, it is expected that they familiarize themselves with background materials relevant to their respective project.

References

2022

  1. Extractive Summarization of Legal Decisions using Multi-task Learning and Maximal Marginal Relevance
    A. Agarwal, Shanshan Xu, and Matthias Grabmair
    In Findings of EMNLP, 2022
  2. Attack on Unfair ToS Clause Detection: A Case Study using Universal Adversarial Triggers
    Shanshan Xu, I. Broda, Rashid Haddad, M. Negrini, and Matthias Grabmair
    In Natural Legal Language Processing (NLLP), 2022