Prefix Generator for Low-resource Event Extraction, PlusLab @ UCLA
- Incorporated useful external information, such as syntax trees, into generative models, which have been important for event extraction tasks because of their flexibility and efficiency.
- Proposed Prefix Generator, which encodes external information (e.g. Abstract Meaning Representation Graph, Optimus robust representation) by pre-training and mapping representations into prefixes in encoder-decoder models.
- Our method proved to be generally applicable and especially effective in low-resource settings.
- This work will submit to ACL 2023.
Domain Relabeling for Subpopulation Shift, IRIS Lab @ Stanford
- Analyzed spurious correlations caused by poor quality domain labels in current approaches to address subpopulation shifts between training and testing distributions.
- Proposed integrating the candidate features in datasets’ metadata table to get higher-quality domain labels.
- Built reinforcement learning framework that utilizes downstream task performance feedback of each metadata feature to optimize the domain labels iteratively.
- Our method leads to improved worst-group performance in real-world datasets covering fields such as healthcare and weather forecasting.
Reviewing Test Protocols of Distantly Supervised Relation Extraction, THUNLP @ Tsinghua
- Examined two popular relation extraction datasets (NYT10 and Wiki20) for annotation errors due to distant supervision methods adopted.
- Proposed an improved relation ontology and adopted data-cleaning, constructed manually-annotated test sets for NYT10 and Wiki20, correcting 53% wrong labels in NYT10.
- Analyzed performance differences of competitive models on manually-annotated and distantly supervised datasets.
- Our conclusion sheds light on the importance of a more accurate evaluation for relation extraction research.
- Paper Manual Evaluation Matters: Reviewing Test Protocols of Distantly Supervised Relation Extraction published in ACL Findings 2021.