Pretraining Influences Language Transfer

This is a Bangkok news story, published by ACL Anthology - ACL Anthology, that relates primarily to NLP news.

Bangkok news

For more Bangkok news, you can click here:

more Bangkok news

physics news

For more physics news, you can click here:

more physics news

ACL Anthology - ACL Anthology news

For more news from ACL Anthology - ACL Anthology, you can click here:

more news from ACL Anthology - ACL Anthology

About the Otherweb

Otherweb, Inc is a public benefit corporation, dedicated to improving the quality of news people consume. We are non-partisan, junk-free, and ad-free. We use artificial intelligence (AI) to remove junk from your news feed, and allow you to select the best science news, business news, entertainment news, and much more. If you like physics news, you might also like this article about

LR language performance

. We are dedicated to bringing you the highest-quality news, junk-free and ad-free, about your favorite topics. Please come every day to read the latest LR languages news, optimal language pairs news, physics news, and other high-quality news about any topic that interests you. We are working hard to create the best news aggregator on the web, and to put you in control of your news feed - whether you choose to read the latest news through our website, our news app, or our daily newsletter - all free!

direct linguistic motivations

ACL Anthology - ACL Anthology

•

Super donors and super recipients: Studying cross-lingual transfer between high-resource and low-resource languages

Summary

Nutrition label

77% Informative

Despite the increasing popularity of multilingualism within the NLP community, numerous languages continue to be underrepresented due to the lack of available resources.

Our findings surprisingly reveal that the optimal language pairs with improved performance do not necessarily align with direct linguistic motivations, with subtoken overlap playing a more crucial role.

Specific languages tend to be almost universally beneficial for pretraining (super donors), while others benefit from pretraining with almost any language (super recipients).

The Seventh Workshop on Technologies for Machine Translation of Low-Resource Languages (LoResMT 2024) is being held in Bangkok , Thailand .

The conference addresses the gap between high-resource ( 158 -high-resource) languages and 31 low-resource languages (LRLR) languages) Across 158

31 HRLR pairs of language pairs, we investigate how continued pretraining on different languages affects the pretraining model.

Specific languages tend to be almost universally beneficial for pretraining (super donors), while others benefit from pretraining with almost any language (super recipients).

VR Score

Informative language

Neutral language

Article tone

formal

Language

English

Language complexity

Offensive language

not offensive

Hate speech

not hateful