AdaMix, a parameter-efficient fine-tuning method, outperforms full model fine-tuning in few-shot NLU tasks across benchmarks like GLUE. Using prompt-based strategies without extra validation or unlabeled data, AdaMix consistently boosts performance with both BERT and RoBERTa encoders, demonstrating stability and efficiency in few-shot scenarios.AdaMix, a parameter-efficient fine-tuning method, outperforms full model fine-tuning in few-shot NLU tasks across benchmarks like GLUE. Using prompt-based strategies without extra validation or unlabeled data, AdaMix consistently boosts performance with both BERT and RoBERTa encoders, demonstrating stability and efficiency in few-shot scenarios.

Smarter AI Training with Few-Shot Natural Language Tasks

2025/10/02 17:00
3 min read
For feedback or concerns regarding this content, please contact us at crypto.news@mexc.com

Abstract and 1. Introduction

  1. Background

    2.1 Mixture-of-Experts

    2.2 Adapters

  2. Mixture-of-Adaptations

    3.1 Routing Policy

    3.2 Consistency regularization

    3.3 Adaptation module merging and 3.4 Adaptation module sharing

    3.5 Connection to Bayesian Neural Networks and Model Ensembling

  3. Experiments

    4.1 Experimental Setup

    4.2 Key Results

    4.3 Ablation Study

  4. Related Work

  5. Conclusions

  6. Limitations

  7. Acknowledgment and References

Appendix

A. Few-shot NLU Datasets B. Ablation Study C. Detailed Results on NLU Tasks D. Hyper-parameter

A Few-shot NLU Datasets

Data. In contrast to the fully supervised setting in the above experiments, we also perform fewshot experiments following the prior study (Wang et al., 2021) on six tasks including MNLI (Williams et al., 2018), RTE (Dagan et al., 2005; Bar Haim et al., 2006; Giampiccolo et al., 2007; Bentivogli et al., 2009), QQP[1] and SST-2 (Socher et al.). The results are reported on their development set following (Zhang et al., 2021). MPQA (Wiebe et al., 2005) and Subj (Pang and Lee, 2004) are used for polarity and subjectivity detection, where we follow (Gao et al., 2021) to keep 2, 000 examples for testing. The few-shot model only has access to |K| labeled samples for any task. Following true few-shot learning setting (Perez et al., 2021; Wang et al., 2021), we do not use any additional validation set for any hyper-parameter tuning or early stopping. The performance of each model is reported after fixed number of training epochs. For a fair comparison, we use the same set of few-shot labeled instances for training as in (Wang et al., 2021). We train each model with 5 different seeds and report average performance with standard deviation across the runs. In the few-shot experiments, we follow (Wang et al., 2021) to train AdaMix via the prompt-based fine-tuning strategy. In contrast to (Wang et al., 2021), we do not use any unlabeled data.

\

B Ablation Study

\ Table 11: Ablation study demonstrating the impact of parameter sharing in AdaMix adapter framework.

\

C Detailed Results on NLU Tasks

The results on NLU tasks are included in Table 1 and Table 13. The performance AdaMix with RoBERTa-large encoder achieves the best performance in terms of different task metrics in the GLUE benchmark. AdaMix with adapters is the

\ \ Table 12: Varying the bottleneck dimension of adapters in AdaMix with BERT-base and RoBERTa-large encoder. * denotes the bottleneck dimension used in AdaMix with adapters.

\ \ only PEFT method which outperforms full model fine-tuning on all the tasks and on average score. Additionally, the improvement brought by AdaMix is more significant with BERT-base as the encoder, demonstrating 2.2% and 1.2% improvement over the performance of full model fine-tuning and the best performing baseline UNIPELT with BERTbase. The improvement is observed to be consistent as that with RoBERTa-large on every task. The NLG results are included in Table 4 and 5.

D Hyper-parameter

Detailed hyper-parameter configuration for different tasks presented in Table 15 and Table 16.

\

:::info Authors:

(1) Yaqing Wang, Purdue University (wang5075@purdue.edu);

(2) Sahaj Agarwal, Microsoft (sahagar@microsoft.com);

(3) Subhabrata Mukherjee, Microsoft Research (submukhe@microsoft.com);

(4) Xiaodong Liu, Microsoft Research (xiaodl@microsoft.com);

(5) Jing Gao, Purdue University (jinggao@purdue.edu);

(6) Ahmed Hassan Awadallah, Microsoft Research (hassanam@microsoft.com);

(7) Jianfeng Gao, Microsoft Research (jfgao@microsoft.com).

:::


:::info This paper is available on arxiv under CC BY 4.0 DEED license.

:::

[1] https://www.quora.com/q/quoradata/

Market Opportunity
null Logo
null Price(null)
--
----
USD
null (null) Live Price Chart
Disclaimer: The articles reposted on this site are sourced from public platforms and are provided for informational purposes only. They do not necessarily reflect the views of MEXC. All rights remain with the original authors. If you believe any content infringes on third-party rights, please contact crypto.news@mexc.com for removal. MEXC makes no guarantees regarding the accuracy, completeness, or timeliness of the content and is not responsible for any actions taken based on the information provided. The content does not constitute financial, legal, or other professional advice, nor should it be considered a recommendation or endorsement by MEXC.
Tags:

You May Also Like

Is Doge Losing Steam As Traders Choose Pepeto For The Best Crypto Investment?

Is Doge Losing Steam As Traders Choose Pepeto For The Best Crypto Investment?

The post Is Doge Losing Steam As Traders Choose Pepeto For The Best Crypto Investment? appeared on BitcoinEthereumNews.com. Crypto News 17 September 2025 | 17:39 Is dogecoin really fading? As traders hunt the best crypto to buy now and weigh 2025 picks, Dogecoin (DOGE) still owns the meme coin spotlight, yet upside looks capped, today’s Dogecoin price prediction says as much. Attention is shifting to projects that blend culture with real on-chain tools. Buyers searching “best crypto to buy now” want shipped products, audits, and transparent tokenomics. That frames the true matchup: dogecoin vs. Pepeto. Enter Pepeto (PEPETO), an Ethereum-based memecoin with working rails: PepetoSwap, a zero-fee DEX, plus Pepeto Bridge for smooth cross-chain moves. By fusing story with tools people can use now, and speaking directly to crypto presale 2025 demand, Pepeto puts utility, clarity, and distribution in front. In a market where legacy meme coin leaders risk drifting on sentiment, Pepeto’s execution gives it a real seat in the “best crypto to buy now” debate. First, a quick look at why dogecoin may be losing altitude. Dogecoin Price Prediction: Is Doge Really Fading? Remember when dogecoin made crypto feel simple? In 2013, DOGE turned a meme into money and a loose forum into a movement. A decade on, the nonstop momentum has cooled; the backdrop is different, and the market is far more selective. With DOGE circling ~$0.268, the tape reads bearish-to-neutral for the next few weeks: hold the $0.26 shelf on daily closes and expect choppy range-trading toward $0.29–$0.30 where rallies keep stalling; lose $0.26 decisively and momentum often bleeds into $0.245 with risk of a deeper probe toward $0.22–$0.21; reclaim $0.30 on a clean daily close and the downside bias is likely neutralized, opening room for a squeeze into the low-$0.30s. Source: CoinMarketcap / TradingView Beyond the dogecoin price prediction, DOGE still centers on payments and lacks native smart contracts; ZK-proof verification is proposed,…
Share
BitcoinEthereumNews2025/09/18 00:14
U.S. Futures Fall And Betting Odds Rise As Government Shutdown Appears Imminent

U.S. Futures Fall And Betting Odds Rise As Government Shutdown Appears Imminent

The post U.S. Futures Fall And Betting Odds Rise As Government Shutdown Appears Imminent appeared on BitcoinEthereumNews.com. Topline U.S. stock futures fell early on Tuesday after a meeting of Congressional leaders from both parties and President Donald Trump failed to reach a deal on legislation to keep the government funded ahead of Wednesday’s deadline for a government shutdown. Vice President J.D. Vance, accompanied by House Speaker Mike Johnson (R-LA), Senate Majority Leader John Thune (R-SD), and Office of Management and Budget Director Russ Vought, is seen at a press conference following a meeting between President Trump and Congressional Democratic leaders. Anadolu via Getty Images Key Facts Dow Futures dropped 0.22% to 46,518 points in premarket trading early on Tuesday, while the benchmark S&P 500 Futures fell 0.15% to 6,703.50 points. The tech-focused Nasdaq Futures also fell 0.12% to 24,806.75 points. The Bureau of Labor Statistics— which produces monthly nonfarm jobs payroll data and is scheduled to do so on Friday—has warned it will suspend all operations if a shutdown occurs, in a move that could further raise concerns about the health of the job market. In addition to this, the White House budget office has signaled it could use a shutdown to carry out mass firings across several government agencies. What Do The Betting Markets Say About The Odds Of A Shutdown? Bettors believe the odds of a government shutdown have increased significantly after congressional leaders from both parties met with Trump at the White House on Monday but failed to reach a deal. Bookmakers on the crypto betting platform Polymarket now believe there is an 83% chance of a U.S. government shutdown in 2025 and a 79% chance of a shutdown by Wednesday. Both numbers have seen a significant spike in the past 24 hours, rising by around 11 percentage points. Bettors on Kalshi also believe there is a 77% chance of a U.S. government shutdown…
Share
BitcoinEthereumNews2025/09/30 21:54
Uniswap wins again in ‘scam token’ lawsuit

Uniswap wins again in ‘scam token’ lawsuit

Uniswap keeps winning in court. Illustration: Andrés Tapia; Source: Shutterstock.
Share
DL News2026/03/04 01:11