Meta-Learning in Neural Networks: A Survey (original) (raw)
Related papers
Meta-Learning with Task-Adaptive Loss Function for Few-Shot Learning
2021 IEEE/CVF International Conference on Computer Vision (ICCV), 2021
In few-shot learning scenarios, the challenge is to generalize and perform well on new unseen examples when only very few labeled examples are available for each task. Model-agnostic meta-learning (MAML) has gained the popularity as one of the representative few-shot learning methods for its flexibility and applicability to diverse problems. However, MAML and its variants often resort to a simple loss function without any auxiliary loss function or regularization terms that can help achieve better generalization. The problem lies in that each application and task may require different auxiliary loss function, especially when tasks are diverse and distinct. Instead of attempting to hand-design an auxiliary loss function for each application and task, we introduce a new meta-learning framework with a loss function that adapts to each task. Our proposed framework, named Meta-Learning with Task-Adaptive Loss Function (MeTAL), demonstrates the effectiveness and the flexibility across various domains, such as few-shot classification and few-shot regression. C. Semi-Supervised Inner-Loop Optimization Recently, metric-based meta-learning algorithms, such as the method from [15], have attempted to make full use of the unlabeled query set by exploiting its feature similarity with the labeled support set. Under such scenario (a.k.a. transductive setting), many recent metric-based meta-learning algorithms have achieved outstanding performance. On the other hand, the transductive setting or transductive inference is rarely explored among optimization-based learners. Recent few works [2, 9]
A Reweighted Meta Learning Framework for Robust Few Shot Learning
ArXiv, 2020
Model-Agnostic Meta-Learning (MAML) is a popular gradient-based meta-learning framework that tries to find an optimal initialization to minimize the expected loss across all tasks during meta-training. However, it inherently assumes that the contribution of each instance/task to the meta-learner is equal. Therefore, it fails to address the problem of domain differences between base and novel classes in few-shot learning. In this work, we propose a novel and robust meta-learning algorithm, called RW-MAML, which learns to assign weights to training instances or tasks. We consider these weights to be hyper-parameters. Hence, we iteratively optimize the weights using a small set of validation tasks and an online approximation in a \emph{bi-bi-level} optimization framework, in contrast to the standard bi-level optimization in MAML. Therefore, we investigate a practical evaluation setting to demonstrate the scalability of our RW-MAML in two scenarios: (1) out-of-distribution tasks and (2)...
Meta-Learning with Temporal Convolutions
ArXiv, 2017
Deep neural networks excel in regimes with large amounts of data, but tend to struggle when data is scarce or when they need to adapt quickly to changes in the task. Recent work in meta-learning seeks to overcome this shortcoming by training a meta-learner on a distribution of similar tasks; the goal is for the meta-learner to generalize to novel but related tasks by learning a high-level strategy that captures the essence of the problem it is asked to solve. However, most recent approaches to meta-learning are extensively hand-designed, either using architectures that are specialized to a particular application, or hard-coding algorithmic components that tell the meta-learner how to solve the task. We propose a class of simple and generic meta-learner architectures, based on temporal convolutions, that is domain- agnostic and has no particular strategy or algorithm encoded into it. We validate our temporal-convolution-based meta-learner (TCML) through experiments pertaining to both...
Meta-Learning with Adaptive Hyperparameters
ArXiv, 2020
Despite its popularity, several recent works question the effectiveness of MAML when test tasks are different from training tasks, thus suggesting various task-conditioned methodology to improve the initialization. Instead of searching for better task-aware initialization, we focus on a complementary factor in MAML framework, inner-loop optimization (or fast adaptation). Consequently, we propose a new weight update rule that greatly enhances the fast adaptation process. Specifically, we introduce a small meta-network that can adaptively generate per-step hyperparameters: learning rate and weight decay coefficients. The experimental results validate that the Adaptive Learning of hyperparameters for Fast Adaptation (ALFA) is the equally important ingredient that was often neglected in the recent few-shot learning approaches. Surprisingly, fast adaptation from random initialization with ALFA can already outperform MAML.
Fast Few-Shot Classification by Few-Iteration Meta-Learning
2021 IEEE International Conference on Robotics and Automation (ICRA), 2021
Autonomous agents interacting with the real world need to learn new concepts efficiently and reliably. This requires learning in a low-data regime, which is a highly challenging problem. We address this task by introducing a fast optimization-based meta-learning method for few-shot classification. It consists of an embedding network, providing a general representation of the image, and a base learner module. The latter learns a linear classifier during the inference through an unrolled optimization procedure. We design an inner learning objective composed of (i) a robust classification loss on the support set and (ii) an entropy loss, allowing transductive learning from unlabeled query samples. By employing an efficient initialization module and a Steepest Descent based optimization algorithm, our base learner predicts a powerful classifier within only a few iterations. Further, our strategy enables important aspects of the base learner objective to be learned during meta-training. To the best of our knowledge, this work is the first to integrate both induction and transduction into the base learner in an optimization-based meta-learning framework. We perform a comprehensive experimental analysis, demonstrating the speed and effectiveness of our approach on four few-shot classification datasets. The Code is available at https://github.com/4rdhendu/FIML.
Omni-Training: Bridging Pre-Training and Meta-Training for Few-Shot Learning
arXiv (Cornell University), 2021
Few-shot learning aims to fast adapt a deep model from a few examples. While pre-training and meta-training can create deep models powerful for few-shot generalization, we find that pre-training and meta-training focuses respectively on cross-domain transferability and cross-task transferability, which restricts their data efficiency in the entangled settings of domain shift and task shift. We thus propose the Omni-Training framework to seamlessly bridge pre-training and meta-training for data-efficient few-shot learning. Our first contribution is a tri-flow Omni-Net architecture. Besides the joint representation flow, Omni-Net introduces two parallel flows for pre-training and meta-training, responsible for improving domain transferability and task transferability respectively. Omni-Net further coordinates the parallel flows by routing their representations via the joint-flow, enabling knowledge transfer across flows. Our second contribution is the Omni-Loss, which introduces a self-distillation strategy separately on the pre-training and meta-training objectives for boosting knowledge transfer throughout different training stages. Omni-Training is a general framework to accommodate many existing algorithms. Evaluations justify that our single framework consistently and clearly outperforms the individual state-of-the-art methods on both cross-task and cross-domain settings in a variety of classification, regression and reinforcement learning problems.
Few-Shot Classification By Few-Iteration Meta-Learning
2020
Learning in a low-data regime from only a few labeled examples is an important, but challenging problem. Recent advancements within meta-learning have demonstrated encouraging performance, in particular, for the task of few-shot classification. We propose a novel optimization-based meta-learning approach for few-shot classification. It consists of an embedding network, providing a general representation of the image, and a base learner module. The latter learns a linear classifier during the inference through an unrolled optimization procedure. We design an inner learning objective composed of (i) a robust classification loss on the support set and (ii) an entropy loss, allowing transductive learning from unlabeled query samples. By employing an efficient initialization module and a Steepest Descent based optimization algorithm, our base learner predicts a powerful classifier within only a few iterations. Further, our strategy enables important aspects of the base learner objective ...
Stress Testing of Meta-learning Approaches for Few-shot Learning
2021
Meta-learning (ML) has emerged as a promising learning method under resource constraints such as few-shot learning. ML approaches typically propose a methodology to learn generalizable models. In this work-in-progress paper, we put the recent ML approaches to a stress test to discover their limitations. Precisely, we measure the performance of ML approaches for few-shot learning against increasing task complexity. Our results show a quick degradation in the performance of initialization strategies for ML (MAML, TAML, and MetaSGD), while surprisingly, approaches that use an optimization strategy (MetaLSTM) perform significantly better. We further demonstrate the effectiveness of an optimization strategy for ML (MetaLSTM++) trained in a MAML manner over a pure optimization strategy. Our experiments also show that the optimization strategies for ML achieve higher transferability from simple to complex tasks.
Deep Meta-Learning: Learning to Learn in the Concept Space
arXiv (Cornell University), 2018
Few-shot learning remains challenging for metalearning that learns a learning algorithm (metalearner) from many related tasks. In this work, we argue that this is due to the lack of a good representation for meta-learning, and propose deep metalearning to integrate the representation power of deep learning into meta-learning. The framework is composed of three modules, a concept generator, a meta-learner, and a concept discriminator, which are learned jointly. The concept generator, e.g. a deep residual net, extracts a representation for each instance that captures its high-level concept, on which the meta-learner performs fewshot learning, and the concept discriminator recognizes the concepts. By learning to learn in the concept space rather than in the complicated instance space, deep meta-learning can substantially improve vanilla meta-learning, which is demonstrated on various few-shot image recognition problems. For example, on 5-way-1-shot image recognition on CIFAR-100 and CUB-200, it im
Comparing Transfer and Meta Learning Approaches on a Unified Few-Shot Classification Benchmark
Cornell University - arXiv, 2021
Meta and transfer learning are two successful families of approaches to few-shot learning. Despite highly related goals, state-of-the-art advances in each family are measured largely in isolation of each other. As a result of diverging evaluation norms, a direct or thorough comparison of different approaches is challenging. To bridge this gap, we perform a cross-family study of the best transfer and meta learners on both a large-scale meta-learning benchmark (Meta-Dataset, MD), and a transfer learning benchmark (Visual Task Adaptation Benchmark, VTAB). We find that, on average, large-scale transfer methods (Big Transfer, BiT) outperform competing approaches on MD, even when trained only on ImageNet. In contrast, meta-learning approaches struggle to compete on VTAB when trained and validated on MD. However, BiT is not without limitations, and pushing for scale does not improve performance on highly out-of-distribution MD tasks. In performing this study, we reveal a number of discrepancies in evaluation norms and study some of these in light of the performance gap. We hope that this work facilitates sharing of insights from each community, and accelerates progress on fewshot learning.