Latest papers: Difference between revisions
RobowaifuDev (talk | contribs) (→June 2023: Add "Demystifying GPT Self-Repair for Code Generation") |
RobowaifuDev (talk | contribs) (→March 2023: Add "SemDeDup: Data-efficient learning at web-scale through semantic deduplication") |
||
(6 intermediate revisions by the same user not shown) | |||
Line 37: | Line 37: | ||
* [https://arxiv.org/abs/2306.09782 Full Parameter Fine-tuning for Large Language Models with Limited Resources] | * [https://arxiv.org/abs/2306.09782 Full Parameter Fine-tuning for Large Language Models with Limited Resources] | ||
* [https://arxiv.org/abs/2306.11987 Demystifying GPT Self-Repair for Code Generation] | * [https://arxiv.org/abs/2306.11987 Demystifying GPT Self-Repair for Code Generation] | ||
* [https://arxiv.org/abs/2306.08568 WizardCoder: Empowering Code Large Language Models with Evol-Instruct] | |||
* [https://arxiv.org/abs/2306.08205 Agile Catching with Whole-Body MPC and Blackbox Policy Learning] | |||
* [https://arxiv.org/abs/2306.07967 One-for-All: Generalized LoRA for Parameter-Efficient Fine-tuning] | |||
* [https://arxiv.org/abs/2306.07174 Augmenting Language Models with Long-Term Memory] | |||
* [https://arxiv.org/abs/2306.05422 Tracking Everything Everywhere All at Once] | |||
* [https://arxiv.org/abs/2306.04757 INSTRUCTEVAL: Towards Holistic Evaluation of Instruction-Tuned Large Language Models] | |||
* [https://arxiv.org/abs/2306.04050 LLMZip: Lossless Text Compression using Large Language Models] | |||
* [https://arxiv.org/abs/2306.03509 Mega-TTS: Zero-Shot Text-to-Speech at Scale with Intrinsic Inductive Bias] | |||
* [https://arxiv.org/abs/2306.03872 Deductive Verification of Chain-of-Thought Reasoning] | |||
* [https://arxiv.org/abs/2306.00238 Bytes Are All You Need: Transformers Operating Directly On File Bytes] | |||
* [https://arxiv.org/abs/2306.14884 Learning to Modulate pre-trained Models in RL] | |||
==== May 2023 ==== | ==== May 2023 ==== | ||
Line 53: | Line 64: | ||
* [https://arxiv.org/abs/2305.03053 ZipIt! Merging Models from Different Tasks without Training] | * [https://arxiv.org/abs/2305.03053 ZipIt! Merging Models from Different Tasks without Training] | ||
* [https://arxiv.org/abs/2305.16635 Impossible Distillation: from Low-Quality Model to High-Quality Dataset & Model for Summarization and Paraphrasing] | * [https://arxiv.org/abs/2305.16635 Impossible Distillation: from Low-Quality Model to High-Quality Dataset & Model for Summarization and Paraphrasing] | ||
* [https://arxiv.org/abs/2305.09967 Variable Length Embeddings] | |||
* [https://arxiv.org/abs/2305.07759 TinyStories: How Small Can Language Models Be and Still Speak Coherent English?] | |||
==== April 2023 ==== | ==== April 2023 ==== | ||
Line 78: | Line 91: | ||
* [https://arxiv.org/abs/2303.12712v5 Sparks of Artificial General Intelligence: Early experiments with GPT-4] | * [https://arxiv.org/abs/2303.12712v5 Sparks of Artificial General Intelligence: Early experiments with GPT-4] | ||
* [https://arxiv.org/abs/2303.11366 Reflexion: Language Agents with Verbal Reinforcement Learning] | * [https://arxiv.org/abs/2303.11366 Reflexion: Language Agents with Verbal Reinforcement Learning] | ||
* [https://arxiv.org/abs/2303.09540 SemDeDup: Data-efficient learning at web-scale through semantic deduplication] | |||
==== Februrary 2023 ==== | ==== Februrary 2023 ==== | ||
Line 203: | Line 217: | ||
* [https://arxiv.org/abs/2112.04426v3 Improving language models by retrieving from trillions of tokens] | * [https://arxiv.org/abs/2112.04426v3 Improving language models by retrieving from trillions of tokens] | ||
* [https://arxiv.org/abs/2112.03254v3 Human Parity on CommonsenseQA: Augmenting Self-Attention with External Attention] | * [https://arxiv.org/abs/2112.03254v3 Human Parity on CommonsenseQA: Augmenting Self-Attention with External Attention] | ||
* [https://arxiv.org/abs/2112.08654 Learning to Prompt for Continual Learning] | |||
==== November 2021 ==== | ==== November 2021 ==== | ||
Line 271: | Line 286: | ||
* [https://arxiv.org/abs/2102.11174v3 Linear Transformers Are Secretly Fast Weight Programmers] | * [https://arxiv.org/abs/2102.11174v3 Linear Transformers Are Secretly Fast Weight Programmers] | ||
* [https://arxiv.org/abs/2102.08597 Beyond Fully-Connected Layers with Quaternions: Parameterization of Hypercomplex Multiplications with Parameters] | * [https://arxiv.org/abs/2102.08597 Beyond Fully-Connected Layers with Quaternions: Parameterization of Hypercomplex Multiplications with Parameters] | ||
* [https://arxiv.org/abs/2102.04906 Dynamic Neural Networks: A Survey] | |||
==== January 2021 ==== | ==== January 2021 ==== |
Latest revision as of 23:02, 11 July 2023
This page serves to collect notable research papers related to robotics and artificial intelligence, particularly ones that can be used by hobbyists with minimal resources towards creating robowaifus. Feel free to add new papers to the list and discuss any papers on the talk page. Papers posted on /robowaifu/ will also eventually appear here.
Search sites
- SemanticScholar - AI-powered research tool
- PapersWithCode
- Google Scholar
- arXiv
- YouChat - hit and miss from hallucinating a lot but sometimes finds good ones
- Journal of Machine Learning Research
- HuggingFace Daily Papers
Social media sources
- @_akhaliq
- @abacaj (small language models)
- @DrJimFan (multimodal generalist agents)
- @gordic_aleksa
- @hardmaru
List of papers
Unsorted
June 2023
- Augmenting Language Models with Long-Term Memory
- One-for-All: Generalized LoRA for Parameter-Efficient Fine-tuning
- Textbooks Are All You Need
- Fast Segment Anything
- Training Transformers with 4-bit Integers
- RoboCat: A Self-Improving Foundation Agent for Robotic Manipulation
- Full Parameter Fine-tuning for Large Language Models with Limited Resources
- Demystifying GPT Self-Repair for Code Generation
- WizardCoder: Empowering Code Large Language Models with Evol-Instruct
- Agile Catching with Whole-Body MPC and Blackbox Policy Learning
- One-for-All: Generalized LoRA for Parameter-Efficient Fine-tuning
- Augmenting Language Models with Long-Term Memory
- Tracking Everything Everywhere All at Once
- INSTRUCTEVAL: Towards Holistic Evaluation of Instruction-Tuned Large Language Models
- LLMZip: Lossless Text Compression using Large Language Models
- Mega-TTS: Zero-Shot Text-to-Speech at Scale with Intrinsic Inductive Bias
- Deductive Verification of Chain-of-Thought Reasoning
- Bytes Are All You Need: Transformers Operating Directly On File Bytes
- Learning to Modulate pre-trained Models in RL
May 2023
- Plan, Eliminate, and Track -- Language Models are Good Teachers for Embodied Agents
- Distilling Step-by-Step! Outperforming Larger Language Models with Less Training Data and Smaller Model Sizes
- Making the Most of What You Have: Adapting Pre-trained Visual Language Models in the Low-data Regime
- Unlimiformer: Long-Range Transformers with Unlimited Length Input
- Learning to Reason and Memorize with Self-Notes
- Meet in the Middle: A New Pre-training Paradigm
- Direct Preference Optimization: Your Language Model is Secretly a Reward Model
- Large Language Models as Tool Makers
- MEGABYTE: Predicting Million-byte Sequences with Multiscale Transformers
- Do Language Models Know When They're Hallucinating References?
- Large Language Models are In-Context Semantic Reasoners rather than Symbolic Reasoners
- ZipIt! Merging Models from Different Tasks without Training
- Impossible Distillation: from Low-Quality Model to High-Quality Dataset & Model for Summarization and Paraphrasing
- Variable Length Embeddings
- TinyStories: How Small Can Language Models Be and Still Speak Coherent English?
April 2023
- DataComp: In search of the next generation of multimodal datasets
- LaMini-LM: A Diverse Herd of Distilled Models from Large-Scale Instructions
- Learning Agile Soccer Skills for a Bipedal Robot with Deep Reinforcement Learning
- Learning Fine-Grained Bimanual Manipulation with Low-Cost Hardware
- Stable and low-precision training for large-scale vision-language models
- WizardLM: Empowering Large Language Models to Follow Complex Instructions
- Boosting Theory-of-Mind Performance in Large Language Models via Prompting
- Speed Is All You Need: On-Device Acceleration of Large Diffusion Models via GPU-Aware Optimizations
- Scaling Transformer to 1M tokens and beyond with RMT
- Can GPT-4 Perform Neural Architecture Search?
- MiniGPT-4: Enhancing Vision-Language Understanding with Advanced Large Language Models
- Synthetic Data from Diffusion Models Improves ImageNet Classification
- LongForm: Optimizing Instruction Tuning for Long Text Generation with Corpus Extraction
- OpenAssistant Conversations -- Democratizing Large Language Model Alignment
- Multimodal C4: An Open, Billion-scale Corpus of Images Interleaved With Text
- Free Dolly: Introducing the World's First Truly Open Instruction-Tuned LLM
March 2023
- LLaMA-Adapter: Efficient Fine-tuning of Language Models with Zero-init Attention
- Sparks of Artificial General Intelligence: Early experiments with GPT-4
- Reflexion: Language Agents with Verbal Reinforcement Learning
- SemDeDup: Data-efficient learning at web-scale through semantic deduplication
Februrary 2023
- Hyena Hierarchy: Towards Larger Convolutional Language Models
- EfficientTTS 2: Variational End-to-End Text-to-Speech Synthesis and Voice Conversion
- LLaMA: Open and Efficient Foundation Language Models
- SpikeGPT: Generative Pre-trained Language Model with Spiking Neural Networks
- Autonomous Restructuring of Asteroids into Rotating Space Stations
- SwitchPrompt: Learning Domain-Specific Gated Soft Prompts for Classification in Low-Resource Domains
- Symbolic Discovery of Optimization Algorithms
- Learning a Fourier Transform for Linear Relative Positional Encodings in Transformers
- Multimodal Chain-of-Thought Reasoning in Language Models
January 2023
- Looped Transformers as Programmable Computers
- Progressive Prompts: Continual Learning for Language Models
- Memory Augmented Large Language Models are Computationally Universal
- Generative Language Models and Automated Influence Operations: Emerging Threats and Potential Mitigations
- Neural Codec Language Models are Zero-Shot Text to Speech Synthesizers
- SparseGPT: Massive Language Models Can Be Accurately Pruned in One-Shot
- Rethinking with Retrieval: Faithful Large Language Model Inference
December 2022
- Interleaving Retrieval with Chain-of-Thought Reasoning for Knowledge-Intensive Multi-Step Questions
- Self-Instruct: Aligning Language Model with Self Generated Instructions
- Unnatural Instructions: Tuning Language Models with (Almost) No Human Labor
- Point-E: A System for Generating 3D Point Clouds from Complex Prompts
- Constitutional AI: Harmlessness from AI Feedback
- Objaverse: A Universe of Annotated 3D Objects
- NeRFEditor: Differentiable Style Decomposition for Full 3D Scene Editing
November 2022
- Improving Multimodal Interactive Agents with Reinforcement Learning from Human Feedback
- SmoothQuant: Accurate and Efficient Post-Training Quantization for Large Language Models
- Token Turing Machines
- English Contrastive Learning Can Learn Universal Cross-lingual Sentence Embeddings
October 2022
- Entailer: Answering Questions with Faithful and Truthful Chains of Reasoning
- Scaling Instruction-Finetuned Language Models
- lo-fi: distributed fine-tuning without communication
- Large Language Models Can Self-Improve
- Mass-Editing Memory in a Transformer
- Interactive Language: Talking to Robots in Real Time
- CLIP also Understands Text: Prompting CLIP for Phrase Understanding
September 2022
- Learning by Distilling Context
- Dynamic Generation of Interpretable Inference Rules in a Neuro-Symbolic Expert System
- Mega: Moving Average Equipped Gated Attention
- The alignment problem from a deep learning perspective
August 2022
- LLM.int8(): 8-bit Matrix Multiplication for Transformers at Scale
- AlexaTM 20B: Few-Shot Learning Using a Large-Scale Multilingual Seq2Seq Model
- Efficient Long-Text Understanding with Short-Text Models
July 2022
- Confident Adaptive Language Modeling
- Inner Monologue: Embodied Reasoning through Planning with Language Models
- Robust and efficient forward, differential, and inverse kinematics using dual quaternions
June 2022
- On-Device Training Under 256KB Memory
- Video PreTraining (VPT): Learning to Act by Watching Unlabeled Online Videos
- Contrastive Learning as Goal-Conditioned Reinforcement Learning
- Beyond the Imitation Game: Quantifying and extrapolating the capabilities of language models
- A Survey on Sentence Embedding Models Performance for Patent Analysis
- DepthShrinker: A New Compression Paradigm Towards Boosting Real-Hardware Efficiency of Compact Neural Networks
May 2022
- FlashAttention: Fast and Memory-Efficient Exact Attention with IO-Awareness
- GIT: A Generative Image-to-text Transformer for Vision and Language
- NaturalProver: Grounded Mathematical Proof Generation with Language Models
- Large Language Models are Zero-Shot Reasoners
- A Generalist Agent
- Relational Triple Extraction: One Step is Enough
- UL2: Unifying Language Learning Paradigms
- Data Distributional Properties Drive Emergent In-Context Learning in Transformers
- CoCa: Contrastive Captioners are Image-Text Foundation Models
March 2022
- In-context Learning and Induction Heads
- CodeGen: An Open Large Language Model for Code with Multi-Turn Program Synthesis
- Training language models to follow instructions with human feedback
- Parameter-efficient Model Adaptation for Vision Transformers
- Training Compute-Optimal Large Language Models
- Block-Recurrent Transformers
- Delta Tuning: A Comprehensive Study of Parameter Efficient Methods for Pre-trained Language Models
- Tensor Programs V: Tuning Large Neural Networks via Zero-Shot Hyperparameter Transfer
- Training language models to follow instructions with human feedback
February 2022
- Is Neuro-Symbolic AI Meeting its Promise in Natural Language Processing? A Structured Review
- Transformer Quality in Linear Time
- Transformer Memory as a Differentiable Search Index
January 2022
- Can Wikipedia Help Offline Reinforcement Learning?
- Chain-of-Thought Prompting Elicits Reasoning in Large Language Models
- ZeroPrompt: Scaling Prompt-Based Pretraining to 1,000 Tasks Improves Zero-Shot Generalization
Februrary 2022
December 2021
- A Mathematical Framework for Transformer Circuits
- Self-attention Does Not Need Memory
- MAGMA -- Multimodal Augmentation of Generative Models through Adapter-based Finetuning
- Improving language models by retrieving from trillions of tokens
- Human Parity on CommonsenseQA: Augmenting Self-Attention with External Attention
- Learning to Prompt for Continual Learning
November 2021
- Reason first, then respond: Modular Generation for Knowledge-infused Dialogue
- LAION-400M: Open Dataset of CLIP-Filtered 400 Million Image-Text Pairs
October 2021
- Multitask Prompted Training Enables Zero-Shot Task Generalization
- The Neural Data Router: Adaptive Control Flow in Transformers Improves Systematic Generalization
- Learning with Algorithmic Supervision via Continuous Relaxations
- Powerpropagation: A sparsity inducing weight reparameterisation
- Mastering Atari Games with Limited Data
- PHNNs: Lightweight Neural Networks via Parameterized Hypercomplex Convolutions
September 2021
- RAFT: A Real-World Few-Shot Text Classification Benchmark
- Scale Efficiently: Insights from Pre-training and Fine-tuning Transformers
- Towards Zero-Label Language Learning
- Is Curiosity All You Need? On the Utility of Emergent Behaviours from Curious Exploration
- Exploration in Deep Reinforcement Learning: From Single-Agent to Multiagent Domain
- Finetuned Language Models Are Zero-Shot Learners
- A Survey of Exploration Methods in Reinforcement Learning
August 2021
June 2021
- What Is Consciousness? Artificial Intelligence, Real Intelligence, Quantum Mind, And Qualia
- Multimodal Few-Shot Learning with Frozen Language Models
- LoRA: Low-Rank Adaptation of Large Language Models
- Going Beyond Linear Transformers with Recurrent Fast Weight Programmers
- Compacter: Efficient Low-Rank Hypercomplex Adapter Layers
- Parameter-efficient Multi-task Fine-tuning for Transformers via Shared Hypernetworks
- Efficient Passage Retrieval with Hashing for Open-domain Question Answering
- Conversational Question Answering: A Survey
May 2021
- CogView: Mastering Text-to-Image Generation via Transformers
- Pretrained Language Models for Text Generation: A Survey
- RepMLP: Re-parameterizing Convolutions into Fully-connected Layers for Image Recognition
April 2021
- Emerging Properties in Self-Supervised Vision Transformers
- MDETR -- Modulated Detection for End-to-End Multi-Modal Understanding
- Learning and Planning in Complex Action Spaces
- 1-bit LAMB: Communication Efficient Large-Scale Large-Batch Training with LAMB's Convergence Speed
- Survey on reinforcement learning for language processing
- Fast and Efficient Locomotion via Learned Gait Transitions
- EfficientNetV2: Smaller Models and Faster Training
March 2021
- Replacing Rewards with Examples: Example-Based Policy Search via Recursive Classification
- Detecting Hate Speech with GPT-3
- Improving and Simplifying Pattern Exploiting Training
- Get Your Vitamin C! Robust Fact Verification with Contrastive Evidence
- Learning Transferable Visual Models From Natural Language Supervision
February 2021
- Linear Transformers Are Secretly Fast Weight Programmers
- Beyond Fully-Connected Layers with Quaternions: Parameterization of Hypercomplex Multiplications with Parameters
- Dynamic Neural Networks: A Survey
January 2021
December 2020
- Attention over learned object embeddings enables complex visual reasoning
- AIR-FI: Generating Covert Wi-Fi Signals from Air-Gapped Computers
- Neurosymbolic AI: The 3rd Wave
November 2020
- 3D imaging from multipath temporal echoes
- Using IPA-Based Tacotron for Data Efficient Cross-Lingual Speaker Adaptation and Pronunciation Enhancement
- Is Private Learning Possible with Instance Encoding?
October 2020
- Language Models are Open Knowledge Graphs
- Vokenization: Improving Language Understanding with Contextualized, Visual-Grounded Supervision
September 2020
- It's Not Just Size That Matters: Small Language Models Are Also Few-Shot Learners
- The Hardware Lottery
August 2020
July 2020
September 2020
June 2020
- Is SGD a Bayesian sampler? Well, almost
- The Limit of the Batch Size
- Hindsight Expectation Maximization for Goal-conditioned Reinforcement Learning
- Linformer: Self-Attention with Linear Complexity
- Predictive Coding Approximates Backprop along Arbitrary Computation Graphs
May 2020
- Language Models are Few-Shot Learners
- Give Me Convenience and Give Her Death: Who Should Decide What Uses of NLP are Appropriate, and on What Basis?
- ViTAA: Visual-Textual Attributes Alignment in Person Search by Natural Language
- Flowtron: an Autoregressive Flow-based Generative Network for Text-to-Speech Synthesis
- Multi-band MelGAN: Faster Waveform Generation for High-Quality Text-to-Speech
April 2020
March 2020
- OmniTact: A Multi-Directional High Resolution Touch Sensor
- ReZero is All You Need: Fast Convergence at Large Depth
February 2020
- Train Large, Then Compress: Rethinking Model Size for Efficient Training and Inference of Transformers
- Backpropamine: training self-modifying neural networks with differentiable neuromodulated plasticity
- Learning to Continually Learn
- A Survey on Knowledge Graphs: Representation, Acquisition and Applications
- SentenceMIM: A Latent Variable Language Model
January 2020
- ProphetNet: Predicting Future N-gram for Sequence-to-Sequence Pre-training
- Towards a Human-like Open-Domain Chatbot
- Exploiting Cloze Questions for Few Shot Text Classification and Natural Language Inference
- Gradient Surgery for Multi-Task Learning
- Unsupervised Audiovisual Synthesis via Exemplar Autoencoders
December 2019
- Generative Teaching Networks: Accelerating Neural Architecture Search by Learning to Generate Synthetic Training Data
- 12-in-1: Multi-Task Vision and Language Representation Learning
- Deep Learning for Symbolic Mathematics
- ATIS + SpiNNaker: a Fully Event-based Visual Tracking Demonstration
November 2019
October 2019
- Learning Disentangled Representations for Recommendation
- Stabilizing Transformers for Reinforcement Learning
- MelGAN: Generative Adversarial Networks for Conditional Waveform Synthesis
- MIM: Mutual Information Machine
- Modeling Human Motion with Quaternion-based Neural Networks
September 2019
- Fine-Tuning Language Models from Human Preferences
- Neural Machine Translation with Byte-Level Subwords
July 2019
- Large Memory Layers with Product Keys
- Better-than-Demonstrator Imitation Learning via Automatically-Ranked Demonstrations
June 2019
May 2019
- EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks
- Combining Experience Replay with Exploration by Random Network Distillation
April 2019
- Improving Differentiable Neural Computers Through Memory Masking, De-allocation, and Link Distribution Sharpness Control
- Large Batch Optimization for Deep Learning: Training BERT in 76 minutes
January 2019
- Go-Explore: a New Approach for Hard-Exploration Problems
- Paired Open-Ended Trailblazer (POET): Endlessly Generating Increasingly Complex and Diverse Learning Environments and Their Solutions
November 2018
- Hierarchical Policy Design for Sample-Efficient Learning of Robot Table Tennis Through Self-Play
- 3D human pose estimation in video with temporal convolutions and semi-supervised training
- Image Chat: Engaging Grounded Conversations
- Towards Empathetic Open-domain Conversation Models: a New Benchmark and Dataset
October 2018
- Exploration by Random Network Distillation
- Dreaming neural networks: forgetting spurious memories and reinforcing pure ones
- BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
September 2018
- HotpotQA: A Dataset for Diverse, Explainable Multi-hop Question Answering
- Neural Approaches to Conversational AI
July 2018
June 2018
April 2018
- Differentiable plasticity: training plastic neural networks with backpropagation
- The Kanerva Machine: A Generative Distributed Memory
March 2018
- A Short Survey On Memory Based Reinforcement Learning
- Universal Sentence Encoder
- Unsupervised Predictive Memory in a Goal-Directed Agent
- World Models
February 2018
January 2018
- Universal Language Model Fine-Tuning with Subword Tokenization for Polish
- Personalizing Dialogue Agents: I have a dog, do you have pets too?
- Innateness, AlphaZero, and Artificial Intelligence
December 2017
- Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions
- CycleGAN, a Master of Steganography
- Mastering Chess and Shogi by Self-Play with a General Reinforcement Learning Algorithm
November 2017
October 2017
September 2017
August 2017
- Fast, Better Training Trick -- Random Gradient
- Intrinsically Motivated Goal Exploration Processes with Automatic Curriculum Learning
July 2017
June 2017
May 2017
- TriviaQA: A Large Scale Distantly Supervised Challenge Dataset for Reading Comprehension
- Curiosity-driven Exploration by Self-supervised Prediction
March 2017
November 2016
- Visual Dialog
- Google's Multilingual Neural Machine Translation System: Enabling Zero-Shot Translation
October 2016
September 2016
- On Large-Batch Training for Deep Learning: Generalization Gap and Sharp Minima
- Learning to learn with backpropagation of Hebbian plasticity
July 2016
June 2016
May 2016
November 2015
June 2015
- The Ubuntu Dialogue Corpus: A Large Dataset for Research in Unstructured Multi-Turn Dialogue Systems
- Teaching Machines to Read and Comprehend
May 2015
June 2015
- The Ubuntu Dialogue Corpus: A Large Dataset for Research in Unstructured Multi-Turn Dialogue Systems
- Teaching Machines to Read and Comprehend
July 2014
December 2012
September 2003
Previously
Instruction tuning
Evol-Instruct: Mass-Producing Open-Domain Instruction Data with Varying Levels of Complexity using Large Language Models (arXiv:2304.12244)
tl;dr The paper proposes a method called Evol-Instruct for creating large amounts of instruction data with different levels of complexity using a large language model (LLM) instead of humans. The generated data is used to fine-tune another LLM called WizardLM. Human evaluations show that Evol-Instruct instructions are better than human-created ones, and WizardLM is preferred over OpenAI ChatGPT for complex tasks. The study suggests that fine-tuning LLMs with AI-evolved instructions is a promising approach for improving their performance.[1]
2022
November 2022
Large Language Models Are Human-Level Prompt Engineers (arXiv)
tl;dr OpenReview version. Automatic Prompt Engineer (APE) is a method that can generate instructions automatically. It uses a pool of generated instruction candidates and evaluates the quality of them by the zero-shot performance of another LLM following a selected instruction.[2]
2021
August 2021
Computer vision
NeuralMVS: Bridging Multi-View Stereo and Novel View Synthesis (arXiv:2108.03880)
tl;dr Multi-view stereo is a core task in 3D computer vision. NeRF methods do not generalize to novel scenes and are slow to train and test. We propose to bridge the gap between these two methodologies with a novel network that can recover 3D scene geometry as a distance function.[3]
Simulation
iGibson 2.0: Object-Centric Simulation for Robot Learning of Everyday Household Tasks (arXiv:2108.03272)
tl;dr iGibson 2.0 is a novel simulation environment using Bullet that supports the simulation of a more diverse set of household tasks through three key innovations. Firstly, it supports object states, including temperature, wetness level, cleanliness level, and toggled and sliced states, necessary to cover a wider range of tasks. Second, it implements a set of predicate logic functions that map the simulator states to logic states like Cooked or Soaked. Third, the simulator can sample valid physical states that satisfy a logic state. This functionality can generate potentially infinite instances of tasks with minimal effort from the users.[4]
July 2021
Audio processing
SoundStream: An End-to-End Neural Audio Codec (arXiv:2107.03312)
tl;dr A novel neural audio codec that can efficiently compress speech, music and general audio at bitrates normally targeted by speech-tailored codecs. SoundStream at 3kbps outperforms Opus at 12kbps and approaches EVS at 9.6kbps.[5]
June 2021
Multimodal learning
Multimodal Few-Shot Learning with Frozen Language Models (arXiv:2106.13884)
tl;dr When trained at sufficient scale, auto-regressive language models exhibit the notable ability to learn a new language task after being prompted with just a few examples. Here, the authors present a simple, yet effective, approach for transferring this few-shot learning ability to a multimodal setting (vision and language).[6]
Optimizers
A Generalizable Approach to Learning Optimizers (arXiv:2106.00958)
tl;dr Learning to update optimizer hyperparameters instead of model parameters directly using novel features, actions, and a reward function.[7]
May 2021
Memory
Not All Memories are Created Equal: Learning to Forget by Expiring (arXiv:2105.06548)
tl;dr The authors propose Expire-Span, a method that learns to retain the most important information and expire the irrelevant information, which enables Transformers to scale to attend over tens of thousands of previous timesteps efficiently.[8]
April 2021
Fine-tuning
The Power of Scale for Parameter-Efficient Prompt Tuning (arXiv:2104.08691)
tl;dr In this work, the author's explore "prompt tuning" a simple but effective mechanism for learning "soft prompts" to condition frozen language models to perform specific downstream tasks.[9]
March 2021
Computer vision
NeX: Real-time View Synthesis with Neural Basis Expansion (arXiv:2103.05606)
tl;dr The authors present NeX, a new approach to novel view synthesis based on enhancements of multiplane image (MPI) that can reproduce next-level view-dependent effects -- in real time. The method achieves the best overall scores across all major metrics on these datasets with more than 1000× faster rendering time than the state of the art.[10]
October 2020
Computer vision
GRF: Learning a General Radiance Field for 3D Scene Representation and Rendering (arXiv:2010.04595)
tl;dr General Radiance Fields construct an internal representation for each 3D point of a scene from 2D inputs and renders the corresponding appearance and geometry of any 3D scene viewing from an arbitrary angle.[11]
September 2020
Summarization
Learning to Summarize with Human Feedback (arXiv:2009.01325)
tl;dr Human feedback models outperform much larger supervised models and reference summaries on TL;DR.[12]
December 2019
Meta-learning
Generative Teaching Networks: Accelerating Neural Architecture Search by Learning to Generate Synthetic Training Data (arXiv:1912.07768)
tl;dr This paper investigates the intriguing question of whether learning algorithms can automatically generate training data, learning environments, and curricula in order to help AI agents rapidly learn. GTNs are deep neural networks that generate data and/or training environments that a learner trains on for a few SGD steps before being tested on a target task. It then differentiates through the entire learning process via meta-gradients to update the GTN parameters to improve performance on the target task.[13]
Older papers
See also
References
- ↑ Xu et al. Evol-Instruct: Mass-Producing Open-Domain Instruction Data with Varying Levels of Complexity using Large Language Models. arXiv:2304.12244, 2023.
- ↑ Yongchao Zhou, Andrei Ioan Muresanu, Ziwen Han, Keiran Paster, Silviu Pitis, Harris Chan, Jimmy Ba. Large Language Models Are Human-Level Prompt Engineers. arXiv, 2022.
- ↑ Radu Alexandru Rosu, Sven Behnke. NeuralMVS: Bridging Multi-View Stereo and Novel View Synthesis. arXiv:2108.03880, 2021.
- ↑ Chengshu Li, Fei Xia, Roberto Martín-Martín, Michael Lingelbach, Sanjana Srivastava, Bokui Shen, Kent Vainio, Cem Gokmen, Gokul Dharan, Tanish Jain, Andrey Kurenkov, Karen Liu, Hyowon Gweon, Jiajun Wu, Li Fei-Fei, Silvio Savarese. iGibson 2.0: Object-Centric Simulation for Robot Learning of Everyday Household Tasks. arXiv:2108.03272, 2021.
- ↑ Neil Zeghidour, Alejandro Luebs, Ahmed Omran, Jan Skoglund, Marco Tagliasacchi. SoundStream: An End-to-End Neural Audio Codec. arXiv:2107.03312, 2021.
- ↑ Maria Tsimpoukelli, Jacob Menick, Serkan Cabi, S. M. Ali Eslami, Oriol Vinyals, Felix Hill. Multimodal Few-Shot Learning with Frozen Language Models. arXiv:2106.13884, 2021.
- ↑ Diogo Almeida, Clemens Winter, Jie Tang, Wojciech Zaremba. A Generalizable Approach to Learning Optimizers. arXiv:2106.00958, 2021.
- ↑ Sainbayar Sukhbaatar, Da Ju, Spencer Poff, Stephen Roller, Arthur Szlam, Jason Weston, Angela Fan. Not All Memories are Created Equal: Learning to Forget by Expiring. arXiv:2105.06548, 2021.
- ↑ Brian Lester, Rami Al-Rfou, Noah Constant. The Power of Scale for Parameter-Efficient Prompt Tuning. arXiv:2104.08691, 2021.
- ↑ Suttisak Wizadwongsa, Pakkapon Phongthawee, Jiraphon Yenphraphai, Supasorn Suwajanakorn. NeX: Real-time View Synthesis with Neural Basis Expansion. arXiv:2103.05606, 2021.
- ↑ Alex Trevithick, Bo Yang. GRF: Learning a General Radiance Field for 3D Scene Representation and Rendering. arXiv:2010.04595, 2020.
- ↑ Nisan Stiennon, Long Ouyang, Jeff Wu, Daniel M. Ziegler, Ryan Lowe, Chelsea Voss, Alec Radford, Dario Amodei, Paul Christiano. Learning to Summarize with Human Feedback. arXiv:2009.01325, 2020.
- ↑ Felipe Petroski Such, Aditya Rawal, Joel Lehman, Kenneth O. Stanley, Jeff Clune. Generative Teaching Networks: Accelerating Neural Architecture Search by Learning to Generate Synthetic Training Data. arXiv:1912.07768, 2019.