Latest papers

From Robowaifu Institute of Technology
Jump to navigation Jump to search
This page requires expansion!
This page needs papers! Probably should set up an automated system so I can just drop Twitter and Arxiv links.
In Robowaifu.tech papers hold onto you.

This page serves to collect notable research papers related to robotics and artificial intelligence, particularly ones that can be used by hobbyists with minimal resources towards creating robowaifus. Feel free to add new papers to the list and discuss any papers on the talk page. Papers posted on /robowaifu/ will also eventually appear here.

Search sites[edit | edit source]

Social media sources[edit | edit source]

List of papers[edit | edit source]

Unsorted[edit | edit source]

Need to summarize these papers into tl;dr. An automated system for this would be great.

June 2023[edit | edit source]

May 2023[edit | edit source]

April 2023[edit | edit source]

March 2023[edit | edit source]

Februrary 2023[edit | edit source]

January 2023[edit | edit source]

December 2022[edit | edit source]

November 2022[edit | edit source]

October 2022[edit | edit source]

September 2022[edit | edit source]

August 2022[edit | edit source]

July 2022[edit | edit source]

June 2022[edit | edit source]

May 2022[edit | edit source]

March 2022[edit | edit source]

February 2022[edit | edit source]

January 2022[edit | edit source]

Februrary 2022[edit | edit source]

December 2021[edit | edit source]

November 2021[edit | edit source]

October 2021[edit | edit source]

September 2021[edit | edit source]

August 2021[edit | edit source]

June 2021[edit | edit source]

May 2021[edit | edit source]

April 2021[edit | edit source]

March 2021[edit | edit source]

February 2021[edit | edit source]

January 2021[edit | edit source]

December 2020[edit | edit source]

November 2020[edit | edit source]

October 2020[edit | edit source]

September 2020[edit | edit source]

August 2020[edit | edit source]

July 2020[edit | edit source]

September 2020[edit | edit source]

June 2020[edit | edit source]

May 2020[edit | edit source]

April 2020[edit | edit source]

March 2020[edit | edit source]

February 2020[edit | edit source]

January 2020[edit | edit source]

December 2019[edit | edit source]

November 2019[edit | edit source]

October 2019[edit | edit source]

September 2019[edit | edit source]

July 2019[edit | edit source]

June 2019[edit | edit source]

May 2019[edit | edit source]

April 2019[edit | edit source]

January 2019[edit | edit source]

November 2018[edit | edit source]

October 2018[edit | edit source]

September 2018[edit | edit source]

July 2018[edit | edit source]

June 2018[edit | edit source]

April 2018[edit | edit source]

March 2018[edit | edit source]

February 2018[edit | edit source]

January 2018[edit | edit source]

December 2017[edit | edit source]

November 2017[edit | edit source]

October 2017[edit | edit source]

September 2017[edit | edit source]

August 2017[edit | edit source]

July 2017[edit | edit source]

June 2017[edit | edit source]

May 2017[edit | edit source]

March 2017[edit | edit source]

November 2016[edit | edit source]

October 2016[edit | edit source]

September 2016[edit | edit source]

July 2016[edit | edit source]

June 2016[edit | edit source]

May 2016[edit | edit source]

November 2015[edit | edit source]

June 2015[edit | edit source]

May 2015[edit | edit source]

June 2015[edit | edit source]

July 2014[edit | edit source]

December 2012[edit | edit source]

September 2003[edit | edit source]

Previously[edit | edit source]

This page requires tidying up!
This page needs to be completely reformatted. Will be changing the tl;drs to be title text so you can hover over links to get the gist of them without clicking.

Instruction tuning[edit | edit source]

Evol-Instruct: Mass-Producing Open-Domain Instruction Data with Varying Levels of Complexity using Large Language Models (arXiv:2304.12244)

tl;dr The paper proposes a method called Evol-Instruct for creating large amounts of instruction data with different levels of complexity using a large language model (LLM) instead of humans. The generated data is used to fine-tune another LLM called WizardLM. Human evaluations show that Evol-Instruct instructions are better than human-created ones, and WizardLM is preferred over OpenAI ChatGPT for complex tasks. The study suggests that fine-tuning LLMs with AI-evolved instructions is a promising approach for improving their performance.[1]


2022[edit | edit source]

November 2022[edit | edit source]

Large Language Models Are Human-Level Prompt Engineers (arXiv)

tl;dr OpenReview version. Automatic Prompt Engineer (APE) is a method that can generate instructions automatically. It uses a pool of generated instruction candidates and evaluates the quality of them by the zero-shot performance of another LLM following a selected instruction.[2]


2021[edit | edit source]

PROTIP: You can use sshleifer/distilbart-cnn-12-6 to help with summarizing papers. Check the paper template for usage instructions.
2023 update: Leaving this note here as a relic of how much things have progressed. PROTIP: Use GPT-4.

August 2021[edit | edit source]

Computer vision[edit | edit source]

NeuralMVS: Bridging Multi-View Stereo and Novel View Synthesis (arXiv:2108.03880)

tl;dr Multi-view stereo is a core task in 3D computer vision. NeRF methods do not generalize to novel scenes and are slow to train and test. We propose to bridge the gap between these two methodologies with a novel network that can recover 3D scene geometry as a distance function.[3]


Simulation[edit | edit source]

iGibson 2.0: Object-Centric Simulation for Robot Learning of Everyday Household Tasks (arXiv:2108.03272)

tl;dr iGibson 2.0 is a novel simulation environment using Bullet that supports the simulation of a more diverse set of household tasks through three key innovations. Firstly, it supports object states, including temperature, wetness level, cleanliness level, and toggled and sliced states, necessary to cover a wider range of tasks. Second, it implements a set of predicate logic functions that map the simulator states to logic states like Cooked or Soaked. Third, the simulator can sample valid physical states that satisfy a logic state. This functionality can generate potentially infinite instances of tasks with minimal effort from the users.[4]


July 2021[edit | edit source]

Audio processing[edit | edit source]

SoundStream: An End-to-End Neural Audio Codec (arXiv:2107.03312)

tl;dr A novel neural audio codec that can efficiently compress speech, music and general audio at bitrates normally targeted by speech-tailored codecs. SoundStream at 3kbps outperforms Opus at 12kbps and approaches EVS at 9.6kbps.[5]


June 2021[edit | edit source]

Multimodal learning[edit | edit source]

Multimodal Few-Shot Learning with Frozen Language Models (arXiv:2106.13884)

tl;dr When trained at sufficient scale, auto-regressive language models exhibit the notable ability to learn a new language task after being prompted with just a few examples. Here, the authors present a simple, yet effective, approach for transferring this few-shot learning ability to a multimodal setting (vision and language).[6]


Optimizers[edit | edit source]

A Generalizable Approach to Learning Optimizers (arXiv:2106.00958)

tl;dr Learning to update optimizer hyperparameters instead of model parameters directly using novel features, actions, and a reward function.[7]


May 2021[edit | edit source]

Memory[edit | edit source]

Not All Memories are Created Equal: Learning to Forget by Expiring (arXiv:2105.06548)

tl;dr The authors propose Expire-Span, a method that learns to retain the most important information and expire the irrelevant information, which enables Transformers to scale to attend over tens of thousands of previous timesteps efficiently.[8]


April 2021[edit | edit source]

Fine-tuning[edit | edit source]

The Power of Scale for Parameter-Efficient Prompt Tuning (arXiv:2104.08691)

tl;dr In this work, the author's explore "prompt tuning" a simple but effective mechanism for learning "soft prompts" to condition frozen language models to perform specific downstream tasks.[9]


March 2021[edit | edit source]

Computer vision[edit | edit source]

NeX: Real-time View Synthesis with Neural Basis Expansion (arXiv:2103.05606)

tl;dr The authors present NeX, a new approach to novel view synthesis based on enhancements of multiplane image (MPI) that can reproduce next-level view-dependent effects -- in real time. The method achieves the best overall scores across all major metrics on these datasets with more than 1000× faster rendering time than the state of the art.[10]


October 2020[edit | edit source]

Computer vision[edit | edit source]

GRF: Learning a General Radiance Field for 3D Scene Representation and Rendering (arXiv:2010.04595)

tl;dr General Radiance Fields construct an internal representation for each 3D point of a scene from 2D inputs and renders the corresponding appearance and geometry of any 3D scene viewing from an arbitrary angle.[11]


September 2020[edit | edit source]

Summarization[edit | edit source]

Learning to Summarize with Human Feedback (arXiv:2009.01325)

tl;dr Human feedback models outperform much larger supervised models and reference summaries on TL;DR.[12]


December 2019[edit | edit source]

Meta-learning[edit | edit source]

Generative Teaching Networks: Accelerating Neural Architecture Search by Learning to Generate Synthetic Training Data (arXiv:1912.07768)

tl;dr This paper investigates the intriguing question of whether learning algorithms can automatically generate training data, learning environments, and curricula in order to help AI agents rapidly learn. GTNs are deep neural networks that generate data and/or training environments that a learner trains on for a few SGD steps before being tested on a target task. It then differentiates through the entire learning process via meta-gradients to update the GTN parameters to improve performance on the target task.[13]


Older papers[edit | edit source]

See also[edit | edit source]

References[edit | edit source]

  1. Xu et al. Evol-Instruct: Mass-Producing Open-Domain Instruction Data with Varying Levels of Complexity using Large Language Models. arXiv:2304.12244, 2023.
  2. Yongchao Zhou, Andrei Ioan Muresanu, Ziwen Han, Keiran Paster, Silviu Pitis, Harris Chan, Jimmy Ba. Large Language Models Are Human-Level Prompt Engineers. arXiv, 2022.
  3. Radu Alexandru Rosu, Sven Behnke. NeuralMVS: Bridging Multi-View Stereo and Novel View Synthesis. arXiv:2108.03880, 2021.
  4. Chengshu Li, Fei Xia, Roberto Martín-Martín, Michael Lingelbach, Sanjana Srivastava, Bokui Shen, Kent Vainio, Cem Gokmen, Gokul Dharan, Tanish Jain, Andrey Kurenkov, Karen Liu, Hyowon Gweon, Jiajun Wu, Li Fei-Fei, Silvio Savarese. iGibson 2.0: Object-Centric Simulation for Robot Learning of Everyday Household Tasks. arXiv:2108.03272, 2021.
  5. Neil Zeghidour, Alejandro Luebs, Ahmed Omran, Jan Skoglund, Marco Tagliasacchi. SoundStream: An End-to-End Neural Audio Codec. arXiv:2107.03312, 2021.
  6. Maria Tsimpoukelli, Jacob Menick, Serkan Cabi, S. M. Ali Eslami, Oriol Vinyals, Felix Hill. Multimodal Few-Shot Learning with Frozen Language Models. arXiv:2106.13884, 2021.
  7. Diogo Almeida, Clemens Winter, Jie Tang, Wojciech Zaremba. A Generalizable Approach to Learning Optimizers. arXiv:2106.00958, 2021.
  8. Sainbayar Sukhbaatar, Da Ju, Spencer Poff, Stephen Roller, Arthur Szlam, Jason Weston, Angela Fan. Not All Memories are Created Equal: Learning to Forget by Expiring. arXiv:2105.06548, 2021.
  9. Brian Lester, Rami Al-Rfou, Noah Constant. The Power of Scale for Parameter-Efficient Prompt Tuning. arXiv:2104.08691, 2021.
  10. Suttisak Wizadwongsa, Pakkapon Phongthawee, Jiraphon Yenphraphai, Supasorn Suwajanakorn. NeX: Real-time View Synthesis with Neural Basis Expansion. arXiv:2103.05606, 2021.
  11. Alex Trevithick, Bo Yang. GRF: Learning a General Radiance Field for 3D Scene Representation and Rendering. arXiv:2010.04595, 2020.
  12. Nisan Stiennon, Long Ouyang, Jeff Wu, Daniel M. Ziegler, Ryan Lowe, Chelsea Voss, Alec Radford, Dario Amodei, Paul Christiano. Learning to Summarize with Human Feedback. arXiv:2009.01325, 2020.
  13. Felipe Petroski Such, Aditya Rawal, Joel Lehman, Kenneth O. Stanley, Jeff Clune. Generative Teaching Networks: Accelerating Neural Architecture Search by Learning to Generate Synthetic Training Data. arXiv:1912.07768, 2019.