Labs:
Imperial College London, UK
African Institute for Mathematical Sciences (AIMS)
Alibaba Group Inc.
Allen Institute AI
Apple Inc.
Baidu Research
Beihang University
Beijing Institute of Technology
Boston University
Bytedance AI Lab
CAS Center for Excellence in Brain Science and Intelligence Technology
CETC Big Data Research Institute
Carnegie Mellon University
DeepMind
Digital Reasoning Systems
FAIR
- XLM
- XLM-R
- GCNN: Gated CNN
- SpanBERT
- RoBERTa
- BART
- Gated CNN
- ConvS2S
- MUSE
- Unsupervised Machine Translation with Monolingual Data
- Transformer + Noisy Channel
- CMLM Transformer
- Linformer: Linear Transformer
- Luna: Linear Attention Mechanism
- Dr. NMT
- mBART
- CRISS
- mBART-50
- M2M-100
- Wav2Letter
- TDS Conv
- wav2vec
- Convolutional Transformer
- vq-wav2vec
- Acoustic BERT
- mCPC
- wav2vec 2.0
- XLSR
- wav2vec-U
- HuBERT
- XLS-R
- u-HuBERT: A Unified HuBERT
- Dual-decoder Transformer
- Subword Embedding
- LASER
Fudan University
Google AI
Google Brain
Google Research
- True Bilingual Neural Machine Translation
- mBERT: Multilingual BERT
- AMBER
- mT5: Multilingual T5
- nmT5: NMT + Multilingual T5
- Adapter Layers
- ALBERT
- Google's T5
- ETC: Extended Transformer Construction
- Seq2Seq
- GNMT: Google's NMT
- Transformers
- Tagged Back-Translation
- Reformer: Efficient Transformer
- Hinted Back-Translation
- Scaling Transformer
- Multilingual Google's NMT
- cMNMT: Complete MNMT
- T-T: Transformer Transducer
- ContextNet
- Conformer
- Combined Semi-supervised Learning
- Wave-Tacotron
- SLAM: Speech Language Model
- Word2Vec
Huazhong University of Science and Technology
Hugging Face
Institute of Automation
JD Explore Academy
Jacobs University
Jagiellonian University
Johns Hopkins University
LORIA
MIT
Maastricht University
Meta AI
- MuAViC: A Multilingual Audio-Visual Corpus for Robust Speech Recognition and Robust Speech-to-Text Translation
- XLAVS-R: Cross-Lingual Audio-Visual Speech Representation from Efficient Modality Injection
- A Comprehensive Analysis of Human-centric Audio-Visual Learning in Speech: A survey
- Auto-AVSR
- data2vec
- AV-HuBERT for AVSR
- data2vec 2.0
- Audio-Visual data2vec