DiffCSE

DiffCSE is a contrastive learning framework for learning sentence embeddings from unlabeled data.^[1] It uses a difference-based loss function to compare two sentence embeddings, one generated by masking out the sentence and filling it in with generated data with a masked language model, and training a model to produce representations that accurately capture the semantic similarity between sentences. This approach is demonstrated to be effective for a variety of downstream tasks, such as multi-document summarization, sentence similarity assessment, and text classification.

A RoBERTa model finetuned with DiffCSE is available on Hugging Face.

References

↑ Chuang et al. "DiffCSE: Difference-based Contrastive Learning for Sentence Embeddings." 2022. arXiv:2204.10298

[1] Chuang et al. "DiffCSE: Difference-based Contrastive Learning for Sentence Embeddings." 2022. arXiv:2204.10298

[1]

DiffCSE

References

Navigation menu

Search