this post was submitted on 01 Jul 2023
1 points (100.0% liked)

Machine Learning - Learning/Language Models

0 readers
1 users here now

Discussion of models, thier use, setup and options.

Please include models used with your outputs, workflows optional.

Model Catalog

We follow Lemmy’s code of conduct.

Communities

Useful links

founded 1 year ago
MODERATORS
 

Models

Datasets

Repos

Related Papers

Credit:

Tweet

Archive:

@Yampeleg The first model to beat 100% of ChatGPT-3.5 Available on Huggingface

πŸ”₯ OpenChat_8192

πŸ”₯ 105.7% of ChatGPT (Vicuna GPT-4 Benchmark)

Less than a month ago the world witnessed as ORCA [1] became the first model to ever outpace ChatGPT on Vicuna's benchmark.

Today, the race to replicate these results open-source comes to an end.

Minutes ago OpenChat scored 105.7% of ChatGPT.

But wait! There is more!

Not only OpenChat beated Vicuna's benchmark, it did so pulling off a LIMA [2] move!

Training was done using 6K GPT-4 conversations out of the ~90K ShareGPT conversations.

The model comes in three versions: the basic OpenChat model, OpenChat-8192 and OpenCoderPlus (Code generation: 102.5% ChatGPT)

This is a significant achievement considering that it's the first (released) open-source model to surpass the Vicuna benchmark. πŸŽ‰πŸŽ‰

Congratulations to the authors!!


[1] - Orca: The first model to cross 100% of ChatGPT: https://arxiv.org/pdf/2306.02707.pdf [2] - LIMA: Less Is More for Alignment - TL;DR: Using small number of VERY high quality samples (1000 in the paper) can be as powerful as much larger datasets: https://arxiv.org/pdf/2305.11206

no comments (yet)
sorted by: hot top controversial new old
there doesn't seem to be anything here