Machine Learning - Learning/Language Models

0 readers

1 users here now

Discussion of models, thier use, setup and options.

Please include models used with your outputs, workflows optional.

Model Catalog

We follow Lemmy’s code of conduct.

Communities

Useful links

founded 1 year ago

MODERATORS

[email protected]

OpenChat_8192 - The first model to beat 100% of ChatGPT-3.5 (lemmy.intai.tech)

submitted 1 year ago* (last edited 1 year ago) by [email protected] to c/[email protected]

0 comments fedilink hide all child comments

Models

Datasets

openchat_sharegpt4_dataset

Repos

openchat

Related Papers

Credit:

Archive:

@Yampeleg The first model to beat 100% of ChatGPT-3.5 Available on Huggingface

🔥 OpenChat_8192

🔥 105.7% of ChatGPT (Vicuna GPT-4 Benchmark)

Less than a month ago the world witnessed as ORCA [1] became the first model to ever outpace ChatGPT on Vicuna's benchmark.

Today, the race to replicate these results open-source comes to an end.

Minutes ago OpenChat scored 105.7% of ChatGPT.

But wait! There is more!

Not only OpenChat beated Vicuna's benchmark, it did so pulling off a LIMA [2] move!

Training was done using 6K GPT-4 conversations out of the ~90K ShareGPT conversations.

The model comes in three versions: the basic OpenChat model, OpenChat-8192 and OpenCoderPlus (Code generation: 102.5% ChatGPT)

This is a significant achievement considering that it's the first (released) open-source model to surpass the Vicuna benchmark. 🎉🎉

OpenChat: https://huggingface.co/openchat/openchat
OpenChat_8192: https://huggingface.co/openchat/openchat_8192 (best chat)
OpenCoderPlus: https://huggingface.co/openchat/opencoderplus (best coder)
Dataset: https://huggingface.co/datasets/openchat/openchat_sharegpt4_dataset
Code: https://github.com/imoneoi/openchat

Congratulations to the authors!!

[1] - Orca: The first model to cross 100% of ChatGPT: https://arxiv.org/pdf/2306.02707.pdf [2] - LIMA: Less Is More for Alignment - TL;DR: Using small number of VERY high quality samples (1000 in the paper) can be as powerful as much larger datasets: https://arxiv.org/pdf/2305.11206

no comments (yet)

sorted by: hot top controversial new old

there doesn't seem to be anything here