Such a Dummy Programmer: GPT4ALL - A new LLaMa (Large Language Model)

Thursday, 30 March 2023

GPT4ALL - A new LLaMa (Large Language Model)

posted 29th March, 2023 - 11:50, GPT4ALL launched 1 hr ago

What if we use AI generated prompt and response to train another AI
- Exactly the idea behind GPT4ALL, they generated 1 million prompt-response pairs using the GPT-3.5-Turbo OpenAI API between March 20, 2023 and March 26th, 2023, and used this to train a large language model.

Work on prompt response model -

1. using dataset - Laion from transformers (huggingface) - https://huggingface.co/datasets/laion/OIG
2. randomly selecting stackoverflow questions for coding knowledge
3. using dataset bigscience 's bloomz p3 from transformers (huggingface) - https://huggingface.co/bigscience/bloomz-p3

RESULTS -
bells and some drumrolls
1. Amazing response with accuracy since the trained data is very fine tuned, i.e. running LLaMa on a 8GB RAM laptop possible.

2. Able to produce these models with about four days work, $800 in GPU costs and $500 in OpenAI API spend. Our released model, gpt4all-lora, can be trained in about eight hours on a Lambda Labs DGX A100 8x 80GB for a total cost of $100. - words exactly from the original paper.

Yep it is that affordable, if someone understands the graphs please comment.

Technical report can be read here - https://s3.amazonaws.com/static.nomic.ai/gpt4all/2023_GPT4All_Technical_Report.pdf

Github repo - https://github.com/nomic-ai/gpt4all

Such a Dummy Programmer

Pages

Thursday, 30 March 2023

GPT4ALL - A new LLaMa (Large Language Model)

No comments:

Post a Comment

GPT4ALL - A new LLaMa (Large Language Model)