126
0
submitted 2 years ago by [email protected] to c/[email protected]
127
1
submitted 2 years ago* (last edited 2 years ago) by [email protected] to c/[email protected]

TL;DR (by GPT-4 🤖):

Prompt Engineering, or In-Context Prompting, is a method used to guide Language Models (LLMs) towards desired outcomes without changing the model weights. The article discusses various techniques such as basic prompting, instruction prompting, self-consistency sampling, Chain-of-Thought (CoT) prompting, automatic prompt design, augmented language models, retrieval, programming language, and external APIs. The effectiveness of these techniques can vary significantly among models, necessitating extensive experimentation and heuristic approaches. The article emphasizes the importance of selecting diverse and relevant examples, giving precise instructions, and using external tools to enhance the model's reasoning skills and knowledge base.

Notes (by GPT-4 🤖):

Prompt Engineering: An Overview

  • Introduction
    • Prompt Engineering, also known as In-Context Prompting, is a method to guide the behavior of Language Models (LLMs) towards desired outcomes without updating the model weights.
    • The effectiveness of prompt engineering methods can vary significantly among models, necessitating extensive experimentation and heuristic approaches.
    • This article focuses on prompt engineering for autoregressive language models, excluding Cloze tests, image generation, or multimodality models.
  • Basic Prompting
    • Zero-shot and few-shot learning are the two most basic approaches for prompting the model.
    • Zero-shot learning involves feeding the task text to the model and asking for results.
    • Few-shot learning presents a set of high-quality demonstrations, each consisting of both input and desired output, on the target task.
  • Tips for Example Selection and Ordering
    • Examples should be chosen that are semantically similar to the test example.
    • The selection of examples should be diverse, relevant to the test sample, and in random order to avoid biases.
  • Instruction Prompting
    • Instruction prompting involves giving the model direct instructions, which can be more token-efficient than few-shot learning.
    • Models like InstructGPT are fine-tuned with high-quality tuples of (task instruction, input, ground truth output) to better understand user intention and follow instructions.
  • Self-Consistency Sampling
    • Self-consistency sampling involves sampling multiple outputs and selecting the best one out of these candidates.
    • The criteria for selecting the best candidate can vary from task to task.
  • Chain-of-Thought (CoT) Prompting
    • CoT prompting generates a sequence of short sentences to describe reasoning logics step by step, leading to the final answer.
    • CoT prompting can be either few-shot or zero-shot.
  • Automatic Prompt Design
    • Automatic Prompt Design involves treating prompts as trainable parameters and optimizing them directly on the embedding space via gradient descent.
  • Augmented Language Models
    • Augmented Language Models are models that have been enhanced with reasoning skills and the ability to use external tools.
  • Retrieval
    • Retrieval involves completing tasks that require latest knowledge after the model pretraining time cutoff or internal/private knowledge base.
    • Many methods for Open Domain Question Answering depend on first doing retrieval over a knowledge base and then incorporating the retrieved content as part of the prompt.
  • Programming Language and External APIs
    • Some models generate programming language statements to resolve natural language reasoning problems, offloading the solution step to a runtime such as a Python interpreter.
    • Other models are augmented with text-to-text API calls, guiding the model to generate API call requests and append the returned result to the text sequence.
128
3
submitted 2 years ago by [email protected] to c/[email protected]

From the “About” section:

goblin.tools is a collection of small, simple, single-task tools, mostly designed to help neurodivergent people with tasks they find overwhelming or difficult.

Most tools will use AI technologies in the back-end to achieve their goals. Currently this includes OpenAI's models. As the tools and backend improve, the intent is to move to an open source alternative.

The AI models used are general purpose models, and so the accuracy of their output can vary. Nothing returned by any of the tools should be taken as a statement of truth, only guesswork. Please use your own knowledge and experience to judge whether the result you get is valid.

129
1
submitted 2 years ago* (last edited 2 years ago) by [email protected] to c/[email protected]

Original tweet:

https://twitter.com/goodside/status/1672121754880180224

Text:

If you put violence, erotica, etc. in your code Copilot just stops working and I happen to need violence, erotica, etc. in Jupyter for red teaming so I always have to make an evil.⁠py to sequester constants for import.

not wild about this. please LLMs i'm trying to help you

(screenshot of evil.py full of nasty things)

130
5
submitted 2 years ago* (last edited 2 years ago) by [email protected] to c/[email protected]

It's coming along nicely, I hope I'll be able to release it in the next few days.

Screenshot:

How It Works:

I am a bot that generates summaries of Lemmy comments and posts.

  • Just mention me in a comment or post, and I will generate a summary for you.
  • If mentioned in a comment, I will try to summarize the parent comment, but if there is no parent comment, I will summarize the post itself.
  • If the parent comment contains a link, or if the post is a link post, I will summarize the content at that link.
  • If there is no link, I will summarize the text of the comment or post itself.

Extra Info in Comments:

Prompt Injection:

Of course it's really easy (but mostly harmless) to break it using prompt injection:

It will only be available in communities that explicitly allow it. I hope it will be useful, I'm generally very satisfied with the quality of the summaries.

131
7
submitted 2 years ago by [email protected] to c/[email protected]

Link to original tweet:

https://twitter.com/sayashk/status/1671576723580936193?s=46&t=OEG0fcSTxko2ppiL47BW1Q

Screenshot:

Transcript:

I'd heard that GPT-4's image analysis feature wasn't available to the public because it could be used to break Captcha.

Turns out it's true: The new Bing can break captcha, despite saying it won't: (image)

132
1
submitted 2 years ago by [email protected] to c/[email protected]

This is a fascinating discussion of the relationship between goals and intelligence from an AI safety perspective.

I asked my trusty friend GPT-4 to summarize the video (I downloaded the subtitles and fed them into ChatGPT), but I highly recommend just watching the entire thing if you have the time.

Summary by GPT-4:

Introduction:

  • The video aims to respond to some misconceptions about the Orthogonality Thesis in Artificial General Intelligence (AGI) safety.
  • This arises from a thought experiment where an AGI has a simple goal of collecting stamps, which could cause problems due to unintended consequences.

Understanding 'Is' and 'Ought' Statements (Hume's Guillotine):

  • The video describes the concept of 'Is' and 'Ought' statements. 'Is' statements are about how the world is or will be, while 'Ought' statements are about how the world should be or what we want.
  • Hume's Guillotine suggests that you can never derive an 'Ought' statement using only 'Is' statements. To derive an 'Ought' statement, you need at least one other 'Ought' statement.

Defining Intelligence:

  • Intelligence in AGI systems refers to the ability to take actions in the world to achieve their goals or maximize their utility functions.
  • This involves having or building an accurate model of reality, using it to make predictions, and choosing the best possible actions.
  • These actions are determined by the system's goals, which are 'Ought' statements.

Are Goals Stupid?

  • Some commenters suggested that single-mindedly pursuing one goal (like stamp collecting) is unintelligent.
  • However, this only seems unintelligent from a human perspective with different goals.
  • Intelligence is separate from goals; it is the ability to reason about the world to achieve these goals, whatever they may be.

Can AGIs Choose Their Own Goals?

  • The video suggests that while AGIs can choose their own instrumental goals, changing terminal goals is rare and generally undesirable.
  • Terminal goals can't be considered "stupid", as they can't be judged against anything. They're simply the goals the system has.

Can AGIs Reason About Morality?

  • While a superintelligent AGI could understand human morality, it doesn't mean it would act according to it.
  • Its actions are determined by its terminal goals, not its understanding of human ethics.

The Orthogonality Thesis:

  • The Orthogonality Thesis suggests that any level of intelligence is compatible with any set of goals.
  • The level of intelligence is about effectiveness at answering 'Is' questions, and goals are about 'Ought' questions.
  • Therefore, it's possible to create a powerful intelligence that will pursue any specified goal.
  • The level of an agent's intelligence doesn't determine its goals and vice versa.
133
0
submitted 2 years ago by [email protected] to c/[email protected]

TL;DR (by GPT-4 🤖):

  • Use of AI Tools: The author routinely uses GPT-4 to answer casual and vaguely phrased questions, draft complex documents, and provide emotional support. GPT-4 can serve as a compassionate listener, an enthusiastic sounding board, a creative muse, a translator or teacher, or a devil’s advocate.

  • Large Language Models (LLM) and Expertise: LLMs can often persuasively mimic correct expert responses in a given knowledge domain, such as research mathematics. However, the responses often consist of nonsense when inspected closely. The author suggests that both humans and AI need to develop skills to analyze this new type of text.

  • AI in Mathematical Research: The author believes that the 2023-level AI can already generate suggestive hints and promising leads to a working mathematician and participate actively in the decision-making process. With the integration of tools such as formal proof verifiers, internet search, and symbolic math packages, the author expects that 2026-level AI, when used properly, will be a trustworthy co-author in mathematical research, and in many other fields as well.

  • Impact on Human Institutions and Practices: The author raises questions about how existing human institutions and practices will adapt to the rise of AI. For example, how will research journals change their publishing and referencing practices when AI can generate entry-level math papers for graduate students in less than a day? How will our approach to graduate education change? Will we actively encourage and train our students to use these tools?

  • Challenges and Future Expectations: The author acknowledges that we are largely unprepared to address these questions. There will be shocking demonstrations of AI-assisted achievement and courageous experiments to incorporate them into our professional structures. But there will also be embarrassing mistakes, controversies, painful disruptions, heated debates, and hasty decisions. The greatest challenge will be transitioning to a new AI-assisted world as safely, wisely, and equitably as possible.

134
1
submitted 2 years ago by [email protected] to c/[email protected]

I’ve been following the development of the next Stable Diffusion model, and I’ve seen this approach mentioned.

Seems like this is a way in which AI training is analogous to human learning - we learn quite a lot from fiction, games, simulations and apply this to the real world. I’m sure the same pitfalls apply as well.

135
1
submitted 2 years ago* (last edited 2 years ago) by [email protected] to c/[email protected]

Quote:

In this work, we introduce TinyStories, a synthetic dataset of short stories that only contain words that a typical 3 to 4-year-olds usually understand, generated by GPT-3.5 and GPT-4. We show that TinyStories can be used to train and evaluate LMs that are much smaller than the state-of-the-art models (below 10 million total parameters), or have much simpler architectures (with only one transformer block), yet still produce fluent and consistent stories with several paragraphs that are diverse and have almost perfect grammar, and demonstrate reasoning capabilities.

Related:

136
2
submitted 2 years ago by [email protected] to c/[email protected]

This is the potential development in AI I'm most interested in. So naturally, I tested this when I first used ChatGPT. In classic ChatGPT fashion, when asked to make a directed acyclic graph representing cause and effect, it could interpret that well enough to make a simple graph...but got the cause and effect flow for something as simple as lighting a fire. Haven't tried it again with ChatGPT-4 though.

137
1
ChatGPT: Magic for English Majors (www.oneusefulthing.org)
submitted 2 years ago by [email protected] to c/[email protected]

AI isn’t magic, of course, but what this weirdness practically means is that these new tools, which are trained on vast swathes of humanity’s cultural heritage, can often best be wielded by people who have a knowledge of that heritage. To get the AI to do unique things, you need to understand parts of culture more deeply than everyone else using the same AI systems.

138
3
submitted 2 years ago by [email protected] to c/[email protected]

Original tweet by @emollick: https://twitter.com/emollick/status/1669939043243622402

Tweet text: One reason AI is hard to "get" is that LLMs are bad at tasks you would expect an AI to be good at (citations, facts, quotes, manipulating and counting words or letters) but surprisingly good at things you expect it to be bad at (generating creative ideas, writing with "empathy").

139
3
submitted 2 years ago by [email protected] to c/[email protected]

Excellent Twitter thread by @goodside 🧵:

The wisdom that "LLMs just predict text" is true, but misleading in its incompleteness.

"As an AI language model trained by OpenAI..." is an astoundingly poor prediction of what a typical human would write.

Let's resolve this contradiction — a thread: For widely used LLM products like ChatGPT, Bard, or Claude, the "text" the model aims to predict is itself written by other LLMs.

Those LLMs, in turn, do not aim to predict human text in general, but specifically text written by humans pretending they are LLMs. There is, at the start of this, a base LLM that works as popularly understood — a model that "just predicts text" scraped from the web.

This is tuned first to behave like a human role-playing an LLM, then again to imitate the "best" of that model's output. Models that imitate humans pretending to be (more ideal) LLMs are known as "instruct models" — because, unlike base LLMs, they follow instructions. They're also known as "SFT models" after the process that re-trains them, Supervised Fine-Tuning.

This describes GPT-3 in 2021.

SFT/instruct models work, but not well. To improve them, their output is graded by humans, so that their best responses can be used for further fine-tuning.

This is "modified SFT," used in the GPT-3 version you may remember from 2022 (text-davinci-002). Eventually, enough examples of human grading are available that a new model, called a "preference model," can be trained to grade responses automatically.

This is RLHF — Reinforcement Learning on Human Feedback. This process produced GPT-3.5 and ChatGPT. Some products, like Claude, go beyond RLHF and apply a further step where model output is corrected and rewritten using feedback from yet another model. The base model is tuned on these responses to yield the final LLM.

This is RLAIF — Reinforcement Learning with AI Feedback. OpenAI's best known model, GPT-4, is likely trained using some other extension of RLHF, but nothing about this process is publicly known. There are likely many improvements to the base model as well, but we can only speculate what they are. So, do LLMs "just predict text"?

Yes, but perhaps without with the "just" — the text they predict is abstract, and only indirectly written by humans.

Humans sit at the base of a pyramid with several layers of AI above, and humans pretending to be AI somewhere in the middle. Added note:

My explanation of RLHF/RLAIF above is oversimplified. RL-tuned models are not literally tuned to predict highly-rated text as in modified SFT — rather, weights are updated via Proximal Policy Optimization (PPO) to maximize the reward given by the preference model. (Also, that last point does somewhat undermine the thesis of this thread, in that RL-tuned LLMs do not literally predict any text, human-written or otherwise. Pedantically, "LLMs just predict text" was true before RLHF, but is now a simplification.)

140
1
submitted 2 years ago by [email protected] to c/[email protected]

OpenAI’s official guide. Short and to the point, no bullshit, covers the basics very well.

141
2
submitted 2 years ago by [email protected] to c/[email protected]

Trick the LLM into revealing a secret password through increasingly difficult levels.

142
1
submitted 2 years ago by [email protected] to c/[email protected]

Microsoft’s new chatbot goes crazy after a journalist uses psychology to manipulate it. The article contains the full transcript and nothing else. It’s a fascinating read.

143
13
submitted 2 years ago by [email protected] to c/[email protected]

Guy trains an LLM on his group chat messages with his best friends with predictable but nevertheless very funny results.

144
1
Unspeakable tokens (www.lesswrong.com)
submitted 2 years ago by [email protected] to c/[email protected]

A deep dive into the inner workings of ChatGPT, and why it stops responding or replies weird or creepy things to seemingly simple requests.

145
1
submitted 2 years ago by [email protected] to c/[email protected]

An excellent video series by Andrej Karpathy (founding member of OpenAI, then head of AI at Tesla). He teaches how GPTs work from the ground up, using Python. I learned a lot from this course.

Actually Useful AI

2355 readers
1 users here now

Welcome! 🤖

Our community focuses on programming-oriented, hype-free discussion of Artificial Intelligence (AI) topics. We aim to curate content that truly contributes to the understanding and practical application of AI, making it, as the name suggests, "actually useful" for developers and enthusiasts alike.

Be an active member! 🔔

We highly value participation in our community. Whether it's asking questions, sharing insights, or sparking new discussions, your engagement helps us all grow.

What can I post? 📝

In general, anything related to AI is acceptable. However, we encourage you to strive for high-quality content.

What is not allowed? 🚫

General Rules 📜

Members are expected to engage in on-topic discussions, and exhibit mature, respectful behavior. Those who fail to uphold these standards may find their posts or comments removed, with repeat offenders potentially facing a permanent ban.

While we appreciate focus, a little humor and off-topic banter, when tasteful and relevant, can also add flavor to our discussions.

Related Communities 🌐

General

Chat

Image

Open Source

Please message @[email protected] if you would like us to add a community to this list.

Icon base by Lord Berandas under CC BY 3.0 with modifications to add a gradient

founded 2 years ago
MODERATORS