101
9
submitted 2 years ago by [email protected] to c/[email protected]
102
6
submitted 2 years ago by [email protected] to c/[email protected]

LlamaIndex is a simple, flexible data framework for connecting custom data sources to large language models.

103
17
submitted 2 years ago by [email protected] to c/[email protected]

Machine learning can help with analysis of gliomas, most common brain tumor, and reduce time patients are in operating room

104
7
submitted 2 years ago by [email protected] to c/[email protected]

NVIDIA offers a consistent, full stack to develop on a GPU-powered on-premises or on-cloud instance. You can then deploy that AI application on any GPU-powered platform without code changes.

@AutoTLDR

105
11
Becoming an AI engineer (www.ignorance.ai)
submitted 2 years ago by [email protected] to c/[email protected]

I think software engineering will spawn a new subdiscipline, specializing in applications of AI and wielding the emerging stack effectively, just as “site reliability engineer”, “devops engineer”, “data engineer” and “analytics engineer” emerged.

The emerging (and least cringe) version of this role seems to be: AI Engineer.

@AutoTLDR

106
5
submitted 2 years ago by [email protected] to c/[email protected]

Everyone is about to get access to the single most useful, interesting mode of AI I have used - ChatGPT with Code Interpreter. I have had the alpha version of this for a couple months (I was given access as a researcher off the waitlist), and I wanted to give you a little bit of guidance as to why I think this is a really big deal, as well as how to start using it.

@AutoTLDR

107
16
submitted 2 years ago by [email protected] to c/[email protected]

We’re rolling out code interpreter to all ChatGPT Plus users over the next week.

It lets ChatGPT run code, optionally with access to files you've uploaded. You can ask ChatGPT to analyze data, create charts, edit files, perform math, etc.

We’ll be making these features accessible to Plus users on the web via the beta panel in your settings over the course of the next week.

To enable code interpreter:

  • Click on your name
  • Select beta features from your settings
  • Toggle on the beta features you’d like to try
108
5
submitted 2 years ago by [email protected] to c/[email protected]

Starting today, all paying API customers have access to GPT-4. In March, we introduced the ChatGPT API, and earlier this month we released our first updates to the chat-based models. We envision a future where chat-based models can support any use case. Today we’re announcing a deprecation plan for older models of the Completions API, and recommend that users adopt the Chat Completions API.

109
6
submitted 2 years ago by [email protected] to c/[email protected]

Some interesting quotes:

  1. LLMs do both of the things that their promoters and detractors say they do.
  2. They do both of these at the same time on the same prompt.
  3. It is very difficult from the outside to tell which they are doing.
  4. Both of them are useful.

When a search engine is able to do this, it is able to compensate for a limited index size with intelligence. By making reasonable inferences about what page text is likely to satisfy what query text, it can satisfy more intents with fewer documents.

LLMs are not like this. The reasoning that they do is inscrutable and massive. They do not explain their reasoning in a way that we can trust is actually their reasoning, and not simply a textual description of what such reasoning might hypothetically be.

@AutoTLDR

110
8
submitted 2 years ago by [email protected] to c/[email protected]

If you are like me, and you didn't immediately understand why people rave about Copilot, these simple examples by Simon Willison may be useful to you:

111
10
submitted 2 years ago by [email protected] to c/[email protected]

We need scientific and technical breakthroughs to steer and control AI systems much smarter than us. To solve this problem within four years, we’re starting a new team, co-led by Ilya Sutskever and Jan Leike, and dedicating 20% of the compute we’ve secured to date to this effort. We’re looking for excellent ML researchers and engineers to join us.

@[email protected]

112
6
submitted 2 years ago by [email protected] to c/[email protected]

I haven't tried this yet, but I have a feeling that it would fail for anything nontrivial. Nevertheless, the concept is very interesting, and as soon as I get API access to GPT-4, I will try it.

I've recently ported a library from TypeScript to Python with the help of ChatGPT (GPT-4), and it took me about a day. It would be interesting to run this tool on the same codebase and compare the results.

If anyone has GPT-4 API access, I would really appreciate if they tried running this tool on something simple, and wrote about the result in the comments.

113
18
submitted 2 years ago by [email protected] to c/[email protected]

114
16
submitted 2 years ago by [email protected] to c/[email protected]

@AutoTLDR

115
73
submitted 2 years ago by [email protected] to c/[email protected]

Researchers have unearthed hundreds of thousands of cuneiform tablets, but many remain untranslated. Translating an ancient language is a time-intensive process, and only a few hundred experts are qualified to perform it. A recent study describes a new AI that produces high-quality translations of ancient texts.

116
1
submitted 2 years ago by [email protected] to c/[email protected]

As of July 3, 2023, we’ve disabled the Browse with Bing beta feature out of an abundance of caution while we fix this in order to do right by content owners. We are working to bring the beta back as quickly as possible, and appreciate your understanding!

117
3
submitted 2 years ago by [email protected] to c/[email protected]

Some interesting quotes:

Computers were very rigid and I grew up with a certain feeling about what computers can or cannot do. And I thought that artificial intelligence, when I heard about it, was a very fascinating goal, which is to make rigid systems act fluid. But to me, that was a very long, remote goal. It seemed infinitely far away. It felt as if artificial intelligence was the art of trying to make very rigid systems behave as if they were fluid. And I felt that would take enormous amounts of time. I felt it would be hundreds of years before anything even remotely like a human mind would be asymptotically approaching the level of the human mind, but from beneath.

But one thing that has completely surprised me is that these LLMs and other systems like them are all feed-forward. It's like the firing of the neurons is going only in one direction. And I would never have thought that deep thinking could come out of a network that only goes in one direction, out of firing neurons in only one direction. And that doesn't make sense to me, but that just shows that I'm naive.

It also makes me feel that maybe the human mind is not so mysterious and complex and impenetrably complex as I imagined it was when I was writing Gödel, Escher, Bach and writing I Am a Strange Loop. I felt at those times, quite a number of years ago, that as I say, we were very far away from reaching anything computational that could possibly rival us. It was getting more fluid, but I didn't think it was going to happen, you know, within a very short time.

118
1
submitted 2 years ago by [email protected] to c/[email protected]

Interesting discussion on HN.

119
1
submitted 2 years ago* (last edited 2 years ago) by [email protected] to c/[email protected]

TL;DR

See comments.

Notes (by GPT-4 🤖):

A Day Without a Copilot: Reflections on Copilot-Driven Development

Introduction

  • The author, Gavin Ray, reflects on the impact of Github Copilot on his software development process.
  • He shares his experience of a day without Copilot, which was a rare occurrence since the Technical Preview.
  • He discusses how Copilot has profoundly changed his development process and experience.

From Monologue to Dialogue

  • Ray appreciates the solitude of coding but also values the collaboration and learning from others.
  • Github Copilot has been a game-changer for him, allowing him to have a dialogue with his code and the collective wisdom of the world without expending energy.
  • Coding has become a collaborative dialogue between Ray and Copilot, shaping the output together.

Fresh Perspectives

  • Copilot provides fresh perspectives, suggesting API designs or implementation details that Ray would not have considered.
  • Not all suggestions are good, but even the bad ones help him think about the problem differently.
  • Ray generates several sets of Copilot suggestions based on the specs before designing or implementing an API, picking the best candidates and tweaking them to create the final implementation.

Copilot-Driven Development

  • Ray describes a phenomenon he calls "Copilot-Driven Development", a process that optimizes for Copilot's suggestions/accuracy.
  • This process includes choosing popular programming languages and well-known libraries, using explicit names and types, writing types and interfaces with specifications and documentation first, implementing tests alongside each implementation, and keeping as much code in a single file as possible during early development.

Outcomes of Copilot-Driven Development

  • Ray uses Copilot's suggestions to guide his development process, helping him think about problems differently and make better decisions.
  • This process allows him to see the problem from different perspectives, gain insights, learn from the community, be more efficient, and be more confident in his decisions.

Evolving Roles in Software Development

  • Tools like Github Copilot and ChatGPT highlight a shift in the role of the software developer, allowing developers to leverage the collective wisdom of the community to improve their work.
  • This shift is important in modern software development, where the complexity and scale of projects can make it difficult for a single individual to have all the necessary knowledge and expertise.
  • The use of tools like Github Copilot does not diminish the role of the individual but enables them to focus more on the creative and strategic aspects of development.
  • These tools are redefining the role of the software developer, allowing them to be more effective and efficient in their work, and focus on the most interesting and challenging aspects of the development process.
120
3
submitted 2 years ago* (last edited 2 years ago) by [email protected] to c/[email protected]

👋 Hello everyone, welcome to our Weekly Discussion thread!

This week, we’re interested in your thoughts on AI safety: Is it an issue that you believe deserves significant attention, or is it just fearmongering motivated by financial interests?

I've created a poll to gauge your thoughts on these concerns. Please take a moment to select the AI safety issues you believe are most crucial:

VOTE HERE: 🗳️ https://strawpoll.com/e6Z287ApqnN

Here is a detailed explanation of the options:

  1. Misalignment between AI and human values: If an AI system's goals aren't perfectly aligned with human values, it could lead to unintended and potentially catastrophic consequences.

  2. Unintended Side-Effects: AI systems, especially those optimized to achieve a specific goal, might engage in harmful behavior that was not intended, often referred to as "instrumental convergence".

  3. Manipulation and Deception: AI could be used for manipulating information, deepfakes, or influencing behavior without consent, leading to erosion of trust and reality.

  4. AI Bias: AI models may perpetuate or amplify existing biases present in the data they're trained on, leading to unfair outcomes in various sectors like hiring, law enforcement, and lending.

  5. Security Concerns: As AI systems become more integrated into critical infrastructure, the potential for these systems to be exploited or misused increases.

  6. Economic and Social Impact: Automation powered by AI could lead to significant job displacement and increase inequality, causing major socioeconomic shifts.

  7. Lack of Transparency: AI systems, especially deep learning models, are often criticized as "black boxes," where it's difficult to understand the decision-making process.

  8. Autonomous Weapons: The misuse of AI in warfare could lead to lethal autonomous weapons, potentially causing harm on a massive scale.

  9. Monopoly and Power Concentration: Advanced AI capabilities could lead to an unequal distribution of power and resources if controlled by a select few entities.

  10. Dependence on AI: Over-reliance on AI systems could potentially make us vulnerable, especially if these systems fail or are compromised.

Please share your opinion here in the comments!

121
1
submitted 2 years ago by [email protected] to c/[email protected]

@AutoTLDR

122
2
submitted 2 years ago* (last edited 2 years ago) by [email protected] to c/[email protected]
123
1
submitted 2 years ago by [email protected] to c/[email protected]

Like many, I've been flabbergasted by the huge advances we've seen in recent years with artificial intelligence. I've spent hours just playing with ChatGPT and Stable Diffusion and am consistently impressed.

I'm also aware of issues surrounding this breed of technology, like with intellectual property and black box biases. On top of that, I'm trying to avoid hype and grift. Where are you seeing AI excel, or where does it have a potential to excel? Are there places where it is being shoehorned into that should be avoided?

124
3
submitted 2 years ago* (last edited 2 years ago) by [email protected] to c/[email protected]

Announcement

The bot I announced in this thread is now ready for a limited beta release.

You can see an example summary it wrote here.

How to Use AutoTLDR

  • Just mention it ("@" + "AutoTLDR") in a comment or post, and it will generate a summary for you.
  • If mentioned in a comment, it will try to summarize the parent comment, but if there is no parent comment, it will summarize the post itself.
  • If the parent comment contains a link, or if the post is a link post, it will summarize the content at that link.
  • If there is no link, it will summarize the text of the comment or post itself.
  • 🔒 If you include the #nobot hashtag in your profile, it will not summarize anything posted by you.

Beta limitations

  • The bot only works in the [email protected] community.
  • It is limited to 100 summaries per day.

How to try it

  • If you want to test the bot, write a long comment, or include a link in a comment in this thread, and then, in a reply comment, mention the bot.
  • Feel free to test it and try to break it in this thread. Please report any weird behavior you encounter in a PM to me (NOT the bot).
  • You can also use it for its designated purpose anywhere in the AUAI community.
125
2
Understanding GPT tokenizers (simonwillison.net)
submitted 2 years ago* (last edited 2 years ago) by [email protected] to c/[email protected]

This is an excellent overview of tokenization with many interesting examples. I also like Simon's small CLI tools; you can read about them at the end of the post.

As usual, I've asked GPT-4 to write a TL;DR and detailed notes for it.

Notice that it couldn't print the "davidjl" glitch token, and (probably because of its presence), the notes are also incomplete. At first I thought it was because the text of the article was longer than the context window, but the TL;DR contains details the notes don't so that probably wasn't the case.

I've still decided to copy the notes here because they are generally useful and also demonstrate this weird behavior.

TL;DR (by GPT-4 🤖)

The article discusses the concept of tokenization in large language models like GPT-3/4, LLaMA, and PaLM. These models convert text into tokens (integers) and predict the next tokens. The author explains how English words are usually assigned a single token, while non-English languages often have less efficient tokenization. The article also explores "glitch tokens," which exhibit unusual behavior, and the necessity of counting tokens to ensure OpenAI's models' token limit is not exceeded. The author introduces a Python library called tiktoken and a command-line tool called ttok for this purpose. Understanding tokens can help make sense of how GPT tools generate text.

Notes (by GPT-4 🤖)

Understanding GPT Tokenizers

  • Large language models like GPT-3/4, LLaMA, and PaLM operate in terms of tokens, which are integers representing text. They convert text into tokens and predict the next tokens.
  • OpenAI provides a Tokenizer tool for exploring how tokens work. The author has also built a tool as an Observable notebook.
  • The notebook can convert text to tokens, tokens to text, and run searches against the full token table.

Tokenization Examples

  • English words are usually assigned a single token. For example, "The" is token 464, " dog" is token 3290, and " eats" is token 25365.
  • Capitalization and leading spaces are important in tokenization. For instance, "The" with a capital T is token 464, but " the" with a leading space and a lowercase t is token 262.
  • Languages other than English often have less efficient tokenization. For example, the Spanish sentence "El perro come las manzanas" is encoded into seven tokens, while the English equivalent "The dog eats the apples" is encoded into five tokens.
  • Some languages may have single characters that encode to multiple tokens, such as certain Japanese characters.

Glitch Tokens and Token Counting

  • There are "glitch tokens" that exhibit unusual behavior. For example, token 23282—"djl"—is one such glitch token. It's speculated that this token refers to a Reddit user who posted incremented numbers hundreds of thousands of times, and this username ended up getting its own token in the training data.
  • OpenAI's models have a token limit, and it's sometimes necessary to count the number of tokens in a string before passing it to the API to ensure the limit is not exceeded. OpenAI provides a Python library called tiktoken for this purpose.
  • The author also introduces a command-line tool called ttok, which can count tokens in text and truncate text down to a specified number of tokens.

Token Generation

  • Understanding tokens can help make sense of how GPT tools generate text. For example, names not in the dictionary, like "Pelly", take multiple tokens, but "Captain Gulliver" outputs the token "Captain" as a single chunk.
view more: ‹ prev next ›

Actually Useful AI

2355 readers
1 users here now

Welcome! 🤖

Our community focuses on programming-oriented, hype-free discussion of Artificial Intelligence (AI) topics. We aim to curate content that truly contributes to the understanding and practical application of AI, making it, as the name suggests, "actually useful" for developers and enthusiasts alike.

Be an active member! 🔔

We highly value participation in our community. Whether it's asking questions, sharing insights, or sparking new discussions, your engagement helps us all grow.

What can I post? 📝

In general, anything related to AI is acceptable. However, we encourage you to strive for high-quality content.

What is not allowed? 🚫

General Rules 📜

Members are expected to engage in on-topic discussions, and exhibit mature, respectful behavior. Those who fail to uphold these standards may find their posts or comments removed, with repeat offenders potentially facing a permanent ban.

While we appreciate focus, a little humor and off-topic banter, when tasteful and relevant, can also add flavor to our discussions.

Related Communities 🌐

General

Chat

Image

Open Source

Please message @[email protected] if you would like us to add a community to this list.

Icon base by Lord Berandas under CC BY 3.0 with modifications to add a gradient

founded 2 years ago
MODERATORS