Implementing Stream-based Real-time Chatbot Responses #6310

perzeuss · 2023-07-20T16:05:07Z

perzeuss
Jul 20, 2023

Current Problem:

The current method of interacting with Language Models (LLMs) on Discord poses certain challenges for users seeking dynamic and engaging conversations. When a user sends a query or prompt to an LLM, the response is typically generated as a complete text block. While this approach provides the necessary information, it lacks the fluidity and real-time nature of a human-like conversation.

Imagine sending a question to an LLM and receiving a massive block of text as a response. Scanning through lengthy paragraphs to find the relevant information can be cumbersome and time-consuming. Moreover, the lack of real-time typing makes the interaction feel less interactive and engaging, as if you're merely reading a pre-scripted response.

This current experience also limits the user's ability to influence the direction of the conversation in real-time. Once the LLM generates the response, any further input or follow-up questions may not be as seamless, hindering the back-and-forth exchange that characterizes dynamic conversations.

Feature Proposal:

In light of recent advancements in artificial intelligence, more specifically Language Models (LLMs), the way we interact with chatbots is rapidly evolving. One notable advancement involves the capability of LLMs to generate outputs not as a complete text block, but in a character-by-character, stream-based format. This reflects a more organic, human-like conversation flow, which could revolutionize the expected behavior of chatbots.

The aim is to deliver this dynamic, real-time conversational experience within Discord. Envision a scenario where a user presents an AI prompt, and this AI responds within the Discord platform, streaming its response in real-time, akin to the way ChatGPT operates.

To achieve this, we propose the development of a new API endpoint that allows text output from an AI chatbot to be pushed into a Discord message incrementally. As an AI model generates the narrative step-by-step, each new line is pushed to Discord, effectively updating an existing message with additional content. On the Discord platform, message presentation would be animated or 'streamed', replicating the look and feel of real-time typing.

Following the streaming of the response, a cursor will indicate the temporary pause in conversation until either a timeout is reached, or the API endpoint receives an explicit instruction that the message is complete, signaling that no further content will be added.

Technical Considerations:

The new endpoint must facilitate the dynamic push of additional text into existing messages.
Discord must animate or visually 'stream' the pushed text to simulate real-time typing.
After message completion, maintain a cursor until an inactivity timeout ensues or an explicit 'end-of-message' signal is received from the API.

New API Endpoints Proposal

Creating a Message in Stream Mode

To initiate a new message in streaming mode, we propose adding a similar endpoint where a bot can create a new live message.

POST /channels/{channel.id}/messages/live

Parameters:

Should accept an object payload with the initial content and a status key writing for starting the real-time message.

{
  "content": "Initial content here...",
  "status": "writing"
}

A successful POST request will respond with the initial content and the message_id can be used later for PATCH requests to update the message with additional content.

Example usage:

# Creating a new live message
curl -X POST \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json" \
  -d '{"content": "Starting message...", "status": "writing"}' \
  https://discord.com/api/v9/channels/123456789/messages/live

Response:

{
  "content": "Starting message...",
  "status": "writing",
  "message_id": "987654321"
}

Updating an Existing Message

We suggest adding a new endpoint under the Channels Resource. This endpoint will be used for updating a specific existing message within a specific channel, using their respective IDs.

Initiating a stream:

PATCH /channels/{channel.id}/messages/{message.id}/live

Parameters:

Accepts a JSON object containing the content to be appended to the message and a status key to indicate if the message is 'writing' (being updated) or 'finished' (no more content will be appended).

{
  "content": "your content here",
  "status": "writing/finished"
}

Example usage:

# You're continuing to type your message
curl -X PATCH \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json" \
  -d '{"content": "more text...", "status": "writing"}' \
  https://discord.com/api/v9/channels/123456789/messages/987654321/live

# You're finished with your message
curl -X PATCH \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json" \
  -d '{"content": "the final part of the text.", "status": "finished"}' \
  https://discord.com/api/v9/channels/123456789/messages/987654321/live

Modifications to the Existing API

To ensure a seamless user experience, this feature needs to be integrated with the existing typing indicator mechanism. For the POST request, initiating a new live message should trigger the typing indicator in the associated channel. Similarly, for the PATCH request, the typing indicator start triggering as soon as the 'status' is set to 'writing' and should stop upon receiving the finished status.

This active text streaming feature would revolutionize chatbot interactions on Discord. It carries the potential to offer a more dynamic, engaging conversation mode with chatbots, thus improving user interactivity drastically. Furthermore, it opens up avenues for several interesting applications on the Discord platform including but not limited to real-time transcription services, collaborative editing, AI-based real-time conversations, and more.

As a community member, I think the introduction of this new feature could be a significant step towards enhancing the overall Discord experience. I look forward to hearing everybody's thoughts, feedback, and any possible suggestions for improvement.

Some parts of the text and the examples were generated by an LLM.

RealAlphabet · 2023-07-22T14:19:35Z

RealAlphabet
Jul 22, 2023

Only one sentence comes to mind.

Waste of bandwidth.

For bots, for Discord and especially for users on limited mobile connections.

2 replies

perzeuss Jul 22, 2023
Author

I completely understand your concerns, especially if you haven't experienced chatting with a Large Language Model (LLM) via Discord before. To give you a better idea, imagine waiting for a large block of text to appear all at once in a chat conversation. You are waiting and waiting and then you need to scroll up to start reading. This lack of a natural flow can make the interaction feel less interactive and more like reading a wall of text.

Now, in terms of potential solutions, one might think of splitting the response into multiple smaller messages, such as sending ten separate messages instead of one long message. However, this approach would bring about its own set of problems. For instance, managing the response would become much more complicated, and you can no longer interact with just one message, instead you need to handle a list of messages. Deleting a coherent and sequential conversation spread across multiple messages could lead to confusion and make the interaction cumbersome.

That's where the idea of LLM streaming messages comes into play. By delivering responses character-by-character or in small, incremental portions, the conversation feels more dynamic and lifelike. It allows users to experience the chat with LLMs in a manner that closely resembles a back-and-forth dialogue, promoting engagement and making the conversation more enjoyable.

Moreover, implementing a response cancellation feature could be the key to addressing bandwidth concerns. If a user is not satisfied with an ongoing streamed response, they could choose to cancel it, saving bandwidth for both the user and Discord servers. This option empowers users to have more control over their interactions and data usage.

I appreciate your feedback and look forward to hearing more perspectives from the community :)

RealAlphabet Jul 23, 2023

What you say about the friction associated with the need to scroll at the beginning of the message makes perfect sense in this context. After reading your post, I've drafted a new suggestion that might be of interest to you in your use case, and which would be a longer-term solution for other bots and not just chatbots. #6313

devsnek · 2023-07-23T01:44:06Z

devsnek
Jul 23, 2023

We have no plans to implement this. The premise also seems flawed, because normal humans on Discord don't send long essays in a live stream, they send multiple shorter messages. You can train your AI to do the same thing if that's your goal.

0 replies

tomx-sh · 2025-04-12T11:11:22Z

April Fool’s Day was like two weeks ago

Could you please elaborate ?
Isn't Midjourney a plain illustration of what I'm saying ?

mantikafasi Apr 12, 2025

what you are saying sounds like ai slop

Implementing Stream-based Real-time Chatbot Responses #6310

Uh oh!

Uh oh!

Current Problem:

Feature Proposal:

New API Endpoints Proposal

Creating a Message in Stream Mode

Updating an Existing Message

Modifications to the Existing API

Replies: 3 comments · 5 replies

Uh oh!

Uh oh!

Uh oh!

perzeuss Jul 22, 2023 Author

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Replies: 3 comments 5 replies

perzeuss Jul 22, 2023
Author