Skip to content

Integration/chat moderation #2538

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 7 commits into
base: main
Choose a base branch
from
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
84 changes: 84 additions & 0 deletions content/chat/moderation/custom/index.textile
Original file line number Diff line number Diff line change
@@ -0,0 +1,84 @@
---
title: Custom Moderation
meta_description: "Detect and remove unwanted content in a Chat Room using a custom provider"
product: chat
---

There may be situations where you have trained your own model, or you want to apply proprietary logic using your own infrastructure, whilst performing moderation.

Ably provides simple APIs to allow your moderation logic to prevent harmful content from being present in your chat room.

h2(#beforepublish). Before publish

Before publish moderation is where your moderation logic is invoked before the message is published to your chat room. This has the benefit of preventing harmful content from ever entering your chat room, at the cost of some latency in invoking your moderation logic as part of the publish path.

h3(#configuration). Integration configuration

To fine-tune how Ably handles messages according to your use-case, you can configure before publish rule behavior using the following fields.

- Retry timeout := Maximum duration (in milliseconds) that an attempt to invoke the rule may take (including any retries).
- Max retries := Maximum number of retries after the first attempt at invoking the rule.
- Failed action := The action to take in the event that a rule fails to invoke. Options are reject the request or publish anyway.
- Too many requests action := The action to take if your endpoint returns @429 Too Many Requests@, which may happen if your endpoint is overloaded. The options are to fail moderation, or retry.
- Room filter (optional) := A regular expression to match to specific chat rooms.

h3(#api). The API

Ably provides a simple API for integrations to moderate chat messages. There are some nuances for particular transports, which can be seen on the individual transports page.

h4(#request). Request format

The request has the following JSON format.

```[json]
{
"source": "string",
"appId": "string",
"roomId": "string",
"site": "string",
"ruleId": "string",
"message": {
"roomId": "string",
"clientId": "string",
"text": "string",
"metadata": {
"key": "any"
},
"headers": {
"key": "string"
}
}
}
```

h4(#response). Response format

```[json]
{
"action": "accept|reject",
"rejectionReason": {
"key": "any
}
}
```

* @action@: Must be either @accept@ or @reject@. @accept@ means that the message will be published to the chat room, @reject@ means it will be rejected.
* @rejectionReason@: Optional. If provided with @action: "reject"@, it can contain any information about why the message was rejected. This information may be sent back to clients.

h3(#response). Error handling

If moderation was performed as expected, regardless of the outcome, your endpoint MUST return a status code of @200@. For other codes, Ably will take the following action:

- 4xx (excluding 429) := Ably will not retry moderation. The message will be handled according to your rule configuration.
- 429 := Ably will only retry if your rule configuration permits retries in the @429 Too Many Requests@ case.
- 5xx := Ably will retry moderation with backoff, until it either succeeds, or the retry window is exceeded.

If, by the end of the retry window, Ably has not been able to get a definitive moderation answer from your endpoint, the action we take next will depend on your rule configuration. If you have elected to publish the message anyway, we will do so. You can always remove harmful content in hindsight using human moderators or community reporting schemes. Alternatively, you may have elected to reject the message. If this is the case, Ably will not publish the message.

h2(#after-publish). After publish

After publish moderation is where your moderation logic is invoked after a message is published to the chat room. In this configuration, harmful content may briefly be visible in the room, although most moderation engines are able to process content and instruct its removal almost instantaneously. This configuration is helpful when you need to prioritise latency and performance.

There isn't currently a chat-specific custom API for after publish moderation.

However, you can still use standard Ably "integration rules"/docs/integrations to send chat messages to your infrastructure and then remove any offending content with the REST API.
50 changes: 50 additions & 0 deletions content/chat/moderation/custom/lambda.textile
Original file line number Diff line number Diff line change
@@ -0,0 +1,50 @@
---
title: AWS Lambda
meta_description: "Detect and remove unwanted content in a Chat Room using AWS Lambda."
product: chat
languages:
- javascript
- react
- swift
- kotlin
---

The AWS Lambda rule is a powerful way to custom moderation solution to Ably Chat. It enables you to run custom moderation logic or integrate with your preferred moderation provider by configuring an AWS Lambda function that will be invoked before messages are published to a chat room.

This rule is particularly useful when you want to:
* Integrate with a custom moderation service
* Implement your own moderation logic
* Use a moderation provider that isn't directly supported by Ably

h2(#setup). Integration setup

Configure the integration rule in your "Ably dashboard":https://ably.com/accounts/any/apps/any/integrations or using the "Control API":/docs/account/control-api.

The following fields are specific to the Lambda transport. You can also configure the "general fields":/docs/chat/moderation/custom#configuration.

- AWS region := The region where your Lambda function is deployed.
- Function name := The name of your AWS Lambda function.
- AWS authentication scheme := The authentication method to use. Either @AWS Credentials@ or @ARN of an assumable role@. See the "Ably AWS authentication documentation":https://ably.com/docs/integrations/webhooks/lambda#aws-authentication.


h2(#lambda-response). Lambda-specific response

When invoking Lambda functions, the response code received is from the Lambda runtime, not a response code specific to your function. Therefore, for Lambda transports, the standard response format is wrapped to give you the ability to specify a status code in your response. The lambda-specific response is as follows:

<pre><code class="json">{
"statusCode": "integer",
"body": "{\"action\": \"accept|reject\", \"rejectionReason\": {\"key\": \"any\"}}"
}</code></pre>

* @statusCode@: The HTTP status code of the response
* @body@: A JSON string that is the serialization of the standard moderation response.

h2(#best_practices). Best Practices

When implementing your Lambda function, consider the following:

* Keep your function execution time as low as possible to minimize latency
* Implement proper error handling and logging
* Consider implementing rate limiting if you're using a third-party moderation service
* Use appropriate IAM roles and permissions for your Lambda function
* Consider implementing caching for frequently occurring content
59 changes: 59 additions & 0 deletions content/chat/moderation/direct/hive-dashboard.textile
Original file line number Diff line number Diff line change
@@ -0,0 +1,59 @@
---
title: Hive (Dashboard)
meta_description: "Detect and remove unwanted content in a Chat Room using Hive AI, providing human moderators a place to review and act on content."
product: chat
languages:
- javascript
- react
- swift
- kotlin
---

"Hive dashboard":https://hivemoderation.com/dashboard is a powerful all-in-one moderation tool, which enables you to set up rules to combine automated AI moderation with human review.

The Hive Dashboard rule is a rule applied to chat rooms which enables you to use Hive's "moderation dashboard":https://docs.thehive.ai/docs/what-is-the-moderation-dashboard to review messages in your chat room.

h2(#setup). Integration setup

Configure the integration rule in your "Ably dashboard":https://ably.com/accounts/any/apps/any/integrations or using the "Control API":/docs/account/control-api.

The following are the fields specific to Hive dashboard configuration:

- Hive API Key := The API key for your Hive dashboard account.
- Chat Room Filter (optional) := A regular expression to match the chat room name.

h2(#hive_setup). Hive setup

To use the Hive Dashboard rule, you first need to set up a Hive account and an application.

Once you have done this you can get an API key from the "dashboard settings":https://dashboard.thehive.ai/app/_/settings/api_keys which you can then use to set up the rule.

When the rule is enabled, it will send all published messages in the room to the dashboard for review.
Please note that messages get sent to Hive _after_ they have been published, so users will see them in the chat room until they have been deleted by a moderation rule, although in the case of automatic moderation this will be near-instantaneous.
If you want to prevent messages from reaching users until they have been approved automatically by an AI moderation rule, check out the "Hive model only rule":/docs/chat/rooms/moderation/hive-model-only.

In order for hive dashboard to delete messages, you need to set up an "action":https://docs.thehive.ai/docs/actions in the dashboard.
The setup is as follows:

# Go to the "post actions":https://dashboard.thehive.ai/app/_/actions/post page in the dashboard.
# Click "+ Create new"
# Ensure that the "Action type" is set to "POST"
# Enter a descriptive name for the action, for example "Delete message"
# Set the "Endpoint URL" field to @https://rest.ably.io/chat/v2/moderation/hive/delete@ (if you have a "dedicated cluster":https://ably.com/docs/platform-customization#setting-up-a-custom-environment the domain should be your custom REST domain)
# Under "Customize API Request" add the @Authorization@ header with the value @Bearer <your Base64 encoded Ably API key>@, make sure to Base64 encode the API key you get from the Ably dashboard.
# Under "Request Body" add the following two params:
** @serial@ := @Post ID@
** @clientId@ := @User ID@
# Click "Save"

Please refer to the below screenshot for an example of the action setup:

<a href="@content/screenshots/hive-dashboard-rule/hive-dashboard-action-setup.png" target="_blank">
<img src="@content/screenshots/hive-dashboard-rule/hive-dashboard-action-setup.png" style="width: 70%" alt="Hive Dashboard Action Setup">
</a>

You should now see the action in the list of actions, and it will be available to human moderators in the dashboard, as well as for use by auto-moderation rules.

Within the Hive dashboard, messages within a room will appear as part of the same group, where the @group_id@ is the @roomId@ of the room, and edits to a message will appear as children of the original message.

For more information on how to use the Hive Dashboard, please refer to the "Hive Dashboard documentation":https://docs.thehive.ai/docs/what-is-the-moderation-dashboard.
41 changes: 41 additions & 0 deletions content/chat/moderation/direct/hive-model-only.textile
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
---
title: Hive (Model Only)
meta_description: "Detect and remove unwanted content in a Chat Room using Hive AI."
product: chat
languages:
- javascript
- react
- swift
- kotlin
---

"Hive Moderation":https://hivemoderation.com is a powerful suite of moderation tools that can be used to moderate content in chat rooms.

The Hive (model only) rule is a rule applied to chat rooms in Ably Chat which enables you to use Hive's text moderation models to detect and handle inappropriate content before it is published to other users.

h2(#setup). Integration setup

Configure the integration rule in your "Ably dashboard":https://ably.com/accounts/any/apps/any/integrations or using the "Control API":/docs/account/control-api.

The following are the fields specific to Hive (model only) configuration:

- Hive API key := The API key for your Hive account.
- Thresholds := A map of text "text moderation classes":https://docs.thehive.ai/reference/text-moderation to "severity":https://docs.thehive.ai/docs/detailed-class-descriptions-text-moderation. When moderating text, any message deemed to be at or above a specified threshold will be rejected and not published to the chat room.
- Model URL (optional) := A custom URL if using a custom moderation model.
- Retry timeout := Maximum duration (in milliseconds) that an attempt to invoke the rule may take (including any retries). The possible range is 0 - 5000ms.
- Max retries := Maximum number of retries after the first attempt at invoking the rule.
- Failed action := The action to take in the event that a rule fails to invoke. Options are reject the request or publish anyway.
- Too many requests action := The action to take in the event that Hive returns a 429 (Too Many Requests Response). Options are to fail rule invocation, or retry.
- Room filter (optional) := A regular expression to match to specific chat rooms.

h2(#text-length). Text length

Hive's models accept content with a maximum length of 1024 characters. If sending a message larger than this, Ably will automatically break the text into smaller requests, with crossover between segments to ensure context is preserved.

Ably will aggregate the model responses, rejecting the message as a whole if one or more of the text segments fail to pass the threshold requirements.

h2(#rejections). Handling rejections

If a message fails moderation and the rule policy is to reject, then it will be rejected by the server.

Moderation rejections will use error code @42213@.
48 changes: 48 additions & 0 deletions content/chat/moderation/index.textile
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
---
title: Moderation
meta_description: "Detect and remove unwanted content in a Chat Room."
product: chat
languages:
- javascript
- react
- swift
- kotlin
---

Moderation is a crucial feature for chat rooms and online communities to maintain a safe, respectful, and engaging environment for all participants. Moderators help enforce community guidelines and remove potentially harmful content that can drive users away from an online experience.

Moderation strategies can take many forms. Human moderators can sit and participate in the chat room, using community guidelines to make judgements on the chat content, taking action such as deleting a message when it is found to be in violation of standards. Many modern approaches involve moderation engines and Artificial Intelligence models, which can screen content in order to filter out harmful messages before they are allowed into the chat room, without the need for human moderators. Many of these are highly configurable, allowing you to screen content across multiple categories to suit your needs. Hybrid approaches can make the best of both worlds, employing AI to pre-screen messages, with human moderators able to make judgement calls on edge-cases or in response to user feedback.

Ably Chat supports a variety of moderation options in chat rooms, to help you keep your participants safe and engaged.

h2(#types). Types of moderation

Moderation with Ably falls into two categories: before and after publish.

h3(#before-publish). Before publish

When using before publish moderation, a message is reviewed by an automated moderation engine (such as an AI model) before it is published to the chat room. This is helpful in sensitive scenarios where inappropriate content being visible in the chat room for even a second is unacceptable, for example, in schools.

This approach provides additional safety guarantees, but may come at the cost of a small amount of latency, as messages must be vetted prior to being published.

h3(#after-publish). After publish

When using after publish moderation, a message is published as normal, but is forwarded to a moderation engine after the fact. This enables you to avoid the latency penalty of vetting content prior to publish, at the expense of bad content being visible in the chat room (at least briefly). Many automated moderation solutions are able to process and delete offending messages within a few seconds of publication.

Please note that message deletion is currently performed as a soft delete, meaning that your application will need to filter out deleted messages that it sees.

h2(#directvscustom). Direct vs Custom

There are a plethora of moderation options available on the market, from simple pattern-matching APIs to fully fledged machine learning models.

Ably provides direct integrations between your chat room and moderation providers, to give you access to powerful moderation platforms with minimal code and setup. If a provider you are looking for is not listed, please get in touch!

Alternatively, you might have a custom solution you wish to integrate with, or a provider that Ably doesn't yet directly support. If this is the case, Ably offers a custom option, where you can utilize serverless functions such as AWS Lambda to reach out to your own infrastructure to moderate chat messages.

h3(#hive). Hive

Hive provide automated content moderation solutions. The first of these is the "model only":https://hivemoderation.com/text-moderation solution, which provides access to a powerful ML model that takes content and categorises it against various criteria, for example, violence or hate speech. For each classification, it also provides an indication of the severity of the infraction. Using this information, you can determine what level of classification is appropriate for your chat room and filter / reject content accordingly.

The second solution is the "dashboard":https://hivemoderation.com/dashboard. This is an all-in-one moderation tool, that allows you to combine automated workflows using ML models as well as human review and decisions to control the content in your chat room.

Ably is able to integrate your chat rooms directly directly with both of these solutions, to allow you to get up and running with Hive moderation with minimal code required.
36 changes: 36 additions & 0 deletions src/data/nav/chat.ts
Original file line number Diff line number Diff line change
@@ -1,3 +1,4 @@
import { link } from 'fs';
import { NavProduct } from './types';

export default {
Expand Down Expand Up @@ -63,6 +64,41 @@ export default {
},
],
},
{
name: 'Moderation',
pages: [
{
name: 'Introduction',
link: '/docs/chat/moderation',
},
{
name: 'Direct Integrations',
pages: [
{
name: 'Hive (Model Only)',
link: '/docs/chat/moderation/direct/hive-model-only',
},
{
name: 'Hive (Dashboard)',
link: '/docs/chat/moderation/direct/hive-dashboard',
},
],
},
{
name: 'Custom',
pages: [
{
name: 'API Overview',
link: '/docs/chat/moderation/custom',
},
{
name: 'AWS Lambda',
link: '/docs/chat/moderation/custom/lambda',
},
],
},
],
},
],
api: [
{
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.