Skip to content

Add the CHATHISTORY extension specification - refactored #393

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 27 commits into from
Nov 9, 2020
Merged
Changes from 7 commits
Commits
Show all changes
27 commits
Select commit Hold shift + click to select a range
bbdba28
Chathistory extension
prawnsalad Jun 26, 2019
c633bb7
Require message-tags, not draft/message-tags
prawnsalad Jun 26, 2019
e7d5def
Examples using msgid instead of draft/msgid
prawnsalad Jun 26, 2019
b0f524f
Language changes for returned message notes
prawnsalad Jun 26, 2019
92b9e20
Backticks are cool
prawnsalad Jun 26, 2019
e1a6b43
Defining history content for query buffers; Language change for LATES…
prawnsalad Jun 27, 2019
3a39a29
Include the target on FAIL messages
prawnsalad Sep 16, 2019
950b019
updates to chathistory spec
slingamn Jan 31, 2020
7942807
move "consistent message order" language up in the spec
slingamn Feb 17, 2020
215b7f5
Merge pull request #2 from slingamn/chathistory.3
prawnsalad Feb 17, 2020
5d90804
add WIP warnings and draft prefix
slingamn Feb 24, 2020
6ede679
clarify dependency relationship
slingamn Feb 24, 2020
e9eb159
fix specification of latest *
slingamn May 6, 2020
0c271a5
Merge pull request #3 from slingamn/chathistory_draft
prawnsalad Jul 20, 2020
caa9cbd
fix example pseudocode
slingamn Oct 5, 2020
8b41bfb
Merge pull request #4 from slingamn/chathistory_pseudocode_revisions
prawnsalad Oct 5, 2020
550eb70
add caveat about bouncers not providing msgids
slingamn Oct 12, 2020
e5d7359
language and formatting tweaks
slingamn Oct 13, 2020
8473d9a
mention channel membership issues under security considerations
slingamn Oct 13, 2020
0570bd6
update copyright notice
slingamn Oct 15, 2020
aaa33fb
adjust copyright notice again
slingamn Oct 15, 2020
e0deca9
specify the batch parameter
slingamn Oct 20, 2020
841ad22
caution about nickname changes
slingamn Oct 20, 2020
aab8bff
rearrange and clarify some language
slingamn Oct 30, 2020
40a919b
add some tentative language about nick rewriting
slingamn Oct 30, 2020
36b3b44
make the batch type and param into MUSTs
slingamn Nov 3, 2020
aa80df2
add INVALID_TARGET fail code
slingamn Nov 4, 2020
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
115 changes: 115 additions & 0 deletions extensions/chathistory.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,115 @@
---
title: IRCv3 chathistory extension
layout: spec
work-in-progress: true
copyrights:
-
name: "Evan Magaliff"
period: "2017"
email: "[email protected]"
-
name: "Darren Whitlen"
period: "2018-2019"
email: "[email protected]"
---
## Description
This document describes the format of the `chathistory` extension. This enables clients to request messages that were previously sent if they are still available on the server.

The server as mentioned in this document may refer to either an IRC server or an IRC bouncer.

## Implementation
The `chathistory` extension uses the [chathistory][batch/chathistory] batch type and introduces a client command, `chathistory`.

To fully support this extension, clients MUST support the [`batch`][batch], [`server-time`][server-time] and [`message-tags`][message-tags] capabilities.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Someone interpreted this language as meaning that chathistory has a hard dependency on batch, server-time, and message-tags, and therefore that chathistory could not necessarily be used to replace znc.in/playback.

I think it's worth clarifying that this is not the case: degraded functionality (in particular, functionality equivalent to znc.in/playback) is still available even without support for any of these caps.


The `chathistory` capability MUST be negotiated. This allows the server and client to act differently when delivering message history on connection.

An ISUPPORT token MUST be sent to the client to state the maximum number of messages a client can request in a single command, represented by an integer. `CHATHISTORY=50`. If `0`, the client SHOULD assume that there is no maximum number of messages.

### `CHATHISTORY` Command
`CHATHISTORY` content can be requested by the client by sending the `CHATHISTORY` command to the server. A `batch` MUST be returned by the server. If no content exists to return, an empty batch SHOULD be returned to avoid the client waiting for a reply and to indicate that no content is available.

This comment was marked as duplicate.


The `chathistory` command uses the following general syntax structure:

CHATHISTORY <subcommand> <target> <timestamp | msgid> <limit>

The `target` parameter specifies a single buffer (channel or nickname) from which history SHOULD be retrieved. Any `timestamp` values or parameters MUST be in the format of `YYYY-MM-DDThh:mm:ss.sssZ`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In the situation where a client is disconnected to a bouncer offering chathistory. How can I know which nickname targets may have messaged me while I am disconnected? Someone I have never spoken to with the client may message me while I am disconnected and the API offers no way to know which targets are available so the client can reconnect but it doesn’t know there are messages for a specific target.

This is less of a problem for channels because you likely get a JOIN event so you can retrieve the history afterwards. ZNC-Playback solves this problem by allowing wildcards here (although that introduces other problems of its own).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking a special-purpose target like *self that refers to your own direct messages with all other clients, including your self-messages.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is resolved now by the * target (merged into the PR).


If a nickname is given as the `target` then the server SHOULD include history sent between both the current user and the target nickname as to give a full conversation. The server SHOULD attempt to include history involving other nicknames if either the current user or the target nickname has changed during the requested timeframe.

#### Subcommands

The following subcommands are used to describe how the server should return messages relative to the `timestamp` or `msgid` given.

#### `BEFORE`

CHATHISTORY BEFORE <target> <timestamp=YYYY-MM-DDThh:mm:ss.sssZ | msgid=1234> <limit>

Request up to `limit` number of messages before and including the given `timestamp` or `msgid`. Only one timestamp or msgid MUST be given, not both.

#### `AFTER`
CHATHISTORY AFTER <target> <timestamp=YYYY-MM-DDThh:mm:ss.sssZ | msgid=1234> <limit>
Request up to `limit` number of messages after and excluding the given `timestamp` or `msgid`. Only one timestamp or msgid MUST be given, not both.

#### `LATEST`
CHATHISTORY LATEST <target> <* | timestamp=YYYY-MM-DDThh:mm:ss.sssZ | msgid=1234> <limit>
Request the most recent messages that have been sent after and excluding the given `timestamp` or `msgid`. If a `*` is given instead of a timestamp or msgid, the server MUST use the current time as a timestamp. The number of messages returned MUST be equal to or less than `limit`. If a `*` is not given, only one timestamp or msgid MUST be given, not both.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Multiple messages might have the exact same server-time; as a client I'd likely want to provide an inclusive timestamp instead of exclusive to the result and do de-duplication based on msgid locally. Otherwise there can be cases where I disconnected while receiving a series of messages at the exact same time and loose one of them.

It might even be better to ask for an older timestamp from a few minutes before and handle msgid based deduplication. If there are messages I didn't receive but have an older timestamp due to originating from a different linked server which can have latency between the server I am connected.

msgid lookup can face similiar problems.


This is useful for retrieving the latest conversation when first joining a channel or opening a query buffer.

#### `AROUND`
CHATHISTORY AROUND <target> <timestamp=YYYY-MM-DDThh:mm:ss.sssZ | msgid=1234> <limit>
Request a number of messages before and after the `timestamp` or `msgid` with the total number of returned messages not exceeding `limit`. The implementation may decide how many messages to include before and after the selected message. Only one timestamp or msgid MUST be given, not both.

This is useful for retrieving conversation context around a single message.

#### `BETWEEN`
CHATHISTORY BETWEEN <target> <timestamp=YYYY-MM-DDThh:mm:ss.sssZ | msgid=1234> <timestamp=YYYY-MM-DDThh:mm:ss.sssZ | msgid=1234> <limit>
Request up to `limit` number of messages between the given `timestamp` or `msgid` values. The returned messages MUST start from the inclusive first message selector, while excluding and finishing on the second - this may be forwards or backwards in time.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you use a timestamp for one selector and a msgid for the other?


#### Returned message notes
The returned messages MUST be in ascending time order and the `server-time` tag SHOULD be the time at which the message was received by the IRC server. The `msgid` tag that identifies each individual message in a response MUST be the `msgid` tag as originally sent by the IRC server.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the server-time tag SHOULD be the time at which the message was received by the IRC server

When it comes to linked servers; does this state that the time will be the one from which I am connected to, or the server that processed the message?

If the former, then a message's "server-time" could change depending on which server I've connected to?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I interpreted this as "the server that initially processed the message", i.e., the time tag should be the same across all relays and replays of the message.

Intriguingly, the ambiguity you're identifying exists even outside the context of history, and AFAICT is not clarified by the latest server-time spec.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking “the server that initially processed the message” too, although that likely means you might not see messages in time order (ascending) when they come from other linked servers live which makes some of the below commands tricky to use.

For example I may receive the following messages, the various nick are connected to different linked servers. doe@aus is notable as they are connected to an Australian IRCd which has the higest lattency:

@time=2020-01-23T08:39:42.102Z :[email protected] PRIVMSG #example :Radio is online
@time=2020-01-23T08:39:42.102Z :[email protected] PRIVMSG #example :Now playing XYZ
@time=2020-01-23T08:39:42.042Z :dennis@eu PRIVMSG #example :Hello World
@time=2020-01-23T08:39:42.021Z :kyle@us PRIVMSG #example :Hello World
@time=2020-01-23T08:39:41.944Z :doe@aus PRIVMSG #example :Hello World

If I disconnected after the first message was received or processed, then reconnect and ask for CHATHISTORY LATEST giving the last message timestamp that I’ve seen as 2020-01-23T08:39:42.102Z. Then I SHOULD receive no additional messages despite never having seen the additional 4 messages. msgid lookup has similar problems because the server I connect to may not know the order that I’ve received the messages.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Over on prawnsalad#1 we're planning to make most of the commands use timestamps only as parameters. That will allow clients to compensate for skew/latency across the network by adding or subtracting a fuzz duration. (10 seconds seems like a sane value: Unreal disables linking entirely when it detects 30 seconds of skew.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In what sense does removing the ability to specify msgids allow clients to do anything?

The situation where it makes sense to use times and a fudge factor is when you're starting to update your existing partial history which came from another server. The rest of the time it's pointless waste.

Either way, with the time order requirement, a distributed implementation that's not strictly serialising has to say inconsistent things, either by returning history in an order it didn't really happen or by returning messages with timestamps that they didn't really have. I could imagine either causing problems for a client that tries to construct a canonical history, but I also just don't want to be required to do extra work to provide less accurate information.

Copy link
Contributor

@slingamn slingamn Jan 27, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just want to get clear on the terms of this debate. If I understand you correctly, your primary objection is to the requirement that results be returned in time order, not to the requirement that queries support timestamps as parameters. As a server vendor, you're going to have to do the work required to answer queries of the form "what messages were sent between 2020-01-27T17:22:09.270Z and 2020-01-27T17:40:09.282Z?" regardless.

Here's my position on this issue:

  1. Since there is no global total order on messages, it is unnecessarily difficult for client developers to correctly use msgids in ordering queries. With timestamps, the client and all servers in the network can agree on the intended semantics of the query. (If clients need to support timestamp fudging regardless to handle the case where they reconnect to a different server, in what sense is "waste" occurring? What is being wasted?)
  2. The preferred compromise is "returning history in an order it didn't really happen", not sending fake timestamps. The only thing that can cause the timestamp order to contradict the "causal order" of messages is clock skew among servers (not latency). Clock skew among well-managed servers should be on the order of a few milliseconds, so this is a rare edge case.
  3. The server-time spec already says "a client SHOULD treat the message as having occurred at the given time instead of its current time": it is implicit in this that the timestamp order overrides the order of message delivery (and I think this is the behavior implemented in many clients that support server-time). [retracted, people think this is factually incorrect and it seems peripheral to the argument]


#### Errors and Warnings
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestions for more FAIL messages:

  • UNKNOWN_CRITERIA <criteria> — for if someone specifies an invalid criteria like wibble=foo instead of e.g. msgid=abc123.

  • INVALID_CRITERIA <criteria> — for if someone specifies a criteria thats not valid for a command e.g. the wildcard (*) criteria with a non-LATEST command.

  • INVALID_LIMIT <limit> — for if someone specifies a limit that is out of the allowed range.

Errors are returned using the standard replies syntax.

If the server receives a `CHATHISTORY` command with an unknown subcommand, the `UNKNOWN_COMMAND` error code MUST be returned.
> FAIL CHATHISTORY UNKNOWN_COMMAND the_given_command :Unknown command

If the server receives a `CHATHISTORY` command with missing parameters, the `NEED_MORE_PARAMS` error code MUST be returned.
> FAIL CHATHISTORY NEED_MORE_PARAMS the_given_command :Missing parameters

If no message history can be returned due to an error, the `MESSAGE_ERROR` error code SHOULD be returned.
> FAIL CHATHISTORY MESSAGE_ERROR the_given_command the_given_target [extra_context] :Messages could not be retrieved

### Examples

Requesting the latest conversation upon joining a channel
~~~~
[c] CHATHISTORY LATEST #channel * 50
[s] :irc.host BATCH +ID chathistory #channel
[s] @batch=ID;msgid=1234;time=2019-01-04T14:33:26.123Z :nick!ident@host PRIVMSG #channel :message
[s] @batch=ID;msgid=1235;time=2019-01-04T14:33:38.123Z :nick!ident@host NOTICE #channel :message
[s] @batch=ID;msgid=1238;time=2019-01-04T14:34:17.123Z;+client_tag=val :nick!ident@host PRIVMSG #channel :ACTION message
[s] :irc.host BATCH -ID
~~~~

Requesting further message history than our client currently has
~~~~
[c] CHATHISTORY BEFORE bob timestamp=2019-01-04T14:34:17.123Z 50
[s] :irc.host BATCH +ID chathistory bob
[s] @batch=ID;msgid=1234;time=2019-01-04T14:34:09.123Z :bob!ident@host PRIVMSG alice :hello
[s] @batch=ID;msgid=1235;time=2019-01-04T14:34:10.123Z :alice!ident@host PRIVMSG bob :hi! how are you?
[s] @batch=ID;msgid=1238;time=2019-01-04T14:34:16.123Z; :bob!ident@host PRIVMSG alice :I'm good, thank you!
[s] :irc.host BATCH -ID
~~~~

## Use Cases
The batch type and supporting command are useful for allowing an "infinite scroll" type capability within the client. A client will, upon scrolling to the top of the active window or a manual trigger, may request `chathistory` from the server and, after receiving returned content, append it to the top of the window. Users can repeat this historic scrolling to retrieve prior history until limitations are met (see below).

Upon joining a channel, a client may request the latest messages for the buffer so that the active conversation context may be retrieved.

## Security Considerations
Secure identification of users and clients MUST exist in order to ensure that users cannot obtain history they are not authorised to view. Use of account names, internal account identifiers, or certificate fingerprints SHOULD be strongly considered when matching content to users. If a client requests content for a target that they do not have permission for, eg. a channel they are banned from, an empty batch SHOULD be returned as if no content exists.

While an ISUPPORT token value of `0` may be used to indicate no message limit exists, servers SHOULD set and enforce a reasonable maximum and properly throttle `CHATHISTORY` commands to prevent abuse.