Skip to content
Chris Kalafarski edited this page Aug 5, 2017 · 21 revisions

Standards & Specifications for syndicated.media

Version 1.0 (2017-03-14)

1 Introduction

1.1 About this document

This document describes aspects, technological and not, of the various systems used in media production and distribution as they relate to podcasting. In this document, the term podcasting refers to the publishing of audio content on the surface web using standard technologies via a document describing the content and associated metadata.

This document recognizes that it is common to refer to discrete pieces of content (e.g. episodes) or related groups of content (e.g. shows or series) as "podcasts". In order to avoid confusion or ambiguity, this document will not use the term podcast in that context; it refers only to a method of distribution.

This document attempts to define podcasting unambiguously. It includes technical details as well as non-technical descriptions and best practices, and will not strictly adhere to RFC 2119 or similar. The language is designed to be clear, meaningful, and not open to interpretation, without strictly defining key words. Examples are used to give context to ideas, as well as point out important exceptions or clarifications. When terminology has broadly accepted meanings, those terms will not be explicitly defined or restated.

This document may include commentary or other language that is simply there to help the reader understand why a certain standard or practice has been adopted and why it is defined the way it is.

1.2 What is syndicated.media?

Syndicated.media is a community-driven working group focused on the future of podcasting. It exists to ensure that podcasting grows to meet the needs of listeners, creators, producers, publishers, advertisers, and developers, without sacrificing the incredible groundwork that has been established to make it an open and inclusive medium.

Anyone is able to, and everyone is encouraged to, participate in the proceedings of the group. That includes generating ideas, discussing the possible implementation of those ideas, helping determine the priorities of the group, and acting on the results of the group's activities.

1.3 What is podcasting?

Content of various types can make use of podcasting. That content may represent stories, programs, shows, classes, tutorials, news and weather reports, or many other formats. Podcasting is a form of distribution, in much the same way that DVD, FM radio, and cable television are forms of distribution.

At its most fundamental level, distribution of content through podcasting is accomplished by maintaining a well-formed text file (a "feed") on the world wide web which describes the content being distributed, and points to media resources for that content. In many cases a feed will contain information about a single show or series (which is to say, related content created by a single owner or entity), but not always.

Clients (hardware or software that are designed to utilize content made available through podcasting) access the feeds, wherever they may be on the web, and make use of the metadata and media. Because feeds are available on the web, and the format of all feeds is mostly standardized, there are few limitations on who can build a client, or which podcasting content any given client has access to.

By definition, for the purposes of syndicated.media and the standards it puts forth, if a feed exists only behind a paywall or another means of keeping web content private (e.g., password protection), that feed is not being distributed through podcasting. If content is not available via feed, it is also not being distributed through podcasting.

1.4 What are syndicated.media standards?

Syndicated.media is producing a suite of standards to unambiguously define various aspects of the podcasting platform. These include existing standards (e.g., RSS 2.0 and Media RSS), formalization of industry best practices that have not been explicitly defined, as well as new standards, technical specifications, and best practices.

The collective purpose of these standards is to unambiguously define the formats and methods of transmission of data used for podcasting, as well as the behavior of clients that make use of those data.

1.5 Why are these standards being created?

Due to the nature and history of podcasting as a decentralized, distributed platform, many technical and non-technical aspects of the medium are lacking clarity. These standards are being selected and designed to eliminate the ambiguity that exists within the platform, and introduce new capabilities in a way that better guarantees widespread adoption.

A main goal of the syndicated.media standards is to create an environment where the interaction between any pair of consumers and producers of podcasting data in which each side conforms to the standards will be predictable and reliable. As such, conformance to any specific version of the syndicated.media standard is considered to be all-or-nothing.

2 Syndication Fundamentals

2.1 Feed Format

A feed used for podcasting describes fully (or provides access to resources that, as a whole, describe fully) the content being made available. The feed is a text file that implements a number of standards.

Feeds adhere strictly to the RSS 2.0 Specification, use UTF-8 character encoding, and have a media type of application/rss+xml.

Other parts of the syndicated.media standards may restrict how aspects of RSS 2.0 can be used. That is to say, syndicated.media standards include a subset of the full RSS 2.0 Specification, but all feeds conforming to syndicated.media standards do adhere entirely to the RSS 2.0 Specification.

2.1.1 URL attribute of enclosures

The RSS 2.0 Specification includes the following language in the context of URL's used for enclosure elements:

The url must be an http url.

This is taken to mean that the resources these URL's point to are available over an HTTP protocol, not that the URL's must only include a scheme of exactly http://. As such, other schemas, including https:// are considered valid, as long as the resource is being served via an HTTP protocol, which includes HTTPS.

This interpretation is inline with other parts of the RSS 2.0 Specification that do explicitly allow the https:// scheme, as well as modern expectations of web resources being available securely.

3 Extending RSS

3.1 Syndicated.media Core Extensions Namespace

Syndicated.media has defined a namespace to be used for adding elements to feeds that are not part of the RSS 2.0 Specification itself (as is allowed by the spec).

The namespace name for such elements is:

https://schema.syndicated.media/core/1.0/

3.1.1 Namespace Implementation Note

Clients making use of the syndicated.media core extension namespace (as well as other syndicated.media namespaces, and namespaces designed by other groups) should parse XML data for feeds based on the namespace name, not the prefix used for tags in the feed.

As an example, a feed parser should not search specifically for <itunes:image> tags, it should search for <image> tags within the http://www.itunes.com/dtds/podcast-1.0.dtd namespace that Apple has defined.

There should be no expectation that feeds will use any particular prefix for a given namespace. Even though it is extremely common for feeds to use an itunes: prefix for tags within the Apple namespace, programming parsers to make that assumption is fragile. Nearly all popular XML parsing libraries properly support XML namespace name-to-prefix mapping.

3.1.2 Example core namespace prefix

With the preceding implementation note in mind, code examples in this document and elsewhere will (and should) use the prefix smcore:. A single prefix is used in documentation and examples for the sake of consistency, not as an indication that it is more correct or preferred when making use of the extensions in real-world situations.

As stated in the previous note, it would be incorrect to build a parser that strictly matches on tags with a string literal prefix of smcore:, despite its use in this document.

4 RSS Restrictions

4.1 Unrecognized elements

Several elements defined by the RSS 2.0 spec are not relevant in a modern podcasting system. These elements have lost their usefulness for various reasons, such a dependency on a protocol that is not widely used, or providing a function that is well outside the needs of podcasting clients or users.

These elements should generally not be included in an RSS feed used for podcasting, and podcasting clients should not attempt to interpret, present, or act on the data defined by them.

The following channel elements are unrecognized:

  • category
  • cloud
  • ttl
  • rating
  • textInput
  • skipHours
  • skipDays

The following item elements are unrecognized:

  • category
  • comments

4.2 Elements with restricted definitions

The definitions of some elements in the RSS 2.0 spec are too broad or permissive to be useful in the context of podcasting. By restricting the scope of such definitions these elements become viable parts of a feed. None of these restricted definitions fundamentally change the purpose of the elements.

4.2.1 image

The channel-level image element is defined as having a maximum width and height of 144 and 400, respectively. Artwork for podcasting content has standardized on square aspect ratios, and the emergence of devices with high pixel densities has put these maximums at odds with expectations.

When used in feeds for podcasting, the images linked to by this element should be as large as is allowed by these maximums, while retaining the square aspect ratio. This means they should be 144 pixels in both dimensions. As this is a fairly diminutive image size on many modern devices, the main purpose of these images should be considered to be a thumbnail.

The image element, even with these restrictions, is considered to have very limited useful value. Clients should not assume it will be included in feeds. Its inclusion in feeds is not encouraged.

4.2.2 date-times

Elements with date-time values (e.g. pubDate and lastBuildDate) must conform to RFC 822 in RSS 2.0. An exception is made that "the year may be expressed with two characters or four characters".

In order to reduce unnecessary variability, date-time values for podcast feeds adhering to the syndicated.media standards should follow these additional formatting restrictions:

  • the year is expressed with four characters; never two characters
  • the day-of-week is included (not optional)
  • the day (day-of-month) is expressed with two characters; never one character
  • the time includes seconds (not optional)
  • values is expressed in Coordinated Universal Time (UTC)
  • the time zone is indicated using UT; never Z, GMT, +0000, etc

Example: Sat, 07 Sep 2002 07:14:00 UT

4.2.3 guid

The use of a guid value as a permalink is not supported. Clients should always behave as though the isPermaLink attribute is not present. Feeds should not include the attribute or, if they must include it for some reason, the value would be something other than true (false would be ideal).