Post Type Discovery

Post Type Discovery specifies algorithms for determining the type of a post by what properties it has and potentially what value(s) they have, which helps avoid the need for explicit post types that are being abandoned by modern post creation UIs.

The Response Type Algorithm in particular specifies how a [[Webmention]] receiver determines whether a webmention is a comment, like, repost, RSVP, or other type of mention, widely implemented in practice by Webmention receivers to determine when and how to display various peer-to-peer social responses.

Introduction

Post type discovery defines explicit algorithms for inferring the type of a post from other properties of that post.

Inferring the type of a post helps provide a bridge between formats and protocols without explicit post types (e.g. [[h-entry]], [[jf2]], [[micropub]], Atom ([[RFC4287]]), [[RSS-2.0]]) to those with explicit post types (e.g. [[ActivityPub]], [[AS2]]). For more details on those specifications see references, and for how they relate, see the overview document [[social-web-protocols]]. Post type discovery can apply to any post data structure independent of serialization (e.g. HTML, JSON, etc.)

Use Cases

Both creation user interfaces, and post presentation designs are evolving to directly use the presence or absence of specific properties (and their values) directly, rather than depending on any kind of explicit "post type", thus why bother discovering a post type in the first place? This section documents the (few) use-case(s) that is/are known to date.

Synthesizing explicit type formats

There are existing formats that require explicit post types (e.g. ActivityStreams [[AS1]]), or are based on explicit post types, (e.g. ActivityStreams2 [[AS2]]), and code that consumes them expects explicit post types. Post type discovery enabling automatic synthesizing of such formats from posts that merely have a set of content related properties.

Aggregating responses by type

Modern social web posting systems implement multiple types of responses and provide user interfaces that show aggregate counts or collections of those responses grouped by type, e.g. number of likes, reposts, replies.

Notifications wording and filtering

Many social web implementations provide notifications to a user with response type specific wording, e.g. "Alice commented on your photo", "7 people liked your video". Automatically generating both the specific wording of these notifications, and deciding which to provide to the user, based on preferences and/or filtering, requires determining the specific response type.

Conformance

Conformance Keywords

As well as sections marked as non-normative, all authoring guidelines, diagrams, examples, and notes in this specification are non-normative. Everything else in this specification is normative.

The key words "MAY", "MUST", "MUST NOT", "OPTIONAL", "RECOMMENDED", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", and "SHOULD NOT" are to be interpreted as described in [[!RFC2119]].

Conformance Classes

There are many possible ways a Post Type Discovery implementation can be used and tested. This section describes possible conformance classes from existing implementations and use-cases.

Type Discovery Function

A Type Discovery Function consumes untyped posts from a possibly untyped format or protocol, (e.g. the h-entry microformat) and outputs the type output (e.g. a simple one-word string like "note" or "article") of the algorithm implemented (either the Response Type Algorithm or the Post Type Algorithm).

AS2 Proxy

An AS2 Proxy consumes untyped posts from an untyped format or protocol, and outputs valid [[!AS2]] JSON using the equivalent AS2 Activity and/or Object Types.

Response Type Algorithm

The Response Type Algorithm ("the response algorithm") is for specifically discovering the type of a response, in particular as a result of receiving a [[!Webmention]], by analyzing the "source" from the Webmention, called the "response" in the response algorithm. It is a proper subset of the general Post Type Algorithm (defined below).

If the response has an "rsvp" property with a valid value (one of "yes", "no", "maybe", "interested"),
Then it is an RSVP.
If the response has a "repost-of" property with a valid URL,
Then it is a repost (AKA share).
If the response has a "like-of" property with a valid URL,
Then it is a like (AKA favorite).
If the response has an "in-reply-to" property with a valid URL,
Then it is a reply (AKA comment).
Else it is a mention.

Quoted property names in the response algorithm are defined in [[!h-entry]].

Post Type Algorithm

The Post Type Algorithm ("the algorithm") discovers the type of a post given a data structure representing an item with a flat set of properties (e.g. JSON output from [[microformats2-parsing]]), each with one or more values, by following these steps until reaching the first "it is a(n) ... post" statement at which point the "..." is the discovered post type.

If the post is an "event" item (may be a [[microformats2]] root class name of [[!h-event]]),
Then it is an event.
If the post has an "rsvp" property with a valid value (one of "yes", "no", "maybe", "interested"),
Then it is an RSVP post.
If the post has a "repost-of" property with a valid URL,
Then it is a repost (AKA "share") post.
If the post has a "like-of" property with a valid URL,
Then it is a like (AKA "favorite") post.
If the post has an "in-reply-to" property with a valid URL,
Then it is a reply post.
If the post has a "video" property with a valid URL,
Then it is a video post.
If the post has a "photo" property with a valid URL,
Then it is a photo post.
If the post has a "content" property with a non-empty value,
Then use its first non-empty value as the content
Else if the post has a "summary" property with a non-empty value,
Then use its first non-empty value as the content
Else it is a note post.
If the post has no "name" property
or has a "name" property with an empty string value (or no value)
Then it is a note post.
Take the first non-empty value of the "name" property
Trim all leading/trailing whitespace
Collapse all sequences of internal whitespace to a single space (0x20) character each
Do the same with the content
If this processed "name" property value is NOT a prefix of the processed content,
Then it is an article post.
Else it is a note post.

Quoted property names in the algorithm are defined in [[!h-entry]].

Note: for [[RFC4287]], use the Atom entry title for the "name" property, and the Atom entry content for the content as mentioned in the algorithm.

Note: for [[RSS-2.0]], use the RSS item title for the "name" property, and the RSS item description for the content as mentioned in the algorithm.

Post Type	AS2 equivalent
event	Event
rsvp	Event RSVP
repost	Announce
like	Like
reply	Note with inReplyTo
video	Video
photo	Image
note	Note
article	Article

Methodology

There are two important aspects to the methodology of the Post Type Discovery algorithm: scope (why is something explicitly in the algorithm), and order (why is something where it is in the algorithm).

Scope

The algorithm could attempt to cover innumerable potential hypothetical post types, or take an evidence based approach, focusing on real world publishing practices. This specification does the latter, specifically by placing a minimum bar of documented real world publishing practices of different visually apparent post types on the open web at recent (< 1 year old) permalinks, each with at least three independent implementations that have converged on what properties (and potentially values thereof) they have to imply their visually apparent post types. As a result of being evidence based, it is likely this specification will expand over time as more apparent post types are published by more convergent implementations.

Order

The algorithm must also specify an order (e.g. of precedence) that various properties (and their values) imply various post types. The algorithm is ordered by post types that are in general "richer" in terms of content as well as show greater cognitive effort by the author.

Examples

Like Post

Here is an example [[h-entry]] post from Activity Streams 2.0 Vocabulary examples [[AS2-vocab]]:

<div class="h-entry p-name">
  <span class="p-author h-card">Sally</span>
  liked
  <a class="u-like-of"
    href="http://example.org/notes/1">
    http://example.org/notes/1
  </a>
</div>

Following the algorithm, the step "If the post has a "like-of" property with a valid URL" is satisfied and thus the algorithm returns that the post is a "like" post.

Given this semantic, an implementation can generate (or process as if generated and consumed) the following AS2 JSON, in particular the "@type": "Like" in this output is what is determined by this algorithm:

{
  "@type": "Like",
  "actor": {
    "@type": "Person",
    "displayName": "Sally"
  },
  "object": "http://example.org/notes/1"
}

FAQ

What about a photo reply

Q: What about a reply that includes a photo?

A: It's a reply.

Q2: Should that show up as a "photo" post?

A2: It should show up as a "reply" and not be in a user's published feed of their photos. The user-centric design here is to treat replies separately, because in practice, when users post replies to others' posts, and include a photo, the photos typically assume the context of that other post, and would look odd outside of it (e.g. in a generic "photos" feed). In addition, by not including reply photos in a user's feed of their photos, it gives the user the freedom to reply to other posts with whatever they wish, including photos, and not have those reply-specific photos pollute their streams of "their stuff" that their followers subscribe to.

A2a: From a presentation perspective, a reply should primarily be displayed as a reply first, and then adapt accordingly to whatever other properties it may have.

Is a video tag sufficient

Q: Is a video tag sufficient to imply a video post?

A: No, video tags can be used for additional content e.g. inside an article. Only relying on video tag markup would lead to false positives.

Implementations

Implementations, in progress, partial, or complete, of Post Type Discovery.

Granary

Granary synthesizes ActivityStreams [[AS1]], [[microformats2]], and Atom [[RFC4287]] from various input feeds and sources, and as such has some code that can be considered in progress or even a partial implementation of Post Type Discovery:

Live public demo site: https://granary-demo.appspot.com/
Issue(s) related to implementing Post Type Discovery: #41

p3k

p3k (a CMS) implements Post Type Discovery internally within its [[micropub]] endpoint to automatically add posts to various collections. E.g.: if this post is a reply, it goes in the "replies" collection. if it's an RSVP, it goes in the "rsvps" and "replies" collections.

Live example: http://aaronparecki.com/

mf2util

mf2util exposes a function for post_type_discovery that takes an h-entry and returns "like", "reply", "note", "article", etc.

Live demo: https://kylewm.com/services/mf2util

Change Log

Changes from 1 August 2017 WD to this version

Add Conformance Classes (#10)
Add "event" discovery per sufficient implementations (#19)
Add informative Atom and RSS handling of their known/unchanging elements for determining at least "note" vs "article", generalizing beyond mf2 (#2)
Additional use cases for Response Type Discovery (#33)
Add post types to AS2 equivalents table (#9) (#15)

Changes from 14 June 2017 WD to 1 August 2017 WD

Response Type: move "reply" to last explicitly recognized response type to enable p-summary fallback use-cases (#25)

Changes from 1 March 2017 WD to 14 June 2017 WD

Added new section Response Type Algorithm (#24)
Editorial improvements

Changes from 28 October 2016 WD to 1 March 2017 WD

Explicitly note applies to any post data structure independent of serialization (#13)
Reference Social Web Protocols (#16)
End algorithm with explicit "Else" (#18)
Editorial improvements