Adding a field to the Mastodon API

2023-10-04

When I was younger, I expected everyone in the future would have computer handles we’d mostly go by, and names would be secondary. I gleefully memorized hundreds of online handles for all the folks I knew in school and online.

The character 'Cereal Killer' from the movie 'Hackers' — Also, we’d all dress like this.

It turned out, that didn’t exactly happen. People still keep their (usually) parent-assigned names and use them often.

The character 'Cereal Killer' from the movie 'Hackers', with a three-piece-suit hastily cartoon-sketched on the image — Also, he gives up breaking systems, becomes a quant, and makes $500k a year.

Except online, where many of us still use handles all the time. This gives me a challenge, becuase unfortunately, in spite of my efforts to the contrary, I have become an olds and can barely remember one name a person goes by anymore. So when faced with someone’s online handle in Mastodon, I have a hell of a time keeping track of who they are in real life.

Conveniently, Mastodon has a system to handle this: you can attach a note to any account you follow, shown only to you, that provides additional context for a user. So, for example, if I follow ‘@bob’ on Mastodon but I don’t remember who that is, I can just attach a note.

A note clarifying that ‘@bob’ is ‘Bob, from accounting’

Unfortunately, that note isn’t surfaced in the timeline when a user posts.

A post in the user’s timeline, showing their name but no note

But with only a few tweaks to the source code, we can get it there!

A post in the user’s timeline, showing the note next to their display name

It turns out this relatively simple change touched on just about every layer of the Mastodon application, from data storage model to the React / Redux presentation layer. I’ll go over how it was implemnted. In the process, I’ll go over how I learned how to do this and sketch out the pieces of the architecture I learned about.

This post will be part 1 of 3 when I’m all finished:

Prerequesite knowledge: It will be helpful if the reader is familiar with some fundamentals of Ruby on Rails for this section. I’m going to talk about some of them, but this won’t be a comprehensive dive on either Rails implementation theory or model-view-control pattern in general.

Note: I am doing all my work on the glitch-soc fork of Mastodon. I suspect most of what I present here will line up 1-for-1 with baseline Mastodon, but be aware that if line numbers and other details are off, that’s probably why. Nothing glitch-soc changes impacts this project from what I can tell.

Okay! Let’s dive in.

High-level Mastodon server app layout

Ruby on Rails web services are, generally speaking, applications that sit between a data store and a client program (like a web browser) and turn data from the store into data the client can understand. Because it suits this kind of application well, they tend to follow a “model-view-controller” design pattern:

The model is software that knows how to read and write to the data store (usually PostgreSQL in Mastodon’s case). All the model pieces are stored in the app/models directory. Care and feeding of the PostgreSQL database itself lives in db. This includes the migrations (db/migrate), which are scripts that add new things to the database; you split those changes up into separate files so that an already-existing Mastodon node can make tiny tweaks to its live databases without blowing away the whole database and starting over.
The view (app/views) is software that knows how to take a representation of data from the model and repackage it in a way that a client can understand. This might be a JSON wrapper if the client is making an API call, HTML if the client is a web browser requesting a human-readable view of the data, and so on.

Worth noting here: I wouldn’t consider the React web frontend for Mastodon part of the ‘view’ precisely. It’s essentially its own application, running in a web browser, that also uses the model-view-controller paradigm. Simple! Also complicated! We’ll discuss that in post 3 of this series. For now, we can think of the view mostly as “The thing that turns the data requested into a JSON blob” and be right enough to continue.
The controller (app/controllers) knows about the model and the view and knows what to do in response to various triggers (such as “Someone has requested the main timeline” or “The automated fetch process just got twenty-five new posts from a peer Mastodon node; I would like to store them!"). It acts as, among other things, an intermediary between the model and the view and knows when to do various things with each.

Flowchart of Mastodon server components: DB -> model -> controller -> view -> client

Why bother to separate the concerns? It makes it easier to adapt the server application to changes in circumstance. With the view separate from the controller, the controller can select different views depending on how the data should be presented (JSON vs. HTML; one could even support others if useful). Meanwhile, though it happens less often for an app like Mastodon, it could be the case that your entire database changes underneath you. It’s easier to adapt to a change like that if you’re just adjusting an adaptor layer between the database and your “business logic” data structures (in the controller) than if changing the database means you have to rewrite most of the controller.

Now that we’re oriented, we can hunt down where that note is stored. And, we will immediately get it wrong. ;)

Step 1: finding the note model

I don’t know much about the note attached to an account, but I can scan the code and see what I can find. I’ll start with just a bare keyword search from the top of the app:

git grep -rn note app/models/

app/models/account.rb:14:#  note                          :text             default(""), not null
app/models/account.rb:107:  validates :note, note_length: { maximum: MAX_NOTE_LENGTH }, if: -> { local?
 && will_save_change_to_note? }
app/models/account.rb:487:    note&.strip!
app/models/account.rb:509:    [note, display_name, fields.map(&:name), fields.map(&:value)].join(' ')

Okay! That looks promising. Let’s look in account.rb.

class Account < ApplicationRecord
  self.ignored_columns += %w(
    subscription_expires_at
    secret
    remote_url
    salmon_url
    hub_url
    trust_level
  )

In here, I see a lot of validates and scope method calls, to clarify how some of the data is handled when loaded / stored. In particular, :note has a validates against it being shorter than a maximum length on save.

  validates :note, note_length: { maximum: MAX_NOTE_LENGTH }, if: -> { local? && will_save_change_to_note? }

You won’t find a will_save_change_to_note? function defined anywhere in the codebase. Ruby on Rails loves doing automatic code generation and metaprogramming; that function exists because the database table itself has a note column. In general, reading a Rails codebase requires one to get comfortable with the fact that a lot of function calls won’t line up to anything that can be explicitly searched. Occasionally, I have to dive into the Rails source code itself to make a guess at what’s happening (ActiveRecord and its buddies).

We can keep going down this road, but I’m going to cut us off here because it turns out it won’t lead where we want. The note here is actually the note a user makes about themsevles on their own site. The UI (in English) calls this “Bio.”

The ‘Bio’ field. Here, Bob has set his bio to “I do some work for a big company.

In hindsight, this might have been obvious: the note field we found is part of the account, but the note a user sets on an account is information owned by that user, indexed by the account. If it lived in the Account table, the table would have a chunk of data that grew for every user on a Mastodon node that made a note on an account. Inefficient and error-prone for security (those notes should only be visible to the user who creates them!).

So we need to keep looking.

Step 1b: No really, where is the note model?

A trick I like to use when trying to figure something like this out is looking at what symbols are called in the JSON going over the wire. If I log in as the user that has the note identifying Bob as from acocunting and navigate to Bob’s account info page as that user, what does the server tell me?

Popping open the browser inspector and watching the network, I see that the HTML page loaded is a single-page app that fetches more data… But if I make a change to the note, my browser sends a POST request to http://localhost:3000/api/v1/accounts/111163837554817823/note. There is often a relationship between API POST handlers and controllers, so let’s look for a note controller… And we find one, at app/controllers/api/v1/accounts/notes_controller.rb.

This handler has some creation logic!

  def create
    if params[:comment].blank?
      AccountNote.find_by(account: current_account, target_account: @account)&.destroy
    else
      @note = AccountNote.find_or_initialize_by(account: current_account, target_account: @account)
      @note.comment = params[:comment]
      @note.save! if @note.changed?
    end
    render json: @account, serializer: REST::RelationshipSerializer, relationships: relationships_presenter
  end

Okay, here’s a smorgasbord of useful information.

AccountNote is the thing that lets us access a note stored against an account. Note how in find_by we provide both the current account and the target account; that’s how we key the information to keep it private per-user.

What is an AccountNote? The notes_controller.rb file won’t tell you. Rails instead leans heavily on standardizing the directory structure and adds all files in particular locations to an autoloading infrastructure provided by the Active Support framework. So when Ruby tries to pull in an AccountNote symbol the first time, it fails, but autoloading picks up, munges the TitleCase class name into snake_case, sneaks over to where an account_note.rb file lives, and real quick pops that open and interprets it, then it should have a definition for AccountNote.

A quick find starting at the app directory tells us the account_note.rb file happens to live at app/models/account_note.rb.

As a point of personal preference: I despise this pattern for discoverability; I much prefer very explicit references to needed symbols in my source code. But Rails has been around for decades now and I don’t expect any of that to change any time soon.

Within the account note, the comment field is where the text gets stored. The AccountNote model there is dirt-simple; comment has a max length of 2,000.

Step 2: Output the account note

Okay, a sketch of what we want to do here is coming together. We want to provide the comment from the account note when the user wants to look at a status, so the client can show the comment as part of the status’s source. So it’ll probably look something like this:

Fetch a post
Get the account information for that post
Get the AccountNote for that account and the current logged-in user
Return to the client hte post, the account info, and the comment

Using the “let the browser help us” trick from earlier, I load up my main timeline and pull in posts. I see that the posts come from a GET request to http://localhost:3000/api/v1/timelines/home. So is there a controller at app/controllers/api/v1/timelines/home_controller.rb? Yep!

There are a couple interesting bits in here.

The controller uses a before_action indicating :require_user!. So we can only fetch home timelines if we’re logged in.
the show method is called on GET requests. It pulls in the statuses and then creates something called a StatusRelationshipsPresenter. Hm, presenters, that’s new. Then it calls render passing:
- the statuses as the json: param,
- a REST::StatusSerializer as an each_serializer param (each_serializer basically tells render “Oi, that json you just got is gonna be an array of something. Serialize it by passing each element through this each_serializer and concatenating the results into a new array”).
- that status relationship presenter as a relationships parameter
- an HTTP status code

Cool, two new things introduced: presenters and serializers.

How a controller renders a response: feeds model to render, which uses the serializer. Serializers can call other serializers. A presenter can feed data from other models to the serializer

The status serializer lives in app/serializers/rest/status_serializer.rb. Glancing at it, it’s a little similar to the model; it maps between the data structure passed into it and how to turn that data structure into JSON. In particular, there are some interesting attribute and belongs_to method calls that set up filters for whether some attributes get reflected into the JSON on the wire and how the status binds to other data in other models (for example, a status belongs_to an account, so to send a status over the wire we will grab th account referenced by the status’s account_id field and serialize that account with the AccountSerializer.

Where is the presenter that got passed in via the @relationships parameter? Turns out, it gets tucked away in an instance_options hash; :relationships is the key we can look up the presenter in.

Looking at status_relationships_presenter.rb, we find it’s basically a utility that builds up some structured data across all statuses, such as consolidating conversation threads and dropping statuses that trip over a filter. Unfortunately, it doesn’t know anything about comments. But the neighboring AccountRelationshipsPresenter, if you provide it an account ID and a current account ID, does. The comments are a field in account_note_map.

So statuses know their accounts via a belongs_to relationship and the account relationship presenter knows the comment attached to an account if we are a logged-in user. Great! Sounds like we just want to augment the status serializer to build and pass an account relationship provider for each status, and we’re good.

… Well, we can’t. At least, not as far as I can tell.

There appears to be a hole in the feature set for Active Model Serializers. You can pass a presenter (or any other parameter) from the controller to a serializer via instance_options, but there is no optional parameter for belongs_to to understand that a recursive call to another serializer should carry a presenter with it. I’m not 100% sure, but after poring over both documentation and source code, I found nothing of the sort.

Speaking of documentation: the documentation on Active Model is garbage. That’s not a subjective observation: I mean that the documentaition for version 0.10.0 of the ActiveModel::Serializer class is mis-formatted and has “END SERIALIZER MACROS” where the documentation should be for the object, root, and scope instance attributes. I shouldn’t cast aspersions on a community, but I’m not as surprised as I want to be when the style guide for Ruby straight-up says this as “best practice” for documenting your code.

Text that says 'NO COMMENTS': Write self-documenteing code and ignore the rest of this section. Seriously! — LOOK at what Rubyists have been demanding your Resepct for all this time. They have played us for absolute fools.

In any case, I only have but so much patience for busted documentation and runtime-self-modifying metaprogramming, so let this box serve as the mea culpe that I gave up figuring out if there is an equivalent to instance_options for the belongs_to relationship across serializers, or even if instance_options is transparently passed.

I can’t tell.

Okay. So there’s no easy way to glue the note into the account when the account is serialized as a part of a Mastodon post. So, time to hack it.

The character 'Cereal Killer' from the movie 'Hackers'. Again. — Oh look he’s back.

Actual step 2: screw it, we’re denormalizing

The status serializer knows the relationship presenter and the account. The account serializer, in this context, doesn’t know the account relationship presenter.

Here’s what we’re going to do:

Modify StatusRelationshipsPresenter so it builds an AccountsRelationshipsPresenter. That presenter knows about the account for each status we want to display.
Use that accounts presenter to build a new attribute on StatusRelationshipsPresenter, :account_notes_map. This is a map from a status ID to the comment on the account that posted the status, if such a comment exists.
In the StatusSerializer, add a new account_note attribute and populate it using the presenter.

It’s not ideal that the account_note will be riding on the status and not the account, since it’s conceptually part of the account information. But it’ll let us push the data through to the client, and that’s ultimately what matters.

As this post is already pretty long, I’m going to rip through here. Here’s the relevant diff patches. Note that I left some log dumps in here (Rails.logger.info) that should really be deleted.

app/presenters/status_relationships_presenter.rb

@@ -4,7 +4,7 @@ class StatusRelationshipsPresenter
   PINNABLE_VISIBILITIES = %w(public unlisted private).freeze

   attr_reader :reblogs_map, :favourites_map, :mutes_map, :pins_map,
-              :bookmarks_map, :filters_map, :attributes_map
+              :bookmarks_map, :filters_map, :attributes_map, :account_notes_map

   def initialize(statuses, current_account_id = nil, **options)
     if current_account_id.nil?
@@ -14,6 +14,7 @@ class StatusRelationshipsPresenter
       @mutes_map      = {}
       @pins_map       = {}
       @filters_map    = {}
+      @account_notes_map = {}
     else
       statuses            = statuses.compact
       status_ids          = statuses.flat_map { |s| [s.id, s.reblog_of_id] }.uniq.compact
@@ -27,6 +28,8 @@ class StatusRelationshipsPresenter
       @mutes_map       = Status.mutes_map(conversation_ids, current_account_id).merge(options[:mutes_map] || {})
       @pins_map        = Status.pins_map(pinnable_status_ids, current_account_id).merge(options[:pins_map] || {})
       @attributes_map  = options[:attributes_map] || {}
+      @account_notes_map = build_account_notes_map(statuses, current_account_id).merge(options[:account_notes_map] || {})
+      Rails.logger.info "markt - for statuses #{statuses}\nbuilt account notes map #{@account_notes_map}!"
     end
   end

@@ -44,4 +47,15 @@ class StatusRelationshipsPresenter
       end
     end
   end
+
+  # Builds a map from status ID to account note
+  def build_account_notes_map(statuses, current_account_id)
+    account_ids = statuses.map {|status| status.account_id}
+    account_ids.uniq!
+    account_info = AccountRelationshipsPresenter.new(account_ids, current_account_id)
+
+    statuses.each_with_object({}) do |status, account_notes_map|
+      account_notes_map[status.id] = account_info.account_note.dig(status.account_id, :comment)
+    end
+  end
 end

app/serializers/rest/status_serializer.rb

@@ -8,6 +8,7 @@ class REST::StatusSerializer < ActiveModel::Serializer
              :uri, :url, :replies_count, :reblogs_count,
              :favourites_count, :edited_at

+  attribute :account_note, if: :account_note?
   attribute :favourited, if: :current_user?
   attribute :reblogged, if: :current_user?
   attribute :muted, if: :current_user?
@@ -50,6 +51,15 @@ class REST::StatusSerializer < ActiveModel::Serializer
     !current_user.nil?
   end

+  def account_note?
+    Rails.logger.info "Checking account note for status #{object.id}\nRelationships?#{!relationships.nil?}"
+    !((relationships&.account_notes_map&.dig(object.id)).nil?)
+  end
+
+  def account_note
+    relationships&.account_notes_map&.dig(object.id)
+  end
+
   def show_application?
     object.account.user_shows_application? || (current_user? && current_user.account_id == object.account_id)
   end

… finally, I caught one bug while making these changes. While the home controller passes a status relationships presenter through, the controller at app/controllers/api/v1/statuses_controller.rb doesn’t actually create a relationships presenter. So we add one.

app/controllers/api/v1/status_controller.rb

@@ -26,7 +26,13 @@ class Api::V1::StatusesController < Api::BaseController
   def show
     cache_if_unauthenticated!
     @status = cache_collection([@status], Status).first
-    render json: @status, serializer: REST::StatusSerializer
+    relationships = nil
+
+    unless current_user.nil?
+      relationships = StatusRelationshipsPresenter.new([@status], current_user&.account_id)
+    end
+
+    render json: @status, serializer: REST::StatusSerializer, relationships: relationships
   end

   def context

We plumb all that together, fire up the server, look at Bob’s timeline, and check the inspector to see what comes through at http://localhost:3000/api/v1/timelines/home…

The API response, showing the new account_note field

… success!

Next steps

Now that we have the data through, we can display it. Next post will be a brief detour into the trials and tribulations of setting up an actual dev environment for Mastodon. Then in part 3, we’ll edit the UI to actually display the account note.