Exporting Mastodon (ActivityPub) posts

(link) , ,

In the attempt to own the content I produce on the internet, I decided to move all my Mastodon posts to here. In order to do that I wrote a little Haskell script that takes the Mastodon exported files (it only reads outbox.json) and produces files in folders notes/, replies/, reposts/ with the correct frontmatter (according to the convention I adopted from IndieKit).

Here is the script, in case somebody finds it useful:

{-# LANGUAGE OverloadedStrings #-}
module Main where

import Data.Aeson
import Data.Time.Format.ISO8601
import Control.Monad
import Control.Applicative
import Data.List.Extra (split)
import Data.Time.Clock (UTCTime(..))
import qualified Data.ByteString.Lazy as BS
import qualified Data.Text as T
import qualified Data.Text.IO as T

data ActivityStreams = AS { orderedItems :: [ASItem] }
deriving (Show)

data ASItem = I
{ itemId :: String
, asObject :: ASObject
, published :: UTCTime
, to :: [String]
} deriving (Show)

data ASObject =
{ url :: T.Text
, content :: T.Text
, inReplyTo :: Maybe T.Text
| Boost { boostUrl :: T.Text }
deriving (Show)

instance FromJSON ActivityStreams where
parseJSON (Object v) = AS <$> v .: "orderedItems"
parseJSON _ = mzero

instance FromJSON ASItem where
parseJSON (Object v) = I <$> v .: "id" <*> v .: "object" <*> v .: "published" <*> v .: "to"
parseJSON _ = mzero

instance FromJSON ASObject where
parseJSON (Object v) = Note <$> v .: "url" <*> v .: "content" <*> v .: "inReplyTo"
parseJSON (String t) = return $ Boost t
parseJSON _ = mzero

handleItem :: ASItem -> IO ()
handleItem item = do
let isoDate = iso8601Show $ published item
packedDate = T.pack isoDate
fileName = concat
[ take 10 isoDate -- extracts the YYYY-MM-DD part
, "-mastodon:"
, (split (== '/') $ itemId item) !! 6 -- extracts the Mastodon post id
, ".html"

case asObject item of
Note u c r -> do
let (folder, replyTo) =
case r of
Just replyUrl -> ("replies/", [ "in-reply-to: " <> replyUrl ])
Nothing -> ("notes/", [])
fullFileName = folder ++ fileName

putStrLn $ fullFileName

T.writeFile fullFileName $ T.unlines $
[ "---"
, "title: ''"
, "date: " <> packedDate
, "mastodon-original: " <> u
] ++ replyTo ++
[ "---"
, c

Boost u -> do
putStrLn $ "reposts/" ++ fileName

T.writeFile ("reposts/" ++ fileName) $ T.unlines
[ "---"
, "title: ''"
, "date: " <> packedDate
, "repost-of: " <> u
, "---"

main :: IO ()
main = do
contents <- BS.readFile "outbox.json"
let maybeAS = eitherDecode contents

case maybeAS of
Right as -> do
putStrLn "Parsed!"

let public = ""
followers = ""

filteredItems = filter (\it -> public `elem` to it || followers `elem` to it)
$ orderedItems as

forM_ (filteredItems) handleItem

Left err -> putStrLn err

One thing to note is that I also decided to publish posts that were originally available only to my followers. This is hardcoded in the url assigned to followers. If you also want to make those previously private posts available change followers to the corresponding url of your profile. Or remove the second branch of || in filteredItems if you only want to export publicly available posts.

Assuming that we saved the haskell file as export.hs then the cabal file export.cabal is as follows:

name:               export
build-type: Simple
cabal-version: >= 1.10

executable export
main-is: export.hs
build-depends: base
, aeson
, time
, extra
, bytestring
, text
ghc-options: -threaded
default-language: Haskell2010

To export everything, it's enought to just run stack build followed by ./export.

The source files are also published as gits.



I've built this website as an experiment, to gather my social activity on the internet. I believe that the internet should be inhabited by many small independent websites which communicate among themselves, as opposed to one or two omnipresent platforms like Facebook or Twitter that manage all our activity for us but also take away our freedoms. Chris Aldrich wrote a nice article about it here.

How does this website work? I use either Indigenous, Quill or Micropublish to write new posts. Once I save, the content is sent to my Github repository via Indiekit. Afterwords, Netlify is notified, which then builds and serves the website. Works like charm ;-)

I don't want to write a tutorial on how to do it, there is a plenty of good resources out there. If you also want to try it out, the good place to start is indiewebify. One tip: start slowly, only build the basic functionality first, see how you like it and only after some time add some more.

Useful references