Exporting Mastodon (ActivityPub) posts

, ,

In the attempt to own the content I produce on the internet, I decided to move all my Mastodon posts to here. In order to do that I wrote a little Haskell script that takes the Mastodon exported files (it only reads outbox.json) and produces files in folders notes/, replies/, reposts/ with the correct frontmatter (according to the convention I adopted from IndieKit).

Here is the script, in case somebody finds it useful:

{-# LANGUAGE OverloadedStrings #-}
module Main where

import Data.Aeson
import Data.Time.Format.ISO8601
import Control.Monad
import Control.Applicative
import Data.List.Extra (split)
import Data.Time.Clock (UTCTime(..))
import qualified Data.ByteString.Lazy as BS
import qualified Data.Text as T
import qualified Data.Text.IO as T

data ActivityStreams = AS { orderedItems :: [ASItem] }
deriving (Show)

data ASItem = I
{ itemId :: String
, asObject :: ASObject
, published :: UTCTime
, to :: [String]
} deriving (Show)

data ASObject =
Note
{ url :: T.Text
, content :: T.Text
, inReplyTo :: Maybe T.Text
}
| Boost { boostUrl :: T.Text }
deriving (Show)

instance FromJSON ActivityStreams where
parseJSON (Object v) = AS <$> v .: "orderedItems"
parseJSON _ = mzero

instance FromJSON ASItem where
parseJSON (Object v) = I <$> v .: "id" <*> v .: "object" <*> v .: "published" <*> v .: "to"
parseJSON _ = mzero

instance FromJSON ASObject where
parseJSON (Object v) = Note <$> v .: "url" <*> v .: "content" <*> v .: "inReplyTo"
parseJSON (String t) = return $ Boost t
parseJSON _ = mzero


handleItem :: ASItem -> IO ()
handleItem item = do
let isoDate = iso8601Show $ published item
packedDate = T.pack isoDate
fileName = concat
[ take 10 isoDate -- extracts the YYYY-MM-DD part
, "-mastodon:"
, (split (== '/') $ itemId item) !! 6 -- extracts the Mastodon post id
, ".html"
]

case asObject item of
Note u c r -> do
let (folder, replyTo) =
case r of
Just replyUrl -> ("replies/", [ "in-reply-to: " <> replyUrl ])
Nothing -> ("notes/", [])
fullFileName = folder ++ fileName

putStrLn $ fullFileName

T.writeFile fullFileName $ T.unlines $
[ "---"
, "title: ''"
, "date: " <> packedDate
, "mastodon-original: " <> u
] ++ replyTo ++
[ "---"
, c
]

Boost u -> do
putStrLn $ "reposts/" ++ fileName

T.writeFile ("reposts/" ++ fileName) $ T.unlines
[ "---"
, "title: ''"
, "date: " <> packedDate
, "repost-of: " <> u
, "---"
]


main :: IO ()
main = do
contents <- BS.readFile "outbox.json"
let maybeAS = eitherDecode contents

case maybeAS of
Right as -> do
putStrLn "Parsed!"

let public = "https://www.w3.org/ns/activitystreams#Public"
followers = "https://mastodon.social/users/jaklt/followers"

filteredItems = filter (\it -> public `elem` to it || followers `elem` to it)
$ orderedItems as

forM_ (filteredItems) handleItem

Left err -> putStrLn err

One thing to note is that I also decided to publish posts that were originally available only to my followers. This is hardcoded in the url assigned to followers. If you also want to make those previously private posts available change followers to the corresponding url of your profile. Or remove the second branch of || in filteredItems if you only want to export publicly available posts.

Assuming that we saved the haskell file as export.hs then the cabal file export.cabal is as follows:

name:               export
version: 0.1.0.0
build-type: Simple
cabal-version: >= 1.10

executable export
main-is: export.hs
build-depends: base
, aeson
, time
, extra
, bytestring
, text
ghc-options: -threaded
default-language: Haskell2010

To export everything, it's enought to just run stack build followed by ./export.

The source files are also published at gist.github.com.

Responses (?)

Indieweb interactions: Like/Reshare/Reply/Bookmark with Quill or Like/Reshare/Reply/Bookmark with Micropublish.