Exporting Mastodon (ActivityPub) posts
In the attempt to own the content I produce on the internet, I decided to move all my Mastodon posts to here. In order to do that I wrote a little Haskell script that takes the Mastodon exported files (it only reads outbox.json
) and produces files in folders notes/
, replies/
, reposts/
with the correct frontmatter (according to the convention I adopted from IndieKit).
Here is the script, in case somebody finds it useful:
{-# LANGUAGE OverloadedStrings #-}
module Main where
import Data.Aeson
import Data.Time.Format.ISO8601
import Control.Monad
import Control.Applicative
import Data.List.Extra (split)
import Data.Time.Clock (UTCTime(..))
import qualified Data.ByteString.Lazy as BS
import qualified Data.Text as T
import qualified Data.Text.IO as T
data ActivityStreams = AS { orderedItems :: [ASItem] }
deriving (Show)
data ASItem = I
{ itemId :: String
, asObject :: ASObject
, published :: UTCTime
, to :: [String]
} deriving (Show)
data ASObject =
Note
{ url :: T.Text
, content :: T.Text
, inReplyTo :: Maybe T.Text
}
| Boost { boostUrl :: T.Text }
deriving (Show)
instance FromJSON ActivityStreams where
parseJSON (Object v) = AS <$> v .: "orderedItems"
parseJSON _ = mzero
instance FromJSON ASItem where
parseJSON (Object v) = I <$> v .: "id" <*> v .: "object" <*> v .: "published" <*> v .: "to"
parseJSON _ = mzero
instance FromJSON ASObject where
parseJSON (Object v) = Note <$> v .: "url" <*> v .: "content" <*> v .: "inReplyTo"
parseJSON (String t) = return $ Boost t
parseJSON _ = mzero
handleItem :: ASItem -> IO ()
handleItem item = do
let isoDate = iso8601Show $ published item
packedDate = T.pack isoDate
fileName = concat
[ take 10 isoDate -- extracts the YYYY-MM-DD part
, "-mastodon:"
, (split (== '/') $ itemId item) !! 6 -- extracts the Mastodon post id
, ".html"
]
case asObject item of
Note u c r -> do
let (folder, replyTo) =
case r of
Just replyUrl -> ("replies/", [ "in-reply-to: " <> replyUrl ])
Nothing -> ("notes/", [])
fullFileName = folder ++ fileName
putStrLn $ fullFileName
T.writeFile fullFileName $ T.unlines $
[ "---"
, "title: ''"
, "date: " <> packedDate
, "mastodon-original: " <> u
] ++ replyTo ++
[ "---"
, c
]
Boost u -> do
putStrLn $ "reposts/" ++ fileName
T.writeFile ("reposts/" ++ fileName) $ T.unlines
[ "---"
, "title: ''"
, "date: " <> packedDate
, "repost-of: " <> u
, "---"
]
main :: IO ()
main = do
contents <- BS.readFile "outbox.json"
let maybeAS = eitherDecode contents
case maybeAS of
Right as -> do
putStrLn "Parsed!"
let public = "https://www.w3.org/ns/activitystreams#Public"
followers = "https://mastodon.social/users/jaklt/followers"
filteredItems = filter (\it -> public `elem` to it || followers `elem` to it)
$ orderedItems as
forM_ (filteredItems) handleItem
Left err -> putStrLn err
One thing to note is that I also decided to publish posts that were originally available only to my followers. This is hardcoded in the url assigned to followers
. If you also want to make those previously private posts available change followers
to the corresponding url of your profile. Or remove the second branch of ||
in filteredItems
if you only want to export publicly available posts.
Assuming that we saved the haskell file as export.hs
then the cabal file export.cabal
is as follows:
name: export
version: 0.1.0.0
build-type: Simple
cabal-version: >= 1.10
executable export
main-is: export.hs
build-depends: base
, aeson
, time
, extra
, bytestring
, text
ghc-options: -threaded
default-language: Haskell2010
To export everything, it's enought to just run stack build
followed by ./export
.
The source files are also published at gist.github.com.
Responses (?)
JavaScript needs to be enabled to show responses. (Although, it is not necessary for posting them.)