Posts (What is this?)

Simple Haskell scripting

,

It's been quite some time since Martín Escardó told me about the somehow forgotten Haskell function

interact :: (String -> String) -> IO ()

What it does is that it takes the function String -> String and simply throws the entire program input into it and whatever it outputs produces as the program output. For example, the following Haskell program prints back the first 10 characters of its input.

main :: IO ()
main = interact (take 10)

This becomes really useful when chained with the lines :: String -> [String] and unlines :: [String] -> String functions. Then writing Haskell scripts that deal with text data, with entries split by lines, is just simple. The usual Haskell script then looks something like this.

main :: IO ()
main = interact pipe
where
pipe = unlines . map linepipe . lines

linepipe :: String -> String
linepipe = ... -- a function that handles a single line of input

There are quite a few Haskell scripting libraries out there and they get quite a bit of attention. However, I haven't seen many articles praising the simplicity and power of the interact+lines+unlines pattern.

As a bonus, here is one real-world example. Let's say we want to convert the CSV data of this form

Alice;travelling, maths;https://alice.crypto
Bob;espionage;
Jonáš;;
...

into an html of this form

<ul>
<li><a href="https://alice.crypto">Alice</a> (travelling, maths)</li>
<li>Bob (espionage)</li>
<li>Jonáš</li>
...
</ul>

The following is an easy script I wrote to do just that with interact. I decided to use the Data.Text.Lazy version of interact because I needed to deal with unicode characters properly. The benefit of the lazy version is that the script can handle inputs that don't fit into the memory.

#!/usr/bin/env runhaskell

{-# LANGUAGE OverloadedStrings #-}

import qualified Data.Text.Lazy as T
import qualified Data.Text.Lazy.IO as T

main :: IO ()
main = do
putStrLn "<ul>"
T.interact pipe
putStrLn "</ul>"
where
pipe :: T.Text -> T.Text
pipe = T.unlines . map linepipe . T.lines

linepipe :: T.Text -> T.Text
linepipe line =
"<li>" <> name <> hobbies <> "</li>"
where
(a:b:c:xs) = T.split (== ';') line

name | c == "" = a
| otherwise = "<a href=\"" <> c <> "\">" <> a <> "</a>"

hobbies | b /= "" = " (" <> b <> ")"
| otherwise = ""

To run it, simply type cat data.csv | ./script.sh. This should work provided that script.sh is executable and the package text is installed.

Exporting Mastodon (ActivityPub) posts

, ,

In the attempt to own the content I produce on the internet, I decided to move all my Mastodon posts to here. In order to do that I wrote a little Haskell script that takes the Mastodon exported files (it only reads outbox.json) and produces files in folders notes/, replies/, reposts/ with the correct frontmatter (according to the convention I adopted from IndieKit).

Here is the script, in case somebody finds it useful:

{-# LANGUAGE OverloadedStrings #-}
module Main where

import Data.Aeson
import Data.Time.Format.ISO8601
import Control.Monad
import Control.Applicative
import Data.List.Extra (split)
import Data.Time.Clock (UTCTime(..))
import qualified Data.ByteString.Lazy as BS
import qualified Data.Text as T
import qualified Data.Text.IO as T

data ActivityStreams = AS { orderedItems :: [ASItem] }
deriving (Show)

data ASItem = I
{ itemId :: String
, asObject :: ASObject
, published :: UTCTime
, to :: [String]
} deriving (Show)

data ASObject =
Note
{ url :: T.Text
, content :: T.Text
, inReplyTo :: Maybe T.Text
}
| Boost { boostUrl :: T.Text }
deriving (Show)

instance FromJSON ActivityStreams where
parseJSON (Object v) = AS <$> v .: "orderedItems"
parseJSON _ = mzero

instance FromJSON ASItem where
parseJSON (Object v) = I <$> v .: "id" <*> v .: "object" <*> v .: "published" <*> v .: "to"
parseJSON _ = mzero

instance FromJSON ASObject where
parseJSON (Object v) = Note <$> v .: "url" <*> v .: "content" <*> v .: "inReplyTo"
parseJSON (String t) = return $ Boost t
parseJSON _ = mzero


handleItem :: ASItem -> IO ()
handleItem item = do
let isoDate = iso8601Show $ published item
packedDate = T.pack isoDate
fileName = concat
[ take 10 isoDate -- extracts the YYYY-MM-DD part
, "-mastodon:"
, (split (== '/') $ itemId item) !! 6 -- extracts the Mastodon post id
, ".html"
]

case asObject item of
Note u c r -> do
let (folder, replyTo) =
case r of
Just replyUrl -> ("replies/", [ "in-reply-to: " <> replyUrl ])
Nothing -> ("notes/", [])
fullFileName = folder ++ fileName

putStrLn $ fullFileName

T.writeFile fullFileName $ T.unlines $
[ "---"
, "title: ''"
, "date: " <> packedDate
, "mastodon-original: " <> u
] ++ replyTo ++
[ "---"
, c
]

Boost u -> do
putStrLn $ "reposts/" ++ fileName

T.writeFile ("reposts/" ++ fileName) $ T.unlines
[ "---"
, "title: ''"
, "date: " <> packedDate
, "repost-of: " <> u
, "---"
]


main :: IO ()
main = do
contents <- BS.readFile "outbox.json"
let maybeAS = eitherDecode contents

case maybeAS of
Right as -> do
putStrLn "Parsed!"

let public = "https://www.w3.org/ns/activitystreams#Public"
followers = "https://mastodon.social/users/jaklt/followers"

filteredItems = filter (\it -> public `elem` to it || followers `elem` to it)
$ orderedItems as

forM_ (filteredItems) handleItem

Left err -> putStrLn err

One thing to note is that I also decided to publish posts that were originally available only to my followers. This is hardcoded in the url assigned to followers. If you also want to make those previously private posts available change followers to the corresponding url of your profile. Or remove the second branch of || in filteredItems if you only want to export publicly available posts.

Assuming that we saved the haskell file as export.hs then the cabal file export.cabal is as follows:

name:               export
version: 0.1.0.0
build-type: Simple
cabal-version: >= 1.10

executable export
main-is: export.hs
build-depends: base
, aeson
, time
, extra
, bytestring
, text
ghc-options: -threaded
default-language: Haskell2010

To export everything, it's enought to just run stack build followed by ./export.

The source files are also published at gist.github.com.

Welcome

I've built this website as an experiment, to gather my social activity on the internet. I believe that the internet should be inhabited by many small independent websites which communicate among themselves, as opposed to one or two omnipresent platforms like Facebook or Twitter that manage all our activity for us but also take away our freedoms. Chris Aldrich wrote a nice article about it here.

How does this website work? Typically I write new posts, like or bookmark in either Indigenous, Quill or Micropublish. Once I save, the content is sent to my Github repository via (an old version of) Indiekit. Afterwords, Netlify is notified, which then builds and serves the website. Works like charm ;-)

I don't want to write a tutorial on how to do it, there is a plenty of good resources out there. If you also want to try it out, the good place to start is indiewebify. One tip: start slowly, only build the basic functionality first, see how you like it and only after some time add some more.

This website is intentionally lightweight (in fact, it's less than 512 kB and also less than 250 kB).

Useful references