Haskell Journal

Nov 27, 2024
Day 1
case args of
  ("--url" : url : _) -> ...
  _ -> ...
Day 2
Day 3
Day 4
Day 5
runApp :: AppM a -> IO (Either e a)
runApp app = do
	config <- ask -- gets the global app config
	runExceptT $ (runReaderT app config)
	-- runReaderT runs the app and extracts the inner monad (ExceptT e a)
	-- runExceptT then runs the inner monad and returns an IO (Either e a)
Right (Left <some_error_message>)
liftEitherAppM :: Either SomeException a -> AppM a
liftEitherAppM = either throwE return . lift

Update: In the morning, I had the idea of looking at opensource Haskell projects to see what patterns they use. I tried Hakyll and PostgREST: both the projects seem to be just passing config to the relevant functions, so I think I will simplify the app for now to just pass Config (the one that has connection pool so that my inner functions that deal with DB can get a connection to work with) to the functions instead of using the ReaderT monad transformer until I get some more ideas and understanding of the monad.

Day 6
type App a = Config -> ExceptT AppError IO a
extractData :: [Tag String] -> FeedItem
extractData tags =
  let title = getInnerText $ takeBetween "<title>" "</title>" tags
      linkFromYtFeed = extractLinkHref tags -- youtube specific
      link = case getInnerText $ takeBetween "<link>" "</link>" tags of
        "" -> Nothing
        x -> Just x
      pubDate = case getInnerText $ takeBetween "<pubDate>" "</pubDate>" tags of
        "" -> Nothing
        x -> Just x
      updatedDate = case getInnerText $ takeBetween "<updated>" "</updated>" tags of
        "" -> Nothing
        x -> Just x
      updated = pubDate <|> updatedDate
   in FeedItem {title = title, link = link <|> linkFromYtFeed, updated = fromMaybe "" updated}
Day 7
...
res <- try $ withResource connPool handleSomething :: IO (...)
...
where
  handleSomething :: IO ... -- this is where I play with and finalze the type till compiler stops complaining
  handleSomething = undefined
try' :: (String -> AppError) -> IO a -> IO (Either AppError a)
try' mkError action = do
  res <- (try :: IO a -> IO (Either SomeException a)) action
  pure $ case res of
    Left e -> Left . mkError $ show e
    Right a -> Right a
Day 8
> rss-digest add <url> # this adds an XML URL to the database, correctly showing an error if the URL is already added or it's an invalid URL.
> rss-digest refresh # this fetches all feed links from all the RSS feeds in the database, and then updates the feed_items table.
> rss-digest purge # nukes the whole thing

Update:

Day 9
Day 10
Day 11
parseDate datetime = fmap utctDay $ firstJust $ map tryParse [fmt1, fmt2, fmt3, fmt4, fmt5, fmt6]
   where
     fmt1 = "%Y-%m-%dT%H:%M:%S%z"
     fmt2 = "%a, %d %b %Y %H:%M:%S %z"
     fmt3 = "%a, %d %b %Y %H:%M:%S %Z"
     fmt4 = "%Y-%m-%dT%H:%M:%S%Z"
     fmt5 = "%Y-%m-%dT%H:%M:%S%Q%z"
     fmt6 = "%Y-%m-%dT%H:%M:%S%Q%Z"
     ...rest of the code
Day 12
Day 13
runApp :: App a -> IO ()
runApp app = do
  let template = $(embedFile "./template.html")
  rdigestPath <- lookupEnv "RDIGEST_FOLDER"
  case rdigestPath of
    Nothing -> showAppError $ GeneralError "It looks like you have not set the RDIGEST_FOLDER env. `export RDIGEST_FOLDER=<full-path-where-rdigest-should-save-data>"
    Just rdPath -> do
      pool <- newPool (defaultPoolConfig (open (getDBFile rdPath)) close 60.0 10)
      let config = Config {connPool = pool, template = BS.unpack template, rdigestPath = rdPath}
      res <- (try :: IO a -> IO (Either AppError a)) $ app config
      destroyAllResources pool
      either showAppError (const $ return ()) res
parseURL :: String -> Maybe URL
parseURL url = case parseURI url of
  Just uri -> (if uriScheme uri `elem` ["http:", "https:"] then Just url else Nothing)
  Nothing -> Nothing

Can be written as:

parseURL :: String -> Maybe URL
parseURL url = parseURI url >>= \uri -> if uriScheme uri `elem` ["http:", "https:"] then Just url else Nothing
getDomain :: Maybe String -> String
getDomain url =
  let maybeURI = url >>= parseURI >>= uriAuthority
   in maybe "" uriRegName maybeURI
Day 14
Day 15
Day 16

One of the things that has been bugging me since I wrapped most bits on the rdigest project was that I could not get it to work in a Github repo where I planned to run it on a cron so that digests get created automatically every day and hosted somewhere so I can read from anywhere.

My first attempts to do this was to upload my locally-built binary and see if that works in a GH action running on a macos-latest machine — it did not. I spent a tiny bit of time before my day-job took precedence and because I couldn't get it to work, I decided to park there and come back later.

Coming back later, I decided to make the rdigest repo build its binaries on ubuntu-latest and release the artefact in the repo. This was exceedingly simple (probably needed a couple of iterations to get the right configuration options for GHCup, Cabal, GHC, tagging etc.). This worked nice, and all that was left to do was to consume the release in my rdigest-data repo's cron action.

With a few iterations, I got all of it tied up. The cron ran once every day and updated the digest for a particular day. With GH Pages setup on that repo, I was able to just hit the URL and see a list of all digests and read each digest at leisure.

But this unearthed a new problem: since I ran the digest update just once, and the process would only update for "today" (whatever today was at the time of running the binary), some posts could "slip" the digest depending on the timing. I haven't spent much time thinking about the optimal approach to fixing this but in the meantime, it made sense to just update all digest-days once I have refreshed (and saved posts from) all feeds in my list. Ugly, nuclear solution but it ensures all my digest files are up-to-date.

Life is getting a lot in the way in recent times so there's been a pause in activity but I am itching to refine the rdigest codebase to elegantly separate out the functional and imperative bits. And also revisit what interesting things the project can spawn into and make it more useful than just producing digests.