The Enterprise

Building a Personal Database with SQLite and Haskell (part 1)

Michael Pankov •

As some of you may know, I and Alexander Turok record a podcast. It's called "Code Speak Loop" and we discuss programming and related stuff.

To somehow manage the topics, guests and schedule, I decided to develop a small-ish tool to store all the planned podcasts in a sensible way.

To be able to easily sort and query the podcasts directory, we're going to use SQLite embedded database. And since I like Haskell, the program is going to be written in this language.

Second part of the post is available here.

Beware: this post is not very educational! It shows some quickly-written dirty Haskell code, which you may find both ugly and unconventional and hard to understand for a novice (and it is).
In case you don't want to follow-through, there's a GitHub repository with this source code.

I'm going to use GHC 7.6.3 on Ubuntu 13.10 x64, which I built from source myself using this instruction. To install SQLite, I had to run
apt-get install libsqlite3-dev sqliteman sqlite3
That also installs a GUI tool for managing SQLite databases: sqliteman. It's pretty straight-forward.

For GHC interfacing with SQLite, we're going to need HDBC and HDBC-sqlite3. But first, let's create the project and a cabal sandbox.
mkdir haskpod
cd haskpod
cabal init

>Package name? [default: haskpod]
>Package version? [default: 0.1.0.0]
>Please choose a license:
>...
>Your choice? [default: (none)] 10
>Author name? [default: Michael Pankov]
>Maintainer email? [default: ...@gmail.com]
>Project homepage URL?
>Project synopsis? A podcast management tool
>Project category:
>...
>Your choice? [default: (none)] 18
>What does the package build:
>...
>Your choice? 2
>What base language is the package written in:
>...
>Your choice? [default: Haskell2010]
>Include documentation on what each field means (y/n)? [default: n] y
>
>Guessing dependencies...
>
>Generating LICENSE...
>Warning: unknown license type, you must put a copy in LICENSE yourself.
>Generating Setup.hs...
>Generating haskpod.cabal...
>
>You may want to edit the .cabal file and add a Description field.
>
touch LICENSE
cabal sandbox init
cabal install --only-dependencies
Then let's create the simplest Main.hs
module Main where

main = putStrLn "Hello world"

And then we can build and run the project!
cabal configure
> ...
cabal build
> ...
cabal run
> ...
> Hello world
Now, to the database.

Let's start with main function, adapted from the cmdlib example.
main = getArgs >>= executeR Main {} >>= \opts -> do
let ts = start opts
Just t = (parseTime defaultTimeLocale "%D %R" ts) :: Maybe UTCTime
gs = csv $ guests opts
ps = csv $ topics opts
p = Podcast (episodeNumber opts) gs ps t
d <- doesFileExist databaseFilePath
when (not d) initDatabase
savePodcast p
print p
executeR is cmdlib's way of processing the arguments when using record style of arguments specification. "%D %R" format of time gives us "06/30/14 12:30 PM UTC".  start, guests, topics and episodeNumber are all defined in the cmdlib record type:
data Main = Main { episodeNumber :: Int
, guests :: String
, topics :: String
, start :: String }
deriving (Typeable, Data, Eq, Show)
I tried to use lists of strings for guests and topics, but it proved not worth it: simpler to just split the string we get from parser. The start time suffered the same problem, so I decided to just parse it myself.

The record fields we described are going to be bound to the actual options passed to program by Attributes instance and a RecordCommand instance from cmdlib:
instance Attributes Main where
attributes _ = group "Options" [
episodeNumber %> [ Help "Episode number", Default (1 :: Integer) ],
guests %> Help "Guests of the episode",
topics %> Help "Topics of the episode",
start %> [ Help "Start date and time"
, Default (Data.DateTime.fromGregorian 2014 1 1 0 0 0) ]
]

instance RecordCommand Main where
mode_summary _ = "Podcast management tool."

The defaults, where not specified, are empty.

Then, we need some extensions for cmdlib to work:
 {-# LANGUAGE FlexibleInstances, MultiParamTypeClasses, DeriveDataTypeable #-}
I didn't delve into them. It seems there's nothing as controversial as Template Haskell, so I just let it be.

I implemented a Podcast record type as well. It is of not much use right now, but it shows the structure of entity we want to store, more accurately.
data Podcast = Podcast { episodeNumber_ :: Int
, guests_ :: [String]
, topics_ :: [String]
, start_ :: UTCTime
}
deriving (Show)

After all this preparation, we can get to actual database initialization:
initDatabase = do
conn <- connectSqlite3 databaseFilePath
run conn ("CREATE TABLE podcasts" ++
" ( id INTEGER PRIMARY KEY ASC " ++
" , number INTEGER NOT NULL " ++
" , topics TEXT NOT NULL " ++
" , guests TEXT NOT NULL " ++
" , start_ INTEGER NOT NULL ) ") []
commit conn
disconnect conn
return ()

databaseFilePath = "podcasts.db"

Notice some interesting things:
  • We have to specify the id as INTEGER PRIMARY KEY ASC to be able to not specify it when doing INSERT later.
  • We have to commit the transaction. There's a nicer to do it: withTransaction, which provides context to wrapped IO action.
  • I used underscore in the end of start, since start was highlighted as a keyword when I tried the query manually. Didn't check it though!
Now we can implement podcast saving:
savePodcast p = do
conn <- connectSqlite3 databaseFilePath
run conn "INSERT INTO podcasts VALUES (NULL, ?, ?, ?, ?)"
[toSql $ episodeNumber_ p, toSql $ unlines $ topics_ p,
toSql $ unlines $ guests_ p, toSql $ start_ p]
commit conn
disconnect conn
return ()
Nothing particularly interesting. toSql uses Convertible instance to convert, pretty much, everything to everything. It even starts to feel like dynamic typing! :)

There's also small helper I used to separate the string of comma-separated guests' names and topics into lists of strings:
csv :: String -> [String]
csv s = case dropWhile isComma s of
"" -> []
s' -> w : csv s''
where (w, s'') =
break isComma s'
where isComma = (== ',')

It's a verbatim copy of words standard function, except for delimiter. Might be quite stupid to copy it like that.

That's all we need to implement saving of our naively defined Podcast records to SQLite. It took me a bit more than an hour, what probably supports the argument that Haskell is a great language for prototyping.

See the next post implementing showing the podcasts we saved. And stay tuned!



Just for reference: haskpod.cabal build-depends section:
  build-depends:       base == 4.6.*
-- we'll need to handle date and time of start of podcast recording
, time == 1.4.*
-- command line arguments handling
, cmdlib == 0.3.*
-- wrappers around time package
, datetime == 0.2.*
-- locale, for time formatting
, old-locale == 1.0.*
-- filesystem querying, i.e. does file exist?
, directory == 1.2.*
-- database interface stuff
, HDBC-sqlite3 == 2.3.*
, HDBC == 2.4.*
And the module header:
module Main where

import Control.Monad
import Data.Time
import Data.DateTime
import Database.HDBC
import Database.HDBC.Sqlite3
import System.Console.CmdLib (Attributes(..), RecordCommand(..), Typeable, Data,
Attribute(..), (%>), executeR, group, getArgs)
import System.Directory
import System.Locale


comments powered by Disqus