Authoring blog posts in Obsidian #
I’m using Gitea, Drone, and Hugo to watch for commits to my Obsidian vault, extract blog posts, and publish them to one of my servers. I run my stuff on Digital Ocean droplets, and I use Caddy for serving static sites.
Why does it work? #
it’s cheap, fast, and simple. Self-hosting means I have more control over what gets published. This could all be accomplished with Github Actions, but I’d have to have separate vaults/repositories for public vs private content or I’d have to just make all my notes public.
Why doesn’t it work? #
My original selection of pipeline images and commands was inefficient, incurring unnecessary network traffic and relying on third party package mirrors that suddenly started performing very badly.
Another important detail is media: the directory structure for my Obsidian vault and my site are very different.
I want to write blog posts with screenshots, media files, and more. Obsidian lets you drag and drop attachments, or link them manually with links in the form ![[path/to/attachment.png]]
Finally, Hugo is a great static site generator, but there are better options when you’re looking to publish content authored in Obsidian. In particular, the graph view is something that I’d love to bring into my blog. Luckily, Quartz is built directly on top of Hugo and comes with a theme and some helper utilities
What are the options? #
The Requirements #
- attachment links must be transformed from
![[attachments/whatever.png]]
to![[notes/post-name/whatever.]]
- the site must be built with Quartz instead of Hugo
Transforming links #
The first choice is to whether I “fix” this during authoring, or during the publishing step. For the former, my options look something like this:
- manually typing the final URL into the note
- creating a complicated template system for generating Hugo shortcodes. in my head, this would use a prompter to let me select what attachment i want to insert, ask for resizing parameters, etc, and then generate a Hugo shortcode or an
<img>
tag.
None of these are satisfactory to me. I’d love to just drag and drop a piece of media into my note inside Obsidian and simply not have to think about it any further.
This leaves implementing something during the publishing pipeline. Now that I’ve got my drone pipeline working, it’s the perfect place to do transformations. This path presents a variety of possibilities falling on a spectrum somewhere between a bash script invoking sed and a custom ( Golang ) program that parses frontmatter, markdown, and applies pre-configured transformations.
Quartz #
The Quartz repo has a few built-in options for turning your notes into a website: a Dockerfile, a Makefile, and instructions on how to build everything from scratch. All of these are great, and I played with them all at different times to figure out which was a good fit.
Pipelines: More than meets the eye #
Unsurprisingly, I opted to extend my existing Drone pipeline with a transformer. This part of the pipeline has been in the back of my mind since the beginning, more or less, but it was much more important to get things stable first.
The pipeline I’m finally satisfied with looks like this, with checked boxes indicating what I had implemented at the start of this phase of the project.
- Create a temporary shared directory,
/tmp/blog
- Clone the vault repository
- do a
submodule
update and usegit-lfs
to pull down attachments - clone my forked Quartz repository into
/tmp/blog
- Copy posts from
$VAULT/Resources/blog/post-name.md
to/tmp/blog/content/notes/post-name/index.md
- Scan all
index.md
files in/tmp/blog/content/
for links that look like![[attachments/whatever.png]]
, findwhatever.png
and copy it into the/tmp/blog/content/notes/post-name/
directory for thatindex.md
. - Scan all
index.md
files in/tmp/blog/content/
for links that look like![[attachments/whatever.png]]
and edit them to![[notes/post-name/whatever.png]]
- Run the Quartz build command
- Copy the static site to destination web server
Hours and hours of debugging pipelines later #
Drone Volumes #
The linchpin of this whole operation is having a temporary workspace that all these tools can operate on in sequence. To that end, I used Drone’s Temporary Volumes to mount /tmp/blog
in all the relevant pipeline steps.
Creating a temporary volume looks like this. I really couldn’t tell you what temp:{}
is about, it certainly looks strange but I never had the spare cycles to investigate.
volumes:
- name: blog
temp: {}
Once you’ve created the volume, a pipeline step can mount it to a desired path. See below for an example of using your created volume.
Quartz #
Forking Quartz was easy, I’d done so late last year during another attempt to get this blog off the ground.
After a merge to get my fork up to date with upstream, I was able to slot this into the pipeline with the following.
- name: clone-quartz
image: alpine/git
volumes:
- name: blog
path: /tmp/blog
commands:
- git clone -b hugo https://github.com/therealfakemoot/quartz.git /tmp/blog
This sets the stage for building the site; this sets the stage for a step I implemented previously: ![[Resources/attachments/copy-posts-checkbox-screenshot.png]]
I opted to stop committing content to a blog repository and cloning the static site skeleton into the pipeline for a few reasons:
- I already have reproducibility by virtue of building things with docker and having sources of truth in git.
- It was an unnecessary layer of complexity
- It was an unnecessary inversion of control flow
Configuring Quartz had its rocky moments. I’ve had to wrestle with frontmatter a lot, confusing TOML and YAML syntaxes can break your build or break certain features like the local graph.
Gathering Media #
This step ended up being pretty fun to work on. I took the opportunity to write this in Go because I knew I could make it fast and correct.
The process is simple:
- Walk a target directory and find an
index.md
file - When you find an
index.md
file, scan it for links of the form[[attachments/whatever.png]]
- Find
whatever.png
in the vault’s attachments directory and copy it adjacent to its respectiveindex.md
file.
walkFunc
is what handles step 1. You call err := filepath.Walk(target, walkFunc(attachments))
and it will call your walkFunc
for every filesystem object the OS returns.
This piece of code checks if we’ve found a blog post and then chucks it to scanReader
.
func walkFunc(matchChan matches) filepath.WalkFunc {
return func(path string, info fs.FileInfo, err error) error {
if err != nil {
return nil
}
if info.IsDir() {
return nil
}
f, err := os.Open(path)
if err != nil {
return err
}
if strings.HasSuffix(path, "index.md") {
scanReader(f, path, matchChan)
}
return nil
}
}
scanReader
iterates line-by-line and uses a regular expression to grab the necessary details from matching links.
type Attachment struct {
Filename string
Note string
}
type matches chan Attachment
func scanReader(r io.Reader, path string, matchChan matches) {
log.Printf("scanning markdown file: %s", path)
pat := regexp.MustCompile(`\[\[(Resources\/attachments\/.*?)\]\]`)
s := bufio.NewScanner(r)
for s.Scan() {
tok := s.Text()
matches := pat.FindAllStringSubmatch(tok, -1)
if len(matches) > 0 {
log.Printf("media found in %s: %#+v\n", path, matches)
for _, match := range matches {
dirs := strings.Split(path, "/")
noteFilename := dirs[len(dirs)-2]
log.Println("noteFilename:", noteFilename)
matchChan <- Attachment{Filename: match[1], Note: noteFilename}
}
}
}
}
Finally, moveAttachment
receives a struct containing context ( the location of the index.md
file and the name of the attachment to copy ) and performs a copy.
func moveAttachment(att Attachment, dest string) error {
destPath := filepath.Jon(dest, strings.Split(att.Note, ".")[0])
log.Println("moving files into:", destPath)
_, err := copy(att.Filename, filepath.Join(destPath, filepath.Base(att.Filename)))
return err
}
func copy(src, dst string) (int64, error) {
sourceFileStat, err := os.Stat(src)
if err != nil {
return 0, err
}
if !sourceFileStat.Mode().IsRegular() {
return 0, fmt.Errorf("%s is not a regular file", src)
}
source, err := os.Open(src)
if err != nil {
return 0, err
}
defer source.Close()
destination, err := os.Create(dst)
if err != nil {
return 0, err
}
defer destination.Close()
nBytes, err := io.Copy(destination, source)
return nBytes, err
}
This ended up being the most straightforward part of the process by far. I packed this in a Dockerfile
, using build stages to improve caching.
FROM golang:latest as BUILD
WORKDIR /gather-media
COPY go.mod ./
# COPY go.sum ./
RUN go mod download
COPY *.go ./
RUN go build -o /bin/gather-media
Integration into the pipeline is here:
- name: gather-media
image: code.ndumas.com/ndumas/gather-media:latest
volumes:
- name: blog
path: /tmp/blog
commands:
- gather-media -target /tmp/blog/content/notes
Full code can be found here.
Transforming Links #
Link transformation ended up being pretty trivial, but it took way way longer than any of the other steps because of an embarrassing typo in a find
invocation. Another Docker image, another appearance of the blog volume.
The typo in my find
was using contents/
instead of content/
. My code worked perfectly, but the pipeline wasn’t finding any files to run it against.
- name: sanitize-links
image: code.ndumas.com/ndumas/sanitize-links:latest
volumes:
- name: blog
path: /tmp/blog
commands:
- find /tmp/blog/content/ -type f -name 'index.md' -exec sanitize-links {} \;
sanitize-links
is a bog-standard sed
invocation. My original implementation tried to loop inside the bash script, but I realized I could refactor this into effectively a map()
call and simplify things a whole bunch.
The pipeline calls find
, which produces a list of filenames. Each filename is individually fed as an argument to sanitize-links
. Clean and simple.
#! /bin/sh
echo "scanning $1 for attachments"
noteName=$(echo $1|awk -F'/' '{print $(NF-1)}')
sed -i "s#Resources/attachments#notes/$noteName#w /tmp/changes.txt" $1
cat /tmp/changes.txt
Lots of Moving Pieces #
If you’re reading this post and seeing images embedded, everything is working. I’m pretty happy with how it all came out. Each piece is small and maintainable. Part of me worries that there’s too many pieces, though. gather-media
is written in Go, I could extend it to handle some or all of the other steps.
For the future #
Things I’d like to keep working on
- include shortcodes for images, code snippets, and the like
- customize the CSS a little bit
- customize the layout slightly
Unsolved Mysteries #
- What does
temp: {}
do? Why is it necessary?