Custom 404 Error Pages With Caddy V2

Getting Started

To provide a consistent basis to evaluate the configuration and behaviour of Caddy I will be leveraging Docker and a Caddy image produced by the Caddy team. At the time of writing caddy:2.0.0-alpine is a small footprint version of the latest release and should help keep the evaluation of configuration and behaviour consistent.

Lets create a simple index.html for Caddy to serve, and a minimal Caddyfile configuration to serve it.

# create a webpage
echo "hello world" > index.html

# create a file server Caddyfile config
cat >Caddyfile <<EOL
localhost

root * /usr/share/caddy/
file_server
EOL

We should now be able to start a Caddy process in the foreground using the caddy:2.0.0-alpine Docker image and see what happens:

# start Caddy using Docker in the foreground 
docker run --rm -p 80:80 -p 443:443 \
    -v $PWD/Caddyfile:/etc/caddy/Caddyfile \
    -v $PWD/index.html:/usr/share/caddy/index.html \
    caddy:2.0.0-alpine

The index.html file we created is shared to /usr/share/caddy/index.html, the path specified by the root directive in the Caddyfile configuration. The file_server (static file server) directive in the Caddyfile defaults to index.html and index.txt as index files.

Testing it out (in a separate terminal):

# -k allow any certificate
# -L follow redirects
# -D - dump headers to StdOut
curl -k -L -D - localhost/

HTTP/1.1 308 Permanent Redirect
Connection: close
Location: https://localhost
Server: Caddy
Date: ...
Content-Length: 0

HTTP/2 200 
accept-ranges: bytes
content-type: text/html; charset=utf-8
etag: ...
last-modified: ...
server: Caddy
content-length: 12
date: ...

hello world

We can see that the response to http://localhost is a 308 Permanent Redirect to https://localhost. Caddy has a built in feature for Automatic HTTPS which includes automatic redirects. Disabling Automatic HTTPS redirects is not possible in the 2.0.0 release via Caddyfile configuration but can be done via JSON Configuration. Finally, we see hello world - the contents of index.html.

Caddy Configuration Options

Caddy uses a JSON configuration structure which can be generated or adapted from other formats, the most prolific being the Caddyfile. A Caddyfile leverages a custom DSL to provide a human friendly syntax, which is ultimately translated into a more verbose and potentially less ambiguous configuration. We can use the built in Caddy Adapt command to generate a/the JSON representation of the Caddyfile we created previously:

$ cat Caddyfile
localhost

root * /usr/share/caddy/
file_server

# JSON output prettified for readability (e.g json_pp)
docker run --rm \
    -v $PWD/Caddyfile:/etc/caddy/Caddyfile \
    caddy:2.0.0-alpine \
    caddy adapt --config /etc/caddy/Caddyfile
{
  "apps": {
    "http": {
      "servers": {
        "srv0": {
          "listen": [
            ":443"
          ],
          "routes": [
            {
              "match": [
                {
                  "host": [
                    "localhost"
                  ]
                }
              ],
              "handle": [
                {
                  "handler": "subroute",
                  "routes": [
                    {
                      "handle": [
                        {
                          "handler": "vars",
                          "root": "/usr/share/caddy/"
                        },
                        {
                          "handler": "file_server",
                          "hide": [
                            "/etc/caddy/Caddyfile"
                          ]
                        }
                      ]
                    }
                  ]
                }
              ],
              "terminal": true
            }
          ]
        }
      }
    }
  }
}

This JSON configuration can be cross-referenced against the documentation for a precise explanation of all the details. The general gist is: listen on port 443, match a route when the request is to host localhost, handle that matched route by setting a variable root to /usr/share/caddy/ before asking the file_server to generate a response. The JSON format provides a configuration structure for the Automatic HTTPS feature. To disable Automatic HTTPS redirects we can make the following amendment:

    ...
    "srv0": {
        "automatic_https": {
          "disable_redirects": true
        },
        "listen": [
            ":443"
        ],
        ...

With the amended JSON configuration saved to config.json, we can test it out:

# name container caddy-test
docker run --rm -p 80:80 -p 443:443 --name caddy-test \
    -v $PWD/config.json:/etc/caddy/config.json \
    -v $PWD/index.html:/usr/share/caddy/index.html \
    caddy:2.0.0-alpine caddy run --config /etc/caddy/config.json

$ curl -k -L -D - http://localhost/
curl: (52) Empty reply from server

$ curl -k -L -D - https://localhost/
HTTP/2 200 
accept-ranges: bytes
content-type: text/html; charset=utf-8
etag: "qbdgzzc"
last-modified: ...
server: Caddy
content-length: 12
date: ...

hello world

By exec’ing into our Caddy container and using lsof we can see what ports Caddy is listening on:

$ docker exec -it caddy-test ash
/ apk add lsof
/ lsof -Pni
COMMAND PID USER   FD   TYPE DEVICE SIZE/OFF NODE NAME
caddy     1 root    3u  IPv4  43292      0t0  TCP 127.0.0.1:2019 (LISTEN)
caddy     1 root    7u  IPv6  44102      0t0  TCP *:443 (LISTEN)

Now automatic_redirects is disabled, we can see that Caddy is no longer listening on port 80. We can see *:443 as expected, but why is Caddy listening on 127.0.0.1:2019 ?

Caddy Admin API

By default, the Caddy v2.0.0 release enables an HTTP admin endpoint listening on 127.0.0.1:2019, unless told otherwise.

The Admin API provides methods to control the Caddy process and configuration. We can exec into the container and use curl to investigate the API.

Get a shell inside the caddy-test container we started running previously.

$ docker exec -it caddy-test ash

Install curl

apk add curl
...
OK: 7 MiB in 19 packages

Shut down the Caddy process via unauthenticated HTTP request:

curl -k -L -D - -X POST http://localhost:2019/stop
HTTP/1.1 200 OK
Date: ...
Content-Length: 0

The documentation states:

If you are running untrusted code on your server (yikes 😬), make sure you protect your admin endpoint by isolating processes, patching vulnerable programs, and configuring the endpoint to bind to a permissioned unix socket instead.

I think the documentation here is slightly understated and perhaps the decision to expose the configuration and process control of Caddy by default over HTTP without authentication is questionable. A typical (and perhaps old-fashioned) set up of a web server would use file permissions to control who can modify the configuration - and probably an init system to look after the web server process. The init system would probably require root permissions to start and stop the service.

Before moving on to configure a unix socket with restrictive file permissions as the Admin API endpoint, perhaps we should question whether having a configuration API is actually a good idea at all.

Configuration Reload Behaviour

What happens to existing connections when configuration is updated via Admin API request ?

To find out, lets hook up a PHP script that sleeps for 20 seconds before printing out “completed”. If we perform a config update via API whilst the request is in progress - we hope to get a “completed” response and a successful update of the configuration.

The index.php script:

cat >index.php <<EOL
<?php
sleep(20);
echo "completed";
EOL

The Caddyfile:

cat >Caddyfile <<EOL
localhost

root * /usr/share/caddy/
php_fastcgi 127.0.0.1:9000
EOL

Adapt Caddyfile to JSON:

docker run --rm \
    -v $PWD/Caddyfile:/etc/caddy/Caddyfile \
    caddy:2.0.0-alpine \
    caddy adapt --config /etc/caddy/Caddyfile \
    > config.json

Create a modified copy of config.json:

sed 's/"srv0"/"srv1"/' config.json > config2.json

Run Caddy (having stopped any previous instances running, ctrl-c):

docker run --rm -p 80:80 -p 443:443 --name caddy-test \
    -v $PWD/config.json:/etc/caddy/config.json \
    -v $PWD/config2.json:/etc/caddy/config2.json \
    -v $PWD/index.php:/usr/share/caddy/index.php \
    caddy:2.0.0-alpine caddy run --config /etc/caddy/config.json

Install curl, PHP-FPM in the running container and start PHP-FPM:

$ docker exec -it caddy-test ash
apk add curl php7-fpm && php-fpm7

Check PHP-FPM and the index.php script work

$ curl -k -L -D - https://localhost/index.php
HTTP/2 200 
content-type: text/html; charset=UTF-8
server: Caddy
x-powered-by: PHP/7.3.18
content-length: 9
date: ...

completed

Run a PHP request in one terminal and make a request to update the config via API in another terminal:

# Terminal 1
$ curl -k -L -D - https://localhost/index.php
...

# Terminal 2
$ docker exec -it caddy-test ash
curl -D - -X POST "http://localhost:2019/load" \
	-H "Content-Type: application/json" \
	-d @/etc/caddy/config2.json
HTTP/1.1 200 OK
Date: ...
Content-Length: 0
Connection: close

# Terminal 1
curl: (52) Empty reply from server

We did not get a “completed” response. Trying curl -k -L -D - https://localhost/index.php again should show that the configuration is not broken. Caddy just dropped our request when updating the configuration.

Why use the Admin API ?

I could appreciate live config changes via API being useful in some circumstances if existing connections where honoured and not just dropped. On a small scale - with a single web server buzzing along - perhaps a Red/Black or Blue/Green deployment model would not quite deliver on cost / benefit. But even then there are additional complexities added by the API mechanism. For example, when an update to configuration is made, there is a discrepancy between the in-memory configuration and the on-disk representation. The Caddy documentation does talk about a few ways to handle this:

Important note: This should be obvious, but once you use the API to make a change that is not in your original config file, your config file becomes obsolete. There are a few ways to handle this:
Use the –resume of the caddy run command to use the last active config.
Don’t mix the use of config files with changes via the API; have one source of truth.
Export Caddy’s new configuration with a subsequent GET request (less recommended than the first two options).

For my money the Admin API is just not a problem worth having.

Disabling the Admin API

The Admin API functionality can be disabled via a global option in the Caddyfile as well as via JSON configuration.

Caddyfile:

{
    admin off
}
localhost

root * /usr/share/caddy/
file_server

JSON:

{
  "admin": {
    "disabled": true
  },  
  "apps": {
    "http": {
  ...

Custom 404 error pages via Caddyfile configuration

There are 2 options presented in the documentation for configuraing a customised 404 page response from a file via Caddyfile configuration.

Option 1

handle_errors {
	rewrite * /{http.error.status_code}.html
	file_server
}

This option depends on a file existing named after the status code, e.g /404.html.

What happens if a file does not exist for a particular status code ?

Lets find out.

Create a Caddyfile that uses HTTP Basic authentication. We can the trigger a 401 status by providing incorrect authentication credentials.

# https://Bob:hiccup@localhost
cat >Caddyfile <<EOL
localhost

root * /usr/share/caddy/
file_server
basicauth /* {
    Bob JDJhJDEwJEVCNmdaNEg2Ti5iejRMYkF3MFZhZ3VtV3E1SzBWZEZ5Q3VWc0tzOEJwZE9TaFlZdEVkZDhX
}
handle_errors {
	rewrite * /{http.error.status_code}.html
	file_server
}
EOL

Create a 404 page:

echo "my 404 page" > 404.html

Run Caddy:

docker run --rm -p 80:80 -p 443:443 \
    -v $PWD/Caddyfile:/etc/caddy/Caddyfile \
    -v $PWD/index.html:/usr/share/caddy/index.html \
    -v $PWD/404.html:/usr/share/caddy/404.html \
    caddy:2.0.0-alpine

Make a request for a 404 URL with authentication:

curl -k -L -D - https://Bob:hiccup@localhost/sdsd
HTTP/2 404 
content-type: text/html; charset=utf-8
etag: "qbq8dvc"
server: Caddy
content-length: 12
date: ...

my 404 page

A 404 status code and the custom 404 page. Excellent.

Make a request for a 404 URL without authentication:

curl -k -L -D - https://localhost/sdsd
HTTP/2 200 
server: Caddy
www-authenticate: Basic realm="restricted"
content-length: 0
date: ...

A 200 status code ? Yes - the handle_errors directive has a behaviour defined where-by any error that is not matcheds by a route (matcher => handler) (i.e handled) in handle_errors will by default end up returning a 200 status code. This was confirmed by the author of Caddy as “working as expected”. I would suggest the default behaviour should be to return the status code that the handle_errors block was not configured to handle. We can add a rough equivalent using a handle block with a respond directive.

handle_errors {
    rewrite * /{http.error.status_code}.html
    file_server
    handle {
        respond "{http.error.status_code} {http.error.status_text}" {http.error.status_code}
    }
}

We now get a 401 unauthorised response as expected.

curl -k -L -D - https://localhost/sdsd
HTTP/2 401 
server: Caddy
www-authenticate: Basic realm="restricted"
content-length: 16
date: ...

401 Unauthorized

Option 2

handle_errors {
	rewrite * /error.html
	templates
	file_server
}

This option rewrites all error responses to a single file error.html. The templates directive provides an opportunity to leverage Caddy’s templating capabilities.

cat >Caddyfile <<EOL
localhost

root * /usr/share/caddy/
file_server

handle_errors {
	rewrite * /error.html
	templates
	file_server
}
EOL

We would now have a go text/template package based template (error.html) to maintain using an experimental feature of Caddy that can result in 200 status responses if there is a problem rendering the error template. Not ideal and access to HTTP status code from within the template context is not entirely forthcoming.

⚠️ Template functions/actions are still experimental, so they are subject to change.

Custom 404 error pages via JSON configuration

The root file path variable is set using the vars middleware, consumed by the file_server handler later.
The status_code variable is set using the vars middleware.
The status_code variable is used by a match block to select a rewrite handler to specify the custom 404 page file and a file_server handler to serve the file.

  "errors": {
    "routes": [
      {
        "handle": [
          {
            "handler": "vars",
            "root": "/usr/share/caddy/"
          },
          {
            "handler": "vars",
            "status_code": "{http.error.status_code}"
          }
        ]
      },
      {
        "match": [
          {
            "vars": {
              "status_code": "404"
            }
          }
        ],
        "handle": [
          {
            "handler": "rewrite",
            "uri": "/404.html"
          },
          {
            "handler": "file_server",
            "hide": []
          }
        ],
        "terminal": true
      }
    ]
  }