Yak shaving our SSO

Yes, that is a thing!

Well folks, the time had come to make a decent job of supporting our family on our homelab, and move away from big G. Having already sorted out basic observability with replicated logging and monitoring, plus cross site backup, it was time to put consistent user services in place.

Other articles will (probably) explain our setup for redundant email, calendar & contacts. We have alternate solutions for photos and music - Phil went minimal and just went for NAS shared to each player and added home brew face recognition. Stu was later to the party and went for immich & jellyfin respectively, (after using miniDLNA for several years.) Stu also has a home automation package called Domoticz which is quite widely used (just not as loudly trumpeted as Home Assistant!)

Of course, each service needs user authentication. To reduce friction with the family, we would really like to offer common credentials across all services, with single sign on for relevant services. Plus a password change UI. Hmm.

How hard can it be

We started with just chucking the relevant lines from /etc/passwd & /etc/shadow in git as we keep a shared repo in sync nightly between our two servers. This is ok for manual synchronisation, but leaves a potential big time gap in password sync if a user changes their credentials and neither of us are around to cut/paste the files! Also if we fat finger this operation we are in trouble… Finally, although our email software (courier-imap, exim4) can use OS authentication, other more 'cloudy' packages can’t access this so easily it seems.

So - time to put something more robust in place that we can leverage for all services…

Start from the OS level

PAM provides the Linux/Unix OS API for authentication, and there is good support for both LDAP (Lightweight Directory Access Protocol) and NIS (Network Information Services) back ends to provide centralised network data stores. (NIS was where I started back in about 1992 on HP-UX…) However, NIS is deprecated these days as unreliable and insecure, so… hello LDAP!

OpenLDAP FTW

OpenLDAP has been around for decades, and is a single binary (slapd) backed by a file store. So this is where we start our yak shaving… getting the package installed is trivial on both Debian and Ubuntu. Documentation is.. interesting! There has been quite an effort to update slapd recently, moving to an entirely new configuration structure stored in the database itself, and a new integration with PAM using an external daemon (nslcd). Hmm. You can find more on OpenLDAP roadmap here and PAM+LDAP design here.

It took some time to find the right steps to get a base configuration that we could ingest data into from the shadow file; since we need a migration and we don’t have clear text password credentials for our users (rightly so!) Once the migration scripts were written/executed, and the source data filed away in git, we were able to remove the users from the /etc files and hey presto! We could still log in! PAM+LDAP provides a neat separation of system accounts (generally those below 1000, created by package installers and the like directly in the OS files) from our new, portable user accounts.

Whilst getting setup, we decided to create an administrator user with permission to edit records. This avoids having to know or share knowledge of the rootDN password (that gives unlimited access to the data) with any client applications. We can then replicate the admin user's credentials. We also installed SSL/TLS certificates to provide secure access to clients if they needed it, using the system default ‘snakeoil’ keys :)

NB: We had already done the donkey work of syncing user and group IDs across our two servers - we were lucky to have very few clashes! Retro fitting common user credentials to disparate systems is always going to be messy… :)

Can I change my password?

We then needed to check what happened when a user executes the ‘passwd’ utility.. it works! Well, yes and no. The LDAP stored credentials did update, but they also became hashed with a different algorithm - one very old and from the distant past which is NOT secure :( It turns out there is basically a fight between which components create the hash; the LDAP client software, or the LDAP server itself. There is a 'new' method defined as part of an LDAP extension RFC 3062 that allows the LDAP server to create the hash according to a 'policy' set in the config. It turns out this policy is just two values (hash algorithm=CRYPT which directs slapd to the OS libcrypt() library, and salt format="$6$%.6s" which tells libcrypt to select SHA512 and a 6-digit salt), but they are stored in different database sections - sigh. The good news is that nslcd, part of the latest PAM+LDAP integration package (as of Q1 2026) already supports this method and infact does NOT support any other!

With this setup, we get reasonably secure crypto hashing of passwords. BUT the shadow file was using a “$y$…” salt format. This is a reference to yescrypt which is a very modern, secure algorithm, so why did we not set this up? Unfortunately, we cannot select this as yescrypt is not supported by some client tools yet. These tools do not understand what is in the password field and fail to display, check or update the password correctly. Notable among those tools is phpLDAPadmin, possibly the most widely used FOSS LDAP management tool out there; boo! So SHA512CRYPT with a decent salt it is.. We have checked that such hashes can be exported back into a shadow passwd file and still work, (as they should since this was the default for Debian systems before yescrypt was introduced,) which is one of our requirements - to be able to back out a user credential to the file-based PAM system at any time, just incase (!)

Can I have a self-service web UI with that?

OK, so the user can change their password from the shell prompt. But let's be honest, how many will really do that? We need a web UI for this. So began many hours of hunting around for LDAP client UI tools that supported this need.. and we finally stumbled onto the LDAP Tools Project. This is a fantastic resource that happens to include exactly what we need - a basic web UI for user password change. Even better, they make this available as either a Debian package (from their own repository) or a container image from docker.io. Nice :) Since we are at the 'throwing at the wall and seeing if it sticks' stage, we installed by container image. Now, had we done this after the discovery of the password hashing conundrum, it would have saved a lot of time! As it was, the tool pretty much immediately worked, but left some wierd password hash that was nothing like the shadow file syntax and would not work there. Here began the journey of discovery about the aforementioned RFC 3062 and allowing the server to do the hashing. The default settings in the client are to create the hash itself (which uses whatever libcrypt version and configuration is in the container image, not the OS one) and upload the string directly to the LDAP field. Oops. Fortunately, the tool has a config value the lets it use RFC 3062. Win!

Can I have an admin UI tool with that?

In addition to finding this great set of user tools, we came across a number of other LDAP 'admin' tools and we thought it would be prudent to have one of these handy as cryptic cmdline tools are an extra cognitive load during any debugging/emergency situation later.. we can recommend phpLDAPadmin however be aware of it’s limitation with yescrypt support. Otherwise it seems excellent, and is again available as Debian package or container image. We have the container version running right now..

Can we synchronise two servers?

Now, all this is very nice on one server, but what happens if it goes offline? How do our users log in on the other server to continue reading their email if the LDAP service is down? This is where we need replication, so we can run an LDAP server local to each site and keep them in sync as the data is updated.

Let's use OpenLDAP sync…

Those of you more familiar with OpenLDAP will perhaps know it already comes with replication tools… and indeed supports N-way synchronisation between 2 or more servers. Great! We tried to set this up; I’m sure it is possible somehow but we really struggled with the complexity of it. First it won’t work without SSL/TLS connections. Then it needs some Access Control List setup to allow servers to talk to each other. Then it needs a cron / systemd timer job to invoke the sync, or separate processes monitoring the DB for changes, then there is a queue of transactions to manage in a file, then… wow. All with cryptic command line tools. The list went on. Well done if you have set this up successfully on your site - we can see that for 1000's of accounts and 10's of servers it may prove worth the effort to understand and maintain. We didn’t bother; we felt this was too complex to manage and remember what to do in an emergency. It is also very opaque to monitor.

No :(

Let's use Syncthing instead…

We have already got reliable, cross-site file replication working nicely thanks to syncthing; a simple to manage tool available in most distributions with a simple to manage replication setup and decent web UI. Can we leverage this for LDAP sync? Why, yes of course! If one takes a close look at the built-in synchronisation, really all it is doing is exporting records when they change, queueing them for delivery to the other sites and importing them when they arrive. There is a lot of extra complexity in trying to export the minimal amount of changes and account for network failures, slower delivery, etc. There is also a big warning about 'split brain' situations where changes can occur in multiple servers to the same records at the same time (e.g. during a network outage) and resolving this later is a nightmare. Our situation is much simpler; we only expect the password field to change during self-service requests. This is the only thing we need to sync up. Any other change is manual administrator work and we can replicate it ourselves by whatever means we choose. So with this in mind, we can approach this in a very crude way:

run strictly in master/slave to avoid split brain. Only change passwords on the master (by only running the self-service web UI tool there.)
run a sync on a timer at the master - we chose 10 minutes as a reasonable time to replicate a password change. If the system fails in this window, the users will already know about it and can repeat the change action themselves if they wish on the recovered system.
export all the password hash fields at once - there are a few 10's of user accounts. No point being smart here, just dump them all out to temporary files.
identify each file with user’s DN from LDAP. This is guaranteed to be unique as it is also the database key.
check if each temporary file is different from the one currently being replicated - if yes, update the replicated copy. Then delete all the temporaries.
let syncthing resolve the replication across sites with version monitoring, queuing, recovery, etc.
run a monitor script at the receiver that checks each replicated file timestamp every few seconds. The number of files is tiny, so again no point being smart. Just check all timestamps
import the password hash field if a file updates.
to bootstrap the system, simply run the dump to make a initial dataset, then start the updater which will read the initial timestamps to monitor.
to restart either end, we just need a replicated file set locally to compare against. This should always be there once bootstrapped.
to force an update, simply 'touch' a file in the replicated set and it will be updated in the receiver.
if a user is added, their file will appear in the replicated set when dumped. New files will be imported at the receiver then monitored.
if a user is removed, their replicated file can simply be deleted and the receiver script restarted to remove the in-memory state.

Note that we do not have to care about the direction of sync with syncthing - it is N-way capable already. We simply either run the sync dump or the update scripts as required for master or slave status of the site. Having the replicated copies locally on the machine provides the startup state we need. This solution does depend on the system clock NOT going backwards between sync runs… but then lots of stuff will likely also break if the system clock is out by that much!

Yes!

Can we do SSO now?

No. LDAP provides a single reference source of authentication credentials and other user metadata to authentication tools like PAM. It does not provide a transferrable 'token' that can be used to log in to multiple services. That needs another technology layer…

eMail - LDAP

Some applications do not really support the idea of SSO, like a single email account. The access protocols here are IMAP and SMTP, both of which require a username/password credential in clear text. There is no 'token' style login, so LDAP has provided us a single point of control for the password but not single sign-on.

Our chosen email IMAP solution uses courier-imap which operates using the OS account login and home directories to locate the users Maildir folder. Similarly, our chosen email SMTP solution uses exim4 which also authenticates users using their OS account and home directory via PAM to locate any .forward files and to deliver email files to their Maildir.

This is the best we can do with eMail protocol level access that supports dedicated eMail client software like Thunderbird, Evolution, etc. We might be able to do better with a webmail client.. maybe. This is also why GMail desktop services are web based!

jellyfin part 1 - LDAP

jellyfin comes with an LDAP plugin. Yaay! Atfer some poking about, we made this work and could provide authentication externally to the jellyfin user database, so that was nice. But it's not SSO, only single password change. And we still need to make accounts in jellyfin first, then set their authentication check to LDAP…

immich part 1 - No…

immich comes with many plugins - none of which do LDAP authentication :(

Domoticz part 1 - No…

Domoticz does not come with any authentication plugins. It also does not use OS level credential checking :(

OK so how do we do SSO?

Single what now?

Single sign-on requires a token that can be trusted and passed around to different services as an indicator that the user has 'logged on' to a realm and can therefore be allowed access to all services that are also registered with the same realm. One technology layer that does this is Open Authorization, specifically version 2.0. OAuth2.0 supports a range of use cases and deliberately does not specify how client software can register with a realm nor exchange keys with the same. There is a layer above OAuth2.0 called Open ID Connect that provides usage profiles of OAuth2.0 and extends the specification so client software can interact with realm software and discover how to manage the user and how to check their token.

Clearly the client software is our service application, like jellyfin, immich and Domoticz.

So, what is this magic realm software? You may find it under a few different names:

OpenID Provider (OP)
Identity Provider (IdP)
Identity and Access Management (IAM) <– you've probably seen this one from AWS :)
Authorization server <– this is the OAuth terminology
OIDC server
OAuth server

…blah blah blah. There seems to be a lot of twaddle and marketing-speak around this stuff as it's all big corporates and big money stuff doing security!

We have evaluated a few software packages that can provide this service - we have (so far) picked Keycloak as the most reasonable looking balance of: it working without huge bootstrap pain, being too opaque to understand / too badly documented to configure in a few hours, too huge and cumbersome for our little servers to handle it or need all the things it does. Also it must be able to reference/import user credentials from LDAP and provide full OIDC, not just OAuth2.0.

We looked at:

TinyAuth - this is not an IAM and does not do OIDC. It might fit your need for login controls for a basic web site.
Authelia - this might be an IAM. When we had spent the best part of 3 hrs trying to get it even to start up, we moved on. Shame, it looked quite efficient!
Authentik - this is a whopper! It worked, but we struggled to get it to load users from LDAP. It sucked up resources. It's also quite commercial in $$$.
Keycloak - this is the best balance we think. I’d expected a JVM in a container to be slow as h*ll, but it’s really not! Also memory footprint is OK. UI is clean!

immich part 2 - OIDC

So.. onwards with Keycloak, in a container deployment again. There are guide websites to setup Authentik with immich; after a bit more yak shaving we got keycloak to connect up and do the token thing by following the Authentik guide and adjusting for keycloak’s terminology/UI shifts. Yaay! We also got it to import LDAP users after poking it in the right place (apparently this is 'user federation' these days…) and hooray! It can issue tokens from the LDAP credentials!

So we are done, right? For immich - pretty much yes!

Just a tweak to configure immich to allow new users automatically and that’s it. Once the users exist in immich you can share content/etc. as normal. Logout is clean and directs the user back to the Keycloak logout page. Great!

jellyfin part 2 - OIDC

More yak shaving. Jellyfin has a plugin that supports OIDC/OAuth2.0. It is 'alpha' by declaration of the owner/developer. It is however used in every example I have seen of how to enable OIDC for jellyfin! I guess it is the only one out there… surprisingly, it does work. The config UI is awful and cannot even show you the previous values you have entered. So you have to retype ALL the data every time you tweak something. Nice.

BUT… it works! Great! You have to make your own HTML/CSS login button by abusing the ‘branding’ area of the login screen :) Logout does not work; it returns to the jellyfin login screen but you can press ’login with keycloak’ and it goes straight back in. No realm logout is invoked. So I made a new ’logout from keycloak’ button on the login screen by abusing the branding HTML area a bit more… It's working OK. It creates new users automatically. Can I remember the options I chose when I entered the form data? NO :( Hope I wrote it down… (I did.)

Domoticz part 2 - OIDC Error…

Oh boy… (!) Domoticz have a funny 'take' on OIDC. Their wiki says they support it and they describe how to configure a client app to access the server, but they kinda don't say only if you use the Domoticz server as the IAM as well. They solved a different problem - how to enable Android and iOS phone apps to do OIDC authentication so they do not store passwords out there in the big, wide world. Seems a good starting place for using tokens, and credit for going the OIDC route. Shame it is unusable regard using external IAMs (yet) because it makes several assumptions about what must be included in the token…

We tried, for many hours, to understand the code paths in Domoticz. We can see it has a parser for JWT tokens which we know Keycloak issues; this follows the OIDC standard. We can also see it checks the token signature against a key stored in the Domoticz database, which is good! We managed to get the key installed from Keycloak and it worked! Yaay! But now the token is rejected because Domoticz uses the JWT claim fields (OAuth has wierd names for stuff) for unusual purposes. Vis:

aud string must contain one value and this must be the client_id which we can store in the Domoticz DB (as a username!) This is OK! We can do this!
signature must be PS256 algorithm - we can do this too, and we can import the public key into the client_id user record in the Domoticz DB. OK!
issuer URL must be “https://$hostname:$port/” of the Domoticz server. Can't use external IAM then? OOPS :(
subject string must be a valid username stored in the Domticz DB. Keycloak passes UUID values, not under our control. We can't do this. OOPS :(
key_id string must be an ASCII number that matches the username record in the Domoticz DB. This is NOT how key_id works in OIDC! OOPS :(

For these reasons, OIDC will currently not work with Domoticz and an external IAM. I have no doubt their own IAM can make a JWT token like they expected. This is no use to man or beast as Domoticz cannot import users from LDAP and we cannot shove in the password hashes ourselves as Domoticz uses MD5 hashing which we don't. Plus we want SSO, not just password changes.

No SSO for you:(

Domoticz part 3 - OAuth Proxy FTW

We do not give up easily. Domoticz supports other means to provide external credentials, notably it supports Basic Authorization (and ancient X.509 Authorization) headers. Nice! These accept a username and clear text password that must match the Domoticz DB user data. Awesome!

So.. is there a way to provide a web front-end that can handle OIDC SSO and generate an Authorization header from it, then proxy requests from our browser? You bet there is! Say hello to Oauth2 Proxy - the weapon of choice for adding OIDC capabilities to legacy apps!

This little beauty is available as a pre-built Go binary or container (you know which I chose by now…) and acts as an OIDC client to login to the realm and request the magic JWT token. Once done, it passes the token back to the web browser in a cookie and becomes a HTTP proxy. Now every time a request flows through, it checks the token is valid, does a renewal with the IAM if required, and passes the request on to the upstream web server you have protected. Nice. If the token expires or is not renewed, boom! 401 unauthorised for you! Job done, right? Yes and no…

Remember we need a Basic Authorization header. This needs a username and password that matches Domoticz. Well, it turns out that oauth2-proxy can generate just such a header for us. Great! Turns out it also uses the JWT subject claim field for the username. Not great. This is that pesky UUID we saw before. However, there is also an option to prefer the email string as the username - now we're talking :) We can configure users in Domoticz with email addresses for usernames. Done right? Not quite - we still need a password. Now, given we have done a real user login in the oauth2-proxy to get the JWT token, we can trust the username is legit. The password serves no purpose anymore as we have already authenticated this username. So we can make them ALL THE SAME and use a static string here to just get by the Domoticz password check logic. OK? Turns out oauth2-proxy has a configuration to do exactly that already! I love this little guy :)

So that is what we do now: create new users in Domoticz with email usernames and the same fixed string password. Configure the password in oauth2-proxy and send this in the Authorization header. Bingo! It WORKS! Hooray!

Domoticz part 4 - Logout how?

So login works fine, but we press Logout in the Domoticz and as if by magic, we go straight back to the Dashboard page! This is to be expected; the Domoticz service has no idea we are connecting via the oauth2-proxy and thus we remain logged in, sending the Authorization header. How do we log out of this thing?

The guide has some instructions for us: first we call the oauth2-proxy endpoint to remove the session, and this can redirect the browser to the Keycloak logout endpoint for us. Nice! BUT… Domoticz web UI is all JavaScript and API calls, not web pages at all! What to do?

More yak shaving followed - we tried to block the API calls only; this results in a screen saying 'Domoticz is Offline' which is hilarious, but useless. We tried to log out of Keycloak, but the proxy just carries on working until the JWT expires and won't refresh. Then we noticed Domoticz has a Custom HTML template facility we can abuse… hehe. A bit like the branding HTML area abuse in jellyfin, we can add a URL button to the Custom menu in Domoticz that calls the required logout URL and we’re done! Hooray!

Domoticz part 5 - restore the Nginx SSL Proxy

You thought we were done here :)

Some things are left to tidy up. Unlike all our other services which reside behind our Wireguard VPN and do NOT use SSL (why bother for LAN only?) Domoticz needs to be exposed to the Public Internet. This is so we can get to it without fussing about with VPNs… and so our phone apps work.

Prior to implementing SSO, we exposed the Domoticz port via an Nginx proxy that terminated the SSL connection and forwarded requests. This we continue to do for the legacy phone apps that cannot handle SSO. For interest and amusement, we would like to expose the web UI as a public end point. We may consider adding other services to the public rosta later…

So - oauth2-proxy comes with some guidance how to setup nginx to do the donkey work of the proxying, checking with the oauth-proxy endpoint if the request should be passed on or not. This is done using the auth_request module in nginx, which makes a sub request using the cookie from the browser and gets either 2xx or 401 status code in reply. We need to configure the oauth2-proxy to return our Authorization header in the auth check response, and we include this in the proxy request to Domoticz as before. We also need to adjust the redirect URL to return the browser to the public address, not the internal one when logging in. This means adding the URL to Keycloak and the whitelist in oauth2 also. Seems to work well…

BUT we can make it more efficient now we have a better proxy in nginx! If we review the Domoticz code paths, we see that only the API endpoint is checked for authorization. So we can replicate this in nginx with two location blocks, one for “/json” which catches the API calls, and one for “/” which passes all others. This is great BUT - now we get no redirect to login when we open the home page! So we need a third location header which catches ONLY the home page and redirects to the login endpoint for us. Did you know you can have location =/ in nginx which matches exactly the / request and nothing else? Cool!

..and that's how you shave the SSO yak.

(Ashbysoft *)

SSO Yak Shaving