DS replication problems

My lack of experience with LDAP in general, and Sun's (iPlanet|Directory Server( Enterprise Edition)?) in particular, has proven to be a bit of a handicap of late.

Case in point: when I upgraded $big_machine to Solaris 10 at the end of August, I also upgraded its LDAP server from iPlanet 5.1 to DSEE 6 (same software, different name). At the time I had two problems: I was unable to get replication to $big_server (we have a multi-master configuration; not supposed to work with 5.1, but it does/did for us) working over SSL, and replication from $big_server to other machines did not work. There were a lot of things going wrong at that point, so I set up replication in the clear from $little_machine, another LDAP server on the LAN, and left it 'til I had more time. It wasn't ideal, but it would do.

The last two Saturdays I've been trying to figure out why replication wasn't working. I concentrated on getting replication to it working over SSL. This was tough, because the logs didn't tell me much:

Server failed to flush BER data back to client

I swear, this turned up more Googlejuice today than it did a few weeks ago, because this time it turned up the ever-excellent Brandon Hutchinson again. This time he had a truly great set of instructions on installing DSEE6. That lead me to this blog entry, very helpful, giving information about the different sorts of databases you can stick your SSL certs into. (Must learn more about SSL/OpenSSL…)

However, in the end it turned out to be a simple and moderately embarassing mistake: it's not enough, with DS6, to say dsadm add-cert and be done with it; you actually have to specify the certificate to use. As Brandon points out, you have to edit =dse.ldif= in order to do so (though I had to stop the server, edit the file and start it up again, rather than just edit and restart, in order to get it to work).

The other thing — replication from $big_server elsewhere — is still not working. I suspect this is my fault; in an attempt to get things working, I decided that the thing to try would be initializing $big_server from $little_server, then the other way around. This did not change things, and now $little_server is unable to push its changes elsewhere. I've since been told this is a mistake on my part; arghh.

Unfortunately, there were other things I screwed up in the original install of DS6 on $big_server — embarassing and rather pointless to record for Google right now — and I strongly suspect that I'm going to have to reinstall or reinitialize $big_server just to get things into a reasonably coherent state. Fortunately, there aren't that many changes that ever happen on it, so there shouldn't be many to lose or redo if it's wiped.

And thus my Saturday.