Roy's Blog

A Hacker's musings on Code | Tech | Life

In my continuing efforts to entirely self host, fighting spam is hard. I originally configured SpamAssassin on my mail server quite a few years ago, and to be fair it has done it's job. But recently, more spam has been creeping through and my ever growing stack of addons (such as policyd-spf, OpenDKIM, OpenDMARC and others) to SA was eating quite a lot of memory on my poor server.

So I shopped around and found Rspamd. For my needs it sounded wonderful - no more need for MySQL (it's a hard dependency of OpenDMARC) as I much prefer PostgreSQL. SPF, DKIM and DMARC all integrated. Written in C and LUA which is a massive improvement over Perl and Python. Also sports a shiny Web UI to monitor the server and do basic config. Speaking of config, it's still not entirely easy, but it's much easier than configuring the stack I used to have! I did have to patch the build so that it works with OpenSSL-1.1 which is now in pkgsrc. All in all, I anticpated a nice memory reduction once I had it all configured. So far it's using about 200Mb less memory, but it's early days. How much better or worse than SA it is at actual spam filtering remains to be seem, but I have high hopes.

While here, I also replaced procmail with PigeonHole. I didn't really need to do this, but I thought "As I'm here.....". Actually the end result is much nicer as I now only have one Spam folder instead of another two Spam folders for training ham and spam. I just need to hook this final part into how I manage spam on my mlmmj email lists.

Continue reading...

[ERROR] Can't open and lock privilege tables: Got error 9 from storage engine

Nice error. Googling for it doesn't reveal much on how to fix it. The good news is that I only use MySQL for Phabricator and PostreSQL for everything else. The bad news is that my Phabricator instance is no longer working. The worse news is that I get the same error when trying to use backups, so there must be something else in play here.

Ideas on how to resolve this are welcome!

Continue reading...

I've been trying to run an IPv6 tunnel without much success - it's far to laggy to use for real work. So I've turned that off, and I just noticed I'm now getting an IPv6 Router Advertisement across my Super Hub3 in modem mode. I've gotten a default route AND a online prefix option to 2a02:8800:f000:2120::/64 (but sadly, no auto config flag). This prefix is owned by Virgin Media.

So, I can ping the router but nothing else as I don't have a public IPv6 IP address. No address via RA, no reply from my DHCPv6 solicitations - which is odd as the router says I can get a managed address and other information. Maybe they have yet to turn that part on? Please, turn it on soon Virgin!

Continue reading...

After waving a fond farewell to Fossil I give a hearty hello to Phabricator!

The Good

Phabricator is written in PHP which means I don't have to install Yet Another Framework. I use quite a few things that depend on PHP on this site already, such as Grav and RoundCube. So of course, it allows me to self host. Or you can rent a Phabricator VPS @ Phacility.

The sign up process (to my Phabricator instance, not somewhere else) is very straight-forward, allowing email/password with ReCaptha or use a OAuth2 provider such as Google. So this is very socially acceptable and should be secure from spambots.

The core work is based around the ease of code auditing and review of patches. There is even a pastebin so users can upload config files and logs for analysis. Doing all this in a mailing list over the years results in things being here, there and everywhere .... and then expiring. Having it all centralised means nothing is lost. But more importantly, it's much easier to look at and work with, so this is a massive quality of life improvement.

Tickets (or tasks in Phabricator) very user friendly, showing a collapsable history with full links to related objects such as commits, reviews, logs, etc. Infact the linking is extremely easy, one can reference some more of the popular objects by using a single letter follows by the id. Such as T1. Tickets can be related to one or more Projects and in turn Projects can display Tasks on a KanBan Board.

Phabricator can host your code in your SCM of choice for you and defaults to not allowing destructive changesets by default which saves me from messing around with custom hooks. This allows the same feature as Fossil's immutable history on the server - you can do what you like to your own clone still.

It's fast! No, it's not as fast as Fossil, but it's still more than fast enough especially when you consider the extra toys you get - syntax highlighting, desktop notifications (on supported browsers, which is most recent ones), user icons, in-depth tooltips. It's certainly faster than other solutions I've looked at recently and bar Fossil, probably the fastest.

You get a chat room (does require a NodeJS server on the host for automatic updates though it seems) and a wiki. I still use IRC on FreeNode, but the advantage here is that this is web based and persistent so you don't loose anything if you get disconnected. Still, unsure how useful either be as I don't recall users editing any publically editable wiki pages I've had over the years - are my man pages really that good? Heh.

The Bad

Phabricator is written in PHP. Now I did say that was a good thing earlier, but it's a double edged sword. PHP does have a bad reputation for both security and language structure. I would argue that this is no different from how C is today. This is also bad, because my site ran on PHP-7.0 and that was soooo much faster than earlier versions it was silly. But Phabricator didn't support PHP-7 until PHP-7.1 in early Feb this year. Something to think about for long term support, but this equally applies to other languages, especially the Python-2 vs Python-3 issue as my box has two Python versions due mainly to certbot needing Python-2.7

Phabricator requires MySQL (I installed MariaDB, the fork from MySql). I was very happy with PostgreSQL but my box does not have the resources to run both. Pretty much all other software I use allows the choice of DB, so this actually took me by surprise. And just like the PHP reaction others have, I was concerned by using MySQL, but as I'm not really into being a DBA I'm quite happy with MySQL so far.

The linking is really bad for DHCP, because we always talk about T1 and T2 as timers. This is important, because my main product is of course dhcpcd. In Phabricator T1 and T2 are shorthand to link to Task 1 and Task 2. You can fix this by stopping Phabricator from linking via a matchig regex, but I quite like the ease of use and solved the problem via changing the AUTO_INCREMENT value in some tables from 1 to 101. This reduces the potential collision with other things, such as Z1 and allows the same workflow. Upstream rejected my change and even went as far as to remove me posting my fix if anyone else has the same issue claiming this would make support hard. As it turns out, something with my change isn't quite right - either Phabricator or MySQL resets the AUTO_INCREMENT value. I don't know which one, or what action I did or if it's a general Garbage Collection going on. This could be why they didn't like the change, but heh ho most of the important tables now have values in at 101 and higher so it shouldn't be a problem anymore.

The Continuous Integration support is limiting, but it is there. Apparently you can at least call out to Jenkins or BuildBot.

Because Phabricator is based on and developed in a DevOps fashion, there is practically no support for managed releases or milestones. This isn't a problem for me as such, but I would like a feature to track important things that went into a release better.

The Ugly

Phabricator is NOT ugly. It's quite visually appealing. However, it is quite possibly the most complex installation I've ever done as it uses many databases and as many configuration options as sysctl on a good BSD. This wasn't helped by running on NetBSD -current and a gcc built PHP with Phabricator just didn't work and I spent a long time working out why. My fix was to build everthing with clang which required a lot of personal effort from me at the time due to the recent UEFI booting support breaking the build and a the new clang-4 compiler not working with the NetBSD build knobs I was using. On the plus side, the Phabricator docmentation is good and about 95% of the issues I had were easily searchable on StackOverflow or the (mostly) friendly Phabricator community helped me out in their chat channel - which oddly enough is also a Phabricator application.

Phabricator workflow with more than one dev, or the best way of submitting patches, is to use the Arcanist tool. They admit it's not great and things should be manageable directly through the SCM. We'll see how that progresses. In the meantime, posting patches to the Differential application is quite easy and allows easy patch review.

I had to stop using Fossil because Fossil is more than just a SCM - it strives to be a complete one stop solution. Obviously that won't work for the desire to use Phabricator for all the good reasons, so I needed to pick a SCM to use. Luckily Phabricator quite a few - GIT, Mercurial and SVN.

But what about the source code control?

It's importance cannot be understated - the code is everything, the history of the code is everything. This has been known since the dawn of time. At this point though, the SCM just becomes a tool in the box, just like sed.

Eh what?

Every SCM solution out there has pretty much the same set of basic features you need - atomic checkins (ok, CVS lacks this), changesets, branching, tagging. That's all you pretty much need at a basic level - the rest of the features are predominently driven by workflow.

Tools exist to export data from one to the other, and tools are being created to allow a more transparent bridge again making the choice of SCM even less important than it was before. The only real issue is the importance of meta data that has no place-holder in the other SCM you want to use. A good example of this would git the Author vs Commiter git attribute on the commit.

Then, you need to understand that the SCM is only for developers. End users don't care a hoot about it - what they do care about is an easy to use system which handles the lifetime of their issue where dicussion, patches, logs, reviews and audits can happen. Hopefully they can even get a fixed build at the end. This is basically part of Application Lifecycle Management.

Continue reading...

I've been using Fossil as my SCM for quite a few years now and it has served me well. It replaced my aging Trac (which I've now really retired in the recent server move ... it didn't move) + GIT setup. There is nothing inherently wrong with it and upstream are quite quick to resolve any issues. So lets start with a list of Fossil plus points, in no particular order:

  • BSD license.
  • One binary, easy installation, very low maintainence cost.
  • Integrated CGI web front end.
  • Integrated Wiki, Tickets - which are also distributed.
  • Sane command line UI.
  • Stores everything in a SQLite database.
  • Repository is not joined to the checkout, supports different checkout from the same cloned repository.
  • I have a Fossil commit bit - my change allows a near perfect Fossil <> GIT bridge.

And naturally, after many years of use, there are some negative points:

  • The ticketing system is very basic and has no email support - you're expected to use each tickets RSS feed, but this is not clear.
  • It's not extendable.
  • It's possible for an admin in the upstream repo to wipe out parts or the whole of your cloned repo.
  • It's not social.

That's actually a very small list of negative points. It shows that Fossil is a great product, with a great team behind it. Let's address these these negatives in more detail though.

The ticketing system is is poor

Yes, the tickets are distributed, but that's the only good point. The UI to progress the ticket needs a lot of work and is not intuitive to use. Tickets don't support markdown. It's not clear to the end user that the only feedback they get is a RSS feed. My initial attempt to fix this was about 3 years ago but was met with silence. I could try and improve this by creating a fossil branch just to add RSS icons to the ticket UI.

Fossil is not extendable

This isn't actually that bad, what it does have works well enough (aside from the ticketing). And to be fair, there is a 3rd party library to extend fossil but it doesn't seem to be used by anything I can find. However, based on recent experiences at my day job (where I don't use Fossil), code reviews are turning out to be quite critical and the tools we were using suck. Well, Fossil doesn't have any code review feature nor any easy way of hooking it into an automated build system for continuous integration So we're left reviewing changes via pastebin where links expire or diffs via email. Now diffs via email have been standard on many open source projects, and still are in many. Most of the time I can read them fine, but sometimes they are hard to review in an email. Sometimes I end up using a code review tool, loading the dhcpcd source code, copy and paste the diff into it, reviewing and then copying my review comments back into the email. This is hardly ideal and quite time consuming.

Fossil history can be wiped out

Fossil has an ability to delete anything from your cloned repo - it's called shunning. While this is an awesome feature for corporations (I work for one, I understand the problems and wish my day job had this feature), but equally I believe it's entirely un-suitable for open source use. This is my PC, it contains my contributions to a project which could wipe out my copy of said project and published contributions. History, gone. Now, it's entirely likely that this will never happen, I like to believe in the good in people, but the possibilty remains someone could push the button. OK, there's a bit more to it than that - the default fossil setting is to auto-sync the shun list. However, the code is disabled for auto-sync (ie sync on commit) but is enabled for the manual pull/push commands. While that removes the item from your checkout (if it's there), it won't actually remove it from the repository itself until the repository is rebuilt, which is sometimes forced on you when upgrading fossil. Yes, this is probably a knee-jerk reaction to a none-issue, but it still grates. This is also a reason why I love to self host and would never consider having GitHub or similar being the one sole place where I publish my work. I have always, and always will do, self host.

Fossil is not social

By it's very nature, you can't contribue to the club unless you're in the club - at least not using just Fossil. It's designed (from my perspective anyway) to be a distributed CVS/SVN + wiki + tickets. By this, I mean there is one master repository everyone clones from and pushes to. This makes it impossible to have my own branch outside of the main repo and publish it to others (equivalent of GIT Fork and Pull). It can also be argued that this is a good thing because it encourages people to work together and just like the prior point, this is a good feature for corporate setups. But equally sometimes someone needs to maintain a patchset unsuitable for upstream for valid reasons. This is rare, but it has happened. And I hate losing users for any reason. Could they have a branch they maintain in my repo? Quite possibly, but Fossil's security isn't that granular AFAIK and I would dislike someone messing around in the other branches. Maybe that's anti-social of me, but equally no-one has ever asked for commit access to my repos either.

EDIT: Dr Richard Hipp pointed out privately that this is The Cathedral and the Bazaar. Fossil is the pre-eminent solution for The Cathedral, while others are more suited to the Bazaar.

In summary

Taking the above into account, I can no longer justify the use of Fossil in my Open Source projects. For other projects, Fossil is still an awesome tool if that's all you need.

Continue reading...