Planet Linux Australia
Celebrating Australians & Kiwis in the Linux and Free/Open-Source community...

May 23, 2019

Installing Ubuntu 18.04 using both full-disk encryption and RAID1

I recently setup a desktop computer with two SSDs using a software RAID1 and full-disk encryption (i.e. LUKS). Since this is not a supported configuration in Ubuntu desktop, I had to use the server installation medium.

This is my version of these excellent instructions.

Server installer

Start by downloading the alternate server installer and verifying its signature:

  1. Download the required files:

     wget http://cdimage.ubuntu.com/ubuntu/releases/bionic/release/ubuntu-18.04.2-server-amd64.iso
     wget http://cdimage.ubuntu.com/ubuntu/releases/bionic/release/SHA256SUMS
     wget http://cdimage.ubuntu.com/ubuntu/releases/bionic/release/SHA256SUMS.gpg
    
  2. Verify the signature on the hash file:

     $ gpg --keyid-format long --keyserver hkps://keyserver.ubuntu.com --recv-keys 0xD94AA3F0EFE21092
     $ gpg --verify SHA256SUMS.gpg SHA256SUMS
     gpg: Signature made Fri Feb 15 08:32:38 2019 PST
     gpg:                using RSA key D94AA3F0EFE21092
     gpg: Good signature from "Ubuntu CD Image Automatic Signing Key (2012) <cdimage@ubuntu.com>" [undefined]
     gpg: WARNING: This key is not certified with a trusted signature!
     gpg:          There is no indication that the signature belongs to the owner.
     Primary key fingerprint: 8439 38DF 228D 22F7 B374  2BC0 D94A A3F0 EFE2 1092
    
  3. Verify the hash of the ISO file:

     $ sha256sum ubuntu-18.04.2-server-amd64.iso 
     a2cb36dc010d98ad9253ea5ad5a07fd6b409e3412c48f1860536970b073c98f5  ubuntu-18.04.2-server-amd64.iso
     $ grep ubuntu-18.04.2-server-amd64.iso SHA256SUMS
     a2cb36dc010d98ad9253ea5ad5a07fd6b409e3412c48f1860536970b073c98f5 *ubuntu-18.04.2-server-amd64.iso
    

Then copy it to a USB drive:

dd if=ubuntu-18.04.2-server-amd64.iso of=/dev/sdX

and boot with it.

Manual partitioning

Inside the installer, use manual partitioning to:

  1. Configure the physical partitions.
  2. Configure the RAID array second.
  3. Configure the encrypted partitions last

Here's the exact configuration I used:

  • /dev/sda1 is 512 MB and used as the EFI parition
  • /dev/sdb1 is 512 MB but not used for anything
  • /dev/sda2 and /dev/sdb2 are both 4 GB (RAID)
  • /dev/sda3 and /dev/sdb3 are both 512 MB (RAID)
  • /dev/sda4 and /dev/sdb4 use up the rest of the disk (RAID)

I only set /dev/sda2 as the EFI partition because I found that adding a second EFI partition would break the installer.

I created the following RAID1 arrays:

  • /dev/sda2 and /dev/sdb2 for /dev/md2
  • /dev/sda3 and /dev/sdb3 for /dev/md0
  • /dev/sda4 and /dev/sdb4 for /dev/md1

I used /dev/md0 as my unencrypted /boot partition.

Then I created the following LUKS partitions:

  • md1_crypt as the / partition using /dev/md1
  • md2_crypt as the swap partition (4 GB) with a random encryption key using /dev/md2

Post-installation configuration

Once your new system is up, sync the EFI partitions using DD:

dd if=/dev/sda1 of=/dev/sdb1

and create a second EFI boot entry:

efibootmgr -c -d /dev/sdb -p 1 -L "ubuntu2" -l \EFI\ubuntu\shimx64.efi

Ensure that the RAID drives are fully sync'ed by keeping an eye on /prod/mdstat and then reboot, selecting "ubuntu2" in the UEFI/BIOS menu.

Once you have rebooted, remove the following package to speed up future boots:

apt purge btrfs-progs

To switch to the desktop variant of Ubuntu, install these meta-packages:

apt install ubuntu-desktop gnome

then use debfoster to remove unnecessary packages (in particular the ones that only come with the default Ubuntu server installation).

Fixing booting with degraded RAID arrays

Since I have run into RAID startup problems in the past, I expected having to fix up a few things to make degraded RAID arrays boot correctly.

I did not use LVM since I didn't really feel the need to add yet another layer of abstraction of top of my setup, but I found that the lvm2 package must still be installed:

apt install lvm2

with use_lvmetad = 0 in /etc/lvm/lvm.conf.

Then in order to automatically bring up the RAID arrays with 1 out of 2 drives, I added the following script in /etc/initramfs-tools/scripts/local-top/cryptraid:

 #!/bin/sh
 PREREQ="mdadm"
 prereqs()
 {
      echo "$PREREQ"
 }
 case $1 in
 prereqs)
      prereqs
      exit 0
      ;;
 esac

 mdadm --run /dev/md0
 mdadm --run /dev/md1
 mdadm --run /dev/md2

before making that script executable:

chmod +x /etc/initramfs-tools/scripts/local-top/cryptraid

and refreshing the initramfs:

update-initramfs -u -k all

Disable suspend-to-disk

Since I use a random encryption key for the swap partition (to avoid having a second password prompt at boot time), it means that suspend-to-disk is not going to work and so I disabled it by putting the following in /etc/initramfs-tools/conf.d/resume:

RESUME=none

and by adding noresume to the GRUB_CMDLINE_LINUX variable in /etc/default/grub before applying these changes:

update-grub
update-initramfs -u -k all

Test your configuration

With all of this in place, you should be able to do a final test of your setup:

  1. Shutdown the computer and unplug the second drive.
  2. Boot with only the first drive.
  3. Shutdown the computer and plug the second drive back in.
  4. Boot with both drives and re-add the second drive to the RAID array:

     mdadm /dev/md0 -a /dev/sdb3
     mdadm /dev/md1 -a /dev/sdb4
     mdadm /dev/md2 -a /dev/sdb2
    
  5. Wait until the RAID is done re-syncing and shutdown the computer.

  6. Repeat steps 2-5 with the first drive unplugged instead of the second.
  7. Reboot with both drives plugged in.

At this point, you have a working setup that will gracefully degrade to a one-drive RAID array should one of your drives fail.

May 21, 2019

A nerd snipe, in which I learn to read gerber files

Share

So, I had the realisation last night that the biggest sunk cost with getting a PCB made in China is the shipping. The boards are about 50 cents each, and then its $25 for shipping (US dollars of course). I should therefore be packing as many boards into a single order as possible to reduce the shipping cost per board.

I have a couple of boards on the trot at the moment, my RFID attendance tracker project (called GangScan), and I’ve just decided to actually get my numitrons working and whipped up a quick break out board for those. You’ll see more about that one later I’m sure.

I decided to ask my friends in Canberra if they needed any boards made, and one friend presented with a set of Gerber CAM files and nothing else. That’s a pain because I need to know the dimensions of the board for the quoting system. Of course, I couldn’t find a tool to do extract that for me with a couple of minutes of Googling, so… I decided to just learn to read the file format.

Gerber is well specified, with a quite nice specification available online. So it wasn’t too hard to dig out the dimensions layer from the zipped gerber files and then do this:

Contents of file Meaning Dimensional impact
G04 DipTrace 3.3.1.2* Comment
G04 BoardOutline.gbr* Comment
%MOIN*% File is in inch units
G04 #@! TF.FileFunction,Profile* Comment
G04 #@! TF.Part,Single* Comment
%ADD11C,0.005512*% Defines an apperture. D11 is a circle with diameter 0.005512 inches
%FSLAX26Y26*% Resolution is 2.6, i.e. there are 2 integer places and 6 decimal places
G04* Comment
G70* Historic way of setting units to inches
G90* Historic way of setting coordinates to absolute notation
G75* Sets quadrant mode graphics state parameter to ‘multi quadrant’
G01* Sets interpolation mode graphics state parameter to ‘linear interpolation’
G04 BoardOutline* Comment
%LPD*% Sets the object polarity to dark
X394016Y394016D2* Set current point to 0.394016, 0.394016 (in inches) Top left is 0.394016, 0.394016 inches
D11* Draw the previously defined tiny circle
Y1194016D1* Draw a vertical line to 1.194016 inches Board is 1.194016 inches tall
X1931366Y1194358D1* Draw a line to 1.931366, 1.194358 inches
Board is 1.931366 inches wide (and not totally square)
Y394358D1* Draw a vertical line to 0.394358 inches
X394016Y394016D1* Draw a line to 0.394016, 0.394016 inches
M02* End of file

So this board is effectively 3cm by 5cm.

A nice little nerd snipe to get the morning going.

Share

May 20, 2019

Linux Security Summit 2019 North America: CFP / OSS Early Bird Registration

The LSS North America 2019 CFP is currently open, and you have until May 31st to submit your proposal. (That’s the end of next week!)

If you’re planning on attending LSS NA in San Diego, note that the Early Bird registration for Open Source Summit (which we’re co-located with) ends today.

You can of course just register for LSS on its own, here.

Gangscan 0.6 boards

Share

So I’ve been pottering away for a while working on getting the next version of the gang scan boards working. These ones are much nicer: thicker tracks for signals, better labelling, support for a lipo battery charge circuit, a prototype audio circuit, and some LEDs to indicate status. I had them fabbed at the same place as last time, although the service was much faster this time around.

A gang scan 0.6 board

I haven’t got as far as assembling a board yet — I need to get some wire thin enough for the vias before I can do that. I’ll let you know how I go though.

Share

May 19, 2019

Trigs map

Share

A while ago I had a map of all the trig points in the ACT and links to the posts I’d written during my visits. That had atrophied over time. I’ve just spent some time fixing it up again, and its now at https://www.madebymikal.com/trigs_map.html — I hope its useful to someone else.

Share

May 18, 2019

Trail run: Lake Tuggeranong to Kambah Pool (return)

Share

This wasn’t the run I’d planned for this day, but here we are. This runs along the Centenary Trail between Kambah Pool and Lake Tuggeranong. Partially shaded, but also on the quite side of the ridge line where you can’t tell that you’re near the city. Don’t take the tempting river ford, there is a bridge a little further downstream! 14.11km and 296 vertical ascent.

Be careful of mountain bikers on this popular piece of single track. You’re allowed to run here, but some cyclists don’t leave much time to notice other track users.

Share

Trail run: Tuggeranong Stone Wall loop

Share

The Tuggeranong Stone wall is a 140 year old boundary between to former stations. Its also a nice downhill start to a trail run. This loop involves starting at the Hyperdome, following the wall down, and the continuing along to Pine Island before returning. Partially shaded, and with facilities at the Hyperdome and Pine Island. 6km, and 68m vertically.

Share

Trail run: Barnes and ridgeline

Share

A first attempt at running to Barnes and Brett trigs, this didn’t work out quite as well as I’d expected (I ran out of time before I’d hit Brett trig). The area wasn’t as steep as I’d expected, being mostly rolling grazing land with fire trails. Lots of gates and now facilities, but stunning views of southern Canberra from the ridgeline. 11.11km and 421m of vertical ascent.

Share

Trail run: Pine Island South to Point Hut with a Hill

Share

This one is probably a little bit less useful to others, as the loop includes a bit more of the suburb than is normal. That said, you could turn this into a suburb avoiding loop quite easily. A nice 11.88km run with a hill climb at the end. A total ascent of 119 metres. There isn’t much shade along the run, but there is some in patches. There are bathrooms at Point Hut and Pine Island.

Be careful of mountain bikers on this popular piece of single track. You’re allowed to run here, but some cyclists don’t leave much time to notice other track users.

Share

Trail run: Cooleman Ridge

Share

This run includes Cooleman and Arawang trig points. Not a lot of shade, but a pleasant run. 9.86km and 264m of vertical ascent.

Share

Trail running guide: Tuggeranong

Share

I’ve been running on trails more recently (I’m super bored with roads and bike paths), but running on trails makes load management harder — often I’m looking for a run of approximately XX length with no more than YY vertical ascent. So I was thinking, maybe I should just write the runs that I do down so that over time I create a menu of options for when I need them.

This page documents my Tuggeranong runs.

NameDistance (km)Vertical Ascent (m)NotesPosts
Cooleman Ridge9.78264Cooleman and Arawang Trigs. Not a lot of shade and no facilities.25 April 2019
Pine Island South to Point Hut with a Hill11.88119A nice Point Hut and Pine Island loop with a hill climb at the end. Toilets at Point Hut and Pine Island. Not a lot of shade. Beware of mountain bikes!21 February 2019
Barnes and ridgeline11.11421Not a lot of shade and no facilities, but stunning views of southern Canberra.2 May 2019
Lake Tuggeranong to Kambah Pool (return)14.11296Partial shade and great views, but beware the mountain bikes!11 May 2019
Tuggeranong Stone Wall loop668Partial shade and facilities at the Hyperdome and Pine Island.27 April 2019

Share

May 09, 2019

Audiobooks – April 2019

Enlightenment Now: The Case for Reason, Science, Humanism, and Progress by Steven Pinker

Amazing good book, well argued and lots of information. The only downside is he talks to some diagrams [downloadable] at times. Highly Recommend. 9/10

A History of Britain, Volume : Fate of Empire 1776 – 2000 by Simon Schama

I didn’t enjoy this all that much. The author tried to use various lives to illustrate themes but both the themes and biographies suffered. Huge areas also left out. 6/10

Where Did You Get This Number? : A Pollster’s Guide to Making Sense of the World by Anthony Salvanto

An overview of (mostly) political polling and it’s history. Lots of examples for the 2016 US election campaign. Light but interesting. 7/10

Squid Empire: The Rise and Fall of the Cephalopods by Danna Staaf

Pretty much what the titles says. I got a little lost with all the similarly names species but the general story was interesting enough and not too long. 6/10

Apollo in the Age of Aquarius by Neil M. Maher

The story of the back and forth between NASA and the 60s counterculture from the civil rights struggle and the antiwar movement to environmentalism and feminism. Does fairly well. 7/10


Share

May 06, 2019

Visual Studio Code for Linux kernel development

Here we are again - back in 2016 I wrote an article on using Atom for kernel development, but I didn't stay using it for too long, instead moving back to Emacs. Atom had too many shortcomings - it had that distinctive Electron feel, which is a problem for a text editor - you need it to be snappy. On top of that, vim support was mediocre at best, and even as a vim scrub I would find myself trying to do things that weren't implemented.

So in the meantime I switched to spacemacs, which is a very well integrated "vim in Emacs" experience, with a lot of opinionated (but good) defaults. spacemacs was pretty good to me but had some issues - disturbingly long startup times, mediocre completions and go-to-definitions, and integrating any module into spacemacs that wasn't already integrated was a big pain.

After that I switched to Doom Emacs, which is like spacemacs but faster and closer to Emacs itself. It's very user configurable but much less user friendly, and I didn't really change much as my elisp-fu is practically non-existent. I was decently happy with this, but there were still some issues, some of which are just inherent to Emacs itself - like no actually usable inbuilt terminal, despite having (at least) four of them.

Anyway, since 2016 when I used Atom, Visual Studio Code (henceforth referred to as Code) came along and ate its lunch, using the framework (Electron) that was created for Atom. I did try it years ago, but I was very turned off by its Microsoft-ness, it seeming lack of distinguishing features from Atom, and it didn't feel like a native editor at all. Since it's massively grown in popularity since then, I decided I'd give it a try.

Visual Studio Code

Vim emulation

First things first for me is getting a vim mode going, and Code has a pretty good one of those. The key feature for me is that there's Neovim integration for Ex-commands, filling a lot of shortcomings that come with most attempts at vim emulation. In any case, everything I've tried to do that I'd do in vim (or Emacs) has worked, and there are a ton of options and things to tinker with. Obviously it's not going to do as much as you could do with Vimscript, but it's definitely not bad.

Theming and UI customisation

As far as the editor goes - it's good. A ton of different themes, you can change the colour of pretty much everything in the config file or in the UI, including icons for the sidebar. There's a huge sore point though, you can't customise the interface outside the editor pretty much at all. There's an extension for loading custom CSS, but it's out of the way, finnicky, and if I wanted to write CSS I wouldn't have become a kernel developer.

Extensibility

Extensibility is definitely a strong point, the ecosystem of extensions is good. All the language extensions I've tried have been very fully featured with a ton of different options, integration into language-specific linters and build tools. This is probably Code's strongest feature - the breadth of the extension ecosystem and the level of quality found within.

Kernel development

Okay, let's get into the main thing that matters - how well does the thing actually edit code. The kernel is tricky. It's huge, it has its own build system, and in my case I build it with cross compilers for another architecture. Also, y'know, it's all in C and built with make, not exactly great for any kind of IDE integration.

The first thing I did was check out the vscode-linux-kernel project by GitHub user "amezin", which is a great starting point. All you have to do is clone the repo, build your kernel (with a cross compiler works fine too), and run the Python script to generate the compile_commands.json file. Once you've done this, go-to-definition (gd in vim mode) works pretty well. It's not flawless, but it does go cross-file, and will pop up a nice UI if it can't figure out which file you're after.

Code has good built-in git support, so actions like staging files for a commit can be done from within the editor. Ctrl-P lets you quickly navigate to any file with fuzzy-matching (which is impressively fast for a project of this size), and Ctrl-Shift-P will let you search commands, which I've been using for some git stuff.

git command completion in Code

There are some rough edges, though. Code is set on what so many modern editors are set on, which is the "one window per project" concept - so to get things working the way you want, you would open your kernel source as the current project. This makes it a pain to just open something else to edit, like some script, or checking the value of something in firmware, or chucking something in your bashrc.

Auto-triggering builds on change isn't something that makes a ton of sense for the kernel, and it's not present here. The kernel support in the repo above is decent, but it's not going to get you close to what more modern languages can get you in an editor like this.

Oh, and it has a powerpc assembly extension, but I didn't find it anywhere near as good as the one I "wrote" for Atom (I just took the x86 one and switched the instructions), so I'd rather use the C mode.

Terminal

Code has an actually good inbuilt terminal that uses your login shell. You can bring it up with Ctrl-`. The biggest gripe I have always had with Emacs is that you can never have a shell that you can actually do anything in, whether it's eshell or shell or term or ansi-term, you try to do something in it and it doesn't work or clashes with some Emacs command, and then when you try to do something Emacs-y in there it doesn't work. No such issue is present here, and it's a pleasure to use for things like triggering a remote build or doing some git operation you don't want to do with commands in the editor itself.

Not the most important feature, but I do like not having to alt-tab out and lose focus.

Well...is it good?

Yeah, it is. It has shortcomings, but installing Code and using the repo above to get started is probably the simplest way to get a competent kernel development environment going, with more features than most kernel developers (probably) have in their editors. Code is open source and so are its extensions, and it'd be the first thing I recommend to new developers who aren't already super invested into vim or Emacs, and it's worth a try if you have gripes with your current environment.

May 05, 2019

Ignition!

Share

Whilst the chemistry was sometimes over my head, this book is an engaging summary of the history of US liquid rocket fuels during the height of the cold war. Fun to read and interesting as well. I enjoyed it.

Ignition! Book Cover Ignition!
John Drury Clark
Technology & Engineering
1972
214

Share

May 04, 2019

Codec2 and FreeDV Update

Quite a lot of Codec2/FreeDV development going on this year, so much that I have been neglecting the blog! Here is an update…..

Github, Travis, and STM32

Early in 2019, the number of active developers had grown to the point where we needed more sophisticated source control, so in March we moved the Codec 2 project to GitHub. One feature I’m enjoying is the collaboration and messaging between developers.

Danilo (DB4PLE) immediately had us up and running with Travis, a tool that automatically builds our software every time it is pushed. This has been very useful in spotting build issues quickly, and reducing the amount of “human in the loop” manual testing.

Don (W7DMR), Danilo, and Richard (KF5OIM) have been doing some fantastic work on the cmake build and test system for the stm32 port of 700D. A major challenge has been building the same code on desktop platforms without breaking the embedded stm32 version, which has tight memory constraints.

We now have a very professional build and test system, and can run sophisticated unit tests from anywhere in the world on remote stm32 development hardware. A single “cmake test all” command can build and run a suite of automated tests on the x86 and stm32 platforms.

The fine stm32 work by Don will soon lead to new firmware for the SM1000, and FreeDV 700D is already running on radios that support the UHSDR firmware.

FreeDV in the UK

Mike (G4ABP), contacted me with some fine analysis of the FreeDV modems on the UK NVIS channel. Mike is part of a daily UK FreeDV net, which was experiencing some problems with loss of sync on FreeDV 700C. Together we have found (and fixed) bugs with FreeDV 700C and 700D.

The UK channel is interesting: high SNR (>10dB), but at times high Doppler spread (>3Hz) which the earlier FreeDV 700C modem may deal with better due to it’s high sampling rate of the channel phase. In contrast, FreeDV 700D has been designed for moderate Doppler (1Hz), but heavily optimised for low SNR operation. More investigation required here with off air samples to run any potential issues to ground.

I would like to hear from you if you have problems getting FreeDV 700D to work with strong signals! This could be a sign of fast fading “breaking” the modem. By working together, we can improve FreeDV.

FreeDV in Argentina

Jose, LU5DKI, is part of an active FreeDV group in Argentina. They have a Facebook page for their Radio Club Coronel Pringles LU1DIL that describes their activities. They are very happy with the low SNR and interference rejecting capabilities of FreeDV 700D:

Regarding noise FREEDV IS IMMUNE TO NOISE, coincidentally our CLUB is installed in a TELEVISION MONITORING CENTER, where the QRN by the monitors and computers is very intense, it is impossible to listen to a single SSB station, BUT FREEDV LISTENS PERFECTLY IN 40 METERS

Roadmap for 2019

This year I would like to get FreeeDV 2020 released, and FreeDV 700D running on the SM1000. A bonus would be some improvement in the speech quality for the lower rate modes.

Reading Further

FreeDV 2020 First On Air Tests
Porting a LDPC Decoder to a STM32 Microcontroller
Universal Ham Software Defined Radio Github page

FreeDV 2020 First On Air Tests

Brad (AC0ZJ), Richard (KF5OIM) and I have been putting the pieces required for the new FreeDV 2020 mode, which uses LPCNet Neural Net speech synthesis technology developed by Jean-Marc Valin. The goal of this mode is 8kHz audio bandwidth in just 1600 Hz of RF bandwidth. FreeDV 2020 is designed for HF channels where SSB an “armchair copy” – SNRs of better than 10dB and slow fading.

FreeDV 2020 uses the fine OFDM modem ported to C by Steve (K5OK) for the FreeDV 700D mode. Steve and I have modified this modem so it can operate at the higher bit rate required for FreeDV 2020. In fact, the modem can now be configured on the command line for any bandwidth and bit rate that you like, and even adapt the wonderful LDPC FEC codes developed by Bill (VK5DSP) to suit.

Brad is working on the integration of the FreeDV 2020 mode into the FreeDV GUI program. It’s going well, and he has made 1200 mile transmissions across the US to a SDR using the Linux version. Brad has also done some work on making FreeDV GUI deal with USB sound cards that come and go in different order.

Mark, VK5QI has just made a 3200km FreeDV transmission from Adelaide, South Australia to a KiwiSDR in the Bay of Islands, New Zealand. He decoded it with the partially working OSX version (we do most of our development on Ubuntu Linux).

I’m surprised as I didn’t think it would work so well over such long paths! There’s a bit of acoustic echo from Mark’s shack but you can get an idea of the speech quality compared to SSB. Thanks Mark!

For the adventurous, the freedv-gui source code 2020 development branch is here). We are currently performing on air tests with the Linux version, and Brad is working on the Windows build.

Reading Further

Steve Ports an OFDM modem from Octave to C
Bill’s (VK5DSP) Low SNR Blog

May 02, 2019

Restricted Sleep Regime

Since moving down to Melbourne my poor sleep has started up again. It’s really hard to say what the main factor driving this is. My doctor down here has put me onto a drug free way of trying to improve my sleep, and I think I kind of like it, while it’s no silver bullet, it is something I can go back to if I’m having trouble with my sleep, without having to get a prescription.

The basic idea is to maximise sleep efficiency. If you’re only getting n hours sleep a night, only spend n hours  a night in bed. This forces you to stay up and go to bed rather late for a few nights. Hopefully, being tired will help you sleep through the night in one large segment. Once you’ve successfully slept through the night a few times, relax your bed time by say fifteen minutes, and get used to that. Slowly over time, you increase the amount of sleep you’re getting, while keeping your efficiency high.

 
Person T has had Person A design a one-page flyer and sent it to Person J... as a single image. Person T is two hours ahead, time-zone wise, and Person A is roughly 12 hours behind.

Person J also wishes to email out the flyer with hyperlinks on each of two names in the image.

Sent as a bare image, she will not fly.

Embedding the image in a PDF would allow only the entire image to possess a single hyperlink.

So... crank up GIMP, open image, select the Move tool, drag Guides from each Ruler to section up the image. Each Guide changes nothing, however its presence allows the Rectangle Select tool to be very precise and consistent.

Now File ⇒ Save the work-file in case you wish to adjust things for another round. Here, I have applied the Cubist tool from the Filters to most of the content, so the idea is conveyed without revealing details of said content.

The next step is to Rectangle Select the top area (in the screenshot above, the left-name area has been Rectangle Selected), then Copy it (Ctrl+C is the keyboard shortcut), then File ⇒ Create ⇒ From Clipboard (Ctrl+Shift+V is the shortcut) to make the copy into a new image, export that image (File ⇒ Export) as a PNG (lossless compression), repeat for the bottom area, then in the central section, for the left, left-name, centre, right-name, right areas.

Open LibreOffice Writer, Insert ⇒ Image the top-area image, right-click, choose Properties, under the Type tab make it “As character” under the Crop tab set the Scale so it will all fit nicely (58% in this case, which can be tweaked later to suit), OK. Click to the right of the image, press Shift+Enter to insert a NewLine (rather than a paragraph).

Now Insert ⇒ Image the centre left area, then left-name, centre, right-name, right. With the name areas (in this case) I also chose the Hyperlink tab within the Properties dialogue, and pasted the link into the URL field, making that image section click-able. When done, Shift+Enter to make a place for the bottom area.

Finally, Insert ⇒ Image the bottom-area image (and if it does not all butt up squarely, check (Format ⇒ Paragraph) that the Line Spacing for the document’s sole paragraph is set to Single). Now save (for the sake of posterior) and click the “Export as PDF” button.

April 30, 2019

Election Activity Bundle

With the upcoming federal election, many teachers want to do some related activities in class – and we have the materials ready for you! To make selecting suitable resources a bit easier, we have an Election Activity Bundle containing everything you need, available for just $9.90. Did you know that the secret ballot is an Australian […]

FreeDV QSO Party 2019 Part 2

Here is a fine summary of the FreeDV QSO Party 2019, which took place last weekend. I took part, and made several local and interstate contacts, plus listened in to many more.

Thanks so much to AREG for organising the event, and for all the Hams world wide who took part.

It would be great to make some international DX contacts using the mode, in particular to get some operational experience with the modems on long distance channels.

Generating Wideband Speech from Narrowband

I’m slowly getting up to speed on this machine learning caper. I had some free time this week, so I set myself a little machine learning exercise.

LPCNet represents speech using 18 bands that cover the range from 0 to 8000Hz (in the form of MFCCs). However the Wavenet work demonstrated high quality speech using just the Codec 2 2400 bit/s features, which only contain information in the 0 to 4 kHz range. This suggests we can regenerate the speech energy above 4000Hz, from the features beneath 4000Hz. In a speech coding application, this would save bits, as we no longer have to quantise and transmit the high frequency band energies.

So the goals of this project were:

  1. Gain experience in Machine Learning.
  2. Generate reasonable quality speech by synthesising the top 6 bands (3200 to 8000Hz) from the information in the lower 12 bands (which cover 0 to 2800Hz). Doesn’t have to be a perfect reconstruction, after all, we are throwing information away.

Method

As a starting point I set up a Keras model with a couple of Dense layers, and a linear output layer. The band features extracted by dump_data are the log10 of the band energies. Multiplying by 10 gives us the energy in dB. This is quite neat, as the network “loss” function (mean square error) effectively reports the distance in dB^2/10. This gives us a nice objective measure of distortion, and hopefully speech quality.

So the input to the network is the lower 12 bands, and the LPCNet pitch gain (a rough estimate of voicing). The output is the 6 high frequency bands.

I wanted to sanity check my network to make sure it was getting results better than random. So I reasoned a trivial algorithm would just set the HF band energies to their mean. The distortion of this trivial algorithm is the variance of the training data. I measured the variance of the HF bands as 200dB^2, and my first attempts were reducing this to 50dB^2, a factor of four improvement. So that’s a good start.

I messed about with the number and size of layers, activation functions, optimisers, batch size, epochs, which produced minor changes. On a whim I decided to remove the mean and that significantly reduced the error to 12dB^2, or a rms error of 3.5dB. That’s within the range of what a (coarse) vector quantiser will achieve – and we are doing it with zero bits.

BTW – is it just me or is NN design just guesswork? Maybe it becomes educated guesswork as experience grows.

The mean of the bands is the related to frame energy, and is often sent to the decoder in some form as part of regular quantisation. So is the pitch gain (e.g. in the form of a voicing flag). However this NN only seems to work well when it’s the mean of all 18 bands.

Testing

I used a “vanilla” LPCNet system to test, slightly modified to output band energies rather than the DCT of the band energies. First I generate features using dump_data, then play the features straight into test_lpcnet to synthesise. There is no quantisation. I placed my HF regeneration network in between, optionally replacing the top 6 bands with the estimates from the network.

I use a small database (1E6 vectors) for LPCNet experimentation, as this trains LPCNet in just a few hours. I determined by experiment this was just large enough to synthesise good quality speech on samples from within the training database. I have a script (tinytrain.sh) that trains the network, and generates test samples, automating the process. If my experimental algorithms don’t work with this tiny database, no point going any further, and I haven’t wasted too much time. If the experiments work out, I can then train a better network, e.g. to deal with many different speakers and conditions.

So I used the same 1E6 size database of vectors for training the high band regeneration algorithm. The file are available from my LPCNet Github repo (train_regen.py, test_regen.py, run_regen.sh, plot_regen.m).

Samples

Condition Female Male
Original Play Play
Vanilla LPCNet Play Play
Regen HF with mean removed Play Play
Regen HF without mean removed Play Play

The mean removed samples sound rather close to the vanilla. Almost too close. In the other samples without mean removal, the high frequencies are not regenerated as well, although they might still be useful in a low bit rate coding scenario, given they cost zero bits.

Here is a 3D plot of the first 1 second (100 x 10ms frames) of the bands for the female sample above. The little hills roll up and down as the words are articulated. At the high frequency end, the peaks tend to correspond to consonants, e.g. the “ch” in birch is in the middle. Right at the end (around frame 100 at the “rear”), we can see thet start of “sss” in slid

The second plot shows the error of the regeneration network:

Below is a plot of a single frame, showing the original and regenerated bands;

Conclusion

Anyhoo, looks like the idea has promise, especially for reducing the bit rate required for speech codecs. Further work will show if the idea works for a wider range of speech samples, and with quantisation. The current model could possibly be improved to take into account adjacent frames using a covnet or RNN. OK, on with the next experiment …..

LUV May 2019 Main Meeting: Kali Linux

May 7 2019 19:00
May 7 2019 21:00
May 7 2019 19:00
May 7 2019 21:00
Location: 
Kathleen Syme Library, 251 Faraday Street Carlton VIC 3053

PLEASE NOTE LATER START TIME

7:00 PM to 9:00 PM Tuesday, May 7, 2019
Kathleen Syme Library, 251 Faraday Street Carlton VIC 3053

Speakers:

  • errbufferoverfl: Kali Linux

Kali Linux

Many of us like to go for dinner nearby after the meeting, typically at Brunetti's or Trotters Bistro in Lygon St.  Please let us know if you'd like to join us!

Linux Users of Victoria is a subcommittee of Linux Australia.

May 7, 2019 - 19:00

read more

LUV May 2019 Workshop

May 25 2019 12:30
May 25 2019 16:30
May 25 2019 12:30
May 25 2019 16:30
Location: 
Infoxchange, 33 Elizabeth St. Richmond

PLEASE NOTE CHANGE OF DATE DUE TO THE ELECTION

This month's meeting will be on 25 May rather than 18 May 2019 due to the election on the usual workshop date.

Please contact the committee at luv-ctte@luv.asn.au with suggestions for topics.

The meeting will be held at Infoxchange, 33 Elizabeth St. Richmond 3121.  Late arrivals please call (0421) 775 358 for access to the venue.

LUV would like to acknowledge Infoxchange for the venue.

Linux Users of Victoria is a subcommittee of Linux Australia.

May 25, 2019 - 12:30

read more

April 27, 2019

Building new pods for the Spectracom 8140 using modern components

I've mentioned a bunch of times on the time-nuts list that I'm quite fond of the Spectracom 8140 system for frequency distribution. For those not familiar with it, it's simply running a 10MHz signal against a 12v DC power feed so that line-powered pods can tap off the reference frequency and use it as an input to either a buffer (10MHz output pods), decimation logic (1MHz, 100kHz etc.), or a full synthesizer (Versa-pods).

It was only in October last year that I got a house frequency standard going using an old Efratom FRK-LN which now provides the reference; I'd use a GPSDO, but I live in a ground floor apartment without a usable sky view, this of course makes it hard to test some of the GPS projects I'm doing. Despite living in a tiny apartment I have test equipment in two main places, so the 8140 is a great solution to allow me to lock all of them to the house standard.


(The rubidium is in the chunky aluminium chassis underneath the 8140)

Another benefit of the 8140 is that many modern pieces of equipment (such as my [HP/Agilent/]Keysight oscilloscope) have a single connector for reference frequency in/out, and should the external frequency ever go away it will switch back to its internal reference, but also send that back out the connector, which could lead to other devices sharing the same signal switching to it. The easy way to avoid that is to use a dedicated port from a distribution amplifier for each device like this, which works well enough until you have this situation in multiple locations.

As previously mentioned the 8140 system uses pods to add outputs, while these pods are still available quite cheaply used on eBay (as of this writing, for as low as US$8, but ~US$25/pod has been common for a while), recently the cost of shipping to Australia has gone up to the point I started to plan making my own.

By making my own pods I also get to add features that the original pods didn't have[1], I started with a quad-output pod with optional internal line termination. This allows me to have feeds for multiple devices with the annoying behaviour I mentioned earlier. The enclosure is a Pomona model 4656, with the board designed to slot in, and offer pads for the BNC pins to solder to for easy assembly.



This pod uses a Linear Technologies (now Analog Devices) LTC6957 buffer for the input stage replacing a discrete transistor & logic gate combined input stage in the original devices. The most notable change is that this stage works reliably down to -30dBm input (possibly further, couldn't test beyond that), whereas the original pods stop working right around -20dBm.

As it turns out, although it can handle lower input signal levels, in other ways including power usage it seems very similar. One notable downside is the chip tops out at 4v absolute maximum input, so a separate regulator is used just to feed this chip. The main regulator has also been changed from a 7805 to an LD1117 variant.

On this version the output stage is the same TI 74S140 dual 4-input NAND gate as was used on the original pods, just in SOIC form factor.

As with the next board there is one error on the board, the wire loop that forms the ground connection was intended to fit a U-type pin header, however the footprint I used on the boards was just too tight to allow the pins through, so I've used some thin bus wire instead.



The second major variant I designed was a combo version, allowing sine & square outputs by just switching a jumper, or isolated[2] or line-regenerator (8040TA from Spectracom) versions with a simple sub-board containing just an inductor (TA) or 1:1 transformer (isolated).



This is the second revision of that board, where the 74S140 has been replaced by a modern TI 74LVC1G17 buffer. This version of the pod, set for sine output, uses almost exactly 30mA of power (since both the old & new pods use linear supplies that's the most sensible unit), whereas the original pods are right around 33mA. The empty pods at the bottom-left are simply placeholders for 2 100 ohm resistors to add 50 ohm line termination if desired.

The board fits into the Pomona 2390 "Size A" enclosures, or for the isolated version the Pomona 3239 "Size B". This is the reason the BNC connectors have to be extended to reach the board, on the isolated boxes the BNC pins reach much deeper into the enclosure.

If the jumpers were removed, plus the smaller buffer it should be easy to fit a pod into the Pomona "Miniature" boxes too.



I was also due to create some new personal businesscards, so I arranged the circuit down to a single layer (the only jumper is the requirement to connect both ground pins on the connectors) and merged it with some text converted to KiCad footprints to make a nice card on some 0.6mm PCBs. The paper on that photo is covering the link to the build instructions, which weren't written at the time (they're *mostly* done now, I may update this post with the link later).

Finally, while I was out travelling at the start of April my new (to me) HP 4395A arrived so I've finally got some spectrum output. The output is very similar between the original and my version, with the major notable difference being that my version is 10dB worse at the third harmonic. I lack the equipment (and understanding) to properly measure phase noise, but if anyone in AU/NZ wants to volunteer their time & equipment for an afternoon I'd love an excuse for a field trip.



Spectrum with input sourced from my house rubidium (natively a 5MHz unit) via my 8140 line. Note that despite saying "ExtRef" the analyzer is synced to its internal 10811 (which is an optional unit, and uses an external jumper, hence the display note.



Spectrum with input sourced from the analyzer's own 10811, and power from the DC bias generator also from the analyzer.


1: Or at least I didn't think they had, I've since found out that there was a multi output pod, and one is currently in the post heading to me.
2: An option on the standard Spectracom pods, albeit a rare one.

April 22, 2019

Pi-hole with DNS over TLS on Fedora

Quick and dirty guide to using Pi-hole with Stubby to provide both advertisement blocking and DNS over TLS. I’m using Fedora 29 ARM server edition on a Raspberry Pi 3.

Download Fedora server ARM edition and write it to an SD card for the Raspberry Pi 3.

sudo fedora-arm-image-installer --resizefs --image=Fedora-Server-armhfp-29-1.2-sda.raw.xz --target=rpi3 --media=/dev/mmcblk0

Make sure your Raspberry Pi can already resolve DNS queries from some other source, such as your router or internet provider.

Log into the Fedora Server Cockpit web interface for the server (port 9090) and enable automatic updates from the Software tab. Else you can do updates manually.

sudo dnf -y update && sudo reboot

Install Stubby

Install Stubby to forward DNS requests over TLS.

sudo dnf install getdns bind-utils

Edit the Stubby config file.

sudo vim /etc/stubby/stubby.yml

Set listen_addresses to localhost 127.0.0.1 on port 53000 (also set your preferred upstream DNS providers, if you want to change the defaults, e.g. CloudFlare).

listen_addresses:
– 127.0.0.1@53000
– 0::1@53000

Start and enable Stubby, checking that it’s listening on port 53000.

sudo systemctl restart stubby
sudo ss -lunp |grep 53000
sudo systemctl enable stubby

Stubby should now be listening on port 53000, which we can test with dig. The following command should return an IP address for google.com.

dig @localhost -p 53000 google.com

Next we’ll use Pi-hole as a caching DNS service to forward requests to Stubby (and provide advertisement blocking).

Install Pi-hole

Sadly, Pi-hole doesn’t support SELinux at the moment so set it to permissive mode (or write your own rules).

sudo setenforce 0
sudo sed -i s/^SELINUX=.*/SELINUX=permissive/g /etc/selinux/config

Install Pi-hole from their Git repository.

sudo dnf install git
git clone --depth 1 https://github.com/pi-hole/pi-hole.git Pi-hole
cd "Pi-hole/automated install/"
sudo ./basic-install.sh

The installer will run, install deps and prompt for configuration. When asked what DNS to use, select Custom from the bottom of the list.

Custom DNS servers

Set the server to 127.0.0.1 (note that we cannot set the port here, we’ll do that later)

Use local DNS server

In the rest of the installer, also enable the web interface and server if you like and allow it to modify the firewall else this won’t work at all! 🙂 Make sure you take note of your admin password from the last screen, too.

Finally, add the port to our upstream (localhost) DNS server so that Pi-hole can forward requests to Stubby.

sudo sed -i '/^server=/ s/$/#53000/' /etc/dnsmasq.d/01-pihole.conf
sudo sed -i '/^PIHOLE_DNS_[1-9]=/ s/$/#53000/' /etc/pihole/setupVars.conf
sudo systemctl restart pihole-FTL

If you don’t want to muck around with localhost and ports you could probably add an IP alias and bind your Stubby to that on port 53 instead.

Testing

On a machine on your network, set /etc/resolv.conf to point to the IP address of your Pi-hole server to use it for DNS.

On the Pi-hole, check incoming DNS requests to ensure they are listening and forwarding on the right ports using tcpdump.

sudo tcpdump -Xnn -i any port 53 or port 53000 or port 853

Back on your client machine, ping google.com and with any luck it will resolve.

For a new query, tcpdump on your Pi-hole box should show an incoming request from the client machine to your pi-hole on port 53, a follow-up localhost request to 53000 and then outward request from your Pi-hole to 853, then finally the returned result back to your client machine.

You should also notice that the payload for the internal DNS queries are plain text, but the remote ones are encrypted.

Web interface

Start browsing around and see if you notice any difference where you’d normally see ads. Then jump onto the web interface on your Pi-hole box and take a look around.

Pi-hole web interface

If that all worked, you could get your DHCP server to point clients to your shiny new Pi-hole box (i.e. use DHCP options 6,<ip_address>).

If you’re feeling extra brave, you could redirect all unencrypted DNS traffic on port 53 back to your internal DNS before it leaves your network, but that might be another blog post…


Vale Polly Samuel (1963-2017): On Dying & Death

Those of you who follow me on Twitter will know some of this already, but I’ve been meaning to write here for quite some time about all this. It’s taken me almost two years to write, because it’s so difficult to find the words to describe this. I’ve finally decided to take the plunge and finish it despite feeling it could be better, but if I don’t I’ll never get this out.

September 2016 was not a good month for my wonderful wife Polly, she’d been having pains around her belly and after prodding the GP she managed to get a blood test ordered. They had suspected gallstones or gastritis but when the call came one evening to come in urgently in the next morning for another blood test we knew something was up. After the blood test we were sent off for an ultrasound of the liver and with that out of the way went out for a picnic on Mount Dandenong for a break. Whilst we were eating we got another phone call from the GP, this time to come and pick up a referral for an urgent MRI. We went to pick it up but when they found out Polly had already eaten they realised they would need to convert to a CT scan. A couple of phone calls later we were booked in for one that afternoon. That evening was another call to come back to see the GP. We were pretty sure we knew what was coming.

The news was not good, Polly had “innumerable” tumours in her liver. Over 5 years after surgery and chemo for her primary breast cancer and almost at the end of her 5 years of tamoxifen the cancer had returned. We knew the deal with metastatic cancer, but it was still a shock when the GP said “you know this is not a curable situation”. So the next day (Friday) it was right back to her oncologist who took her of the tamoxifen immediately (as it was no longer working) and scheduled chemotherapy for the following Monday, after an operation to install a PICC line. He also explained about what this meant, that this was just a management technique to (hopefully) try and shrink the tumours and make life easier for Polly for a while. It was an open question about how long that while would be, but we knew from the papers online that she had found looking at the statistics that it was likely months, not years, that we had. Polly wrote about it all at the time, far more eloquently than I could, with more detail, on her blog.

Chris, my husband, best pal, and the love of my life for 17 years, and I sat opposite the oncologist. He explained my situation was not good, that it was not a curable situation. I had already read that extensive metastatic spread to the liver could mean a prognosis of 4-6 months but if really favorable as long as 20 months.

The next few months were a whirlwind of chemo, oncology, blood tests, crying, laughing and loving. We were determined to talk about everything, and Polly was determined to prepare as quickly as she could for what was to come. They say you should “put your affairs in order” and that’s just what she did, financially, business-wise (we’d been running an AirBNB and all those bookings had to be canceled ASAP, plus of course all her usual autism consulting work) and personally. I was so fortunate that my work was so supportive and able to be flexible about my hours and days and so I could be around for these appointments.

Over the next few weeks it was apparent that the chemo was working, breathing & eating became far easier for her and a follow up MRI later on showed that the tumours had shrunk by about 75%. This was good news.

In October 2016 was Polly’s 53rd birthday and so she set about planning a living wake for herself, with a heap of guests, music courtesy of our good friend Scott, a lot of sausages (and other food) and good weather. Polly led the singing and there was an awful lot of merriment. Such a wonderful time and such good memories were made that day.

Polly singing at her birthday party in 2016

That December we celebrated our 16th wedding anniversary together at a lovely farm-stay place in the Yarra Valley before having what we were pretty sure was our last Christmas together.

Polly and Chris at the farm-stay for our wedding anniversary

But then in January came the news we’d been afraid of, the blood results were showing that the first chemo had run out of steam and stopped working, so it was on to chemo regime #2. A week after starting the new regime we took a delayed holiday up to the Blue Mountains in New South Wales (we’d had to cancel previously due to her diagnosis) and spent a long weekend exploring the area and generally having fun.

Polly and Chris at Katoomba, NSW

But in early February it was clear that the second line chemo wasn’t doing anything, and so it was on to the third line chemo. Polly had also been having fluid build up in her abdomen (called ascites) and we knew they would have to start to draining that at some point, February was that point; we spent the morning of Valentines Day in the radiology ward where they drained around 4 litres from her! The upside from that was it made life so much easier again for her. We celebrated that by going to a really wonderful restaurant that we used for special events for dinner for Valentines, something we hadn’t thought possible that morning!

Valentine's Day dinner at Copperfields

Two weeks after that we learned from the oncologist that the third line chemo wasn’t doing anything either and he had to give us the news that there wasn’t any treatment he could offer us that had any prospect of helping. Polly took that in her usual pragmatic and down-to-earth way, telling the oncologist that she didn’t see him as the reaper but as her fairy godfather who had given her months of extra quality time and bringing a smile to his and my face. She also asked whether the PICC line (which meant she couldn’t have a bath, just shower with a protective cover over it) could come out and the answer was “yes”.

The day before that news we had visited the palliative ward there for the first time, Polly had a hard time with hospitals and so we spent time talking to the staff, visiting rooms and Polly all the time reframing it to reduce and remove the anxiety. The magic words were “hotel-hospital”, which it really did feel like. We talked with the oncologist about how it all worked and what might happen.

We also had a home palliative team who would come and visit, help with pain management and be available on the phone at all hours to give advice and assist where they could. Polly felt uncertain about them at first as she wasn’t sure what they would make of her language issues and autism, but whilst they did seem a bit fazed at first by someone who was dealing with the fact that they were dying in such a blunt and straightforward manner things soon smoothed out.

None of this stopped us living, we continued to go out walking at our favourite places in our wonderful part of Melbourne, continued to see friends, continued to joke and dance and cry and laugh and cook and eat out.

Polly on minature steam train

Oh, and not forgetting putting a new paved area in so we could have a little outdoor fire area to enjoy with friends!

Chris laying paving slabs for fire area Polly and Morghana enjoying the fire!

But over time the ascites was increasing, with each drain being longer, with more fluid, and more taxing for Polly. She had decided that when it would get to the point that she would need two a week then that was enough and time to call it a day. Then, on a Thursday evening after we’d had an afternoon laying paving slabs for another little patio area, Polly was having a bath whilst I was researching some new symptoms that had appeared, and when Polly emerged I showed her what I had found. The symptoms matched what happens when that pressure that causes the ascites gets enough to push blood back down other pathways and as we read what else could lie in store Polly decided that was enough.

That night Polly emailed the oncologist to ask them to cancel her drain which was scheduled for the next day and instead to book her into the palliative ward. We then spent our final night together at home, before waking the next day to get the call to confirm that all was arranged from their end and that they would have a room by 10am, but to arrive when was good for us. Friends were informed and Polly and I headed off to the palliative ward, saying goodbye to the cats and leaving our house together for the very last time.

Arriving at the hospital we dropped in to see the oncology, radiology and front-desk staff we knew to chat with them before heading up to the palliative ward to meet the staff there and set up in the room. The oncologist visited and we had a good chat about what would happen with pain relief and sedation once Polly needed it. Shortly after our close friends Scott and Morghana arrived from different directions and I had brought Polly’s laptop and a 4G dongle and so on Skype arrived Polly’s good Skype pal Marisol joined us, virtually. We shared a (dairy free) Easter egg, some raspberry lemonade and even some saké! We had brought in a portable stereo and CD’s and danced and sang and generally made merry – which was so great.

After a while Polly decided that she was too uncomfortable and needed the pain relief and sedation, so everything was put in its place and we all said our goodbyes to Polly as she was determined to do the final stages on her own, and she didn’t want anyone around in case it caused her to try and hang on longer than she really should. I told her I was so proud of her and so honoured to be her husband for all this time. Then we left, as she wished, with Scott and Morghana coming back with me to the house. We had dinner together at the house and then Morghana left for home and Scott kindly stayed in the spare room.

The next day Scott and I returned to the hospital, Polly was still sleeping peacefully so after a while he and I had a late lunch together, making sure to fulfil Polly’s previous instructions to go enjoy something that she couldn’t, and then we went our separate ways. I had not been home long before I got the call from the hospital – Polly was starting to fade – so I contacted Scott and we both made our way back there again. The staff were lovely, they managed to rustle up some food for us as well as tea and coffee and would come and check on us in the waiting lounge, next door to where Polly was sleeping. At one point the nurse came in and said “you need a hug, she’s still sleeping”. Then, a while after, she came back in and said “I need a hug, she’s gone…”.

I was bereft. Whilst intellectually I knew this was inevitable, the reality of knowing that my life partner of 17 years was gone was so hard. The nurse told me us that we could see Polly now, and so Scott and I went to see her to say our final goodbye. She was so peaceful, and I was grateful that things had gone as she wanted and that she had been able to leave on her own terms and without the greater discomforts and pain that she was worried would still be coming. Polly had asked us to leave a CD on, and as we were leaving the nurses said to us “oh, we changed the CD earlier on today because it seemed strange to just have the one on all the time. We put this one on by someone called ‘Donna Williams’, it was really nice.”. So they had, unknowingly, put her own music on to play her out.

As you would expect if you had ever met Polly she had put all her affairs in order, including making preparations for her memorial as she wanted to make things as easy for me as possible. I arranged to have it live streamed for friends overseas and as part of that I got a recording of it, which I’m now making public below. Very sadly her niece Jacqueline, who talks at one point about going ice skating with her, has also since died.

Polly and I were so blessed to have 16 wonderful years together, and even at the end the fact that we did not treat death as a taboo and talked openly and frankly about everything (both as a couple and with friends) was such a boon for us. She made me such a better person and will always be part of my life, in so many ways.

Finally, I leave you with part of Polly’s poem & song “Still Awake”..

Time is a thief, which steals the chances that we never get to take.
It steals them while we are asleep.
Let’s make the most of it, while we are still awake.

Polly at Cardinia Reservoir, late evening

This item originally posted here:

Vale Polly Samuel (1963-2017): On Dying & Death

April 20, 2019

Now migrated to Drupal 8!

Now migrated to Drupal 8! kattekrab Sat, 20/04/2019 - 22:08

Leadership, and teamwork.

Leadership, and teamwork. kattekrab Fri, 13/04/2018 - 04:09

Makarrata

Makarrata kattekrab Thu, 14/06/2018 - 20:19

Communication skills for everyone

Communication skills for everyone kattekrab Sat, 17/03/2018 - 13:01

DrupalCon Nashville

DrupalCon Nashville kattekrab Sat, 17/03/2018 - 22:01

Powerful Non Defensive Communication (PNDC)

Powerful Non Defensive Communication (PNDC) kattekrab Sun, 10/03/2019 - 09:00

I said, let me tell you now

I said, let me tell you now kattekrab Sat, 10/03/2018 - 09:56

The Five Whys

The Five Whys kattekrab Sat, 16/06/2018 - 09:16

Site building with Drupal

Site building with Drupal kattekrab Sat, 17/02/2018 - 14:05

Six years and 9 months...

Six years and 9 months... kattekrab Sat, 27/10/2018 - 13:05

April 17, 2019

Programming an AnyTone AT-D878UV on Linux using Windows 10 and VirtualBox

I recently acquired an AnyTone AT-D878UV DMR radio which is unfortunately not supported by chirp, my usual go-to free software package for programming amateur radios.

Instead, I had to setup a Windows 10 virtual machine so that I could setup the radio using the manufacturer's computer programming software (CPS).

Install VirtualBox

Install VirtualBox:

apt install virtualbox virtualbox-guest-additions-iso

and add your user account to the vboxusers group:

adduser francois vboxusers

to make filesharing before the host and the guest work.

Finally, reboot to ensure that group membership and kernel modules are all set.

Create a Windows 10 virtual machine

Create a new Windows 10 virtual machine within VirtualBox. Then, download Windows 10 from Microsoft then start the virtual machine mounting the .iso file as an optical drive.

Follow the instructions to install Windows 10, paying attention to the various privacy options you will be offered.

Once Windows is installed, mount the host's /usr/share/virtualbox/VBoxGuestAdditions.iso as a virtual optical drive and install the VirtualBox guest additions.

Installing the CPS

With Windows fully setup, it's time to download the latest version of the computer programming software.

Unpack the downloaded file and then install it as Admin (right-click on the .exe).

Do NOT install the GD driver update or the USB driver, they do not appear to be necessary.

Program the radio

First, you'll want to download from the radio to get a starting configuration that you can change.

To do this:

  1. Turn the radio on and wait until it has finished booting.
  2. Plug the USB programming cable onto the computer and the radio.
  3. From the CPS menu choose "Set COM port".
  4. From the CPS menu choose "Read from radio".

Save this original codeplug to a file as a backup in case you need to easily reset back to the factory settings.

To program the radio, follow this handy third-party guide since it's much better than the official manual.

You should be able to use the "Write to radio" menu option without any problems once you're done creating your codeplug.

April 13, 2019

Secure ssh-agent usage

ssh-agent was in the news recently due to the matrix.org compromise. The main takeaway from that incident was that one should avoid the ForwardAgent (or -A) functionality when ProxyCommand can do and consider multi-factor authentication on the server-side, for example using libpam-google-authenticator or libpam-yubico.

That said, there are also two options to ssh-add that can help reduce the risk of someone else with elevated privileges hijacking your agent to make use of your ssh credentials.

Prompt before each use of a key

The first option is -c which will require you to confirm each use of your ssh key by pressing Enter when a graphical prompt shows up.

Simply install an ssh-askpass frontend like ssh-askpass-gnome:

apt install ssh-askpass-gnome

and then use this to when adding your key to the agent:

ssh-add -c ~/.ssh/key

Automatically removing keys after a timeout

ssh-add -D will remove all identities (i.e. keys) from your ssh agent, but requires that you remember to run it manually once you're done.

That's where the second option comes in. Specifying -t when adding a key will automatically remove that key from the agent after a while.

For example, I have found that this setting works well at work:

ssh-add -t 10h ~/.ssh/key

where I don't want to have to type my ssh password everytime I push a git branch.

At home on the other hand, my use of ssh is more sporadic and so I don't mind a shorter timeout:

ssh-add -t 4h ~/.ssh/key

Making these options the default

I couldn't find a configuration file to make these settings the default and so I ended up putting the following line in my ~/.bash_aliases:

alias ssh-add='ssh-add -c -t 4h'

so that I can continue to use ssh-add as normal and have not remember to include these extra options.

April 11, 2019

Using a MCP4921 or MCP4922 as a SPI DAC for Audio on Raspberry Pi

Share

I’ve been playing recently with using a MCP4921 as an audio DAC on a Raspberry Pi Zero W, although a MCP4922 would be equivalent (the ’22 is a two channel DAC, the ’21 is a single channel DAC). This post is my notes on where I got to before I decided that thing wasn’t going to work out for me.

My basic requirement was to be able to play sounds on a raspberry pi which already has two SPI buses in use. Thus, adding a SPI DAC seemed like a logical choice. The basic circuit looked like this:

MCP4921 SPI DAC circuit

Driving this circuit looked like this (noting that this code was a prototype and isn’t the best ever). The bit that took a while there was realising that the CS line needs to be toggled between 16 bit writes. Once that had been done (which meant moving to a different spidev call), things were on the up and up.

This was the point I realised that I was at a dead end. I can’t find a way to send the data to the DAC in a way which respects the timing of the audio file. Before I had to do small writes to get the CS line to toggle I could do things fast enough, but not afterwards. Perhaps there’s a DMA option instead, but I haven’t found one yet.

Instead, I think I’m going to go and try PWM based audio. If that doesn’t work, it will be a MAX219 i2c DAC for me!

Share

April 10, 2019

Audiobooks – March 2019

An Economist Gets Lunch: New Rules for Everyday Foodies by Tyler Cowen

A huge amount of practical advice and how and where to find the best food both locally and abroad. Plus good explanations as to why. 8/10

The Not-Quite States of America: Dispatches from the Territories and Other Far-Flung Outposts of the USA by Doug Mack

Writer tours the not-states of the USA. A bit too fluffy most of the time & too much hanging with US expats. Some interesting bits. 6/10

Shattered: Inside Hillary Clinton’s Doomed Campaign by Jonathan Allen & Amie Parnes

Chronology of the campaign based on background interviews with staffers. A ready needs a good knowledge of the race since this is assumed. Interesting enough. 7/10

Rush Hour by Iain Gatel

A history of commuting (from the early railway era), how it has driven changes in housing, work and society. Plus lots of other random stuff. Very pleasant. 8/10

Share

April 08, 2019

1-Wire home automation tutorial from linux.conf.au 2019, part 2

Share

For the actual on-the-day work, delegates were handed a link to these instructions in github. If you’re playing along at home, you should probably read 1-Wire home automation tutorial from linux.conf.au 2019, part 1 before attempting the work described here. Its especially important that you know the IP address of your board for example.

Relay tweaks

The instructions are pretty self explanatory, although I did get confused about where to connect the relay as I couldn’t find PC8 in my 40 pin header diagrams. That’s because the shields for the tutorial have a separate header which is a bit more convenient:

GPIO header

I was also a bit confused when the relay didn’t work initially, but that turns out because I’d misunderstood the wiring. The relay needs to be powered from the 3.3v pin on the 40 pin header, as there is a PCB error which puts 5v on the pins labelled as 3.3v on the GPIO header. I ended up with jumper wires which looked like this:

Cabling the relay

1-Wire issues

Following on the tutorial instructions worked well from then on until I tried to get 1-Wire setup. The owfs2mqtt bridge plugin was logging this:

2019-04-08 19:23:55.075: /opt/OWFS-MQTT-Bridge/lib/Daemon/OneWire.pm:148:Daemon::logError(): Connection to owserver failed: Can't connect owserver: Address not available

Debugging that involved connecting to the owfs2mqtt docker container (hint: ssh to the Orange Pi, do a docker ps, and then run bash inside the docker container for the addon). Running owserver with debug produces this:

owserver fails

Sorry to post that as an image, cut and paste for the hassos ssh server doesn’t like me for some reason. I suspect I have a defective DS2482, but I’ll have to wait and see what Allistair says.

Share

1-Wire home automation tutorial from linux.conf.au 2019, part 1

Share

I didn’t get much of a chance to work through the home automation tutorial at linux.conf.au 2019 because I ended up helping others in the room get their Orange Pi is booting. Now that things have settled down after the conference, I’ve had a chance to actually do some of the tutorial myself. These are my notes so I can remember what I did later…

Pre-tutorial setup

You need to do the pre-tutorial setup first. I use Ubuntu, which means its important that I use 18.10 or greater so that st-link is packaged. Apart from that the instructions as written just worked.

You also need to download the image for the SD card, which was provided on the day at the conference. The URL for that is from github. Download that image, decompress it, and then flash it to an SD card using something like Balena Etcher. The tutorial used 32gb SD cards, but the image will fit on something smaller than that.

hassos also doesn’t put anything on the Orange Pi HDMI port when it boots, so your machine is going to look like it didn’t boot. That’s expected. For the tutorial we provided a mapping from board number (mac address effectively) to IP address allocated in the tutorial. At home if you’re using an Orange Pi that isn’t from the conference you’re going to have to find another way to determine the IP address of your Orange Pi.

The way we determined MAC addresses and so forth for the boards used at the conference was to boot an Armbian image and then run a simple python script which performed some simple checks of each board by logging into the board over serial. The MAC addresses for the boards handed out on the day are on github.

An Aside: Serial on the Orange Pi Prime

As an aside, the serial on the Orange Pi Prime is really handy, especially with the hassos image. Serial is exposed by a three pin header on the board, which is sort of labelled:

Orange Pi Prime Serial PortThe Orange Pi Prime Serial Port

Noting that you’ll need to bend the pins of the serial header a little if you’re using the shield from the conference:

Serial port connected to a USB to serial converter

The advantage being that suddenly you get useful debugging information! The serial connection is 115200 baud 8N1 (8 data bits, no parity, 1 stop bit) by the way.

Serial debug information from an hassos boot

The hassos image used for the conference allows login as root with no password over serial, which dumps you into a hass interface. Type “login” to get a bash prompt, even though its not in the list of commands available. At this point you can use the “ip address” command to work out what address DHCP handed the board.

The actual on-the-day work

So at this point we’re about as ready as people were meant to be when they walked into the room for the tutorial. I’ll write more notes when I complete the actual tutorial.

Share

April 05, 2019

Introducing GangScan

Share

As some of you might know, I am a Scout Leader. One of the things I do for Scouts is I assist in a minor role with the running of Canberra Gang Show, a theatre production for young people.

One of the things Gang Show cares about is that they need to be able to do rapid roll calls and reporting on who is present at any given time — this is used for working out who is absent before a performance (and therefore needs an understudy), as well as ensuring we know where everyone is in an environment that sometimes has its fire suppression systems isolated.

Before I came along, Canberra Gang Show was doing this with a Windows based attendance tracking application, and 125kHz RFID tags. This system worked just fine, except that the software was clunky and there was only one badge reader — we struggled explaining to youth that they need to press the “out” button when logging out, and we wanted to be able to have attendance trackers at other locations in the theatre instead of forcing everyone to flow through a single door.

So, I got thinking. How hard can it be to build something a bit better?

Let’s start with some requirements: simple to deploy and manage; free software (both cost and freedom); more badge readers than what we have now; and low cost.

My basic proposal for such a thing is a Raspberry Pi Zero W, with a small LCD screen and a RFID reader. The device scans badges, and displays a confirmation of scan to the user. If the device can talk to a central server it streams events to it; otherwise it queues them until the server is available and then streams them.

Sourcing a simple SPI LCD screen and SPI RFID reader from ebay wasn’t too hard, and we were off! The only real wart was that I wanted to use 13.56mHz RFID cards, because then I could store some interesting (up to 1kb) data on the card itself. The first version was simply a ribbon cable:

v0.0, a ribbon cable

Which then led to me having my first PCB ever made. Let’s ignore that its the wrong size shall we?

v0.1, an incorrectly sized PCB

I’m now at the point where the software for the scanner is reasonable, and there is a bare bones server that does enough roll call that it should be functional. I am sure there’s more to be done, but it works enough to demo. One thing I learned while showing off the device at coffee the other day is that it really needs to make a noise when you scan a badge. I’ve ordered a SPI DAC to play with, which might be the solution there. Other next steps include a newer version of the PCB, and some sort of case solution. I’ll do another post when things progress further.

Oh yes, and I’ll eventually release the software too once its in a more workable state.

Share

April 04, 2019

React Isn’t The Problem

As React (via Gutenberg) becomes more present in the WordPress world, I’m seeing some common themes pop up in conversations about it. I spoke a bit about this kind of thing at WordCamp US last year, but if you don’t feel like sitting through a half hour video, let me summarise my thoughts. 🙂

I agree that React is hard. I strongly disagree with the commonly contrasted view that HTML, CSS, PHP, or vanilla JavaScript are easy. They’re all just as hard to work with as React, sometimes more-so, particularly when having to deal with the exciting world of cross-browser compatibility.

The advantage that PHP has over modern JavaScript development isn’t that it’s easy, or that the tooling is better, or more reliable, or anything like that. The advantage is that it’s familiar. If you’re new to web development, React is just as easy anything else to start with.

I’m honestly shocked when someone manages to wade through the mess of tooling (even pre-Gutenberg) to contribute to WordPress. It’s such an incomprehensible, thankless, unreliable process, the tenacity of anyone who makes it out the other side should be applauded. That said, this high barrier is unacceptable.

I’ve been working in this industry for long enough to have forgotten the number of iterations of my personal development environment I’ve gone through, to get to where I can set up something for myself which isn’t awful. React wasn’t around for all of that time, so that can’t be the reason web development has been hard for as long as I remember. What is, then?

Doing Better

Over the past year or so, I’ve been tinkering with a tool to help deal with the difficulties of contributing to WordPress. That tool is called TestPress, it’s getting pretty close to being usable, at least on MacOS. Windows support is a little less reliable, but getting better. 🙂 If you enjoy tinkering with tools, too, you’re welcome to try out the development version, but it does still has some bugs in it. Feedback and PRs are always welcome! There are some screenshots in this issue that give an idea of what the experience is like, if you’d like to check it out that way.

TestPress is not a panacea: at best, it’s an attempt at levelling the playing field a little bit. You shouldn’t need years of experience to build a reliable development environment, that should be the bare minimum we provide.

React is part of the solution

There’s still a lot of work to do to make web development something that anyone can easily get into. I think React is part of the solution to this, however.

React isn’t without its problems, of course. Modern JavaScript can encourage iteration for the sake of iteration. Certainly, there’s a drive to React-ify All The Things (a trap I’m guilty of falling into, as well). React’s development model is fundamentally different to that of vanilla JavaScript or jQuery, which is why it can seem incomprehensible if you’re already well versed in the old way of doing things: it requires a shift in your mental model of how JavaScript works. This is a hard problem to solve, but it’s not insurmountable.

Perhaps a little controversially, I don’t think that React is guilty of causing the web to become less accessible. At worst, it’s continuing the long standing practice of web standards making accessibility an optional extra. Building anything beyond a basic, non-interactive web page with just HTML and CSS will inevitably cause accessibility issues, unless you happen to be familiar with the mystical combinations of accessible tags, or applying aria attributes, or styling your content in just the right way (and none of the wrong ways).

React (or any component-based development system, really) can improve accessibility for everyone, and we’re seeing this with Gutenberg already. By providing a set of base components for plugin and theme authors to use, we can ensure the correct HTML is produced for screen readers to work with. Much like desktop and mobile app developers don’t need to do anything to make their apps accessible (because it’s baked into the APIs they use to build their apps), web developers should have the same experience, regardless of the complexity of the app they’re building.

Arguing that accessibility needs to be part of the design process is the wrong argument. Accessibility shouldn’t be a consideration, it should be unavoidable.

Do Better

Now, can we do better? Absolutely. There’s always room for improvement. People shouldn’t need to learn React if they don’t want to. They shouldn’t have to deal with the complexities of the WCAG. They should have the freedom to tinker, and the reassurance that they can tinker without breaking everything.

The pre-React web didn’t arrive in its final form, all clean, shiny, and perfect. It took decades of evolution to get there. The post-React web needs some time to evolve, too, but it has the benefit of hindsight: we can compress the decades of evolving into a much shorter time period, provide a fresh start for those who want it, while also providing backwards compatibility with the existing ways of doing things.

April 03, 2019

Easybuild: Building Software with Ease

Building software from source is necessary for performance and development reasons. However, this can come with complex dependency and compiler requirements, which have to be explicitly stated in research computing to ensure replication of results. EasyBuild, originally developed by the Julich Supercomputing Centre, the University of Gent, and the Texas Advanced Computing Center, is a tool that allows the building of software with ease, managing the complex dependencies and toolchains, and integrating by default with the Lmod environment modules system.

This presentation will outline the need for tools like Easybuild, describe the framework of Easyblocks, Toolchains, and Easyconfig recipes, and extensions. It will also describe how to install and configure EasyBuild, write and contribute configuration files, and use the configurations to install software with a variety of optional parameters, such as rebuilds and partial builds. Finally, it will conclude with a discussion of some of the more advanced options and opportunities for involvement in the Easybuild community.

Easybuild: Building Software with Ease
Presentation to Linux Users of Victoria, 2nd April, 2010

April 01, 2019

Article Review: Curing the Vulnerable Parser

Every once in a while I read papers or articles. Previously, I've just read them myself, but I was wondering if there were more useful things I could do beyond that. So I've written up a summary and my thoughts on an article I read - let me know if it's useful!

I recently read Curing the Vulnerable Parser: Design Patterns for Secure Input Handling (Bratus, et al; USENIX ;login: Spring 2017). It's not a formal academic paper but an article in the Usenix magazine, so it doesn't have a formal abstract I can quote, but in short it takes the long history of parser and parsing vulnerabilities and uses that as a springboard to talk about how you could design better ones. It introduces a toolkit based on that design for more safely parsing some binary formats.

Background

It's worth noting early on that this comes out of the LangSec crowd. They have a pretty strong underpinning philosophy:

The Language-theoretic approach (LANGSEC) regards the Internet insecurity epidemic as a consequence of ad hoc programming of input handling at all layers of network stacks, and in other kinds of software stacks. LANGSEC posits that the only path to trustworthy software that takes untrusted inputs is treating all valid or expected inputs as a formal language, and the respective input-handling routines as a recognizer for that language. The recognition must be feasible, and the recognizer must match the language in required computation power.

A big theme in this article is predictability:

Trustworthy input is input with predictable effects. The goal of input-checking is being able to predict the input’s effects on the rest of your program.

This seems sensible enough at first, but leads to some questionable assertions, such as:

Safety is predictability. When it's impossible to predict what the effects of the input will be (however valid), there is no safety.

They follow this with an example of Ethereum contracts stealing money from the DAO. The example is compelling enough, but again comes with a very strong assertion about the impossibility of securing a language virtual machine:

From the viewpoint of language-theoretic security, a catastrophic exploit in Ethereum was only a matter of time: one can only find out what such programs do by running them. By then it is too late.

I'm not sure that (a) I buy the assertions, or that (b) they provide a useful way to deal with the world as we find it.

Is this even correct?

You can tease out 2 contentions in the first part of the article:

  • there should be a formal language that describes the data, and
  • this language should be as simple as possible, ideally being regular and context-free.

Neither of these are bad ideas - in fact they're both good ideas - but I don't know that I draw the same links between them and security.

Consider PostScript as a possible counter-example. It's a Turing-complete language, so it absolutely cannot have predictable results. It has a well documented specification and executes in a restricted virtual machine. So let's say that it satisfies only the first plank of their argument.

I'd say that PostScript has a good security record, despite being Turing complete. PostScript has been around since 1985 and apart from the recent bugs in GhostScript, it doesn't have a long history of bugs and exploits. Maybe this just because no-one has really looked, or maybe it is possible to have reasonably safe complex languages by restricting the execution environment, as PostScript consciously and deliberately does.

Indeed, if you consider the recent spate of GhostScript bugs, perhaps some may be avoided by stricter compliance with a formal language specification. However, most seem to me to arise from the desirability of implementing some of the PostScript functionality in PostScript itself, and some of the GhostScript-specific, stupendously powerful operators exposed to the language to enable this. The bugs involve tricks to allow a user to get access to these operators. A non-Turing-complete language may be sufficient to prevent these attacks, but it is not necessary: just not doing this sort of meta-programming with such dangerous operators would also have worked. Storing the true values of the security state outside of a language-accessible object would also be good.

Is this a useful way to deal with the world as we find it?

My main problem with the general LangSec approach that this article takes is this: to get to their desired world, we need to rewrite a bunch of things with entirely different language foundations. The article talks about HTML and PDFs as examples of unsafe formats, but I cannot imagine the sudden wholesale replacement of either of these - although I would love to be proven wrong.

Can we get even part of the way with existing standards? Kinda-sorta, but mostly no, and to the authors' credit, they are open about this. They argue that formal definition parsing the language should be the "most restrictive input definition" - they specifically require you to "give up attempting to accept arbitrarily complex data", and call for "subsetting of many protocols, formats, encodings and command languages, including eliminating unneeded variability and introducing determinism and static values".

No doubt we would be in a better place if people took up these ideas for future programs. However, for our current set of programs and use cases, this is probably not tractable in any meaningful way.

The rest of the paper

The rest of the paper is reasonably interesting. Their general theory is that you should build your parsers based on a formal definition of a language, and that the parser should convert the input data to a set of objects, and then your business logic should deal with those objects. This is the 'recognizer pattern', and is illustrated below:

The recognizer pattern: separate code parses input according to a formal grammar, creating valid objects that are passed to the business logic

In short, the article is full of great ideas if you happen to be parsing a simple language, or are designing not just a parser but a full language ecosystem. They do also provide a binary parser toolkit that might be helpful if you are parsing a binary format that can be expressed with a parser combinator.

Overall, however, I think the burden of maintaining old systems is such that a security paradigm that relies on new code is pretty unlikely, and one that relies on new languages is fatally doomed from the outset. New systems should take up these ideas, yes. But I'd really like to see people grappling with how to deal with the complex and irregular languages that we're stuck with (HTML, PDF, etc) in secure ways.

Praise-Singing Poppler Utilities

Last year I gave a presentation at Linux Users of Victoria entitled Being An Acrobat: Linux and PDFs (there was an additional discussion not in the presentation about embedding Javascript in a PDF and some related security issues, but that's for another post). Part of this presentation was singing the praises of Poppler Utilities (named after the Futurama episode, "The Problem with Popplers"). This is probably the most single most common suite of tools I use when dealing with PDFs with the single exception of reading and creating. There is an enormous number of times I have encountered an issue with PDFs and resolved it with something from the Popper utils suite.

The entire relevant slide from the presentation is reproduced was as follows:

Derived from xpdf, Poppler is a software library for rendering PDF documents and is used in a variety of PDF readers (including Evince, KPDF, LibreOffice, Inkscape, Okular, Zathura etc). A collection of tools, poppler-utils, is built on Poppler’s API provides a variety of useful functions e.g.,

pdffonts - lists the fonts used in a PDF (e.g., pdfonts filename.pdf)
pdfimages - extract images from a PDF (e.g., pdfimages -png filename.pdf images/)
pdfseparate - extract single pages from a PDF (e.g., pdfseparate sample.pdf sample-%d.pdf)
pdftohtml - convert PDF to HTML format retaining formatting
pdftops - convert PDF to printable PS format
pdftotext - extract text from a PDF
pdfunite - merges PDFs (pdfunite page{01..13}.pdf combined.pdf)

Recently I had an experience with this that illustrates a practical example of one of the tools. I am currently doing a MSc in Information Systems at the University of Salford. The course content itself is conducted through the Robert Kennedy College in Swizterland. For each asssignment the course accepts uploads for one file and one file only, and only in particular formats as well (e.g., docx, pdf, rtf etc). An additional upload on the RKC system will overwrite one's previously submitted assignment file.

If you have multi-part components to an assignment, you will have to export them to a common format, combine them, and upload them as a single document. In a project management course, I ended with several files, as the assignment demanded a title page, a slideshow with nodes, a main body of the assignment (business case and project product plan), a Gannt chart (created through ProjLibre), and an reference and appendix file.

At the end of the assignment, I had a Title Page file, a Part A file, Part B Main Body file, Part B LibreProj file, and a Refs and Appendix File. First I converted them all to PDFs. This is one of the file formats accepted, and is a common export format for the text, slideshow, and project. Then I used the application pdfunite (Linux application) to combine them into a single file. e.g.,

pdfunite title.pdf parta.pdf partb.pdf gannt.pdf refs.pfs assignment.pdf

Quite clearly RKC has a limited and arguably poorly designed upload system. But the options are either complain about it, give up, or work around it. Information systems science demands the latter because we will encounter this all the time. We all have to learn how to workaround system limitations. When it comes to PDF limitations, I have found that Popper Utilities are one of the most useful tools available and I have found that I use the various utilities with almost alarming regularity. So here's a few more that I didn't mention in my initial presentation due to time constraints:

pdfdetach - extract embedded documents from a PDF (e.g., pdfdetach --saveall filename.pdf)
pdfinfo - print file information from a PDF (title, subject, keywords, author, creator etc) (e.g., pdfinfo filename.pdf)
pdftocairo - convert pdf to a png/jpeg/tiff/ps/eps/svg using cairo (e.g., pdftocairo -svg filename.pdf filename.svg)
pdftoppm - convert pdf to Portable Pixmap bitmaps (e.g., pdftoppm filename.pdf filename)

March 30, 2019

Digital government: it all starts with open

This is a short video I did on the importance of openness for digital government, for the EngageTech Forum 2018. I’ve had a few people reuse it for other events so I thought I should blog it properly :) Please see the transcript below. 

<Conference introductory remarks>

I wanted to talk about why openness and engagement is so critical for our work in a modern public service.

For me, looking at digital government, it’s not just about digital services, it’s about how we transform governments for the 21st century: how we do service delivery, engagement, collaboration, and how we do policy, legislation and regulation. How we make public services fit for purpose so they can serve you, the people, communities and economy of the 21st century.

For me, a lot of people think about digital and think about technology, but open government is a founding premise, a founding principle for digital government. Open that’s not digital doesn’t scale, and digital that’s not open doesn’t last. That doesn’t just mean looking at things like open source, open content and open APIs, but it means being open. Open to change. Being open to people and doing things with people, not just to people.

There’s a fundamental cultural, technical and process shift that we need to make, and it all starts with open.

<closing conference remarks>

March 29, 2019

Installing an xmonad Environment on NixOS

NixOS Coin by Craige McWhirter

Xmonad is a very fast, reliable and flexible window manager for Linux and other related operating systems. As I recently shifted from Debian + Propellor to NixOS + NixOps, I now needed to redefine my Xmonad requirements for the new platform.

TL;DR

  • Grab my xmonad.nix file and import it in your /etc/nixos/configuration.nix
  • You can also grab my related xmonad.hs, xmobarrc and session files to use the complete setup.

An Example

At the time of writing, I used the below xmonad.nix to install my requirements. My current version can be found here.

# Configuration for my xmonad desktop requirements

{ config, pkgs, ... }:

{

  services.xserver.enable = true;                        # Enable the X11 windowing system.
  services.xserver.layout = "us";                        # Set your preferred keyboard layout.
  services.xserver.desktopManager.default = "none";      # Unset the default desktop manager.
  services.xserver.windowManager = {                     # Open configuration for the window manager.
    xmonad.enable = true;                                # Enable xmonad.
    xmonad.enableContribAndExtras = true;                # Enable xmonad contrib and extras.
    xmonad.extraPackages = hpkgs: [                      # Open configuration for additional Haskell packages.
      hpkgs.xmonad-contrib                               # Install xmonad-contrib.
      hpkgs.xmonad-extras                                # Install xmonad-extras.
      hpkgs.xmonad                                       # Install xmonad itself.
    ];
    default = "xmonad";                                  # Set xmonad as the default window manager.
  };

  services.xserver.desktopManager.xterm.enable = false;  # Disable NixOS default desktop manager.

  services.xserver.libinput.enable = true;               # Enable touchpad support.

  services.udisks2.enable = true;                        # Enable udisks2.
  services.devmon.enable = true;                         # Enable external device automounting.

  services.xserver.displayManager.sddm.enable = true;    # Enable the default NixOS display manager.
  services.xserver.desktopManager.plasma5.enable = true; # Enable KDE, the default NixOS desktop environment.

  # Install any additional fonts that I require to be used with xmonad
  fonts.fonts = with pkgs; [
    opensans-ttf             # Used in in my xmobar configuration
  ];

  # Install other packages that I require to be used with xmonad.
  environment.systemPackages = with pkgs; [
    dmenu                    # A menu for use with xmonad
    feh                      # A light-weight image viewer to set backgrounds
    haskellPackages.libmpd   # Shows MPD status in xmobar
    haskellPackages.xmobar   # A Minimalistic Text Based Status Bar
    libnotify                # Notification client for my Xmonad setup
    lxqt.lxqt-notificationd  # The notify daemon itself
    mpc_cli                  # CLI for MPD, called from xmonad
    scrot                    # CLI screen capture utility
    trayer                   # A system tray for use with xmonad
    xbrightness              # X11 brigthness and gamma software control
    xcompmgr                 # X composting manager
    xorg.xrandr              # CLI to X11 RandR extension
    xscreensaver             # My preferred screensaver
    xsettingsd               # A lightweight desktop settings server
  ];

}

This provides my xmonad environment with everything I need for xmonad to run as configured.

Herringback

It occurs to me that I never wrote up the end result of the support ticket I opened with iiNet after discovering significant evening packet loss on our fixed wireless NBN connection in August 2017.

The whole saga took about a month. I was asked to run a battery of tests (ping, traceroute, file download and speedtest, from a laptop plugged directly into the NTD) three times a day for three days, then send all the results in so that a fault could be lodged. I did this, but somehow there was a delay in the results being communicated, so that by the time someone actually looked at them, they were considered stale, and I had to run the whole set of tests all over again. It’s a good thing I work from home, because otherwise there’s no way it would be possible to spend half an hour three times a day running tests like this. Having finally demonstrated significant evening slowdowns, a fault was lodged, and eventually NBN Co admitted that there was congestion in the evenings.

We have investigated and the cell which this user is connected to experiences high utilisation during busy periods. This means that the speed of this service is likely to be reduced, particularly in the evening when more people are using the internet.

nbn constantly monitors the fixed wireless network for sites which require capacity expansion and we aim to upgrade site capacity before congestion occurs, however sometimes demand exceeds expectations, resulting in a site becoming congested.

This site is scheduled for capacity expansion in Quarter 4, 2017 which should result in improved performance for users on the site. While we endeavour to upgrade sites on their scheduled date, it is possible for the date to change.

I wasn’t especially happy with that reply after a support experience that lasted for a month, but some time in October that year, the evening packet loss became less, and the window of time where we experienced congestion shrank. So I guess they did do some sort of capacity expansion.

It’s been mostly the same since then, i.e. slower in the evenings than during the day, but, well, it could be worse than it is. There was one glitch in November or December 2018 (poor speed / connection issues again, but this time during the day) which resulted in iiNet sending out a new router, but I don’t have a record of this, because it was a couple of hours of phone support that for some reason never appeared in the list of tickets in the iiNet toolbox, and even if it had, once a ticket is closed, it’s impossible to click it to view the details of what actually happened. It’s just a subject line, status and last modified date.

Fast forward to Monday March 25 2019 – a day with a severe weather warning for damaging winds – and I woke up to 34% packet loss, ping times all over the place (32-494ms), continual disconnections from IRC and a complete inability to use a VPN connection I need for work. I did the power-cycle-everything dance to no avail. I contemplated a phone call to support, then tethered my laptop to my phone instead in order to get a decent connection, and decided to wait it out, confident that the issue had already been reported by someone else after chatting to my neighbour.

hideous-packet-loss-march-2019

Tuesday morning it was still horribly broken, so I unplugged the router from the NTD, plugged a laptop straight in, and started running ping, traceroute and speed tests. Having done that I called support and went through the whole story (massive packet loss, unusable connection). They asked me to run speed tests again, almost all of which failed immediately with a latency error. The one that did complete showed about 8Mbps down, compared to the usual ~20Mbps during the day. So iiNet lodged a fault, and said there was an appointment available on Thursday for someone to come out. I said fine, thank you, and plugged the router back in to the NTD.

Curiously, very shortly after this, everything suddenly went back to normal. If I was a deeply suspicious person, I’d imagine that because I’d just given the MAC address of my router to support, this enabled someone to reset something that was broken at the other end, and fix my connection. But nobody ever told me that anything like this happened; instead I received a phone call the next day to say that the “speed issue” I had reported was just regular congestion and that the tower was scheduled for an upgrade later in the year. I thanked them for the call, then pointed out that the symptoms of this particular issue were completely different to regular congestion and that I was sure that something had actually been broken, but I was left with the impression that this particular feedback would be summarily ignored.

I’m still convinced something was broken, and got fixed. I’d be utterly unsurprised if there had been some problem with the tower on the Sunday night, given the strong winds, and it took ’til mid-Tuesday to get it sorted. But we’ll never know, because NBN Co don’t publish information about congestion, scheduled upgrades, faults and outages anywhere the general public can see it. I’m not even sure they make this information consistently available to retail ISPs. My neighbour, who’s with a different ISP, sent me a notice that says there’ll be maintenance/upgrades occurring on April 18, then again from April 23-25. There’s nothing about this on iiNet’s status page when I enter my address.

There was one time in the past few years though, when there was an outage that impacted me, and it was listed on iiNet’s status page. It said “customers in the area of Herringback may be affected”. I initially didn’t realise that meant me, as I’d never heard for a suburb, region, or area called Herringback. Turns out it’s the name of the mountain our NBN tower is on.

LUV April 2019 Main Meeting: EasyBuild

Apr 2 2019 19:00
Apr 2 2019 21:00
Apr 2 2019 19:00
Apr 2 2019 21:00
Location: 
Kathleen Syme Library, 251 Faraday Street Carlton VIC 3053

PLEASE NOTE LATER START TIME

7:00 PM to 9:00 PM Tuesday, April 2, 2019
Kathleen Syme Library, 251 Faraday Street Carlton VIC 3053

Speakers:

  • Lev Lafayette: EasyBuild

 

Many of us like to go for dinner nearby after the meeting, typically at Brunetti's or Trotters Bistro in Lygon St.  Please let us know if you'd like to join us!

Linux Users of Victoria is a subcommittee of Linux Australia.

April 2, 2019 - 19:00

read more

LUV April 2019 Workshop: Computerbank tour

Apr 27 2019 15:00
Apr 27 2019 19:00
Apr 27 2019 15:00
Apr 27 2019 19:00
Location: 
Computerbank Victoria, 1 Stawell St. North Melbourne

PLEASE NOTE DIFFERENT TIME AND LOCATION

Brian Salter-Duke will be giving a tour of the new premises of Computerbank Victoria at 1 Stawell St. North Melbourne starting at 15:00, followed by a meeting at Errol's Cafe at 69-71 Errol St. North Melbourne from 17:00 onward.

 

Linux Users of Victoria is a subcommittee of Linux Australia.

April 27, 2019 - 15:00

March 26, 2019

Brilliant Smart Wifi plug with Tasmota

Share

A couple of weeks ago I was playing with Tuya derived smart light globes (Mirabella Genio from K-Mart Australia in my case, but there are a variety of other options as well). Now Bunnings has the Brillant Smart Wifi Plug on special for $20 and I decided to give that a go as well, given it is also Tuya derived.

The basic procedure for OTA flashing was the same as flashing the globes, except that you hold down the button on the device for five seconds to put it into flash mode. That all worked brilliantly, until I appear to have fat fingered my wifi details in Tasmota — when I rebooted the device it never appeared on my network.

That would be much more annoying on the globes, but it turns out these smart plugs are really easy to open and that Tuya has documented the pin out of the controlling microprocessor. So, I ended up temporarily soldering some cables to the microprocessor to debug what had gone wrong. It should be noted that as a soldering person I make a great software engineer:

Jumper wires soldered to the serial port.

Once you’ve connected with a serial console, its pretty obvious who can’t be trusted to type their wifi password correctly:

I can’t have nice things.

At this point I was in the process of working out how to use esptool to re-flash the plug when I got super lucky. However, first… Where the hell is GPIO0 (the way you turn on flashing mode)? Its not broken out on the pins for the MCU, but as a kind redditor pointed out, it is exposed on a pad on the back of the board:

The cunningly hidden GPIO0.

…and then I got lucky. You see, to put the MCU into flashing mode you short GPIO0 to ground. I was having trouble getting that to work with esptool, so I had a serial console attached to see what was happening. There I was, shorting GPIO0 out over and over trying to get the magic to work. However, Tasmota also setups up a button on GPIO0 by default, because the sonoffs have a hardware button on that pin. If you hit that button with four short presses, you put the device back into captive portal configuration mode

Once I was back in that mode I could just use a laptop over wifi to re-enter the wifi password and I’m good to go. In hindsight I didn’t need the serial port if I could have powered the device and shorted that pin four times, but it sure was nice to be told what was happening on the serial console while poking around.

Share

March 22, 2019

Seeding sccache for Faster Brave Browser Builds

Compiling the Brave Browser (based on Chromium) on Linux can take a really long time and so most developers use sccache to cache objects files and speed up future re-compilations.

Here's the cronjob I wrote to seed my local cache every work day to pre-compile the latest builds:

30 23 * * 0-4   francois  /usr/bin/chronic /home/francois/bin/seed-brave-browser-cache

and here are the contents of that script:

#!/bin/bash
set -e

# Set the path and sccache environment variables correctly
source ${HOME}/.bashrc-brave
export LANG=en_CA.UTF-8

cd ${HOME}/devel/brave-browser-cache

echo "Environment:"
echo "- HOME = ${HOME}"
echo "- PATH = ${PATH}"
echo "- PWD = ${PWD}"
echo "- SHELL = ${SHELL}"
echo "- BASH_ENV = ${BASH_ENV}"
echo

echo $(date)
echo "=> Update repo"
git pull
npm install
npm run init

echo $(date)
echo "=> Delete any old build output"
rm -rf src/out

echo $(date)
echo "=> Debug build"
killall sccache || true
ionice nice timeout 4h npm run build || ionice nice timeout 4h npm run build
ionice nice ninja -C src/out/Debug brave_unit_tests
ionice nice ninja -C src/out/Debug brave_browser_tests
echo

echo $(date)
echo "=>Release build"
killall sccache || true
ionice nice timeout 5h npm run build Release || ionice nice timeout 5h npm run build Release
ionice nice ninja -C src/out/Release brave_unit_tests
ionice nice ninja -C src/out/Release brave_browser_tests
echo

echo $(date)
echo "=> Delete build output"
rm -rf src/out

March 16, 2019

I, Robot

Share

Not the book of the movie, but the collection of short stories by Isaac Asimov. I’ve read this book several times before and enjoyed it, although this time I found it to be more dated than I remembered, both in its characterisations of technology as well as it’s handling of gender. Still enjoyable, but not the best book I’ve read recently.

I, Robot Book Cover I, Robot
Isaac Asimov
Fiction
Spectra
2004
224

The development of robot technology to a state of perfection by future civilizations is explored in nine science fiction stories.

Share

March 10, 2019

Running Home Assistant on Fedora with Docker

Home Assistant is a really great, open source home automation platform written in Python which supports hundreds of components. They have a containerised version called Hass.io which can run on a bunch of hardware and has a built-in marketplace to make the running of addons (like Let’s Encrypt) easy.

I’ve been running Home Assistant on a Raspberry Pi for a couple of years, but I want something that’s more poweful and where I have more control. Here’s how you can use the official Home Assistant containers on Fedora (note that this does not include their Hass.io marketplace).

First, install Fedora Server edition, which comes with the handy web UI for managing the system called Cockpit.

Once you’re up and running, install Docker and the Cockpit plugin.

sudo dnf install -y docker cockpit-docker

Now we can start and enable the Docker daemon and restart cockpit to load the Docker plugin.

sudo systemctl start docker && sudo systemctl enable docker
sudo systemctl restart cockpit

Create a location for the Home Assistant configuration and set the appropriate SELinux context. This lets you modify the configuration directly from the host and restart the container to pick up the change.

sudo mkdir -p /hass/config
sudo chcon -Rt svirt_sandbox_file_t /hass

Start up a container called hass using the Home Assistant Docker image which will start automatically thanks to the restart option. We pass through the /hass/config directory on the host as /config inside the container.

docker run --init -d \
--restart unless-stopped \
--name="hass" \
-v /hass/config/:/config \
-v /etc/localtime:/etc/localtime:ro \
--net=host \
homeassistant/home-assistant

You should be able to see the container starting up.

sudo docker ps

If you need to, you can get the logs for the container.

sudo docker logs hass

Once it’s started you should see port 8123 listening on the host

sudo ss -ltnp |grep 8123

Finally, enable port 8123 on the firewall to access the service on your network.

sudo firewall-cmd --zone=FedoraServer --add-port=8123/tcp
sudo firewall-cmd --runtime-to-permanent

Now browse to the IP address of your server on port 8123 and you should see Home Assistant. Create an account to get started!

March 08, 2019

Secure Data in Public Configuration Management With Propellor

Steampunk propeller ring by Daniel Proulx

TL;DR

List fields and contexts:

$ propellor --list-fields

Set a field for a particular context:

$ propellor --set 'SshAuthorizedKeys "myuser"' yourServers < authKeys

Dump a field from a specific context:

$ propellor --dump 'SshAuthorizedKeys "myuser"' yourServers

An Example

When using Propellor for configuration management, you can utilise GPG encryption to encrypt data sets. This enables you to leverage public git repositories for your centralised configuration management needs.

To list existing fields, you can run:

$ propellor --list-fields

which will not only list existing fields but will helpfully also list fields that would be used if set:

Missing data that would be used if set:
Field                             Context          Used by
-----                             -------          -------
'Password "myuser"'               'yourDesktops'   your.host.name
'CryptPassword "myuser"'          'yourServers'    your.server.name
'PrivFile "/etc/mail/dkim.key"'   'mailServers'    your.mail.server

You can set these fields with input from either STDIN or files prepared earlier.

For example, if you have public SSH keys you wish to distribute, you can place then into a file then use that file to populate the fields of an appropriate context. The contents of an example authorized_keys, we'll call authKeys, may look like this:

ssh-ed25519 eetohm9doJ4ta2Joo~P2geetoh6aBah9efu4ta5ievoongah5feih2eY4fie9xa1ughi you@host1
ssh-ed25519 choi7moogh<i2Jie6uejoo6ANoMei;th2ahm^aiR(e5Gohgh5Du-oqu1roh6Mie4shie you@host2
ssh-ed25519 baewah%vooPho2Huofaicahnob=i^ph;o1Meod:eugohtiuGeecho2eiwi.a7cuJain6 you@host3

To add these keys to the appropriate users for the hosts of a particular context you could run:

$ propellor --set 'SshAuthorizedKeys "myuser"' yourServers < authKeys

To verify that the fields for this context have the correct data, you can dump it:

$ propellor --dump 'SshAuthorizedKeys "myuser"' yourServers
gpg: encrypted with 256-bit ECDH key, ID 5F4CEXB7GU3AHT1E, created 2019-03-08
      "My User <myuser@my.domain.tld>"
      ssh-ed25519 eetohm9doJ4ta2Joo~P2geetoh6aBah9efu4ta5ievoongah5feih2eY4fie9xa1ughi you@host1
      ssh-ed25519 choi7moogh<i2Jie6uejoo6ANoMei;th2ahm^aiR(e5Gohgh5Du-oqu1roh6Mie4shie you@host2
      ssh-ed25519 baewah%vooPho2Huofaicahnob=i^ph;o1Meod:eugohtiuGeecho2eiwi.a7cuJain6 you@host3

When you next spin Propellor for the desired hosts, those SSH public keys with be installed into the authorized_keys_ filefor the user myuser for hosts that belong to the allServers context.

Setting and Storing Passwords

One of the most obvious and practical uses of this feature is to set secure data that needs to be distributed, such as passwords or certificates. We'll use passwords for this example.

Create a hash of the password you wish to distribute:

$ mkpasswd -m sha-512 > /tmp/deleteme
Password:
$ cat /tmp/deleteme
$6$cyxX.TmGPZWuqQu$LxhbVBaUnFmevOVi1V1NApZA0TCcSkK1241eiZwhhBQTm/PpjoLHe3OMnbjeswa6rgzNAq3pXTB4KjvfF1iXA1

Now that we have that file, we can use it as input for Propellor:

$ propellor --set 'CryptPassword "myuser"' yourServers < /tmp/deleteme
Enter private data on stdin; ctrl-D when done:
gpg: encrypted with 256-bit ECDH key, ID 5F4CEXB7GU3AHT1E, created 2019-03-08
      "My User <myuser@my.domain.tld>"
gpg: WARNING: standard input reopened
Private data set.

Tidy up:

$ rm /tmp/deletem

You're now ready to deploy that password for that user to those servers.

Mirabella Genio smart lights with Tasmota and Home Assistant

Share

One of the things I like about Home Assistant is that it allows you to take hardware from a bunch of various vendors and stitch it together into a single consistent interface. So for example I now have five home automation vendor apps on my phone, but don’t use any of them because Home Assistant manages everything.

A concrete example — we have Philips Hue lights, but they’re not perfect. They’re expensive, require a hub, and need to talk to a Philips data centre to function (i.e. the internet needs to work at my house, which isn’t always true thanks to the failings of the Liberal Party).

I’d been meaning to look at the cheapo smart lights from Kmart for a while, and finally got around to it this week. For $15 you can pickup a dimmable white globe, and for $29 you can have a RGB one. That’s heaps cheaper than the Hue options. Even better, the globes are flashable to run the open source Tasmota stack, which means no web services required!

So here are some instructions on flashing these globes to be useful:

Buy the globes. I bought this warm while dimmable and this RBG option.

Flash to tasmota. This was a little bit fiddly, but was mostly about getting the sequence to put the globes into config mode right (turn off for 10 seconds, turn on, turn off, turn on, turn off, turn on). Wait a few seconds and then expect the lamp to blink rapidly indicating its in config mode. For Canberra people I now have a raspberry pi setup to do this easily, so we can run a flashing session sometime if people want.

Configure tasmota. This is really up to you, but the globes need to know local wifi details, where your MQTT server is, and stuff like that.

And then configure Home Assistant. The example of how to do that from my house is on github.

Share

March 06, 2019

Authentication in WordPress

WebAuthn is now a W3C recommendation, bringing us one step closer to not having to use passwords anymore. If you’re not familiar with WebAuthn, here’s a little demo (if you don’t own a security key, it’ll probably work best on an Android phone with a fingerprint reader).

That I needed to add a disclaimer for the demo indicates the state of WebAuthn authenticator support. It’s nice when it works, but it’s clearly still in progress, and that progress varies. WebAuthn also doesn’t cover how the authenticator device works, that falls under the proposed CTAP standard. They work together to form the FIDO2 Project. Currently, the most reliable option is to purchase a security key, but quality varies wildly, and needing to carry around an extra dongle just for logging in to sites is no fun.

What WordPress Needs

Anything that replaces passwords needs to provide some extra benefit, without losing the strengths of the password model:

  • Passwords are universally understood as an authentication model.
  • They’re portable: you don’t need a special app or token to use them anywhere.
  • They’re extendable: strong passwords can be enforced as needed. Additional authentication (2FA codes, for example) can be added, too.

Magic login links are an interesting step in this direction. The WordPress mobile apps added magic login support for WordPress.com accounts a while ago, I’d love to see this working on all WordPress sites.

A WebAuthn-based model would be a wonderful future step, once the entire user experience is more polished.

The password-less future hasn’t quite arrived yet, but we’re getting closer.

March 04, 2019

What If?

Share

More correctly titled “you die horribly and it probably involves plasma”, this light hearted and fun read explores serious answers to silly scientific questions. The footnotes are definitely the best bit. A really enjoyable read.

What If? Book Cover What If?
Randall Munroe
Humor
Houghton Mifflin Harcourt
September 2, 2014
320

The creator of the incredibly popular webcomic xkcd presents his heavily researched answers to his fans' oddest questions, including “What if I took a swim in a spent-nuclear-fuel pool?” and “Could you build a jetpack using downward-firing machine guns?”

Share

Audiobooks – February 2019

Tamed: Ten Species that Changed our World by Alice Roberts

Plenty of content (14 hours) and not too dumbed down. About 8 of the 10 species are the ones you’d expect. 8/10

It Won’t Be Easy: An Exceedingly Honest (and Slightly Unprofessional) Love Letter to Teaching by Tom Rademacher

A breezy little book about the realities of teaching (at least in the US). Interesting to outsiders & hopefully useful to those in the profession. 7/10

The Hobbit by J. R. R Tolkien, Read by Rob Inglis

A good audio-edition of the book. Unabridged & really the default one for most people. I alternated chapters of this with the excellent Prancing Pony Podcast commentaries on those chapters. 9/10

The Life of Greece: The Story of Civilization, Volume 2 (The Story of Civilization series) by Will Durant

32 hours on the history of Ancient Greece. Seemed to cover just above everything. Written in the 1930s so probably a little out-of-date in places. 7/10

Share

March 03, 2019

Problems with Dreamhost

Share

This site is hosted at Dreamhost, and for reasons I can’t explain right now isn’t accessible from large chunks of Australia. It seems to work fine from elsewhere though. Dreamhost certainly has an explaination — they allege in their emails that take 24 hours that you can’t reply to that its because wordpress is using too much RAM.

However, they don’t explain why that’s suddenly happened when its been previously fine for years, and they certainly don’t explain why it works from some places but not others and why other Dreamhost sites are also offline from the sites having issues.

Its time for a new hosting solution I think, although not bothering to have hosting might also be that solution.

Share

LPCNet Quantiser – wideband speech at 1700 bits/s

I’ve been working with Neural Net (NN) speech synthesis using LPCNet.

My interest is digital voice over HF radio. To get a NN codec “on the air” I need a fully quantised version at 2000 bit/s or below. The possibility of 8kHz audio over HF radio is intriguing, so I decided to experiment with quantising the LPCNet features. These consist of 18 spectral energy samples, pitch, and the pitch gain which is effectively a measure of voicing.

So I have built a Vector Quantiser (VQ) for the DCT-ed 18 log-magnitude samples. LPCNet updates these every 10ms, which is a bit too fast for my target bit rate. So I decimate to say 30ms, then use linear interpolation to reconstruct the 10ms frames at the decoder. The spectrum changes slowly (most of the time), so I quantise the difference between frames to save a few bits.

Detailed Results

I’ve developed a script that generates a bunch of samples, plots various statistics, and builds a HTML page to summarise the results. Here is the current page, including samples for the fully quantised prototype codec at three bit rates between around 2000 and 1400 bits/s. If anyone would like more explanation of that page, just ask.

Discussion of Results

I can hear “birch” losing some quality at the 20ms decimation step. When training my own NN, I have had quite a bit of trouble with very rough speech when synthesising “canadian”. I’m learning that roughness in NN synthesis means more training required, the network just hasn’t experienced this sort of speaker before. The “canadian” sample is quite low pitch so I may need some more training material with low pitch speakers.

My quantisation scheme works really well on some of the carefully spoken Harvard sentences (oak, glue), in ideal recording conditions. However with more realistic, quickly spoken speech with real world background noise (separately, wanted) it starts to sound vocoder-ish (albeit a pretty good vocoder).

One factor is the frame rate decimation from 10 to 20-30ms, which I used to get the bit rate beneath 2000 bit/s. A better quantisation scheme, or LPCNet running on 20ms frames could improve this. Or we could just run it at greater that 2000 bit/s (say for VHF/UHF two way radio).

Comparison to Wavenet

Source Listen
Wavenet, Codec 2 encoder, 2400 bits/s Listen
LPCnet unquantised, 10ms frame rate Listen
Quantised to 1733 bits/s (44bit/30ms) Listen

The “separately” sample from the Wavenet team sounds better to me. Ironically, the these samples use my Codec 2 encoder, running at just 8kHz! It’s difficult to draw broad conclusions from this, as we don’t have access to a Wavenet system to try many different samples. All codecs tend to break down under certain conditions and samples.

However it does suggest (i) we can eventually get higher quality from NN synthesis and (ii) it is possible to encode high quality wideband speech with features covering a narrow spectral range (e.g. 200-3800Hz for the Codec 2 encoder). The 18 element vectors (covering DC to 8000Hz) I’m currently using ultimately set the bit rate of my current system. After a few VQ stages the elements are independent Gaussians and reduction in quantiser noise is very slow as bits are added.

The LPCNet engine has several awesome features: it’s open source, runs in real time on regular CPUs, and is available for us to test on wide variety of samples. The speech quality I am achieving with even my first attempts is rather good compared to any other speech codecs I have played with at these bit rates – in either the open or closed source worlds.

Tips and Observations

I’ve started training my own models, and discovered that if you get rough speech – you probably need more data. For example when I tried training on 1E6 vectors, I had a few samples sounding rough when I tested the network. However with 5E6 vectors, it works just fine.

The LPCNet dump_data –train mode program helps you by being very clever. It “fuzzes” the speech frequency, gain, and adds a little noise. If the NN hasn’t experienced a particular combination of features before, it tends to get lost – and you get rough sounding speech.

I found that 10 Epochs of 5E6 vectors gives me good speech quality on my test samples. That takes about a day with my somewhat underpowered GPU. In fact, most of the training seems to happen on the first few Epochs:

Here is a plot of the training and validation loss for my training database:

This plot shows how much the loss changes on each Epoch, not very much, but not zero. I’m unsure if these small gains lead to meaningful improvements over many Epochs:

I looked into the LPCNet pitch and voicing estimation. Like all estimators (including those in Codec 2), they tend to make occasional mistakes. That’s what happen when you try to fit neat signal processing models to real-world biological signals. Anyway, the amazing thing is that LPCNet doesn’t care very much. I have some samples where pitch is all over the place but the speech still sounds OK.

This is really surprising to me. I’ve put a lot of time into the Codec 2 pitch estimators. Pitch errors are very obvious in traditional, model based low bit rate speech codecs. This suggest that with NNs we can get away with less pitch information – which means less bits and better compression. Same with voicing. This leads to intriguing possibilities for very low bit (few 100 bit/s) speech coding.

Conclusions, Further Work and FreeDV 2020

Overall I’m pleased with my first attempt at quantisation. I’ve learnt a lot about VQ and NN synthesis and carefully documented (and even scripted) my work. The learning and experimental experience has been very satisfying.

Next I’d like to get one of these candidates on the air, see how it sounds over real world digital radio channels, and find out what happens when we get bit errors. I’m a bit nervous about predictive quantisation on radio channels, as it causes errors to propagate in time. However I have a good HF modem and FEC, and some spare bits to add some non-predictive quantisation if needed.

My design for a new, experimental “FreeDV 2020” mode employing LPCNet uses just 1600 Hz of RF bandwidth for 8kHz bandwidth speech, and should run at 10dB SNR on a moderate fading channel.

Here is a longer example of LPCNet at 1733 bit/s compared to HF SSB at a SNR of 10dB (we can send error free LPCNet through a similar HF channel). The speech sample is from the MP3 source of the Australian weekly WIA broadcast:

Source Listen
SSB simulation at 10dB SNR Listen
LPCNet Quantised to 1733 bits/s (44bit/30ms) Listen
Mixed LPCNet Quantised and SSB (thanks Peter VK2TPM!) Listen

This is really new technology, and there is a lot to explore. The work presented here represents my initial attempt at quantisation with the LPCNet synthesis engine, and is hopefully useful for other people who would like to experiment in the area.

Acknowledgements

Thanks Jean-Marc for developing the LPCnet technology, making the code open source, and answering my many questions.

Links

LPCnet introductory page.

The source code for my quantisation work (and notes on how to use it) is available as a branch on the GitHub LPCNet repo.

WaveNet and Codec 2

March 01, 2019

Connecting a VoIP phone directly to an Asterisk server

On my Asterisk server, I happen to have two on-board ethernet boards. Since I only used one of these, I decided to move my VoIP phone from the local network switch to being connected directly to the Asterisk server.

The main advantage is that this phone, running proprietary software of unknown quality, is no longer available on my general home network. Most importantly though, it no longer has access to the Internet, without my having to firewall it manually.

Here's how I configured everything.

Private network configuration

On the server, I started by giving the second network interface a static IP address in /etc/network/interfaces:

auto eth1
iface eth1 inet static
    address 192.168.2.2
    netmask 255.255.255.0

On the VoIP phone itself, I set the static IP address to 192.168.2.3 and the DNS server to 192.168.2.2. I then updated the SIP registrar IP address to 192.168.2.2.

The DNS server actually refers to an unbound daemon running on the Asterisk server. The only configuration change I had to make was to listen on the second interface and allow the VoIP phone in:

server:
    interface: 127.0.0.1
    interface: 192.168.2.2
    access-control: 0.0.0.0/0 refuse
    access-control: 127.0.0.1/32 allow
    access-control: 192.168.2.3/32 allow

Finally, I opened the right ports on the server's firewall in /etc/network/iptables.up.rules:

-A INPUT -s 192.168.2.3/32 -p udp --dport 5060 -j ACCEPT
-A INPUT -s 192.168.2.3/32 -p udp --dport 10000:20000 -j ACCEPT

Accessing the admin page

Now that the VoIP phone is no longer available on the local network, it's not possible to access its admin page. That's a good thing from a security point of view, but it's somewhat inconvenient.

Therefore I put the following in my ~/.ssh/config to make the admin page available on http://localhost:8081 after I connect to the Asterisk server via ssh:

Host asterisk
    LocalForward 8081 192.168.2.3:80

February 28, 2019

LUV March 2019 Workshop: 30th Anniversary of the Web / Federated Social Media

Mar 16 2019 12:30
Mar 16 2019 16:30
Mar 16 2019 12:30
Mar 16 2019 16:30
Location: 
Infoxchange, 33 Elizabeth St. Richmond

This month we will celebreate the 30th anniversary of the World Wide Web with a discussion of its past, present and future.  Andrew Pam will also demonstrate and discuss the installation, operation and use of federated social media platforms including Diaspora and Hubzilla.

The meeting will be held at Infoxchange, 33 Elizabeth St. Richmond 3121.  Late arrivals please call (0421) 775 358 for access to the venue.

LUV would like to acknowledge Infoxchange for the venue.

Linux Users of Victoria is a subcommittee of Linux Australia.

March 16, 2019 - 12:30

read more

LUV March 2019 Main Meeting: ZeroTier / Ethics in the computer realm

Mar 5 2019 19:00
Mar 5 2019 21:00
Mar 5 2019 19:00
Mar 5 2019 21:00
Location: 
Kathleen Syme Library, 251 Faraday Street Carlton VIC 3053

PLEASE NOTE LATER START TIME

7:00 PM to 9:00 PM Tuesday, March 5, 2019
Kathleen Syme Library, 251 Faraday Street Carlton VIC 3053

Speakers:

  • Adrian Close: ZeroTier
  • Enno Davids: Ethics - Weasel words, weasel words, weasel words, bad thing

 

Many of us like to go for dinner nearby after the meeting, typically at Brunetti's or Trotters Bistro in Lygon St.  Please let us know if you'd like to join us!

Linux Users of Victoria is a subcommittee of Linux Australia.

March 5, 2019 - 19:00

read more

Propagating Native Plants

by Alec M. Blomberry & Betty Maloney

Propagating Australian Plants

I have over 1,000 Diospyrus Geminata (scaly ebony) seedlings growing in my shade house (in used toilet rolls). I'd collected the seeds from the (delicious) fruit in late 2017 (they appear to fruit based on rainfall - no fruit in 2018) and they're all still rather small.

All the literature stated that they were slow growing. I may have been more dismissive of this than I needed to be.

I'm growing these for landscape scale planting, it's going to be a while between gathering the seeds (mid 2017) and planting the trees (maybe mid 2019).

So I needed to look into other forms of propagation and either cutting or aerial layering appear to be the way to go, as I already have large numbers of of mature Diospyros Geminata on our property or nearby.

The catch being that I know nothing of either cutting or aerial layering and in particular I want to do this at a reasonable scale (ie: possibly thousands).

So this is where Propagating Australian Plants comes in.

Aerial Layering

It's a fairly dry and academic read, that feels like it may be more of an introductory guide for botanic students than a lay person such as myself.

Despite being last published in 1994 by a publisher that no longer exists and having a distinct antique feel to it, the information within is crisp and concise with clear and helpful illustrations.

Highly recommended if you're starting to propagate natives as I am.

Although I wish you luck picking up a copy - I got mine from an Op Shop. At least the National Library of Australia appears to have a copy.

February 27, 2019

FreeDV QSO Party 2019 Part 1

My local radio club, the Amateur Radio Experimenters Group (AREG), have organised a special FreeDV QSO Party Weekend from April 27th 0300z to April 28th 0300z 2019. This is a great chance to try out FreeDV, work Australia using open source HF digital voice, and even talk to me!

All the details including frequencies, times and the point scoring system over on the AREG site.

February 26, 2019

5 axis cnc fun!

The 5th axis build came together surprisingly well. I had expected much more resistance getting the unit to be known to both fusion360 and LinuxCNC. There is still some tinkering to be done for sure but I can get some reasonable results already. The video below gives an overview of the design:



Shown below is a silent movie of a few test jobs I created to see how well tool contact would be maintained during motion in A and B axis while the x,y,z are moved to keep the tool in the right position. This is the flow toolpath in Fusion360 in action. Non familiarity with these CAM paths makes for a learning curve which is interesting when paired with a custom made 5th that you are trying to debug at the same time.



I haven't tested how well the setup works when cutting harder materials like alloy yet. It is much quieter and forgiving to test cutting on timber and be reasonably sure about the toolpaths and that you are not going to accidentally crash to deep into the material after a 90 degree rotation.


New Dark Age: Technology and the End of the Future

by James Bridle

New Dark Age: Technology and the End of the Future

It may be my first book for 2019 but I'm going to put it out there, this is my must read book for 2109 already. Considering it was published in 2018 to broad acclaim, it may be a safe call.

tl;dr; Read this book. It's well resourced, thoroughly referenced, well thought out with the compiled information and lines drawn potentially redrawing the way you see our industry and the world. If you're already across the issues, the hard facts will still cause you to draw breath.

I read this book in bursts over 4 weeks. Each chapter packing it's own informative punch. The narrative first grabbed my attention on page 4 where the weakness of learning to code alone was fleshed out.

"Computational thinking is predominant in the world today, driving the worst trends in our societies and interactions, and must be opposed by real systemic literacy." page 4

Where it is argued that systemic literacy is much more important than learning to code - with a humorous but fitting plumbing analogy in the mix.

One of the recurring threads in the book is the titular "Dark New Age", with points being drawn back to the various actions present in modern society that are actively reducing our knowledge.

"And so we find ourselves today connected to vast repositories of knowledge, and yet we have not learned to think. In fact, the opposite is true: that which was intended to enlighten the world in practice darkens it. The abundance of information and the plurality of world-views now accessible to us through the Internet are not producing a coherent consensus reality, but one riven by fundamentalist insistence on simplistic narratives, conspiracy theories, and post-factual politics." page 10

Also covered are more well known instances of corporate and government censorship, the traps of the modern convenience technologies.

"When an ebook is purchased from an online service, it remains the property of the seller, it's loan subject to revocation at any time - as happened when Amazon remotely deleted thousands of copies of 1984 and Animal Farm from customers' Kindles in 2009. Streaming music and video services filter the media available by legal jurisdiction and algorithmically determine 'personal' preferences. Academic journals determine access to knowledge by institutional affiliation and financial contribution as physical, open-access libraries close down." page 39

It was the "Climate" chapter that packed the biggest punch for me, as an issue I considered myself rather well across over the last 30 years, it turns out there was a significant factor I'd missed. A hint that surprise was coming came in an interesting diversion into clear air turbulence.

"An advisory circular on preventing turbulence-related injuries, published by the US Federal Aviation Administration in 2006, states that the frequency of turbulence accidents has increased steadily for more than a decade, from 0.3 accidents per million flights in 1989 to 1.7 in 2003." page 68

The reason for this increase was laid at the feet of increased CO2 levels in the atmosphere by Paul Williams of the National Centre for Atmospheric Science and the implications were expounded upon in his paper Nature Climate Change (2013) thusly:

"...in winter, most clear air turbulence measures show a 10-40 per cent increase in the median strength...40-70 per cent increase in the frequency of occurrence of moderate or greater turbulence." page 69

The real punch in the guts came on page 73, where I first came across the concept of "Peak Knowledge" and how the climate change was playing it's defining role in that decline, where President of the American Meteorological Society William B Gail wonders if:

"we have already passed through 'peak knowledge", just as we may have already passed 'peak oil'." page 73

Wondering what that claim was based on, the next few paragraphs of information can be summarised in the following points:

  • From 1000 - 1750 CE CO2 was at 275-285 parts / million.
  • 295ppm by the start of the 20th century
  • 310ppm by 1950
  • 325ppm in 1970
  • 350ppm in 1988
  • 375ppm by 2004
  • 400ppm by 2015 - the first time in 800,000 years
  • 1,000ppm is projected to be passed by the end of this century.

"At 1,000ppm, human cognitive ability drops by 21%" page 74

Then a couple of bombshells:

"CO2 already reaches 500ppm in industrial cities"

"indoors in poorly ventilated schools, homes and workplaces it can regularly exceed 1,000ppm - substantial numbers of schools in California and Texas measured in 2012 breached 2,000ppm."

The implications of this are fairly obvious.

All this is by the end of chapter 3. It's a gritty, honest look at where we're at and where going. It's not pretty but as the old saying goes, to be forewarned is to be forearmed.

Do yourself a favour, read it.

A (simplified) view of OpenPOWER Firmware Development

I’ve been working on trying to better document the whole flow of code that goes into a build of firmware for an OpenPOWER machine. This is partially to help those not familiar with it get a better grasp of the sheer scale of what goes into that 32/64MB of flash.

I also wanted to convey the components that we heavily re-used from other Open Source projects, what parts are still “IBM internal” (as they relate to the open source workflow) and which bits are primarily contributed to by IBMers (at least at this point in time).

As such, let’s start with the legend of the diagram:

Now, the diagram:

Simplified development flow for OpenPOWER firmware

The end thing that a user with a machine will download and apply (or that comes shipped with a box) is the purple “Installable Firmware Release” nodes (bottom center). In this diagram, there are 4 of them. One for POWER9 systems such as the just-announced AC922 system (this is the “OP910 Release” node, which is the witherspoon_defconfig in the op-build tree); one for the p9dsu platform (p9dsu_defconfig in op-build) and one is for IBM FSP based systems such as the S812L and S822L systems (or S812/S822 in OPAL mode).

There are more platforms out there, but this diagram is meant to be simplified. The key difference with the p9dsu platform is that this is produced by somebody other than IBM.

All of these releases are based off the upstream op-build project, op-build is the light blue box in the center of the diagram. We do regular X.Y releases and sometimes do X.Y.Z releases. It’s primarily a pull request based workflow currently, so everything goes via a pull request. The op-build project brings together all the POWER specific firmware components (pretty much everything in every other light blue/blue box) along with a Linux kernel and buildroot.

The kernel and buildroot are the two big yellow boxes on the top right. Buildroot brings together a lot of open source components that are in our firmware image (including some power specific ones that we get through upstream buildroot).

For Linux, this is a pretty simplified view of the process, but we primarily ship the stable tree (with maybe up to half a dozen patches).

The skiboot and petitboot components both use a mailing list based workflow (similar to kernel) as well as X.Y and X.Y.Z releases (again, similar to the linux kernel).

On the far left of the diagram, we have Hostboot, SBE and OCC. These are three firmware components that come from the traditional IBM POWER Firmware group, and are shared with the IBM non-OpenPOWER POWER systems (“traditional” POWER). These components have part of their code from from an (internal) repository called “ekb” which also goes into a (very) low level debug tool and the FSP based systems. There’s also an (internal) gerrit instance that’s the primary place where code review/development discussions are for these components.

In future posts, I’ll probably delve into more specifics of the current development process, and how we may try and change things for the better.

How I do email (at work)

Recently, I blogged on my home email setup and in that post, I hinted that my work setup was rather different. I have entirely separate computing devices for work and personal, a setup I strongly recommend. This also lets me “go home” from work even when working from home, I use a different physical machine!

Since I work for IBM I have (at least) two email accounts for work: a Lotus Notes one and a internet standards compliant one. It’s “easy” enough to get the Notes one to forward to the standards compliant one, from which I can just use fetchmail or similar to pull down mail.

I run mail through a rather simple procmail script: it de-mangles some URL mangling that can happen in the current IBM email infrastructure, runs things through SpamAssassin and deliver to a date based Maildir (or one giant pile for spam).

My ~/.procmailrc looks something like this:

LOGFILE=$HOME/mail_log
LOGABSTRACT=yes

DATE=`date +"%Y%m"`
MAILDIR=Maildir/INBOX
DEFAULT=$DATE/

:0fw
| magic_script_to_unmangle_things

:0fw
| spamc

:0
* ^X-Spam-Status: Yes
$HOME/Maildir/junkmail/incoming/

I use tail -f mail_log as a really dumb kind of biff replacement.

Now, what do I read and write mail with? Notmuch! It is the only thing that even comes close to being able to deal with a decent flow of mail. I have a couple of saved searches just to track how much mail I pull in a day/week. Today (on Monday), it says 442 today and 10,403 over the past week.

For the most part, my workflow is kind of INBOX-ZERO like, except that I currently view victory as INBOX 2000. Most mail does go into my INBOX, the notable exceptions are two main mailing lists I’m subscribed to mostly as FYI and to search/find things when needed. Those are the Linux Kernel Mailing List (LKML) and the buildroot mailing list. Why notmuch rather than just searching the web for mailing list archives? Notmuch can return the result of a query in less time it takes light to get to and from the United States in ideal conditions.

For work, I don’t sync my mail anywhere. It’s just on my laptop. Not having it on my phone is a feature. I have a notmuch post-new hook that does some initial tagging of mail, and as such I have this in my ~/.notmuch-config:

[new]
tags=new;

My post-new hook looks like this:

#!/bin/bash

# immediately archive all messages from "me"
notmuch tag -new -- tag:new and from:stewart@linux.vnet.ibm.com

# tag all message from lists
notmuch tag +devicetree +list -- tag:new and to:devicetree@vger.kernel.org
notmuch tag +inbox +unread -new -- tag:new and tag:devicetree and to:stewart@linux.vnet.ibm.com
notmuch tag +listinbox +unread +list -new -- tag:new and tag:devicetree and not to:stewart@linux.vnet.ibm.com

notmuch tag +linuxppc +list -- tag:new and to:linuxppc-dev@lists.ozlabs.org
notmuch tag +linuxppc +list -- tag:new and cc:linuxppc-dev@lists.ozlabs.org
notmuch tag +inbox +unread -new -- tag:new and tag:linuxppc
notmuch tag +openbmc +list -- tag:new and to:openbmc@lists.ozlabs.org
notmuch tag +inbox +unread -new -- tag:new and tag:openbmc

notmuch tag +lkml +list -- tag:new and to:linux-kernel@vger.kernel.org
notmuch tag +inbox +unread -new -- tag:new and tag:lkml and to:stewart@linux.vnet.ibm.com
notmuch tag +listinbox +unread -new -- tag:new and tag:lkml and not tag:linuxppc and not to:stewart@linux.vnet.ibm.com

notmuch tag +qemuppc +list -- tag:new and to:qemu-ppc@nongnu.org
notmuch tag +inbox +unread -new -- tag:new and tag:qemuppc and to:stewart@linux.vnet.ibm.com
notmuch tag +listinbox +unread -new -- tag:new and tag:qemuppc and not to:stewart@linux.vnet.ibm.com

notmuch tag +qemu +list -- tag:new and to:qemu-devel@nongnu.org
notmuch tag +inbox +unread -new -- tag:new and tag:qemu and to:stewart@linux.vnet.ibm.com
notmuch tag +listinbox +unread -new -- tag:new and tag:qemu and not to:stewart@linux.vnet.ibm.com

notmuch tag +buildroot +list -- tag:new and to:buildroot@buildroot.org
notmuch tag +buildroot +list -- tag:new and to:buildroot@busybox.net
notmuch tag +buildroot +list -- tag:newa nd to:buildroot@uclibc.org
notmuch tag +inbox +unread -new -- tag:new and tag:buildroot and to:stewart@linux.vnet.ibm.com
notmuch tag +listinbox +unread -new -- tag:new and tag:buildroot and not to:stewart@linux.vnet.ibm.com

notmuch tag +ibmbugzilla -- tag:new and from:bugzilla@us.ibm.com

# finally, retag all "new" messages "inbox" and "unread"
notmuch tag +inbox +unread -new -- tag:new

This leaves me with both an inbox and a listinbox. I do not look at the overwhelming majority of mail that hits the listinbox – It’s mostly for following up on individual things. If I started to need to care more about specific topics, I’d probably add something in there for them so I could easily find them.

My notmuch emacs setup has a bunch of saved searches, so my notmuch-hello screen looks something like this:

This gets me a bit of a state-of-the-world-of-email-to-look-at view for the day. I’ll often have meetings first thing in the morning that may reference email I haven’t looked at yet, and this generally lets me quickly find mail related to the problems of the day and start being productive.

How I do email (at home)

I thought I might write something up on how I’ve been doing email both at home and at work. I very much on purpose keep the two completely separate, and have slightly different use cases for both of them.

For work, I do not want mail on my phone. For personal mail, it turns out I do want this on my phone, which is currently an Android phone. Since my work and personal email is very separate, the volume of mail is really, really different. Personal mail is maybe a couple of dozen a day at most. Work is… orders of magnitude more.

Considering I generally prefer free software to non-free software, K9 Mail is the way I go on my phone. I have it set up to point at the IMAP and SMTP servers of my mail provider (FastMail). I also have a google account, and the gmail app works fine for the few bits of mail that go there instead of my regular account.

For my mail accounts, I do an INBOX ZERO like approach (in reality, I’m pretty much nowhere near zero, but today I learned I’m a lot closer than many colleagues). This means I read / respond / do / ignore mail and then move it to an ARCHIVE folder. K9 and Gmail both have the ability to do this easily, so it works well.

Additionally though, I don’t want to care about limits on storage (i.e. expire mail from the server after X days), nor do I want to rely on “the cloud” to be the only copy of things. I also don’t want to have to upload any of past mail I may be keeping around. I also generally prefer to use notmuch as a mail client on a computer.

For those not familiar with notmuch, it does tags on mail in Maildir, is extremely fast and can actually cope with a quantity of mail. It also has this “archive”/INBOX ZERO workflow which I like.

In order to get mail from FastMail and Gmail onto a machine, I use offlineimap. An important thing to do is to set “status_backend = sqlite” for each Account. It turns out I first hacked on sqlite for offlineimap status a bit over ten years ago – time flies. For each Account I also set presynchook = ~/Maildir/maildir-notmuch-presync (below) and a postsynchook = notmuch new. The presynchook is run before we sync, and its job is to move files around based on the tags in notmuch and the postsynchook lets notmuch catch any new mail that’s been fetched.

My maildir-notmuch-presync hook script is:

#!/bin/bash
notmuch search --output=files not tag:inbox and folder:fastmail/INBOX|xargs -I'{}' mv '{}' "$HOME/Maildir/INBOX/fastmail/Archive/cur/"

notmuch search --output=files folder:fastmail/INBOX and tag:spam |xargs -I'{}' mv '{}' "$HOME/Maildir/INBOX/fastmail/Spam/cur/"
ARCHIVE_DIR=$HOME/Maildir/INBOX/`date +"%Y%m"`/cur/
mkdir -p $ARCHIVE_DIR
notmuch search --output=files folder:fastmail/Archive and date:..90d and not tag:flagged | xargs -I'{}' mv '{}' "$ARCHIVE_DIR"

# Gmail
notmuch search --output=files not tag:inbox and folder:gmail/INBOX|grep 'INBOX/gmail/INBOX/' | xargs -I'{}' rm '{}'
notmuch search --output=files folder:gmail/INBOX and tag:spam |xargs -I'{}' mv '{}' "$HOME/Maildir/INBOX/gmail/[Gmail].Spam/cur/"

So This keeps 90 days of mail on the fastmail server, and archives older mail off into month based archive dirs. This is simply to keep directory sizes not too large, you could put everything in one directory… but at some point that gets a bit silly.

I don’t think this is all the most optimal setup I could have, but it does let me read and answer mail on my phone and desktop (as well as use a web client if I want to). There is a bit of needless copying of messages by offlineimap under certain circumstances, but I don’t get enough personal mail for it to be a problem.

pwnm-sync: Synchronizing Patchwork and Notmuch

One of the core bits of infrastructure I use as a maintainer is Patchwork (I wrote about making it faster recently). Patchwork tracks patches sent to a mailing list, allowing me as a maintainer to track the state of them (New|Under Review|Changes Requested|Accepted etc), combine them into patch bundles, look at specific series, test results etc.

One of the core bits of software I use is my email client, notmuch. Most other mail clients are laughably slow and clunky, or just plain annoying for absorbing a torrent of mail and being able to deal with it or just plain ignore it but have it searchable locally.

You may think your mail client is better than notmuch, but you’re wrong.

A key feature of notmuch is tagging email. It doesn’t do the traditional “folders” but instead does tags (if you’ve used gmail, you’d be somewhat familiar).

One of my key work flows as a maintainer is looking at what patches are outstanding for a project, and then reviewing them. This should also be a core part of any contributor to a project too. You may think that a tag:unread and to:project-list@foo query would be enough, but that doesn’t correspond with what’s in patchwork.

So, I decided to make a tool that would add tags to messages in notmuch corresponding with the state of the patch in patchwork. This way, I could easily search for “patches marked as New in patchwork” (or Under Review or whatever) and see what I should be reviewing and looking at merging or commenting on.

Logically, this wouldn’t be that hard, just use the (new) Patchwork REST API to get the state of everything and issue the appropriate notmuch commands.

But just going one way isn’t that interesting, I wanted to be able to change the tags in notmuch and have them sync back up to Patchwork. So, I made that part of the tool too.

Introducing pwnm-sync: a tool to sync patchwork and notmuch.

notmuch-hello tag counts for pwnm-sync tagged
patches in patchwork
notmuch-hello tag counts for pwnm-sync tagged
patches in patchwork

With this tool I can easily see the patchwork state of any patch that I have in my notmuch database. For projects that I’m a maintainer on (i.e. can change the state of patches), If I update the patches of that email and run pwnm-sync again, it’ll update the state in patchwork.

I’ve been using this for a few weeks myself and it’s made my maintainer workflow significantly nicer.

It may also be useful to people who want to find what patches need some review.

The sync time is mostly dependent on how fast your patchwork instance is for API requests. Unfortunately, we need to make some improvements on the Patchwork side of things here, but a full sync of the above takes about 4 minutes for me. You can also add a –epoch option (with a date/time) to say “only fetch things from patchwork since that date” which makes things a lot quicker for incremental syncs. For me, I typically run it with an epoch of a couple of months ago, and that takes ~20-30 seconds to run. In this case, if you’ve locally updated a old patch, it will still sync that change up to patchwork.

Implementation Details

It’s a python3 script using the notmuch python bindings, the requests-futures module for asynchronous HTTP requests (so we can have the patchwork server assemble the next page of results while we process the previous one), and a local sqlite3 database to store state in so we can work out what changed locally / server side.

Getting it

Head to https://github.com/stewart-ibm/pwnm-sync or just:

git clone https://github.com/stewart-ibm/pwnm-sync.git

Optimizing database access in Django: A patchwork story

tl;dr: I made Patchwork a lot faster by looking at what database queries were being generated and optimizing them either by making Django produce better queries or by adding better indexes.

Introduction to Patchwork

One of the key bits of infrastructure a bunch of maintainers of Open Source Software use is a tool called Patchwork. We use it for a bunch of OpenPOWER firmware development, several Linux subsystems use it as well as freedesktop.org.

The purpose of Patchwork is to supplement the patches-to-a-mailing-list development work flow. It allows a maintainer to see all the patches that have been posted on the list, How many Acked-by/Reviewed-by/Tested-by replies they have, delegate responsibility for the patch to a co-maintainer, track (and change) the state of the patch (e.g. to “Under Review”, “Changes Requested”, or “Accepted”), and create bundles of patches to help in review and testing.

Since patchwork is an open source project itself, there’s several instances of it out there in common use. One of the main instances is https://patchwork.ozlabs.org/ which is (funnily enough) used by a bunch of people connected to OzLabs for projects that are somewhat connected to OzLabs. e.g. the linuxppc-dev project and the skiboot and petitboot projects. There’s also a kernel.org instance, which is used by some kernel subsystems.

Recent versions of Patchwork have added some pretty cool features such as the ability to integrate with CI systems such as Snowpatch which helps maintainers see if patches submitted are likely to break things.

Unfortunately, there’s also been some complaints that recent version of patchwork have gotten slower than previous ones. This may well be the case, or it could just be that the volume of patches is much higher and there’s load on the database. Anyway, after asking a few questions about what the size and scope was of the patchwork database on ozlabs.org, I went “hrm… this sounds like it shouldn’t really be a problem… perhaps I should look into this”.

Attacking the problem…

Every so often it is revealed that I know a little bit about databases.

Getting a development environment up for Patchwork is amazingly easy thanks to Docker and the great work of the Patchwork maintainers. The only thing you need to load in is an example dataset. I started by importing mail from a few mailing lists I’m subscribed to, which was Good Enough(TM) for an initial look.

Due to how Django forces us to design a database schema though, the suggested method of getting a sample data set will not mirror what occurs in a production system with multiple lists. It’s for this reason that I ended up using a copy of a live dataset for much of my work rather than constructing an artificial one.

Patchwork supports both a MySQL and PostgreSQL database backend. Since the one on ozlabs.org is backed by PostgreSQL, I ended up loading a large dataset into PostgreSQL for most of my work, although I also did some testing with MySQL.

The current patchwork.ozlabs.org instance has a database of around 13GB in side, with about a million patches. You may think this is big, my database brain goes “no, this is actually quite small and everything should be a lot faster than it is even on quite limited hardware”

The problem with ORMs

It turns out that Patchwork is written in Django, an ORM (Object-Relational Mapping) framework in Python – and thus something that pretty effectively obfuscates application code from the SQL being run.

There is one thing that Django misses that could be a pretty big general performance boost to many applications: it doesn’t support composite primary keys. For some databases (e.g. MySQL’s InnoDB engine) the PRIMARY KEY is a clustered index – that is, the physical layout of the rows on disk reflect primary key order. You can use this feature to your advantage and have much higher cache hits of your database pages.

Unfortunately though, we cannot do that with Django, so we lose a bunch of possible performance because of it (especially for queries that are going to have to bring in data from disk). In fact, we’re forced to use an ID field that’ll scatter our rows all over the place rather than do something efficient. You can somewhat get back some of the performance by creating covering indexes, but this costs in terms of index maintenance and disk space.

It should be noted that PostgreSQL doesn’t have a similar concept, although there is a (locking) CLUSTER statement that can (as an offline operation for the table) re-arrange existing rows to be in index order. In my testing, this can give a bit of a boost to performance of some of the Patchwork queries.

With MySQL, you’d look at a bunch of statistics on what pages are being brought in and paged out of the InnoDB buffer pool. With PostgreSQL it’s a bit more complex as it relies heavily on the OS page cache.

My main experience is with MySQL like environment, so I’ve had to re-learn a bunch of PostgreSQL things in this work which was kind of fun. It may be “because of my upbringing” but it seems as if there’s a lot more resources and documentation out in the wild about optimizing MySQL environments than PostgreSQL ones, especially when it comes to documentation around a bunch of things inside the database server. A lot of credit should go to the MySQL Documentation team – I wish the PostgreSQL documentation was up to the same standard.

Another issue is that fetching BLOBs is generally an expensive operation that you want to avoid unless you’re going to use them. Thus, fetching the whole “object” at once isn’t always optimal. The Django query generation appears to be somewhat buggy when it comes to “hey, don’t fetch these columns, I don’t need them”, so you do have to watch what query is produced not just what query you expect to be produced. For example, [01/11] Improve patch listing performance (~3x).

Another issue with Django is how you go from your Python code to an actual SQL query, especially when the produced SQL query is needlessly complex or inefficient. I’ve seen Django always produce an ORDER BY for one table, even when not needed, I’ve also seen it always join tables even when you’re getting no columns from one of them and there’s no way you’re asking for it. In fact, I had to revert to raw SQL for one of my performance improvements as I just couldn’t beat it into submission: [10/11] Be sensible computing project patch counts.

An ORM can be great for getting apps out quickly, or programming in a familiar way. But like many things, an understanding of what is going on underneath is key for extracting maximum performance.

Also, if you ever hear something like “ORM $x doesn’t scale” then maybe that person just hasn’t looked at how to use the ORM better. The same goes for if they say “database $y doesn’t scale”- especially if it’s a long existing relational database such as MySQL or PostgreSQL.

Speeding up listing current patches for a project

17 SQL queries in 4477msMore than 4 seconds in the database
does not make page load time great.

Fortunately though, the Django development environment lets you really easily dive into what queries are being generated and (at least roughly) where they’re being generated from. There’s a sidebar in your browser that shows how many SQL queries were needed to generate the page and how long they took. The key to making your application go faster is to run fewer queries in less time.

I was incredibly impressed with how easy it was to see what queries were run, where they were run from, and the EXPLAIN output for them.

By clicking on that SQL button on the right side of your browser, you get this wonderful chart of what queries were executed, when, and how long they took. From this, it is incredibly obvious which query is the most problematic: the one that took more than four seconds!

In the dim dark days of web development, you’d have to turn on a Slow Query Log on the database server and then grep through your source code or some other miserable activity. I am so glad I didn’t have to do that.

More than four seconds for a single database query does not make for a nice UX.

This particular query was a real hairy one, the EXPLAIN output from PostgreSQL (and MySQL) was certainly long and involved and would most certainly not feature in the first half of an “Introduction to query optimization” day long workshop. If you haven’t brushed up on various bits of documentation on understanding EXPLAIN, you should! The MySQL EXPLAIN FORMAT=JSON is especially fantastic for getting deep details as to what’s going on with query execution.

The big performance gain here was to have the database be able to execute the query in a much more efficient way by creating two covering indexes for part of the query. To work out what indexes to create, one has to look at the EXPLAIN output and work out why the database is choosing to do either a sequential scan of a large table, or use an index that doesn’t exclude that many rows. In this case, I tweaked the code to slightly change the query that was generated as well as adding a covering index. What we ended up with is something that is dramatically faster.

The main query is ~350x faster than before

You’ll notice that it appears that the first query there takes a lot more time but it doesn’t, it just takes a lot more time relative to the main query.

In fact, this particular page is one that people have mentioned at being really, really slow to load. With the main query now about 350 times faster than it was originally, it shouldn’t be a problem anymore.

A future improvement would be to cache the COUNT() for the common case, as it’s pretty easily computed when new patches come in or states change.

The patches that help this particular page have been submitted upstream here:

Making viewing a patch with comments faster

Now that we can list patches faster, can we make other pages that Patchwork has quicker?

One such page is viewing a patch (or cover letter) that has a lot of comments on it. One feature of Patchwork is that it will display all the email replies to a patch or cover letter in the Web UI. But… this seemed slow

On one of the most commented patches I could find, we ended up executing one hundred and seventy seven SQL queries to view it! If we dove into it, a bunch of the queries looked really really similar…

I’ve got 99 queries where I only need 1.

The problem here is that the Patchwork UI is wanting to find out the name of each person who submitted a comment, and is doing that by querying the ID from a table. What it should be doing instead is a SQL JOIN on the original query and just fetching all that information in one go: make the database server do the work, it’s really good at it.

My patch [02/11] 4x performance improvement for viewing patch with many comments   does just that by using the select_related() method correctly, as well as being explicit about what information we want to retrieve.

We’re now only a few milliseconds to grab all the comments

With that patch, we’re down to a constant number of queries and around a 3x-7x faster time executing them depending if we have a warm cache or not.

The one time I had to use raw SQL

When viewing a project page (such as https://patchwork.ozlabs.org/project/qemu-devel/ ) it displays the number of patches (archived and not archived) for the project. By looking at what SQL queries are executed to collect these numbers, you’ll notice two things. First, here are the queries:

COUNT() queries can be expensive

First thing you’ll notice is that they took a loooooong time to execute (more than a second each). The second thing, if you look closer, is that they contain a join which is completely unneeded.

I spent a good long while trying to make Django behave, and I just could not. I believe it’s due to the model having some inheritance in it. Writing the query by hand ended up being the best solution, and it gave a significant performance improvement:

Unfortunately, only 4x faster.

Arguably, a better way would be to precompute the count for the archived/non-archived patches and just display them. I (or someone else who knows more about Django) may want to look at that for a future improvement.

Conclusion and final thoughts

There’s a few more places where there could be some optimizations, but currently I cannot get any single page to take more than between 40-400ms in the database when running on my laptop – and that’s Good Enough(TM) for now.

The next steps are getting these patches through a round or two of review, and then getting them into a Patchwork release and deployed out on patchwork.ozlabs.org and see if people can find any new ways to make things slow.

If you’re interested, the full patchset with cover letter is here: [00/11] Performance for ALL THE THINGS!

The diffstat is interesting, as most of the added code is auto-generated by Django for database migrations (adding of indexes).

 .../migrations/0027_add_comment_date_index.py | 23 +++++++++++++++++
 .../0028_add_list_covering_index.py           | 19 ++++++++++++++
 .../0029_add_submission_covering_index.py     | 19 ++++++++++++++
 patchwork/models.py                           | 21 ++++++++++++++--
 patchwork/templates/patchwork/submission.html | 16 ++++++------
 patchwork/views/__init__.py                   |  8 +++++-
 patchwork/views/cover.py                      |  5 ++++
 patchwork/views/patch.py                      |  7 ++++++
 patchwork/views/project.py                    | 25 ++++++++++++++++---
 9 files changed, 128 insertions(+), 15 deletions(-)
 create mode 100644 patchwork/migrations/0027_add_comment_date_index.py
 create mode 100644 patchwork/migrations/0028_add_list_covering_index.py
 create mode 100644 patchwork/migrations/0029_add_submission_covering_index.py

I think the lesson is that making dramatic improvements to performance of your Django based app does not mean you have to write a lot of code or revert to raw SQL or abandon your ORM. In fact, use it properly and you can get a looong way. It’s just that to use it properly, you’re going to have to understand the layer below the ORM, and not just treat the database as a magic black box.

ccache and op-build

You may have heard of ccache (Compiler Cache) which saves you heaps of real world time when rebuilding a source tree that is rather similar to one you’ve recently built before. It’s really useful in buildroot based projects where you’re building similar trees, or have done a minor bump of some components.

In trying to find a commit which introduced a bug in op-build (OpenPOWER firmware), I noticed that hostboot wasn’t being built using ccache and we were always doing a full build. So, I started digging into it.

It turns out that a bunch of the perl scripts for parsing the Machine Readable Workbook XML in hostboot did a bunch of things like foreach $key (%hash) – which means that the code iterates over the items in hash order rather than an order that would produce predictable output such as “attribute name” or something. So… much messing with that later, I had hostboot generating the same output for the same input on every build.

Next step was to work out why I was still getting a lot of CCACHE misses. It turns out the default ccache size is 5GB. A full hostboot build uses around 7.1GB of that.

So, if building op-build with CCACHE, be sure to set both BR2_CCACHE=y in your config as well as something like BR2_CCACHE_INITIAL_SETUP="--max-size 20G"

Hopefully my patches hit hostboot and op-build soon.

Switching to iPhone: Part 1

I have used Android phones since the first one: the G1. I’m one of the (relatively) few people who has used Android 1.0. I’ve had numerous Android phones since then, mostly the Google flagship.

I have fond memories of the Nexus One and Galaxy Nexus, as well as a bunch of time running Cyanogen (often daily builds, because YOLO) to get more privacy preserving features (or a more recent Android). I had a Sony Z1 Compact for a while which was great bang for buck except for the fact the screen broke whenever you looked at it sideways. Great kudos to the Sony team for being so friendly to custom firmware loads.

I buy my hardware from physical stores. Why? Well, it means that the NSA and others get to spend extra effort to insert hardware modifications (backdoors), as well as the benefit of having a place to go to/set the ACCC on to get my rights under Australian Consumer Law.

My phone before last was a Nexus 5X. There were a lot of good things about this phone; the promise of fast charging via USB-C was one, as was the ever improving performance of the hardware and Android itself. Well… it just got progressively slower, and slower, and slower – as if it was designed to get near unusable by the time of the next Google phone announcement.

Inevitably, my 5X succumbed to the manufacturing defect that resulted in a boot loop. It would start booting, and then spontaneously reboot, in a loop, forever. The remedy? Replace it under warranty! That would take weeks, which isn’t a suitable timeframe in this day and age to be without a phone, so I mulled over buying a Google Pixel or my first ever iPhone (my iPhone owning friends assured me that if such a thing happens with an iPhone that Apple would have swapped it on the spot). Not wanting to give up a lot of the personal freedom that comes with the Android world, I spent the $100 more to get the Pixel, acutely aware that having a phone was now a near $1000/year habit.

The Google Pixel was a fantastic phone (except the price, they should have matched the iPhone price). The camera was the first phone camera I actually went “wow, I’m impressed” over. The eye-watering $279 to replace a cracked screen, the still eye-watering cost of USB-C cables, and the seat to process the HDR photos were all forgiven. It was a good phone. Until, that is, less than a year in, the battery was completely shot. It would power off when less than 40% and couldn’t last the trip from Melbourne airport to Melbourne city.

So, with a flagship phone well within the “reasonable quality” time that consumer law would dictate, I contacted Google after going through all the standard troubleshooting. Google agreed this was not normal and that the phone was defective. I was told that they would mail me a replacement, I could transfer my stuff over and then mail in the broken one. FANTASTIC!! This was soooo much better than the experience with the 5X.

Except that it wasn’t. A week later, I rang back to ask what was going on as I hadn’t received the replacement; it turns out Google had lied to me, I’d have to mail the phone to them and then another ten business days later I’d have a replacement. Errr…. no, I’ve been here before.

I rang the retailer, JB Hi-Fi; they said it would take them at least three weeks, which I told them was not acceptable nor a “reasonable timeframe” as dictated by consumer law.

So, with a bunch of travel imminent, I bought a big external USB-C battery and kept it constantly connected as without it the battery percentage went down faster than the minutes ticked over. I could sort it out once I was back from travel.

So, I’m back. In fact, I drove back from a weekend away and finally bit the bullet – I went to pick up a phone who’s manufacturer has a reputation of supporting their hardware.

I picked up an iPhone.

I figured I should write up how, why, my reasons, and experiences in switching phone platforms. I think my next post will be “Why iPhone and not a different Android”.

CVE-2019-6260: Gaining control of BMC from the host processor

This is details for CVE-2019-6260 – which has been nicknamed “pantsdown” due to the nature of feeling that we feel that we’ve “caught chunks of the industry with their…” and combined with the fact that naming things is hard, so if you pick a bad name somebody would have to come up with a better one before we publish.

I expect OpenBMC to have a statement shortly.

The ASPEED ast2400 and ast2500 Baseboard Management Controller (BMC) hardware and firmware implement Advanced High-performance Bus (AHB) bridges, which allow arbitrary read and write access to the BMC’s physical address space from the host, or from the network if the BMC console uart is attached to a serial concentrator (this is atypical for most systems).

Common configuration of the ASPEED BMC SoC’s hardware features leaves it open to “remote” unauthenticated compromise from the host and from the BMC console. This stems from AHB bridges on the LPC and PCIe buses, another on the BMC console UART (hardware password protected), and the ability of the X-DMA engine to address all of the BMC’s M-Bus (memory bus).

This affects multiple BMC firmware stacks, including OpenBMC, AMI’s BMC, and SuperMicro. It is independent of host processor architecture, and has been observed on systems with x86_64 processors IBM POWER processors (there is no reason to suggest that other architectures wouldn’t be affected, these are just the ones we’ve been able to get access to)

The LPC, PCIe and UART AHB bridges are all explicitly features of Aspeed’s designs: They exist to recover the BMC during firmware development or to allow the host to drive the BMC hardware if the BMC has no firmware of its own. See section 1.9 of the AST2500 Software Programming Guide.

The typical consequence of external, unauthenticated, arbitrary AHB access is that the BMC fails to ensure all three of confidentiality, integrity and availability for its data and services. For instance it is possible to:

  1. Reflash or dump the firmware of a running BMC from the host
  2. Perform arbitrary reads and writes to BMC RAM
  3. Configure an in-band BMC console from the host
  4. “Brick” the BMC by disabling the CPU clock until the next AC power cycle

Using 1 we can obviously implant any malicious code we like, with the impact of BMC downtime while the flashing and reboot take place. This may take the form of minor, malicious modifications to the officially provisioned BMC image, as we can extract, modify, then repackage the image to be re-flashed on the BMC. As the BMC potentially has no secure boot facility it is likely difficult to detect such actions.

Abusing 3 may require valid login credentials, but combining 1 and 2 we can simply change the locks on the BMC by replacing all instances of the root shadow password hash in RAM with a chosen password hash – one instance of the hash is in the page cache, and from that point forward any login process will authenticate with the chosen password.

We obtain the current root password hash by using 1 to dump the current flash content, then using https://github.com/ReFirmLabs/binwalk to extract the rootfs, then simply loop-mount the rootfs to access /etc/shadow. At least one BMC stack doesn’t require this, and instead offers “Press enter for console”.

IBM has internally developed a proof-of-concept application that we intend to open-source, likely as part of the OpenBMC project, that demonstrates how to use the interfaces and probes for their availability. The intent is that it be added to platform firmware test
suites as a platform security test case. The application requires root user privilege on the host system for the LPC and PCIe bridges, or normal user privilege on a remote system to exploit the debug UART interface. Access from userspace demonstrates the vulnerability of systems in bare-metal cloud hosting lease arrangements where the BMC
is likely in a separate security domain to the host.

OpenBMC Versions affected: Up to at least 2.6, all supported Aspeed-based platforms

It only affects systems using the ASPEED ast2400, ast2500 SoCs. There has not been any investigation into other hardware.

The specific issues are listed below, along with some judgement calls on their risk.

iLPC2AHB bridge Pt I

State: Enabled at cold start
Description: A SuperIO device is exposed that provides access to the BMC’s address-space
Impact: Arbitrary reads and writes to the BMC address-space
Risk: High – known vulnerability and explicitly used as a feature in some platform designs
Mitigation: Can be disabled by configuring a bit in the BMC’s LPC controller, however see Pt II.

iLPC2AHB bridge Pt II

State: Enabled at cold start
Description: The bit disabling the iLPC2AHB bridge only removes write access – reads are still possible.
Impact: Arbitrary reads of the BMC address-space
Risk: High – we expect the capability and mitigation are not well known, and the mitigation has side-effects
Mitigation: Disable SuperIO decoding on the LPC bus (0x2E/0x4E decode). Decoding is controlled via hardware strapping and can be turned off at runtime, however disabling SuperIO decoding also removes the host’s ability to configure SUARTs, System wakeups, GPIOs and the BMC/Host mailbox

PCIe VGA P2A bridge

State: Enabled at cold start
Description: The VGA graphics device provides a host-controllable window mapping onto the BMC address-space
Impact: Arbitrary reads and writes to the BMC address-space
Risk: Medium – the capability is known to some platform integrators and may be disabled in some firmware stacks
Mitigation: Can be disabled or filter writes to coarse-grained regions of the AHB by configuring bits in the System Control Unit

DMA from/to arbitrary BMC memory via X-DMA

State: Enabled at cold start
Description: X-DMA available from VGA and BMC PCI devices
Impact: Misconfiguration can expose the entirety of the BMC’s RAM to the host
AST2400 Risk: High – SDK u-boot does not constrain X-DMA to VGA reserved memory
AST2500 Risk: Low – SDK u-boot restricts X-DMA to VGA reserved memory
Mitigation: X-DMA accesses are configured to remap into VGA reserved memory in u-boot

UART-based SoC Debug interface

State: Enabled at cold start
Description: Pasting a magic password over the configured UART exposes a hardware-provided debug shell. The capability is only exposed on one of UART1 or UART5, and interactions are only possible via the physical IO port (cannot be accessed from the host)
Impact: Misconfiguration can expose the BMC’s address-space to the network if the BMC console is made available via a serial concentrator.
Risk: Low
Mitigation: Can be disabled by configuring a bit in the System Control Unit

LPC2AHB bridge

State: Disabled at cold start
Description: Maps LPC Firmware cycles onto the BMC’s address-space
Impact: Misconfiguration can expose vulnerable parts of the BMC’s address-space to the host
Risk: Low – requires reasonable effort to configure and enable.
Mitigation: Don’t enable the feature if not required.
Note: As a counter-point, this feature is used legitimately on OpenPOWER systems to expose the boot flash device content to the host

PCIe BMC P2A bridge

State: Disabled at cold start
Description: PCI-to-BMC address-space bridge allowing memory and IO accesses
Impact: Enabling the device provides limited access to BMC address-space
Risk: Low – requires some effort to enable, constrained to specific parts of the BMC address space
Mitigation: Don’t enable the feature if not required.

Watchdog setup

State: Required system function, always available
Description: Misconfiguring the watchdog to use “System Reset” mode for BMC reboot will re-open all the “enabled at cold start” backdoors until the firmware reconfigures the hardware otherwise. Rebooting the BMC is generally possible from the host via IPMI “mc reset” command, and this may provide a window of opportunity for BMC compromise.
Impact: May allow arbitrary access to BMC address space via any of the above mechanisms
Risk: Low – “System Reset” mode is unlikely to be used for reboot due to obvious side-effects
Mitigation: Ensure BMC reboots always use “SOC Reset” mode

The CVSS score for these vulnerabilities is: https://nvd.nist.gov/vuln-metrics/cvss/v3-calculator?vector=3DAV:A/AC:L/PR:=N/UI:N/S:U/C:H/I:H/A:H/E:F/RL:U/RC:C/CR:H/IR:H/AR:M/MAV:L/MAC:L/MPR:N/MUI:N=/MS:U/MC:H/MI:H/MA:H

There is some debate on if this is a local or remote vulnerability, and it depends on if you consider the connection between the BMC and the host processor as a network or not.

The fix is platform dependent as it can involve patching both the BMC firmware and the host firmware.

For example, we have mitigated these vulnerabilities for OpenPOWER systems, both on the host and BMC side. OpenBMC has a u-boot patch that disables the features:

https://gerrit.openbmc-project.xyz/#/c/openbmc/meta-phosphor/+/13290/

Which platforms can opt into in the following way:

https://gerrit.openbmc-project.xyz/#/c/openbmc/meta-ibm/+/17146/

The process is opt-in for OpenBMC platforms because platform maintainers have the knowledge of if their platform uses affected hardware features. This is important when disabling the iLPC2AHB bridge as it can be a bit of a finicky process.

See also https://gerrit.openbmc-project.xyz/c/openbmc/docs/+/11164 for a WIP OpenBMC Security Architecture document which should eventually contain all these details.

For OpenPOWER systems, the host firmware patches are contained in op-build v2.0.11 and enabled for certain platforms. Again, this is not by default for all platforms as there is BMC work required as well as per-platform changes.

Credit for finding these problems: Andrew Jeffery, Benjamin
Herrenschmidt, Jeremy Kerr, Russell Currey, Stewart Smith. There have been many more people who have helped with this issue, and they too deserve thanks.

Tracing flash reads (and writes) during boot

On OpenPOWER POWER9 systems, we typically talk to the flash chips that hold firmware for the host (i.e. the POWER9) processor through a daemon running on the BMC (aka service processor) rather than directly.

We have host firmware map “windows” on the LPC bus to parts of the flash chip. This flash chip can in fact be a virtual one, constructed dynamically from files on the BMC.

Since we’re mapping windows into this flash address space, we have some knowledge as to what IO the host is doing to/from the pnor. We can use this to output data in the blktrace format and feed into existing tools used to analyze IO patterns.

So, with a bit of learning of the data format and learning how to drive the various tools, I was ready to patch the BMC daemon (mboxbridge) to get some data out.

An initial bit of data is a graph of the windows into PNOR opened up during an normal boot (see below).

PNOR windows created over the course of a normal boot.

This shows us that over the course of the boot, we open a bunch of windows, and switch them around a fair bit early on. This makes sense as early in boot we do not yet have DRAM working and page in firmware on-demand into L3 cache.

Later in boot, you can see the loading of larger chunks of firmware into memory. It’s also possible to see that this seems to take longer than it should – and indeed, we have a bug there.

Next, by modifying the code again, I introduced recording of when we used a window that the BMC had already cached. While the host will only see one window at a time, the BMC can keep around the ones it prepared earlier in order to avoid IO to the actual flash chips (which are SPI flash, so aren’t incredibly fast).

Here we can see that we’re likely not doing the most efficient things during boot, and there’s probably room for some optimization.

Normal boot but including re-used Windows rather than just created ones

Finally, in order to get finer grained information, I reduced the window size from one megabyte down to 4096 bytes. This will impose a heavy speed penalty as it’ll mean we will have to create a lot more windows to do the same amount of IO, but it means that since we’re using the page size of hostboot, we’ll see each individual page in/out operation that it does during boot.

So, from the next graph, we can see that there’s several “hot” areas of the image, and on the whole it’s not too many pages. This gives us a hint that a bit of effort to reduce binary image size a little bit could greatly reduce the amount of IO we have to do.

4096 byte (i.e. page) size window, capturing the bits of flash we need to read in several times due to being low on memory when we’re L3 cache constrained.

The iowatcher tool also can construct a video of the boot and what “blocks” are being read.

Video of what blocks are read from flash during booting

So, what do we get from this adventure? Well, we get a good list of things to look into in order to improve boot performance, and we get to back these up with data rather than guesswork. Since this also works on unmodified host firmware, we’re measuring what we really boot rather than changing it in order to measure it.

What you need to reproduce this:

Switching to iPhone Part 2: Seriously?

In which I ask of Apple, “Seriously?”.

That was pretty much my reaction with Apple sticking to Lightning connectors rather than going with the USB-C standard. Having USB-C around the place for my last two (Android) phones was fantastic. I could charge a phone, external battery, a (future) laptop, all off the same wall wart and with the same cable. It is with some hilarity that I read that the new iPad Pro has USB-C rather than Lightning.

But Apple’s dongle fetish reigns supreme, and so I get a multitude of damn dongles all for a wonderfully inflated price with an Australia Tax whacked on top.

The most egregious one is the Lightning-to-3.5mm dongle. In the office, I have a good set of headphones. The idea is to block out the sound of an open plan office so I can actually get some concentrating done. With tiny dedicated MP3 players and my previous phones, these sounded great. The Apple dongle? It sounds terrible. Absolutely terrible. The Lighting-to-3.5mm adapter might be okay for small earbuds but it is nearly completely intolerable for any decent set of headphones. I’m now in the market for a Bluetooth headphone amplifier. Another bunch of money to throw at another damn dongle.

Luckily, there seems to be a really good Bluetooth headphone amplifier on Amazon. The same Amazon that no longer ships to Australia. Well, there’s an Australian seller, for six times the price.

Urgh.

February 25, 2019

What Do You Mean "No"?

Quite often when building small Linux images having separate user accounts isn't always at the top of the list of things to include. Petitboot is no different; the most common operations like mounting disks, configuring interfaces, and calling kexec all require root and Petitboot generally only exists long enough to boot into the next thing, so why not run it all as root?

The picture is less clear when we start to think about what is possible to do in Petitboot by default. If someone comes across an open Petitboot console they're only a few keystrokes away from wiping disks, changing files, or even flashing firmware. Depending on how your system is used that may or may not be something you care about, but over time there have been a few requests to "add a password screen to Petitboot" to at least make it so that the system isn't open season for whoever sees it.

Enter Password:

The most direct way to avoid this would be to slap a password prompt onto Petitboot before any changes can be made. There are two immediate drawbacks to this:

  • The Petitboot UI still runs as root, and
  • Exiting to the shell gives the user root permissions as well.

There is already a mechanism to prevent the user exiting to the shell, but this puts all of our eggs in the basket of petitboot-nc being a secure program. If a user can accidentally or otherwise find a way to exit or crash the UI then they're immediately in a root shell, and while petitboot-nc is a good UI it was never designed to be a hardened program protecting the system.

You Have No Power Here

The idea instead as of Petitboot v1.10.0 is not to care if the user drops to the shell; because now it's completely unprivileged.

Normal shell

The only process now that runs as root is pb-discover itself; the console, UI, and helper scripts run as a new 'petituser'. For the server and clients to still communicate the "petitiboot.ui" socket permissions are modified to allow processes that are part of the 'petitgroup' to connect. However now if pb-discover notices that a client in the petitgroup is connecting (or more accurately the client isn't running as root) by default it ignores any commands from it that would configure or boot the system.

A new command, PB_PROTOCOL_ACTION_AUTHENTICATE, lets a client send a password to the server to then be allowed to send all the usual commands like updating the config or booting a specific option. This keeps all the authentication on the server side, avoiding writing any "secure" ncurses code. In the UI the biggest difference is that when trying to change something the user will hit a password field:

Denied

Then the password is sent to the server, checked, and if correct the action goes ahead.

Whose Passwords?

But where does this password come from? Technically it's just the root password. The server computes a hash of the supplied password and compares it against the system's root password. Similarly in the shell the user can run sudo with the root password to enter a full shell if needed:

Oops

Petitboot of course runs in memory, and writing a root password into the image itself would mean recompiling to change the password, so instead Petitboot pulls the root password from NVRAM. On startup Petitboot reads the petitboot,password parameter which is the hash of the root password and updates /etc/shadow with it. This happens before any clients are up or can connect to the server.

Don't Panic

By default no password is set. After all we don't want people upgrading and then being somehow locked out of their system. For ease of use, and for testing-purposes, if no password is configured and the user drops to the shell it is automatically upgraded to a root shell:

Elevated

To set a password there is a new subscreen in System Configuration:

New Password

This sends an authentication command to the server, and assuming the client is authenticated with the current password as well pb-discover updates the shadow file, and writes the hash back to NVRAM.


User support exists in Petitboot v1.10.0 onwards. You will also need some support in your build system to set up the users, see how op-build did it for an example.

There are a few items on the TODO list still which would be good to have. For example storing the password hash in an attached TPM if available, as well as splitting out more of what runs as root; for example the bootloader parsers in pb-discover preferably wouldn't run with root privileges, but they are all part of the one binary.

As always, comments, suggestions, and patches welcome on the list!

February 24, 2019

Audiobooks – January 2019

The Grandmaster: Magnus Carlsen and the Match That Made Chess Great Again by Brin-Jonathan Butler

Not a lot about the match. The author rolls out a bunch of random chess stories and profiles instead. 4/10

The Next American City: The Big Promise of Our Midsize Metros by Mick Cornett

The four-term Mayor of Oklahoma City goes over projects OC and other mid-sized cities implemented to improve their cities & fight back against the large metros. 8/10

21 Lessons for the 21st Century by Yuval Noah Harari

Apparently a lot of expanded essays but still a lot of interesting stuff in there. The good ones are great and the bad ones are okay. 7/10

Chasing the Demon: A Secret History of the Quest for the Sound Barrier, and the Band of American Aces Who Conquered It by Dan Hampton

Covers some early aircraft and aerodynamics history, then the lives of pilots who would break the sound barrier & then the actual event (or events!). 7/10

Share

February 18, 2019

Abaddon’s Gate

Share

This is the third book in the Leviathan Wakes series by James SA Corey. Just as good as the first two, this is a story about how much a daughter loves her father, perhaps beyond reason, moral choices, and politics — just as much as it is the continuation of the story arc around the alien visitor.

Another excellent book, with a bit more emphasis on space battles than previously and an overall enjoyable plot line. Worth a read, to be honest I think the series is getting better.

Abaddon's Gate Book Cover Abaddon's Gate
James S. A. Corey
Fiction
Hachette UK
June 4, 2013
560

The third book in the New York Times bestselling Expanse series. NOW A MAJOR TV SERIES FROM NETFLIX For generations, the solar system - Mars, the Moon, the Asteroid Belt - was humanity's great frontier. Until now. The alien artefact working through its program under the clouds of Venus has emerged to build a massive structure outside the orbit of Uranus: a gate that leads into a starless dark. Jim Holden and the crew of the Rocinante are part of a vast flotilla of scientific and military ships going out to examine the artefact. But behind the scenes, a complex plot is unfolding, with the destruction of Holden at its core. As the emissaries of the human race try to find whether the gate is an opportunity or a threat, the greatest danger is the one they brought with them.

Share

5th axis getting up to speed

After a little bit of calibration and tinkering in the fusion360 cps output filter files the cnc is starting to take advantage of the new 5th axis. Below is a "Flow" multiaxis toolpath finishing a sphere with a flat endmill.


There are still some issues that I have to address. It has been a surprisingly good experience so far. Starting out cutting 5 sides of a block and moving on to cutting the edges at an angle. This is the third test where I rough out the sphere using a 3d adaptive path and then use a flow multiaxis path to clean things up. Things look better in video and there is some of that to come.

February 17, 2019

Using D-STAR with Codec 2

Antony Chazapis, SV9OAN, has been extending the D-STAR protocol to use Codec 2. Below is a brief explanation of the project, please also see his Github page. Over to Antony:

Instead of designing a new protocol, I propose a D-STAR protocol extension that allows to use Codec 2 in frames instead of AMBE. There are many Codec 2 variants, so I started by using the highest bitrate one (3200 mode) without FEC which is fine for over-the-Internet communication. I then modified xlxd to add a new listener for the D-STAR protocol variant (named “DExtra Open”) and ambed to transcode from 1 codec to 2 (eg. AMBE in, Codec 2/AMBE+ out).

In theory you could do transmissions with the new protocol, but the problem is that there are no real clients that implement it – yet. Actually, the only clients available now are my Python-based dv-player and dv-recorder which connect to xlxd, play and record dvtool files respectively. I am thinking about software-based implementations at the moment: Android, iOS, Mac OS, Linux, etc.

Another direction would be to figure out a FEC implementation and do real over-the-air experiments with MMDVM based devices, assuming a microphone input and a speaker output.

73 de SV9OAN

About Antony

Antony Chazapis, SV9OAN, is licensed since 2009. He is a member of RAAG, RASC, ARRL, and the volunteer group HARES (Hellenic Amateur Radio Emergency Service). Since 2017, he has been serving as the Association Manager for the SOTA award program in Greece. Antony enjoys DXing and exploring what makes digital modes tick. He is the author of PocketPacket, an APRS client application for macOS and iOS. Recently, he has been working on enabling D-STAR communications with Codec 2. He holds a PhD in Computer Science and is currently employed by a Toronto-based tech startup implementing bleeding-edge storage solutions.

Update – Feb 2019

About two months later, and Antony has introduced a protocol variant based on Codec 2’s 2400 mode with added FEC. He has also implemented Estrella, a DExtra Open client, that allows direct communication through xlxd reflectors (macOS and iOS versions are available now, with more platforms to follow).

Estrella for macOS
Estrella for iOS

February 15, 2019

Thematic Review 3 : Practical Implementations of Information Systems

Following examples of a successful IS implementation, the Bankers Trust of Australia, and an on-going failure with the example of the UK's Universal Credit project, it is opportune to consider a subset of IS that is common to both projects and determine whether there is a general rule that can be applied. In particular, attention is drawn to the fact that the IT contract for the IS projects in the two cases was either largely developed in-house for the successful project and fully outsourced in the failed project. Common-sense should dictate that this cannot be universalised; surely not every introduction of an IS system must have the IT component developed in-house? One wonders whether there are any thorough studies with practical implementations to explore when it in-house or outsourced development is appropriate?

Traylor (2006) looks at this very issue, arguing "Decades of trial, error, and egghead analysis have yielded a consensus conclusion: Buy when you need to automate commodity business processes; build when you're dealing with the core processes that differentiate your company", however immediately pointing out that such a theoretical model (doubtless built on numerous empirical examples), may encounter problems with reality, such as in-house software for standard processes may come with very high transition costs, whereas external commercial packages may actually be a very good fit for strategic plans. Traylor continues: "'Buy to standardize, build to compete' may be terrific as an ideology, but the choices that real companies face are a lot messier ... and more interesting."

Following the arguments of several senior executives from companies with significant IT investments, some variation is noted. The former CIO of Pricewaterhouse Coopers argues that "[E]verybody knows" that standardised and off-the-shelf purchases are more cost-effective for implementation and maintenance, whereas the IT chief architect at MCI argues for in-house development where one can get incremental revenue or a competitive advantage, whilst at the same time encouraging code re-use. Whilst decisions metrics (cost, skill-sets, strategic value etc) remain the same in either approach, and most firms have a combination of both, Traylor also points out - almost fifteen years ago now - that open source software offers the best of both, providing standards-compliance and bespoke elements. Emphasis is also made towards the software lifecycle with the stated claim that 70 percent of software costs occur after implementation.

Even large and security-conscious companies, such as Visa, who have an enormous amount of in-house development will purchase commercial solutions when there is no competitive advantage to be achieved and costs would be greater to follow the in-house method. Nevertheless, when it comes to commercial software a clear warning is expressed about developing components outside that commercial package which are nevertheless dependent on it. The result of such an approach will be to generate in-house software with critical dependencies which is outside of the control of the organisation, which is both wasteful and dangerous. In many cases, commercial applications offer their own extensions which can fit existing varied business processes.

This is not to say that in-house development should not occur alongside or even with commercial offerings. It should just not be for the same business process (e.g., an accounting package with an in-house accounting extension). An example is given of the District of Columbia city government which has migrated from a myriad of in-house applications to commercial applications, yet still has specific business processes (such as Business Intelligence) which are developed in-house, because commercial applications did not meet the needs of the organisation. Sometimes opportunities even exist for commercialisation with existing vendors to develop new applications, such as the work between the University of Pittsburgh Medical Center and Stentor for a picture-archiving communications system.

Traylor argues the IT Planning Software is taking into account these decisions, and helping managers determine whether to build or buy, and gives an example of the questions confronting MCI on a system to track third-party services inventory. It comes across as being a lot less detailed or sophisticated as one would hope for in a for a purchase of such importance. One is interested about the series of posed questions is that they are almost entirely in reference to an IT business approach, rather than an IS approach - there is little evidence that the question of existing business processes, staff involvement etc are included in such software packages. Whilst does suggest that whilst they provide some contribution to the decision, it is also evidence that an IS decision must be of broader scope than an IT decision.

Reference

Traylor, Polly (2006). To build or to buy IT applications?, InfoWorld, 13 Feb, 2006

February 14, 2019

Thematic Review 2 : Practical Implementations of Information Systems

When reviewing practical implementations of information systems (IS), incredible failures provide very valuable lessons even if they are ongoing. At an estimated £12.8bn, far in excess of the originally estimated £2.2bn (Ballard, 2013), the UK's Universal Credit project will be the single-most expensive failed or overbudget custom software projects, although when adjusted for inflation the UK's NHS Connecting for Health project (mostly abandoned in 2011), also cost around £12bn. Apparently if one wishes to study exceptional failures in IS, government in general, the UK in particular, and the subcategory of health and welfare is a good place to start. Whilst this may seem slightly snide, it is backed by empirical evidence. Only the US's FAA Advanced Automation System (c$4.5b USD, 1994) is really within a similar sort of scale of failure.

Universal Credit, like many such projects, on a prima facie level, seems to designed on reasonable principles. Announced in 2010, with the objective to simplify working-age welfare benefits and encourage taking up paid work, it would replace and combine six different means-tested benefits, and roll them into a single monthly payment and which, as paid work was taken up, would be gradually be reduced, rather than having an all-or-nothing approach, following the "negative income tax" proposals, as proposed by Juliet Rhys William and Milton Friedman (Forget, 2012). The project was meant to start in 2013 and completed by 2017. Under current plans (there have been at least seven timetable completion changes), this has been pushed out to 2023 (Butler, Walker, 2016).

It is important to note that the majority of problems that confront the Universal Credit project were not just limited by IT, although there are plenty of those. Attention to detail is evidently lacking, perhaps overlooked by decision-makers who have a minimal visceral understanding of those who require welfare. For example, part-time work can lead to a situation where people can work more and be paid less (Toynbee, 2013) and the self-employed could lose thousands of pounds per annum. The transition period from old welfare payments to Universal Credit meant that people on already marginal incomes experiencing delays in payments, with the National Audit reporting to eight months (Richardson, 2018). As a result, there are numerous instances of people falling behind in rent payments, food banks requests rocketing, (Savage, Jayanetti, 2017) and some even turning to prostitution (Quayle, Box, 2018). These are, in an economic model, a producer-consumer issue where the producer is a monopoly provider of a good (welfare) which has near-vertical demand due to need. From a business perspective in IS, it reflects not just administrative errors, but also a lack of project planning with sufficient input from users.

However, in addition to these errors, the project has fraught with IT issues. With 15 different suppliers providing components of the outsourced IT systems in the initial stages (DWP, 2012), the suppliers themselves voiced concerns at the capacity to build a nation-wide reporting system within the short time-frame initially provided (Boffey, 2011). The system requires real-time data from a new Pay as You Earn (PAYE) system being developed with HM Revenue & Customs. The resulting IT system has been described as "completely unworkable, badly designed" and already "out of date" (Jee, 2014) and a subsequent survey of staff found that 90% found the system "inadequate" (Jee, 2015). An audit of the system found that the IT systems "depend heavily on manual intervention and will only handle a small number of claims".

Information Systems needs to founded on quantitative and realistic expectations of the capacity of a technology to deliver a good or service, and with project plans that incorporate end-users, as they will be people who "feel the pain" of a poorly designed system. In the case of welfare, this is no mere consumable, but rather pain in a very visceral and, without having to engage in hyperbole, even mortal requirements. The evidence provided so far is that the Universal Credit system has engaged neither in a realistic assessment of the IT requirements and capabilities, was designed without sufficient input of the final recipients of the service, and as a result, has delivered a sub-par project. Overall, it constitutes the worst IS project in history.

References

Ballard, Mark (2013), Universal Credit will cost taxpayers £12.8bn, Computer Weekly, 03 June, 2013.
https://www.computerweekly.com/news/2240185166/Universal-Credit-will-cos...

Boffey, Daniel (2011), Universal credit's 2013 delivery could be derailed by complex IT system, The Observer, 19 June 2011
https://web.archive.org/web/20110620210401/http://www.guardian.co.uk/pol...

Butler, Patrick., Walker, Peter (2016) Universal credit falls five years behind schedule, The Guardian, 20 July, 2016.

DWP Central Freedom of Information Team (2012), FoI request from Mr. Robinson, 18 October, 2012
https://www.whatdotheyknow.com/request/130677/response/322518/attach/3/F...

Forget, Evelyn L. (2012), Advocating negative income taxes: Juliet Rhys-Williams and Milton Friedman, Proceedings of the History of Political Economy Conference Spring 2012.
https://web.archive.org/web/20160303234617/http://econ.duke.edu/uploads/...

Jee, Charlotte (2014), Leaked memo says Universal Credit IT 'completely unworkable', Computer World UK, October 27, 2014
https://www.computerworlduk.com/it-management/leaked-memo-says-universal...

Jee, Charlotte (2015), 90 percent of Universal Credit staff say IT systems 'inadequate', Computer World UK, March 9, 2015
https://www.computerworlduk.com/it-management/90-percent-of-universal-cr...

Richardson, Hannah (2018), Universal Credit 'could cost more than current benefits system', BBC News Education, 15 June 2018

Savage, Michael., Jayanetti, Chaminda (2017), Revealed: universal credit sends rent arrears soaring, The Observer, 16 September, 2017
https://web.archive.org/web/20170917160638/https://www.theguardian.com/s...

Toynbee, Polly (2013), Universal credit is simple: work more and get paid less, The Guardian, 12 July, 2013.

Quayle, Jess., Box, Dan (2018), 'I was forced into prostitution by Universal Credit', BBC News, 19 November, 2018.

February 13, 2019

Thematic Review 1 : Practical Implementations of Information Systems

Following definitional foundations and theoretical models in Information Systems (IS) there is a great desire to find some detailed practical applications. The first, the Arcadia project at Bankers Trust Australia limited (BTAL) for the derivatives group is almost ancient history as far as computer technology is concerned - it was initiated in 1994. However, as a very successful project it provides excellent information on the processes of implementing new systems into an organisation (Baster et. al., (2001)).

The distinction at this time was described as a "B-T gap" ("Business-Technology gap"), where business users assumed that IT solutions were unable to capture the "dynamic, holistic, and imprecise" nature of business information and decisions, whereas IT users tended to place the blame on 'error between keyboard and chair'. Naturally enough, recognising the gap and the need to overcome it was an excellent starting point of the project along with stating an objective of an information system that was "extensible, adaptable, and responsive to rapid changes".

In more contemporary times what was being built would be considered an Enterprise Resource Planning (ERP) system, which makes some of the management choices particularly interesting in that context. Because of stated needs, BTAL rejected the possibility of a vendor-based solution, which was estimated to be more expensive and less flexible, giving "BTAL less control for satisfying internal decision-making strategies and responding quickly to business changes." To further assist the flexibility a component-based approach was applied, allowing for well-structured interfaces between specific business and technology procedures. The component-based model combined both business and technological logic into a layered model, where the core technological infrastructure was subject to minimal changes, with high-level GUI-presentation layer being areas of most iterative functional elaborations, and with databases and scripting incorporating the business logic desired.

One means of bridging the B-T gap was to have project implementation carried out by a combination of IT and business users, with developers, business drivers, and business users who provided feedback in testing and validating the components, whilst the drivers had detailed expertise in the business functionality that was desired. This was seen a different to much theoretical literature which suggested high-level abstraction, whilst the project used an understanding of complex details and development through rapid iteration.

The success of the Arcadia project was based on continuing feedback, follow-up steps to reduce the B-T gap, and the interest expressed by global operations of the BTAL. The major lessons learned from the project include understanding the existence of a B-T gap and the need to bridge it, understanding and leveraging user behaviour and skills whilst minimising interface changes, keeping the project low-profile whilst recruiting detail-orientated staff as testers. Overall, this detail-orientated layered approach which combined business needs with technological capability should be perceived as a model case study for Information Systems.


Figure 1. Business-Technology layers in the Arcadia project

Reference

Baster, Greg., Konana, Prabhudev., Scott, Judy E., Business components: A case study of bankers trust Australia limited, Communications of the ACM 44(5):92-98 · May 2001

Video Review of a Social Media Webinar

A video review of the webinar/video presentation of Meet Edgar's "Ten Social Media Tips for 2019 Success", for the MSc course in Information Systems (University of Salford).

Video produced through slides made with LibreOffice Impress and audio with GNOME SoundRecorder, and composed with OpenShot.

Whilst a transcription is not available, the following is the content from the slideshow presentation.


Ten Social Media Tips for 2019

* This is a review of Meet Edgar’s “Ten Social Media Tips for 2019”
* The video (and transcript) is available from the following URL https://meetedgar.com/blog/10-tips-series-2019-success-on-social-media/
* Presenters are Megan and Christina Meet Edgar Community Social Media
* First argument is that social media is a free platform where you can connect with your followers and create a community.
* The presentation does not describe or differentiate between specific social media platforms, but rather gives general advice common to all.
* This immediately limits the value the presentation; the sort of engagement that one has on Facebook is different to Twitter is different to Youtube is different to Livejournal (does anyone outside of Russia still actively use LJ?), or even Usenet, due to the posting restrictions (word count, imagery), style, and especially community norms.

Tip 1: More social less media

* Purpose of this suggestion is to encourage authenticity in marketing, and that “personal branding is now part of company branding”, so engage directly with followers, add more personality into their social media posts. “People buy from people, not businesses”.
* It is claimed that engagement provides for opportunities to connect with followers, find out what drives them, establish brand-loyalty on the basis of brand personality.
* This is all largely true, but over-stated. For example the webinar has completely overlooked the importance of business-to-business transactions (even if these are not usually via social media). The assertion of “brand personality” is backwards; actors (even corporate actors) have personality. A brand is an identifier of a personality.

Tip 2: More Listening, Less Broadcasting

* Discover pain points of consumer, find out what their ideal image of themselves in the future and try to create a situation where your products and services match that image.
* There’s more content on social media than is possible for a consumer to process. It should be stated that in a factor leading to a greater balance of market power between the supplier of goods and services and the consumer of goods and services (other factors, such as a concentration of capital, are a countervailing trend).
* Argues that producers need to “listen and reply and engage with your followers not just broadcast out your promotions”.

Tip 3 More Video, Less Text

* Argues that "everything we read, every trend we're seeing, every analytics that we're looking at goes to this idea that video is where it's at on social media".
* Produce exclusive video content for followers, and encourage comments.
* Claims that live streams generates the most views.
* Drive engagement through emotional reactions.
* Ironically, the video provided by Meet Edgar is essentially a presentation with images and a voice-over (like this one too, but at least I’m knowingly self-referentially ironic).
* Video is a great media for immersion in exciting live content. It is not live streams that generates the views but the content of the live streams (otherwise Andy Warhol’s Empire would be a great film).
* In reality text is ten times faster than video, can be easily referenced,and is technologically a more independent media.

Tip 4 More new experiences

* Argues that marketing agents must keep an eye on what’s coming up next in social media, find out which platforms are winning, and direct experiments on those.
* If the platform or the social media posts fail to connect then use it as a learning experience and incorporate those lessons in the base networks.
* This advice is horribly vague and could be a massive sinkhole of time and effort if followed as presented. Instead they should be providing a metric for experimentation, trigger levels for greater levels of investment etc

Tip 5 More Content Creation, Less Analytics

* Claims that checking in analytics can be addictive; instead of doing it daily, do it every week or so. Instead, concentrate on content creation and follow meaningful trends over weeks, months, etc.
* This is fairly sound advice, although further practical elaboration on how to correlate analytics information with action would be appreciated. Advice is simply to “really connect the dots on where things are going in your marketing strategy”. That’s not exactly helpful from an information systems point of view.

Tip 6 More education and emotion and less selling

* The proposal is to walk through the “buyer’s journey” (see also Tip 9) and teach them how to use the product to its fullest, and give them an emotional boost when they succeed with it.
* This makes sense with complex products with high price tags that can cover the cost of servicing the customer (for example a supercomputer with training included). Not exactly sure how this is especially relevant for consumer products sold through social media, unless it is mass transmission (e.g., training videos) rather than individual coaching.
* Something that probably should have been included is the potential of conflict between emotion and education. Many emotional decisions are not educated decisions. The ideal customer is one who is deeply committed to a product with grounded reasons, rather than just rash feelings.

Tip 7 More Values, Less Guessing

* A corporate body should have “brand values” which become known to the social media followers.
* Consistency and committed to the values in every social media post.
* Values can change through conversations with consumers.
* Again, brands don’t have values but represent values. Values represent the moral integrity of an organisation. Good corporate practise to liaise with independent organisations and potential critics (e.g., unions, environmental groups, workplace advocates etc).

Tip 8 More customer hero, less product hero

* The presentation argues that the structure of the Hero’s journey can be applied to the customer and social media marketing; “people are going throughout something, they come to a struggle, they overcome it and they are better off for it at the end.”
* Part of the process is to get user-generated content on success stories and share them as social proof in product reassurance.
* The great insight of Joseph Campbell, “The Hero with A Thousand Faces” in generating the monomyth (the hero ventures from the ordinary world to the supernatural, confronts great and magical adversaries, returns and bestows new powers to the community), has been converted to a shopping expedition.
* Raising any customer or product to a heroic level is worthy of ridicule. Mortal bravery is required, not a high-limit credit card.

Tip 9 More answers, less questions

* Suggestions to seek follower questions to develop a list of frequent questions; make it a consistent fun activity to invite followers to participate in regular question and answer sessions. Note that private message requests are increasing faster than requests on public forums.
* Curious that the authors are suggesting developing a Frequently Asked Questions (FAQ) which have been on the Internet (originally by NASA) since the early 1980s and arguably in publications since the 17th century.
* The matter of private messaging is probably a result of increased awareness of privacy issues.

Tip 10 More Traffic, Less Work

* Propose "working smarter" and "systemizing your social media strategy". The advice doesn't really come down to more than having a time-based plan for social media content.
* Even if one does adopt a project-like approach to social media one has to consider time, product, and cost (the classic project management triad) along with contingencies, or the classic marketing approach of product, price, and place.
* Just because we’re in the supposedly new world of social media and individualised market segments these hard issues can’t be hand-waved away.
* The best way to develop “more traffic, less work” is to have a good product at a good price when users need it, and then they’ll do a lot of the marketing for you!

Concluding Remarks

* There was some good material in the presentation, with regards to individual approaches and customer engagement, along with correctly identifying these as part of other trends in online communication.
* However much of the material was seriously short on elaboration, over-stated the case significantly, and lack a quantitative evaluation. The signal to noise ratio was very low.
* It is extremely difficult to see how the authors could seriously argue that this material reflected “Ten Social Media Tips for 2019”. Much of it was not related to social media, let alone material that is particularly important for 2019.
* Really, the presentation could have done with an information systems perspective. That is, looking at social media marketing as being a system, with a technological and business workflow, with market segment inputs decision points, triggers and contingencies, and so forth. The positive aspects of the presentation were random, the negative comes down to a lack of a systemic perspective.

February 12, 2019

Two White Paper Reviews

Information Systems and Enterprise Resource Planning

If a broad definition of information systems is taken as "usage and adaptation of the IT and the formal and informal processes by all of its users" (Paul, 2007), then Enterprise Resource Planning (ERP) must be recognised as a major IT application which seeks to combine a very wide range of business processes in an organisation in a technologically-mediated manner. Integration is of primary importance, for example, so that the disparate and siloed software applications that manage customers, sales, procurement, production, distribution, accounting, human resources, governance etc are provided common associations through a database system and from which decision-makers can engage in effective and informed business intelligence and enterprise management.

As can be imagined with such scope, effective ERP systems are highly sought after, with a range of well-known major providers (e.g., Oracle, SAP, Infor, Microsoft, Syspro, Pegasus etc), and a number of free and open-source solutions as well (e.g., LedgerSMB, metasfresh, Dolibarr etc). The main advantages of ERP systems should be self-evident; forecasting, tracking, a systems consolidation, a comprehensive workflow of activities, and business quality, efficiency, and collaboration. What is perhaps less well-known is the disadvantages; the twins of expensive customisation or business process restructuring for the software, the possibility of vendor lock-in and transition costs, and, of course, cost.

Panorama Consulting Services is a consulting company, that specialises in ERP that has been in operation since 2005, providing advice to business and government. They describe themselves as "[o]ne-hundred percent technology agnostic and independent of vendor affiliation" (Panorama Consulting, 2019), who provide a significant number of corporate "White Papers" on ERP-related subjects which, on a prima facie level, makes them a good candidate to begin some ERP comparisons. In particular, two White Papers are reviewed, "Ten Tips for a Successful ERP Implementation" (Panorama Consulting, 2015) and "Clash of the Titans 2019: An Independent Comparison of SAP, Oracle, Microsoft Dynamics and Infor" (2018), the selection in part to have a sense of a foundational paper followed by a practical implementation, and to determine whether their own principles of assessment in the former are actually applied in practical cases.

As an aside, mentioned must be made of the difference between White Papers in government and business. In the Commonwealth tradition, a White Paper is an official policy statement that is developed from an initial brief, and an initial statement designed for public consultation (a "Green Paper"). This differs significantly in the business world where a White Paper has come to mean a brief statement of businesses policy on a complex topic, and is much closer to a marketing tool (Graham, 2015). Government White Papers are significantly longer, more complex, and, arguably, are often the result of more extensive research.

Review: Ten Tips for a Successful ERP Implementation

The first recommendation from Panorama Consulting is for organisations to establish their own business requirements, which must arise from having defined business processes. It is not possible to define ERP requirements without having the business processes (and presumably the technology) already in place. This relates heavily to the major issues noted by Panorama that 75% of ERP projects take longer than expected, 55% are over-budget, and 41% realise less than half of their expected business benefits (based on Panorama's 2015 Annual Report). The reason for these problems are based on "unrealistic expectations", "mismanagement of scope", "unrealistic plans", "failure to manage the software vendor and project scope".

These issues form the basis of the stated tips, i.e., "focus on business processes and requirements first", "quantify expected benefits" to achieve a healthy ERP return-in-investment, "[e]nsure strong project management and resource commitment", "solicit executive buy-in", "take time to plan up front", "[e]nsure adequate training and change management", "[u]nderstand why you are implementing ERP", "[f]ocus on data migration early in the process", "[l]everage the value of conference room pilots", and finally "clear communication ... to all stakeholders" via a project charter.

Much of the White Paper is dedicated to issues that are less related to ERP as such and more on the project management aspects. Certainly there is some justification for this, after all, the implementation of ERP is a significant project. The highest level recommendation, however, ties in well with an information systems approach, that is, ensuring that business processes exist in the first place. If a business has not mapped its own processes the introduction of an ERP will not solve their problems as well as it could because neither the business procedures or culture are in place to effectively use the ERP.

Review: Clash of the Titans 2019

The second, and far more extensive, Panorama ERP White Paper was written some four years after the "Ten Tips". It analyses the major enterprise ERP software providers, (SAP, Oracle, Microsoft Dynamics, and Infor), based on their own ERP Benchmark Survey which was conducted between September 2018 and October 2018. The survey had some 263 respondents, with some 30% using SAP, 29% using Microsoft Dynamics, 25% using Oracle, and 16% using Infor. The top-level comparison from Panorama to organisations is to "consider software functionality, deployment options and vendor and product viability" and, returning to the previous White Paper, "[a]n informed decision also requires business process mapping and requirements definition".

The evaluations taken in the survey include; (i) implementation duration., (ii) operational disruption., (iii) single-system vs best-of-breed., (iv) internal vs external resource requirements., (v) significance of ERP in the organisation's digital strategy., and (vi) business initiatives included in the digital strategy. The results included some interpretation by Panorama, although it is worth noting the questions were more based around business results rather than the underlying technological tools.

From the ERP systems, SAP had the longest duration for implementation (average of 14.7 months), whereas Infor had the shortest (11.2), and by the same token SAP had the longest average operational disruption (128.5 days) and Infor had the shortest (120.6 days). However SAP clients tend towards large, complex, and global organisations, whereas Infor is designed for less cutomisation, and also as a result, tended to be a single ERP system (90.5% of solution set) whereas SAP had the lowest (86.2%), although it should be pointed out that the variation between the providers was not great.

Of some concern is human resource investment required; "vendors typically recommend a team of at least 8-12 full-time internal resources, this isn't feasible for most organizations". Oracle customers have a high of 50.9% of resources sourced internal, whilst Microsoft Dynamics is at at the other end of the scale at 45.4%. Panorama argues that simple customisation can be done in-house, whereas complex configurations require external support. Also of concern is whether or not ERP played a significant role in the organisations digital strategy, which ranged from 71.4% (Infor) to a low of 45% (Oracle), although the latter is explained as Oracle offers diverse components for specific functions. When functions themselves were analysed, a high of 84.6% included ERP in their business strategy, with a low of 8.3% using eCommerce from an Infor installation.

Critique of the Two White Papers

The initial proposal of conducting a review of the second paper based on the suggestions of the first is difficult as the survey as those questions are not directly addressed. Whilst the second White Paper does reiterate the concerns of planning and project management in both the short and long-term for an organisation considering implementing an ERP system, it does not associate the difficulties with the evident disruptions of implementation that ERP systems cause. In a sense, this can serve as a critique of the second White Paper in its own right, that they survey whilst broad in the range of the questions asked related to usage is also shallow, insofar that it relied on some fairly superficial interpretations of Panorama Consulting on why particular results were generated.

A major issue that is not directly addressed by either paper is whether ERP systems are justified, or have a limited justification from an enterprise software perspective. Whilst it would be challenging for an organisation such as Panorama Consulting to give advice that essentially says "No, an ERP system is not for you", it is surprising that in neither White Paper consideration is given to building bespoke systems. These do not need be as "complex but easy" (easy to use, complex to change) expensive enterprise tools would suggest. After all, an ERP is essentially a graphic interface connecting queries to database with stored procedures that incorporates business logic in a stateful manner. An ERP that is developed in-house but with proprietary-free storage and logic engines is relatively easy to change as the business itself develops. Rather than a "solution" it may be worthwhile looking at a "toolkit". It would certainly help with one of the often-overlooked matter that major ERP solutions embody significant cultural biases (Newman, Zhao, 2008).

Further, neither paper engaged in a comparison of technological resources required and consumed, such as requisite operating system and existing software requirements, physical system requirements, or latency and bandwidth. Given the effect that this has in the end-user experience of the system, the performance of the system (especially consider distance between data and processing) and therefore the speed of business intelligence, and the capacity of the system to cover the business needs at scale, one would have thought that such quantitative and objective measures would have had a primary consideration.

The implementation of an ERP system, which suggests an organisation-wide repository of related business data that is updated from various input functions, certainly seems enticing. However, as the first White Paper has correctly argued, an ERP system can only perform as well as the existing business logic and data repositories. The degree that these are incomplete suggests a need that will arise during implementation. Whilst the two White Papers from Panorama Consulting provide some information necessary for the implementation (organisational buy-in, project management) and some information concerning utilising of major packages, the papers lack the necessary systematic perspective for either implementation or comparison, part of which directly comes from the lack of critical quantitative and logical evaluations.

References

Graham, Gordon (2015). The White Paper FAQ, http://thatwhitepaperguy.com/white-paper-faq-frequently-asked-questions/, Retrieved 16 March 2015.

Newman, M. and Zhao, Y. (2008), The Process of ERP Implementation and BPR: A Tale from two Chinese SMEs. Information Systems Journal, 18 (4): 405-426.

Panorama Consulting (2015), Ten Tips for a Successful ERP Implementation.

Panorama Consulting (2018), Clash of the Titans 2019: An Independent Comparison of SAP, Oracle, Microsoft Dynamics and Infor

Panorama Consulting (accessed 2019), We Are Panorama From: https://www.panorama-consulting.com/company/

Paul, Ray J., (2007), Challenges to information systems: time to change, European Journal of Information Systems, 16, p193-195Two Panorama ERP White Paper Reviews

February 09, 2019

The Disciplinary Vagaries of Information Systems

Introduction

More than thirty years ago, Professor Peter Checkland of the University of Lancaster, raised the question whether information systems (IS) and systems thinking could be united (Checkland, 1988). Almost twenty years later, Ray J. Paul, senior lecturer at the London School of Economics and Political Science also raised the disciplinary status of the subject, as editor the European Journal of Information Systems (Paul, 2007). These two papers are both illustrative of several others (e.g., Banville and Laundry, (1988)., George et. al., (2005)., Firth et. al., (2011)., Annabi and McGann, (2015)) from information systems as it attempts to find its own disciplinary boundaries among the crowd of academia, research, and vocational activities (c.f., Abraham et. al., (2006)., Benamati et. al., (2010).

The two papers are selecting not only to provide an at-a-glance illustration of the time-period of foundational issues within Information Systems as a discipline, but also the temporal context of each paper, and the differences in their views which, at least in part, is reflective of those different times. Drawing from these illustrative comments and from other source material mentioned, some critical issues facing the field of information systems is identified. Rather than attempting to enforce a niche for information systems, a philosophical reconstruction is carried out using formal pragmatics, as developed by the philosopher Karl-Otto Apel (1980) and the social theorist Jurgen Habermas (1984).

Checkland and Paul: A Review

Checkland argues that human beings are "unique" in converting data into information and attributing meaning. Whilst recent studies in animal studies suggest that a more relative term would be more appropriate (c.f., Pika, et. al, (2018)), the point is made. Information Systems are defined as "organized attempts" by institutions to provide information, dominated by the use of computers as the dominant means to carry this into effect. For Checkland a potential conflict between technical experts who avoid the social aspect, and agents of social change who side-step technical expertise, with the discipline lacking in a "Newton" to bring the field together.

In elucidating how this situation came about Checkland notes that the recent history of Information Systems begins with the approach of engineers and statisticians of the 1940s (e.g., Fisher, Wiener, Weaver, Shannon) were primarily concerned with signal-transmission of messages and their distortion, with no reference to meaning. This would be developed in a "semantic information theory" by Stamper et. al., at the London School of Economics, to include semantic analysis. Nevertheless Checkland argues that information systems have "tacitly followed the systems thinking of the 1950s and 1960s". In taking up this argument Checkland applies a phenomenological epistemology where the system is derived from concepts which themselves are a "mutually-creating relationship between perceived reality .. and intellectual concepts (which include 'system')".

Generalized approaches as systems to entities as an "autonomous whole" applies in a variety of fields as diverse as biology, ecology, engineering, economics, sociology etc, from the 1940s to the 1960s and beyond. Checkland argues that this approach correlated with the information technology systems at the time, where such "hard" systems theory where "organizations are conceptualized as goal-seeking machines and information systems are there to enable the information needs associated with organizational goals to be met". In the 1980s new technologies were making it increasingly possible to view organizations as"discourses, cultures, tribes, political battlegrounds, quasi-families, or communication and task networks", where functionalism was being replaced with "phenomenology and hermeneutics", which correlates with the development of Soft Systems Methodology (SSM), meaning-attribution, and emergent properties, and where SSM is a learning system as a whole, and which makes system models.

With this transformation in technology and systems methodology, Checkland argues in the future computer projects will rarely use a project life cycle, except for when "relatively mechanical administrative procedures are computerized". Instead, projects will be increasingly orientated around tasks of perception and meaning where processes and information flows "increasingly need a social and political perspective", which places computing in the hands of the end-user, and where systems are developed so that they can learn and build their own information system. Checkland concludes that this "process orientation offers help with some of the crux problems of information provision in organizations in the 1990s".

Paul's editorial is quite different from Checkland's paper, with the most notable differences being between the difference of twenty years in time and the size of the publication; the editorial is a mere three pages compared to Checkland reaching just over ten. Despite this Paul managed to raise a great number of critical issues with both brevity and power, and with a great deal of hindsight wisdom when compared to the zeitgeist optimism of Checkland. Thematically Paul is concerned with the two topics of challenge and change, noting a variety of past editorials that are concerned with these matters, and notes a connection between them: It is clear to me that there are challenges to IS, and that it is time to address them – it is time to change.

Five challenges are identified which strike at the very core of Information Systems as a discipline and research project highlighting its continuing problematic status. Specifically, Paul states the following: "Nobody seems to know who were are outside the IS community", "Demand from students to study IS is generally dropping", "Research publications in IS do not appear to be publishing the right sort or content of research". In addition. Paul identifies "journal league tables" as being particularly troubling for the discipline, and finally, noting that what are information systems is still subject to significant debate.

Initially referring to the first four challenges, which are less elaborated, Paul suggests that the lack of IS's public recognition is because of its lack of distinct identity, often being located either within business studies or computer science, and suggests that a stronger IS community could provide the strength to overcome these barriers. Whilst the second challenge is an empirical measure it is suggested that community strength would overcome also help with declining enrolments. The third challenge is one of real-world applications versus pure research, with Paul strongly leaning towards the former. As for "journal league tables" Paul makes a case for journals to have a diversity of roles and fitness of purpose.

The final matter is the definition of IS. It is obviously difficult for a discipline to find its place in academia when it is unsure of its own core subject matter, Paul addresses this last point as a matter of major concentration. Taking an approach of via negativa, Paul tries to define information systems by what it is not. Thus, information systems are not information technology, which are the devices that provide the delivery mechanism for information systems. Nor is information systems the processes being used, and nor is the people using the technology and the processes. Instead, to Paul, information systems are the combination of these contributing factors. "The IS is what emerges from the usage and adaptation of the IT and the formal and informal processes by all of its users."

Critique and Reconstruction

Despite the differences in time, size, and even approach, both the papers from Checkland and Paul refer to the issue of identity for IS as a discipline and research project. Checkland provides a historical and philosophical approach to describing this matter and concludes with an optimistic zeitgeist of transformation, partially developed from a degree of technological determinism. Paul's editorial, in contrast, is more concerned with immediate issues and context, although this too is ultimately grounded in ontological and epistemological issues that confront Information Systems.

Checkland's historical argument of the transformation from "hard systems" approaches to SSM are largely accepted within the discipline, although the argument that this correlates with organisations moving from goal-seeking institutions to some sort of extended family is pretty speculative to say the least. Likewise is the claim of processes moving away from functionalism to more phenomenological and hermeneutic approaches, as if these are at variance to each other. After all, functionalism has both phenomenology and hermeneutics as a philosophical foundation. Whilst Checkland is close to transcending these subject-centred approaches with references to speech-act theory and the generation of meaning, they have not engaged in the core principle of linguistic philosophy which notes that shared symbolic values cannot be systematically generated on account of their intersubjectivity (Apel, 1980., Habermas, 1984).

Likewise, Paul's five challenges are not without problems. Two at least are not exactly IS unique problems but rather universal to all academic disciplines, that is the matter of conflict between pure research and practical application (see also Benamati et. al., (2010)) and the issue of journal league tables and fitness to purpose. The matter of IS lacking its own distinct disciplinary identity is certainly a challenge however as Paul's definitional conclusion indirectly notes, this is because it is a multi-disciplinary subject that draws together the organisational requirements in business studies with the technology of computer science. Finally, Paul takes a fairly idealistic approach in arguing that the drop in IS enrolments is due to a lack of an IS community. As Abraham et al (2006) point out, there were particularly historical macroeconomic issues as a major factor, specifically the "dot.com" crash of 2000-2002.

A reconstruction of the disciplinary identity of IS is founded on the philosophy of formal (or universal) pragmatics. Initiated by the "linguistic turn" of Wittgenstein (1953), this approach identifies language and meaning as something that is generated between people, and rather than taking a subject-centered view of the world elaborates this to an intersubjective approach. Uniting epistemological considerations with ontological ones formal pragmatics takes a rationalisation of world orientations with specific verification claims.

Unverifiable Metaphysics Physicalist, Symbolist, Idealist Theology
Verifiable Reality Logical and Empirical Philosophy
Orientations/Worlds (verification) 1. Objective or External World 2.Intersubective or Social World 3. Subjective or Internal World
1. Statements of Truth - Sciences (correspondence) Scientific facts Social facts Unverifiable
2. Statements of Justice - Laws (consensus) Unverifiable Legal Norms Moral Norms
3. Statements of Beauty – Arts (sincerity) Aesthetic Expressions Unverifiable Sensual Expressions

Table 1: Elaborated from Habermas (1984), illustrating the pragmatic complexes

In doing so, IS is placed within multiple complexes; not only does it have to deal with the world-orientation of objective facts (information technology), it is equally involved in the world-orientation of social facts (business processes); whereas earlier IS focussed more on the former, SSM was focussed more on the later. Both of these are part of DIKW hierarchy, which refers to the relationship between data, information, knowledge, and wisdom (Rowley (2007). IS is thus a subset of the wider body of knowledge management, which contains all the specific academic disciplines and world-orientations. Whilst the importance of IS will continue to grow due to the increased development of business processes, information technology, and their integration, it will always be a multi-disciplinary subject as it crosses pragmatic boundaries.

One sociological effect of formal pragmatics refers to an understanding of society as a combination of institutional systems and a lifeworld of communities. Procedural action occurs through institutional systems, whereas meaning generation occurs through the lifeworld. In this sense, a system cannot generate meaning and, despite the best efforts of public relations experts, cannot determine the response of communities to their slogans, neologisms, etc. It is very much asystematic, and the best that organisations can hope for is that they have sufficient inputs from the wild world of external communities to be attentive to communities think of them. In this sense, the hopes of Checkland that somehow institutional systems will transform themselves from procedural to meaning-generating bodies is implausible.

Conclusion

Taking the divergent papers of Checkland and Paul a common theme was noted as both attempted to find a foundation on which to base Information Systems as a discipline and research project. Checkland's approach was to apply philosophical considerations to the history of IS and in particular note the transformation of the engineering and biological perspectives of the 1950s and 1960s to the Soft Systems Methodology of the 1970s and 1980s with the possibility of institutional transformation aided by changes in technological devices. In contrast, Paul noted difficulties of identity within IS as it crossed disciplinary boundaries and lacked a referential ability to define itself in distinction to other fields of inquiry.

Both these papers contribute to an ongoing debate on the nature and function of IS. A contribution is made here using contemporary pragmatic and linguistic philosophy with the identification that the challenges confronting IS are actually required for the existence of IS. As long as there is data that is stored in a structured manner as information and that that information is utilised by institutions, then it is inevitable that IS will be crossing the boundaries of instrumental and social technologies. Likewise, as long meaning is generated intersubjectively then there can be no way that an external body can enforce and define the interpretation of shared symbolic values.

By way of conclusion, it must be said that formal pragmatics is obviously not just applicable to IS. As a theory that looks at whether statements of affairs are even verifiable in a pragmatic sense it must incorporate the entirety of the knowledge of our species; not in the content of course, but rather in the categorisation of what sort of statements are possible in the first place. Certainly, there is many things in the universe which we do not know yet, but at least formal pragmatics provides the tools of what sort of questions can be asked, and when propositions are made, whether or not that content seeks verification of truth, justice, or beauty. There are, perhaps further questions outside these areas of verification, but for them, perhaps it is best to refer to two aphorisms of the early Wittgenstein (1922). "The limits of my language mean the limits of my world"; "Whereof one cannot speak, thereof one must be silent."

References

Abraham, T., Beath, C., Bullen, C., Gallagher, K., Goles, T., Kaiser, K., and Simon, J. (2006) IT Workforce Trends: Implications For IS Programs, Communications of the Association for Information Systems, (17)50, p. 1147-1170

Annabi, H., McGann, Sean T., (2015), MIS-Understood: A Longitudinal Analysis of MIS Major Misconceptions, Proceedings of SIGED: IAIM Conference 2015

Apel, Karl-Otto, (1980), (trans. Glyn Adey, David Frisby), Towards a Transformation of Philosophy, Routledge and Kegal Paul, FP 1972

Benamati, J. H., Ozdemir, Z. D., and Smith, H. J. (2010), Aligning Undergraduate IS Curricula with Industry Needs, Communications of the Association for Information Systems, (53)3, p. 152-156

Banville, C., Landry, M., (1989) Can the Field of MIS be Disciplined?, Communications of the ACM, 32 (1), p48-60

Checkland, P.B., (1988) Information Systems and Systems Thinking: Time to Unite?, International Journal of Information Management, 8, p239-248

George, J. F., Valacich, J. S., and Valor, J. (2005), Does Information Systems Still Matter? Lessons for a Maturing Discipline, Communications of the Association for Information Systems (16)8, pp. 219-232

Habermas, J., The Theory of Communicative Action, Volume 1: Reason and the Rationalisation of Society, Beacon Press, 1984 [FP 1981]

Firth, D., King, J., Koch, H., Looney, C. A., Pavlou, P., and Trauth, E. M. (2011), Addressing the Credibility Crisis in IS, Communications of the Association for Information Systems, (28)13, p. 199-212

Paul, Ray J., (2007), Challenges to information systems: time to change, European Journal of Information Systems, 16, p193-195

Pika, Simone., Wilkinson, Ray, Kendrik, Kobin H., Vernes, Sonja C., (2018), Taking turns: bridging the gap between human and animal communication, Proceeedings of the Royal Society B, 285(1880)

Rowley, Jennifer (2007). The wisdom hierarchy: representations of the DIKW hierarchy. Journal of Information and Communication Science. 33 (2): p163–180.

Wittgenstein, Ludwig (1922), (trans. Frank P. Ramsey and Charles Kay Ogden), Tractatus Logico-Philosophicus, Kegan Paul, FP 1921

Wittgenstein, Ludwig (1953), (trans. G. E. M. Anscombe), Philosophical Investigations, Macmillan Publishing Company.

February 06, 2019

Encrypted connection between SIP phones using Asterisk

Here is the setup I put together to have two SIP phones connect together over an encrypted channel. Since the two phones do not support encryption, I used Asterisk to provide the encrypted channel over the Internet.

Installing Asterisk

First of all, each VoIP phone is in a different physical location and so I installed an Asterisk server in each house.

One of the server is a Debian stretch machine and the other runs Ubuntu bionic 18.04. Regardless, I used a fairly standard configuration and simply installed the asterisk package on both machines:

apt install asterisk

SIP phones

The two phones, both Snom 300, connect to their local asterisk server on its local IP address and use the same details as I have put in /etc/asterisk/sip.conf:

[1000]
type=friend
qualify=yes
secret=password1
encryption=no
context=internal
host=dynamic
nat=no
canreinvite=yes
mailbox=1000@internal
vmexten=707
dtmfmode=rfc2833
call-limit=2
disallow=all
allow=g722
allow=ulaw

Dialplan and voicemail

The extension number above (1000) maps to the following configuration blurb in /etc/asterisk/extensions.conf:

[home]
exten => 1000,1,Dial(SIP/1000,20)
exten => 1000,n,Goto(in1000-${DIALSTATUS},1)
exten => 1000,n,Hangup
exten => in1000-BUSY,1,Hangup(17)
exten => in1000-CONGESTION,1,Hangup(3)
exten => in1000-CHANUNAVAIL,1,VoiceMail(1000@mailboxes,su)
exten => in1000-CHANUNAVAIL,n,Hangup(3)
exten => in1000-NOANSWER,1,VoiceMail(1000@mailboxes,su)
exten => in1000-NOANSWER,n,Hangup(16)
exten => _in1000-.,1,Hangup(16)

the internal context maps to the following blurb in /etc/asterisk/extensions.conf:

[internal]
include => home
include => iax2users
exten => 707,1,VoiceMailMain(1000@mailboxes)

and 1000@mailboxes maps to the following entry in /etc/asterisk/voicemail.conf:

[mailboxes]
1000 => 1234,home,person@email.com

(with 1234 being the voicemail PIN).

Encrypted IAX links

In order to create a virtual link between the two servers using the IAX protocol, I created user credentials on each server in /etc/asterisk/iax.conf:

[iaxuser]
type=user
auth=md5
secret=password2
context=iax2users
allow=g722
allow=speex
encryption=aes128
trunk=no

then I created an entry for the other server in the same file:

[server2]
type=peer
host=server2.dyn.fmarier.org
auth=md5
secret=password2
username=iaxuser
allow=g722
allow=speex
encryption=yes
forceencrypt=yes
trunk=no
qualify=yes

The second machine contains the same configuration with the exception of the server name (server1 instead of server2) and hostname (server1.dyn.fmarier.org instead of server2.dyn.fmarier.org).

Speed dial for the other phone

Finally, to allow each phone to ring one another by dialing 2000, I put the following in /etc/asterisk/extensions.conf:

[iax2users]
include => home
exten => 2000,1,Set(CALLERID(all)=Francois Marier <2000>)
exten => 2000,2,Dial(IAX2/server1/1000)

and of course a similar blurb on the other machine:

[iax2users]
include => home
exten => 2000,1,Set(CALLERID(all)=Other Person <2000>)
exten => 2000,2,Dial(IAX2/server2/1000)

Firewall rules

Since we are using the IAX protocol instead of SIP, there is only one port to open in /etc/network/iptables.up.rules for the remote server:

# IAX2 protocol
-A INPUT -s x.x.x.x/y -p udp --dport 4569 -j ACCEPT

where x.x.x.x/y is the IP range allocated to the ISP that the other machine is behind.

If you want to restrict traffic on the local network as well, then these ports need to be open for the SIP phone to be able to connect to its local server:

# VoIP phones (internal)
-A INPUT -s 192.168.1.3/32 -p udp --dport 5060 -j ACCEPT
-A INPUT -s 192.168.1.3/32 -p udp --dport 10000:20000 -j ACCEPT

where 192.168.1.3 is the static IP address allocated to the SIP phone.

January 31, 2019

20 Years of Linux.conf.au [Memoirs]

On the first night I arrived in Christchurch, New Zealand for Linux.conf.au 2019, a group of around a dozen attendees went to dinner. Amongst them were Steve Hanley and Hugh Blemmings, whom I have known since the early 2000’s at various LCAs around the region. They asked for some memoirs of LCA – something small; what follows was my throughts, far longer than expected

Dateline: Just after the Year 2000. The Y2K bug. The first billion seconds of the Unix™ epoch (Sept 9 2001)…

In the summer of 2001, some friends from Perth and I made a trip to a new conference we had heard about called Linux.conf.au. I was a new Debian Linux developer, my friends were similarly developers, sysadmins, etc. What met us was one of the best interactions of like minded individuals we had seen; deeply technical discussions and presentations by key individuals who not only knew their subject matter, but wrote the code, created the community, or otherwise steered a section of the Open Source software movement from around the world.

Linux.conf.au 2001Linux.conf.au 2001 – day 1

Living on the opposite side of Australia in Perth meant we were intellectually starved of being able to talk face-to-face to key people from this new world of Open Source and Free Software. The distance across the county is almost the same as East to West coast United States, and not many visitors to Melbourne or Sydney make the long trek over the Great Australian Bight to reach Western Australia’s capital.

We found ourselves asking the LCA 2011 organisers if it would be possible in future to run Linux.Conf.Au in Perth one day.

Having had the initial conference (then called the Conference of Australian Linux Users, or CALU) in 1999 in Melbourne, and then Linux.conf.au 2001 in Sydney, it seemed a natural progression to having LCA roam around different cities year; it felt almost unfair to those who could not afford to travel to Melbourne or Sydney.

The result from 2001 was that in 2002 it would run in Brisbane, but that we should make a proposal and get organised

MiniConfs at LCA

In 1999 I went to AusWeb in Ballina, NSW, and ApacheCon 2K in London.

Closing of ApacheCon 2000 in London – developers on stage Q&A

I also went to DebConf 1 in Bordeaux, France. DebConf was run as an adjunct to the larger French Libre Software Meeting (LSM), as Debian felt that its gathering of developers was too small to warrant the organisational overhead for its own conference at that time.

Debian Developers at DebConf1, Bordeaux 2001

I liked the idea of a pre-conference gathering for Debian for Linux.conf.au 2002 in Brisbane – a Mini Conference.

So in parallel to talking about running LCA in Perth for 2003, I asked Raymond Smith, LCA 2002 lead organiser, and the rest of the Brisbane organising team if I could turn up a few days early to Brisbane for the 2002 conference, use one of the rooms for a small pre-event.

The principle was simple: minimal organisation overhead; don’t get in the way of those setting up for LCA.

LCA2003 Bid Preparation

The Puffin/Penguin suit, and a small TuxPuffin/Penguin suit

In December 2001 we found what was probably closer to a full-size puffin costume at a fancy dress shop – close enough that it could pass for a penguin.

We started to plan a video as a welcome video – to show some of Perth, and what could be expected in coming to the West.

With a logo I designed from a classic Australian yellow road sign, we had a theme of the Road Trip to Perth.

So with the Puffin/Penguin suit in hand, and a few phone calls, we found ourselves with camera kit on New Years’ Day 2002 at the arrivals hall of Perth airport to film segments for a video to play at the close of LCA2002: the story of tux arriving and making his/her/their way to the conference venue at UWA. Much of the costume performance was Nick Bannon, but also Mark Tearle, and others. I filmed and rough scripted, Tony Breeds edited video, and sought licensing for the music, generously donated by the band credited.

LCA 2002

The MiniConf at LCA ran smoothly. People arrived from around the world, including then DPL Bdale Garbee.

Debian mini-conf at LCA 2002 in BrisbaneJames Bromberger (c) and Bdale Garbee (r) at the Dbeian Mini-Conf, 2002

The main conference was awesome, as always.

Rusty Russell speaking at Linux.conf.au 2002
Jeremy Allison (l) and Andrew Tridgell (r) tlaking Samba at Linux.conf.au 2002
Raymond Smith, lead organiser LCA 2002
Ted T’so talking filesystems at LCA 2002. “In Ted we trust”.
LCA2003 closing
James Bromberger (l) and Tony Breeds (r) – co-chairs of LCA 2003
LCA2003 invite video being played at LCA2002 Closing

Post 2002 prep for 2003

We ran monthly, then weekly face to face meetings, we split into teams – web site, papers committee, travel & accommodation, swag, catering, venue, AV and more. Bernard Blackham made significant changes to get us able to process the crypto to talk to the CommSecure payment gateway so we could process registrations (and send signed receipts).

We thought that not many people would come to Perth, a worry that drove us to innovate. Sun Microsystems agreed to sponsor a program we devised called the Sun Regional Delegate Programme, funding a sizable amount of money to fly people from across the state down to Perth to attend

I left my full time job in November 2002 to work full time on the conference, having planned to start travelling in Europe sometime after LCA in 2003. Hugh Blemmings (then at IBM) sponsored Linus Torvalds to attend, which we kept under wraps.

Tux at Linux.conf.au 2003

A small group worked on making a much better, full size Penguin costume, which days before the opening we proposed to put Linus in as part of the opening welcome.

Holy Penguin Pee, LCA 2003

I sourced some white label wine, designed and had printed some custom labels, naming this the Holy Penguin Pee, which was to be our conference dinner wine (amongst other beverages). While at the time this was a nice little drop; the bottles I now hold some 16 years later are a little less palatable.

Perth 2003

Miniconfs had blown out to several sessions. Attendance was projected to exceed 500 attendees (excluding speakers and organisers).

As the audience gathered for the welcome in the Octagon Theatre at UWA, we had amongst us Tove Torvalds, and their three small girls: Patricia (6), Daniella (4), and Celeste (2).

The opening of Linux.conf.au 2003James Bromberger (l), Rusty Russel (c), Linus Torvalds (r

As the conference opened, I took to the stage, and the Penguin waddled on. I commented that we have a mascot, and he’s here; Rusty then joined me, and removed the head of the Penguin to reveal Linus within.

Linus in Penguin costume talking to his daughters

Along with Linus, we also had Tove Torvalds, and their three small girls: Patricia (6), Daniella (4), and Celeste (2). During the earlier rehearsal, the girls were so amused to see their Daddy in a penguin suit; there were some lovely photos of them inspecting the suit, and looking at their Dad change the world while having fun.

On that opening welcome – the morning in the Penguin suit – the temperature was over 40°C.

For a non-profit event, we had too much money left over that it was decided to reduce the profit by ordering pizza for lunch on the last day. Days ahead, we drove to a local pizza hut branch, and asked what would be required to order some 300 pizzas, and could they deliver them effectively. We cut a cheque (remember those) two days in advance, and on the day, two minivans stuffed to the roof turned up.

LCA spelt with Pizza Boxes, a slang name for a common form factor of servers around this time, and one of the LCA yellow banners.

Prior to recycling, I suggested we spell out the name of this even in Pizza boxes as a fun tribute to the amount of pizza we all consumed as we cut code, and changed the world. This photo embodies LCA (and appears on the Wikipedia page). I think I took the image from the library balcony, but I may be wrong.

LCA2003 was the first time we had full audio recordings of all main conference sessions. Ogg Speex was the best codec at the time, and video was just beyond us. A CD was produced containing all recordings, plus a source copy of the Speex codec.

LCA2003 closed on 25 January 2003.

Then on the 26th (Australia Day) my then girlfriend and I grabbed our bags, and moved to the UK for 1 year (PS: it was 8).

Roaming around the northern hemisphere

My time in Europe got me to FOSDEM and DebConf many times. I was at UKUUG, tripping through Cambridge occasionally, seeing people whom I had previously met from the Debian community at the LSM in Bordeaux. I met new people as well, some of whom have since made the trip to Australia in order to present at LCA.

James Bromberger (far l), Barry White (l), Pierre Denis (c), Simon Wardley (far r), at the Fotango Christmas party (skiiing) in Chamonix in 2004.

I spent time at Fotango (part of Canon Europe) working with some awesome Perl developers, and running the data centres and infrastructure.

Returning Home

Upon my return to Perth in 2010, I went back to PLUG, to find a new generation of people who were going to LCA 2011 in Brisbane.

I started a cunning plan with the PLUG crew; we put forward a proposal to the Lottery commission for $10,000 to get equipment for us to set up a single stream for video recording using DVSwitch in order to record the regular PLUG meetings.

Euan (l) and Jason (r) playing with DVSwitch at Perth Artifactory in 2011

It worked; a crew came together, and PLUG had some practice at what running a Video Team required (at the time).  I managed to convince Luke John to put forward a proposal to run LCA – it had been nearly 10 years since it had been in Perth – and thus it came to pass.

I, however, was not going to be front and centre in 2014 (though I did give a presentation on Debian and AWS at the 2014 conference).

But I found a new role I could play. With the additional video kit, and a bit of organising, we grabbed a couch and for one year, created LCA TV – an opportunity to grab on video some of the amazing people who come to Linux.conf.au. While we now have great video of presentations, it’s nice to have a few minutes for chat with those amongst us, captured for posterity.

LCA TV 2014

I want to thank LA Council who have had the courage to have LCA wander the region year to year. I want to thank the LCA crews I have worked with in 2003 and 2014, but I want to thank the crew from every year, the speakers who have stood up and spoken, the video teams, and the volunteers.

Looking forward; I want to thank people who haven’t done what they are going to do yet: those who will run LCA in future, and those who will give their time to share their knowledge with others, across countries, languages, companies and more.

Linux.Conf.Au has been central to the success of technical talent in the Linux and Open Source space in this region.


Arriving at CHC Airport, LCA 2019 was present for conference registrations in the terminal.

I have one more person to thank. My then-girlfriend in 2003, now my wife of many years who has put up with me spending so much time attending, planning and running a technical conference over the years. Thanks Andrea.

The Gentle Art of Swedish Death Cleaning

Share

We’ve owned this book for a while, but ironically Catherine lost it for a bit. It seems very topical at the moment because of the Marie Kondo craze, but its been floating around our house for probably a year.

The book is written by an 80+ year old and explains the Swedish tradition of sorting your stuff out before you keel over, which seems like a totally reasonable thing to do when the other option is leaving your grieving kids to work out what on earth to do. The book isn’t as applicable to people not at the end of the lives — it for example recommends starting with large things like furniture and younger people are unlikely to have heaps of unneeded furniture.

That said, there is definitely advice in here that is applicable to other life stages.

The book is composed of a series of generally short chapters. They read a bit like small letters, notes, or blog posts. This makes the book feel very approachable and its a quite fast read.

I enjoyed the book and I think I got some interesting things out of it.

The Gentle Art of Swedish Death Cleaning Book Cover The Gentle Art of Swedish Death Cleaning
Margareta Magnusson, Jane Magnusson,
October 19, 2017
144

D�st�dning, or the art of death cleaning, is a Swedish phenomenon by which the elderly and their families set their affairs in order. Whether it's sorting the family heirlooms from the junk, downsizing to a smaller place, or setting up a system to help you stop misplacing your keys, death cleaning gives us the chance to make the later years of our lives as comfortable and stress-free as possible. Whatever your age, Swedish death cleaning can be used to help you de-clutter your life, and take stock of what's important. Margareta Magnusson has death cleaned for herself and for many others. Radical and joyous, her guide is an invigorating, touching and surprising process that can help you or someone you love immeasurably, and offers the chance to celebrate and reflect on all the tiny joys that make up a long life along the way.

Share

Introduction to Information Systems & Systems Thinking

What Are Information Systems? (video)

Reflective One Minute Paper

The primary objective here is to define information systems. To do so, one must differentiate between raw, unorganised data, to processed, organised, and structured information that is meaningful. Information is necessary to for behaviour, decisions, and outcomes and can be valued by various metrics (timeliness, appropriateness, accuracy, etc). Information has a life-cycle: Creation (internal or external capture), Existence (Store/Retrieve, Use), Termination (Archive or Destroy).

The scholarly history of Information Systems has developed initially from positivist approaches (business and economics), to include interpretative (sociological) and critical (feminist, environmentalist). The use of Information Systems as a discipline has been especially effective in software development, and in the restructuring of business processes. Pragmatic processes are concerned with capturing and understanding business processes for analysis of that process, with notational systems (e.g., Business Process Model and Notation) being employed.

Three significant questions arise from this review.

Firstly, is there a possibility of transcendence or overcoming (aufhebung) to break the information life-cycle, where iteration leads to qualitative improvement (destruction of information, conversely, would be a qualitative destruction). If this so, then information (and data) should always be at the very least archived and never destroyed.

Secondly, where is Information Systems placed in academia given the influences? This is a major issue for researchers and theorists for decades (e.g., Checkland (1988), Banville and Landry (1989), Paul (2007)), which in part reflects its history of attempting to be positivist, interpretative and critical in approaches. Perhaps overlooked in this analysis is the importance of the earlier Habermas-Luhmann debates between systems and critical approaches in social theory (Habermas, Luhmann, 1971)

Finally, the possibility is raised for the insights of critical approaches in information systems and the iterative techniques of agile project management (Highsmith, 2010) to open up formal mappings such as the Business Process Model and Notation. Applying new decision points that allow for democratic inputs and dynamically changing the model maps would allow for dynamic projects as well as established operations, whilst at the same time providing rigour to agile project management.

References

Checkland, P.B. (1988), Information Systems and Systems Thinking: Time to Unite?, International Journal of Information Management, 8 p239-248

Banville, Claude., Landry, Maurice (1989), Can the Field of MIS be Disciplined?, Communications of the ACM, Vol 32 No 1, p48-60

Habermas, Jurgen., Luhmann, Niklas (1971), Theorie der Gesellschaft oder Sozialtechnologie?, Suhrkamp

Highsmith, Jim (2010), Agile Project Management, 2nd edition, Addison-Wesley

Paul, Ray J. (2007), Challenges to information systems: time to change, European Journal of Information Systems 16, p193–195. doi:10.1057/palgrave.ejis.3000681

Systems Thinking (video)

Reflective One Minute Paper

Systems Thinking

Reflective One Minute Paper

The following provides a review of the history and approaches to the concept of "systems". Etymologically, the word derives from the Greek sustēma, from sun- 'with' and histanai 'set up' meaning uniting, putting together. The first scientific use of the word comes from Carnot, referring to steam thermodynamics in the early-mid 19th century, with systems concepts being applied in evolutionary and biological sciences (e.g., Darwin, 1850s., Tansley 1910s) with the biologist Bertalanffy (from Bogdanov) developing a "general systems theory" in the 1930s. Wiener provided the notion of cybernetics, the general study of control and communication, in the 1940s, alongside computing sciences with von Neumann, Turing, and Shannon.

'Cybernetics', is derived from the Greek word for steersman or helmsman, who provides the control system for a boat or ship. Note also applies for government, "kyber-ment'. Cybernetics can be applied to itself, and as such second-order cybernetics was developed by Beer et al in the 1960s, which included the role of the observer. Soft systems methodology developed by Checkland, Wilson et al in the 1970s allowed for normative evaluations to be applied within an interrogative systems approach. Finally, Kauffman, Botkin et al in the 1990s developed a systems approach to complexity which identifies disequilibria dynamics in self-organisation.

Overlooked in the lecture is the extremely important contribution of systems theory within sociology and social theory, deriving from the functional sociology of Weber (1900s), the structuralism of Parsons (1950s), the systems theory of Luhmann (1960s), and the neofunctionalism of Alexander (2000s). Second-order cybernetics also has relevancy with Giddens' (1987) double hermeneutic, a distinguishing difference between social and natural sciences, where the observer is observed as well as the subject.

The structure of a system is a static property and refers to the constituent elements of the system and their relationship to each other. The behaviour is a dynamic property and refers to the effect produced by a system in operation. Feedback is information about the results of a process which is used to change the process. The homeostat, the human being, and the thermostat all are said to maintain homoeostasis or equilibrium, through feedback loops, which was promoted as a theory of everything (e.g., Odum, 1959). Whilst feedback plus systems behaviour was meant to provide a self-regulating system, the observed result is disequilibrium and dynamism at least in nature (equilibrium is present in machines).

Of note was the misuse of language from the computer engineering processes to systems theory (e.g., the use of "memory" for "information storage" (primary, secondary, tertiary, dynamic, fixed). Likewise the use of "information theory" instead of "signal transmission theory". Apropos, Shannon (1948) even described "A Mathematical Theory of Communication", an exceptional paper on signal transmission and noise, but which did not touch upon the pragmatics or semantics of language.

References

Giddens, A., (1987), Social Theory and Modern Sociology, Polity Press, p20-21

Odum, Howard (1959) "The relationships between producer plants and consumer animals, between predator and prey, not to mention the numbers and kinds of organisms in a given environment, are all limited and controlled by the same basic laws which govern non-living systems, such as electric motors and automobiles.", Fundamentals of Ecology, 2nd ed p44

Shannon, C.E., (1948), A Mathematical Theory of Communication, The Bell System Technical Journal, Vol. 27, pp. 379-423 and 623-656, 1948.

LUV February 2019 Main Meeting: All users will eventually die / ZeroTier

Feb 5 2019 19:00
Feb 5 2019 21:00
Feb 5 2019 19:00
Feb 5 2019 21:00
Location: 
Kathleen Syme Library, 251 Faraday Street Carlton VIC 3053

PLEASE NOTE LATER START TIME

7:00 PM to 9:00 PM Tuesday, February 5, 2019
Training Room, Kathleen Syme Library, 251 Faraday Street Carlton VIC 3053

Speakers:

Many of us like to go for dinner nearby after the meeting, typically at Brunetti's or Trotters Bistro in Lygon St.  Please let us know if you'd like to join us!

Linux Users of Victoria is a subcommittee of Linux Australia.

February 5, 2019 - 19:00

read more

LUV February 2019 Workshop: TBA

Feb 23 2019 12:30
Feb 23 2019 16:30
Feb 23 2019 12:30
Feb 23 2019 16:30
Location: 
Infoxchange, 33 Elizabeth St. Richmond

PLEASE NOTE CHANGE OF DATE

Topic to be announced

There will also be the usual casual hands-on workshop, Linux installation, configuration and assistance and advice. Bring your laptop if you need help with a particular issue. This will now occur BEFORE the talks from 12:30 to 14:00. The talks will commence at 14:00 (2pm) so there is time for people to have lunch nearby.

The meeting will be held at Infoxchange, 33 Elizabeth St. Richmond 3121.  Late arrivals please call (0421) 775 358 for access to the venue.

LUV would like to acknowledge Infoxchange for the venue.

Linux Users of Victoria is a subcommittee of Linux Australia.

February 23, 2019 - 12:30

January 29, 2019

Save the Dates! Linux Security Summit Events for 2019.

There will be two Linux Security Summit (LSS) events again this year:

Stay tuned for CFP announcements!

Distributed Storage is Easier Now: Usability from Ceph Luminous to Nautilus

On January 21, 2019 I presented Distributed Storage is Easier Now: Usability from Ceph Luminous to Nautilus at the linux.conf.au 2019 Systems Administration Miniconf. Thanks to the incredible Next Day Video crew, the video was online the next day, and you can watch it here:

If you’d rather read than watch, the meat of the talk follows, but before we get to that I have two important announcements:

  1. Cephalocon 2019 is coming up on May 19-20, in Barcelona, Spain. The CFP is open until Friday February 1, so time is rapidly running out for submissions. Get onto it.
  2. If you’re able to make it to FOSDEM on February 2-3, there’s a whole Software Defined Storage Developer Room thing going on, with loads of excellent content including What’s new in Ceph Nautilus – project status update and preview of the coming release and Managing and Monitoring Ceph with the Ceph Manager Dashboard, which will cover rather more than I was able to here.

Back to the talk. At linux.conf.au 2018, Sage Weil presented “Making distributed storage easy: usability in Ceph Luminous and beyond”. What follows is somewhat of a sequel to that talk, covering the changes we’ve made in the meantime, and what’s still coming down the track. If you’re not familiar with Ceph, you should probably check out A Gentle Introduction to Ceph before proceeding. In brief though, Ceph provides object, block and file storage in a single, horizontally scalable cluster, with no single points of failure. It’s Free and Open Source software, it runs on commodity hardware, and it tries to be self-managing wherever possible, so it notices when disks fail, and replicates data elsewhere. It does background scrubbing, and it tries to balance data evenly across the cluster. But you do still need to actually administer it.

This leads to one of the first points Sage made this time last year: Ceph is Hard. Status display and logs were traditionally difficult to parse visually, there were (and still are) lots of configuration options, tricky authentication setup, and it was difficult to figure out the number of placement groups to use (which is really an internal detail of how Ceph shards data across the cluster, and ideally nobody should need to worry about it). Also, you had to do everything with a CLI, unless you had a third-party GUI.

I’d like to be able to flip this point to the past tense, because a bunch of those things were already fixed in the Luminous release in August 2017; status display and logs were cleaned up, a balancer module was added to help ensure data is spread more evenly, crush device classes were added to differentiate between HDDs and SSDs, a new in-tree web dashboard was added (although it was read-only, so just cluster status display, no admin tasks), plus a bunch of other stuff.

But we can’t go all the way to saying “Ceph was hard”, because that might imply that everything is now easy. So until we reach that frabjous day, I’m just going to say that Ceph is easier now, and it will continue to get easier in future.

At linux.conf.au in January 2018, we were half way through the Mimic development cycle, and at the time the major usability enhancements planned included:

  • Centralised configuration management
  • Slick deployment in Kubernetes with Rook
  • A vastly improved dashboard based on ceph-mgr and openATTIC
  • Placement Group merging

We got some of that stuff done for Mimic, which was released in June 2018, and more of it is coming in the Nautilus release, which is due out very soon.

In terms of usability improvements, Mimic gave us a new dashboard, inspired by and derived from openATTIC. This dashboard includes all the features of the Luminous dashboard, plus username/password authentication, SSL/TLS support, RBD and RGW management, and a configuration settings browser. Mimic also brought the ability to store and manage configuration options centrally on the MONs, which means we no longer need to set options in /etc/ceph/ceph.conf, replicate that across the cluster, and restart whatever daemons were affected. Instead, you can run `ceph config set ...` to make configuration changes. For initial cluster bootstrap, you can even use DNS SRV records rather than specifying MON hosts in the ceph.conf file.

As I mentioned, the Nautilus release is due out really soon, and will include a bunch more good stuff:

  • PG autoscaling:
  • More dashboard enhancements, including:
    • Multiple users/roles, also single sign on via SAML
    • Internationalisation and localisation
    • iSCSI and NFS Ganesha management
    • Embedded Grafana dashboards
    • The ability to mark OSDs up/down/in/out, and trigger scrubs/deep scrubs
    • Storage pool management
    • A configuration settings editor which actually tells you what the configuration settings mean, and do
    • Embedded Grafana dashboards
    • To see what this all looks like, check out Ceph Manager Dashboard Screenshots as of 2019-01-17
  • Blinky lights, that being the ability to turn on or off the ident and fault LEDs for the disk(s) backing a given OSD, so you can find the damn things in your DC.
  • Orchestrator module(s)

Blinky lights, and some of the dashboard functionality (notably configuring iSCSI gateways and NFS Ganesha) means that Ceph needs to be able to talk to whatever tool it was that deployed the cluster, which leads to the final big thing I want to talk about for the Nautilus release, which is the Orchestrator modules.

There’s a bunch of ways to deploy Ceph, and your deployment tool will always know more about your environment, and have more power to do things than Ceph itself will, but if you’re managing Ceph, through the inbuilt dashboard and CLI tools, there’s things you want to be able to do as a Ceph admin, that Ceph itself can’t do. Ceph can’t deploy a new MDS, or RGW, or NFS Ganesha host. Ceph can’t deploy new OSDs by itself. Ceph can’t blink the lights on a disk on some host if Ceph itself has somehow failed, but the host is still up. For these things, you rely on your deployment tool, whatever it is. So Nautilus will include Orchestrator modules for Ansible, DeepSea/Salt, and Rook/Kubernetes, which allow the Ceph management tools to call out to your deployment tool as necessary to have it perform those tasks. This is the bit I’m working on at the moment.

Beyond Nautilus, Octopus is the next release, due in a bit more than nine months, and on the usability front I know we can expect more dashboard and more orchestrator functionality, but before that, we have the Software Defined Storage Developer Room at FOSDEM on February 2-3 and Cephalocon 2019 on May 19-20. Hopefully some of you reading this will be able to attend :-)

Update 2019-02-04: Check out Sage’s What’s new in Ceph Nautilus FOSDEM talk for much more detail on what’s coming up in Nautilus and beyond.

January 27, 2019

Best Foot Forward

Share

Catherine and I have been huge fans of Adam Hills for ages, so it wasn’t a surprise to me that I’d like a book by him. As an aside, we’ve never seen him live — we had tickets for his show in Canberra in 2013, but some of us ended up in labor in hospital instead, so we had to give those tickets away. One day we’ll manage to see him live though, he just needs to get back to touring Australia more!

Anyways, I enjoyed this book which as mentioned above wasn’t a surprise. What was a surprise is that he said something interesting which I have been pondering for the last few days…

Basically, its nice to get on stage and say things, either entertaining the audience or in my case perhaps educating them a little (I give technical conference talks). However, that’s not the most important thing. You need to work out why you’re on that stage before you go out there. What is the overall thing you’re trying to convey? Once you know that, everything else falls into place. I think this is especially true for keynote speeches, which need to appeal to a more general audience than a conference talk where people can pick from a menu.

What Adam seems to be saying in his comedy (at least to me) is to embrace life and be good to each other. Adam is a super positive guy, which is delightful. There is something very special about someone who lifts up those around them. I hope to be that person one day.

Best Foot Forward Book Cover Best Foot Forward
Adam Hills
Autobiography
Hachette Australia
Paperback
353

Share