Planet Linux Australia
Celebrating Australians & Kiwis in the Linux and Free/Open-Source community...

June 26, 2019

Installing NixOS on a Headless Raspberry Pi 3

NixOS Raspberry Pi Gears by Craige McWhirter

This represents the first step in being able to build ready-to-run NixOS images for headless Raspberry Pi 3 devices. Aarch64 images for NixOS need to be built natively on aarch64 hardware so the first Pi 3, the subject of this post, will need a keyboard and mouse attached for two commands.

A fair chunk of this post is collated from NixOS on ARM and NixOS on ARM/Raspberry Pi into a coherent, flowing process with additional steps related to the goal of this being a headless Raspberry Pi 3.

Head to Hydra job nixos:release-19.03:nixos.sd_image.aarch64-linux and download the latest successful build. ie:

 $ wget https://hydra.nixos.org/build/95346103/download/1/nixos-sd-image-19.03.172980.d5a3e5f476b-aarch64-linux.img

You will then need to write this to your SD Card:

# dd if=nixos-sd-image-19.03.172980.d5a3e5f476b-aarch64-linux.img of=/dev/sdX status=progress

Make sure you replace "/dev/sdX" with the correct location of your SD card.

Once the SD card has been written, attach the keyboard and screen, insert the SD card into the Pi and boot it up.

When the boot process has been completed, you will be thrown to a root prompt where you need to set a password for root and start the ssh service:

[root@pi-tri:~]#
[root@pi-tri:~]# passwd
New password:
Retype new password:
passwd: password updated successfully

[root@pi-tri:~]# systemctl start sshd

You can now complete the rest of this process from the comfort of whereever you normally work.

After successfully ssh-ing in and examining your disk layout with lsblk, the first step is to remove the undersized, FAT32 /boot partition:

# fdisk -l /dev/mmcblk0
Disk /dev/mmcblk0: 7.4 GiB, 7948206080 bytes, 15523840 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x2178694e

Device         Boot  Start      End  Sectors  Size Id Type
/dev/mmcblk0p1 *     16384   262143   245760  120M  b W95 FAT32
/dev/mmcblk0p2      262144 15522439 15260296  7.3G 83 Linux


# echo -e 'a\n1\na\n2\nw' | fdisk /dev/mmcblk0

Welcome to fdisk (util-linux 2.32.1).
Changes will remain in memory only, until you decide to write them.
Be careful before using the write command.


Command (m for help): Partition number (1,2, default 2):
The bootable flag on partition 1 is disabled now.

Command (m for help): Partition number (1,2, default 2):
The bootable flag on partition 2 is enabled now.

Command (m for help): The partition table has been altered.
Syncing disks.

# fdisk -l /dev/mmcblk0
Disk /dev/mmcblk0: 7.4 GiB, 7948206080 bytes, 15523840 sectors
Units: sectors of 1 * 512 = 512 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disklabel type: dos
Disk identifier: 0x2178694e

Device         Boot  Start      End  Sectors  Size Id Type
/dev/mmcblk0p1       16384   262143   245760  120M  b W95 FAT32
/dev/mmcblk0p2 *    262144 15522439 15260296  7.3G 83 Linux

Next we need to configure NixOS to boot the basic system we need with ssh enabled, root and a single user and disks configured correctly. I have this example file which at the time of writing looked like this:

# This is an example of a basic NixOS configuration file for a Raspberry Pi 3.
# It's best used as your first configuration.nix file and provides ssh, root
# and user accounts as well as Pi 3 specific tweaks.

{ config, pkgs, lib, ... }:

{
  # NixOS wants to enable GRUB by default
  boot.loader.grub.enable = false;
  # Enables the generation of /boot/extlinux/extlinux.conf
  boot.loader.generic-extlinux-compatible.enable = true;

  # For a Raspberry Pi 2 or 3):
  boot.kernelPackages = pkgs.linuxPackages_latest;

  # !!! Needed for the virtual console to work on the RPi 3, as the default of 16M doesn't seem to be enough.
  # If X.org behaves weirdly (I only saw the cursor) then try increasing this to 256M.
  boot.kernelParams = ["cma=32M"];

  # File systems configuration for using the installer's partition layout
  fileSystems = {
    "/" = {
      device = "/dev/disk/by-label/NIXOS_SD";
      fsType = "ext4";
    };
  };

  # !!! Adding a swap file is optional, but strongly recommended!
  swapDevices = [ { device = "/swapfile"; size = 1024; } ];

  hardware.enableRedistributableFirmware = true; # Enable support for Pi firmware blobs

  networking.hostName = "nixosPi";     # Define your hostname.
  networking.wireless.enable = false;  # Toggles wireless support via wpa_supplicant.

  # Select internationalisation properties.
  i18n = {
    consoleFont = "Lat2-Terminus16";
    consoleKeyMap = "us";
    defaultLocale = "en_AU.UTF-8";
  };

  time.timeZone = "Australia/Brisbane"; # Set your preferred timezone:

  # List services that you want to enable:
  services.openssh.enable = true;  # Enable the OpenSSH daemon.

  # Configure users for your Pi:
   users.mutableUsers = false;     # Remove any users not defined in here

  users.users.root = {
    hashedPassword = "$6$eeqJLxwQzMP4l$GTUALgbCfaqR8ut9kQOOG8uXOuqhtIsIUSP.4ncVaIs5PNlxdvAvV.krfutHafrxNN7KzaM7uksr6bXP5X0Sx1";
    openssh.authorizedKeys.keys = [
      "ssh-ed25519 Voohu4vei4dayohm3eeHeecheifahxeetauR4geigh9eTheey3eedae4ais7pei4ruv4 me@myhost"
    ];
  };

  # Groups to add
  users.groups.myusername.gid = 1000;

  # Define a user account.
  users.users.myusername = {
    isNormalUser = true;
    uid = 1000;
    group = "myusername";
    extraGroups = ["wheel" ];
    hashedPassword = "$6$l2I7i6YqMpeviVy$u84FSHGvZlDCfR8qfrgaP.n7/hkfGpuiSaOY3ziamwXXHkccrOr8Md4V5G2M1KcMJQmX5qP7KOryGAxAtc5T60";
    openssh.authorizedKeys.keys = [
      "ssh-ed25519 Voohu4vei4dayohm3eeHeecheifahxeetauR4geigh9eTheey3eedae4ais7pei4ruv4 me@myhost"
    ];
  };

  # This value determines the NixOS release with which your system is to be
  # compatible, in order to avoid breaking some software such as database
  # servers. You should change this only after NixOS release notes say you
  # should.
  system.stateVersion = "19.03"; # Did you read the comment?
  system.autoUpgrade.enable = true;
  system.autoUpgrade.channel = https://nixos.org/channels/nixos-19.03;
}

Once this is copied into place, you only need to rebuild NixOS using it by running:

# nixos-rebuild switch

Now you should have headless Pi 3 which you can use to build CD card images for other Pi 3's that are fully configured and ready to run.

June 25, 2019

Linux Security Summit North America 2019: Schedule Published

The schedule for the 2019 Linux Security Summit North America (LSS-NA) is published.

This year, there are some changes to the format of LSS-NA. The summit runs for three days instead of two, which allows us to relax the schedule somewhat while also adding new session types.  In addition to refereed talks, short topics, BoF sessions, and subsystem updates, there are now also tutorials (one each day), unconference sessions, and lightning talks.

The tutorial sessions are:

These tutorials will be 90 minutes in length, and they’ll run in parallel with unconference sessions on the first two days (when the space is available at the venue).

The refereed presentations and short topics cover a range of Linux security topics including platform boot security, integrity, container security, kernel self protection, fuzzing, and eBPF+LSM.

Some of the talks I’m personally excited about include:

The schedule last year was pretty crammed, so with the addition of the third day we’ve been able to avoid starting early, and we’ve also added five minute transitions between talks. We’re hoping to maximize collaboration via the more relaxed schedule and the addition of more types of sessions (unconference, tutorials, lightning talks).  This is not a conference for simply consuming talks, but to also participate and to get things done (or started).

Thank you to all who submitted proposals.  As usual, we had many more submissions than can be accommodated in the available time.

Also thanks to the program committee, who spent considerable time reviewing and discussing proposals, and working out the details of the schedule. The committee for 2019 is:

  • James Morris (Microsoft)
  • Serge Hallyn (Cisco)
  • Paul Moore (Cisco)
  • Stephen Smalley (NSA)
  • Elena Reshetova (Intel)
  • John Johnansen (Canonical)
  • Kees Cook (Google)
  • Casey Schaufler (Intel)
  • Mimi Zohar (IBM)
  • David A. Wheeler (Institute for Defense Analyses)

And of course many thanks to the event folk at Linux Foundation, who handle all of the logistics of the event.

LSS-NA will be held in San Diego, CA on August 19-21. To register, click here. Or you can register for the co-located Open Source Summit and add LSS-NA.

 

June 23, 2019

X-Axis is now ready!

The thread plate is now mounted to the base with thread lock in select locations. The top can still come off easily so I can drill holes to mount the gantry to the alloy tongue that comes out the bottom middle (there is one on the other side too).


Without the 75mm by 50mm by 1/4 inch 6061 alloy angle brackets you could flex the steel in the middle. Now, well... it is not so easy for a human to apply enough force to do it. The thread plate is only supported by 4 colonnades at the left and right side. The middle is unsupported to allow the gantry to travel 950mm along. I think the next build will be more a vertical mill style than sliding gantry to avoid these rigidity challenges.


June 20, 2019

Booting a NixOS aarch64 Image in Qemu

NixOS Gears by Craige McWhirter

To boot a NixOS aarch64 image in qemu, in this example, a Raspberry Pi3 (B), you can use the following command:

 qemu-system-aarch64 -M raspi3 -drive format=raw,file=NIXOS.IMG \
 -kernel ./u-boot-rpi3.bin -serial stdio -d in_asm -m 1024

You will need to replace NIXOS.IMG with the name of the image file you downloaded ie: nixos-sd-image-18.09.2568.1e9e709953e-aarch64-linux.img

You will also need to mount the image file and copy out u-boot-rpi3.bin for the -kernel option.

A nerd snipe, in which I reverse engineer the Aussie Broadband usage API

Share

I was curious about the newly available FTTN NBN service in my area, so I signed up to see what’s what. Of course, I need a usage API so that I can graph my usage in prometheus and grafana as everyone does these days. So I asked Aussie. The response I got was that I was welcome to reverse engineer the REST API that the customer portal uses.

So I did.

I give you my super simple implementation of an Aussie Broadband usage client in Python. Patches of course are welcome.

I’ve now released the library on pypi under the rather innovative name of “aussiebb”, so installing it is as simple as:

$ pip install aussiebb

Share

June 18, 2019

TEN THOUSAND DISKS

In OpenPOWER land we have a project called op-test-framework which (for all its strengths and weaknesses) allows us to test firmware on a variety of different hardware platforms and even emulators like Qemu.

Qemu is a fantasic tool allowing us to relatively quickly test against an emulated POWER model, and of course is a critical part of KVM virtual machines running natively on POWER hardware. However the default POWER model in Qemu is based on the "pseries" machine type, which models something closer to a virtual machine or a PowerVM partition rather than a "bare metal" machine.

Luckily we have Cédric Le Goater who is developing and maintaining a Qemu "powernv" machine type which more accurately models running directly on an OpenPOWER machine. It's an unwritten rule that if you're using Qemu in op-test, you've compiled this version of Qemu!

Teething Problems

Because the "powernv" type does more accurately model the physical system some extra care needs to be taken when setting it up. In particular at one point we noticed that the pretend CDROM and disk drive we attached to the model were.. not being attached. This commit took care of that; the problem was that the PCI topology defined by the layout required us to be more exact about where PCI devices were to be added. By default only three spare PCI "slots" are available but as the commit says, "This can be expanded by adding bridges"...

More Slots!

Never one to stop at a just-enough solution, I wondered how easy it would be to add an extra PCI bridge or two to give the Qemu model more available slots for PCI devices. It turns out, easy enough once you know the correct invocation. For example, adding a PCI bridge in the first slot of the first default PHB is:

-device pcie-pci-bridge,id=pcie.3,bus=pcie.0,addr=0x0

And inserting a device in that bridge just requires us to specify the bus and slot:

-device virtio-blk-pci,drive=cdrom01,id=virtio02,bus=pcie.4,addr=3

Great! Each bridge provides 31 slots, so now we have plenty of room for extra devices.

Why Stop There?

We have three free slots, and we don't have a strict requirement on where devices are plugged in, so lets just plug a bridge into each of those slots while we're here:

-device pcie-pci-bridge,id=pcie.3,bus=pcie.0,addr=0x0 \
-device pcie-pci-bridge,id=pcie.4,bus=pcie.1,addr=0x0 \
-device pcie-pci-bridge,id=pcie.5,bus=pcie.2,addr=0x0

What happens if we insert a new PCI bridge into another PCI bridge? Aside from stressing out our PCI developers, a bunch of extra slots! And then we could plug bridges into those bridges and then..


Thus was born "OpTestQemu: Add PCI bridges to support more devices." and the testcase "Petitboot10000Disks". The changes to the Qemu model setup fill up each PCI bridge as long as we have devices to add, but reserve the first slot to add another bridge if we run out of room... and so on..

Officially this is to support adding interesting disk topologies to test Pettiboot use cases, stress test device handling, and so on, but while we're here... what happens with 10,000 temporary disks?

======================================================================
ERROR: testListDisks (testcases.Petitboot10000Disks.ConfigEditorTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/sam/git/op-test-framework/testcases/Petitboot10000Disks.py", line 27, in setUp
    self.system.goto_state(OpSystemState.PETITBOOT_SHELL)
  File "/home/sam/git/op-test-framework/common/OpTestSystem.py", line 366, in goto_state
    self.state = self.stateHandlers[self.state](state)
  File "/home/sam/git/op-test-framework/common/OpTestSystem.py", line 695, in run_IPLing
    raise my_exception
UnknownStateTransition: Something happened system state="2" and we transitioned to UNKNOWN state.  Review the following for more details
Message="OpTestSystem in run_IPLing and the Exception=
"filedescriptor out of range in select()"
 caused the system to go to UNKNOWN_BAD and the system will be stopping."

Yeah that's probably to be expected without some more massaging. What about a more modest 512?

I: Resetting PHBs and training links...
[   55.293343496,5] PCI: Probing slots...
[   56.364337089,3] PHB#0000:02:01.0 pci_find_ecap hit a loop !
[   56.364973775,3] PHB#0000:02:01.0 pci_find_ecap hit a loop !
[   57.127964432,3] PHB#0000:03:01.0 pci_find_ecap hit a loop !
[   57.128545637,3] PHB#0000:03:01.0 pci_find_ecap hit a loop !
[   57.395489618,3] PHB#0000:04:01.0 pci_find_ecap hit a loop !
[   57.396048285,3] PHB#0000:04:01.0 pci_find_ecap hit a loop !
[   58.145944205,3] PHB#0000:05:01.0 pci_find_ecap hit a loop !
[   58.146465795,3] PHB#0000:05:01.0 pci_find_ecap hit a loop !
[   58.404954853,3] PHB#0000:06:01.0 pci_find_ecap hit a loop !
[   58.405485438,3] PHB#0000:06:01.0 pci_find_ecap hit a loop !
[   60.178957315,3] PHB#0001:02:01.0 pci_find_ecap hit a loop !
[   60.179524173,3] PHB#0001:02:01.0 pci_find_ecap hit a loop !
[   60.198502097,3] PHB#0001:02:02.0 pci_find_ecap hit a loop !
[   60.198982582,3] PHB#0001:02:02.0 pci_find_ecap hit a loop !
[   60.435096197,3] PHB#0001:03:01.0 pci_find_ecap hit a loop !
[   60.435634380,3] PHB#0001:03:01.0 pci_find_ecap hit a loop !
[   61.171512439,3] PHB#0001:04:01.0 pci_find_ecap hit a loop !
[   61.172029071,3] PHB#0001:04:01.0 pci_find_ecap hit a loop !
[   61.425416049,3] PHB#0001:05:01.0 pci_find_ecap hit a loop !
[   61.425934524,3] PHB#0001:05:01.0 pci_find_ecap hit a loop !
[   62.172664549,3] PHB#0001:06:01.0 pci_find_ecap hit a loop !
[   62.173186458,3] PHB#0001:06:01.0 pci_find_ecap hit a loop !
[   63.434516732,3] PHB#0002:02:01.0 pci_find_ecap hit a loop !
[   63.435062124,3] PHB#0002:02:01.0 pci_find_ecap hit a loop !
[   64.177567772,3] PHB#0002:03:01.0 pci_find_ecap hit a loop !
[   64.178099773,3] PHB#0002:03:01.0 pci_find_ecap hit a loop !
[   64.431763989,3] PHB#0002:04:01.0 pci_find_ecap hit a loop !
[   64.432285000,3] PHB#0002:04:01.0 pci_find_ecap hit a loop !
[   65.180506790,3] PHB#0002:05:01.0 pci_find_ecap hit a loop !
[   65.181049905,3] PHB#0002:05:01.0 pci_find_ecap hit a loop !
[   65.432105600,3] PHB#0002:06:01.0 pci_find_ecap hit a loop !
[   65.432654326,3] PHB#0002:06:01.0 pci_find_ecap hit a loop !

(That isn't good)

[   66.177240655,5] PCI Summary:
[   66.177906083,5] PHB#0000:00:00.0 [ROOT] 1014 03dc R:00 C:060400 B:01..07 
[   66.178760724,5] PHB#0000:01:00.0 [ETOX] 1b36 000e R:00 C:060400 B:02..07 
[   66.179501494,5] PHB#0000:02:01.0 [ETOX] 1b36 000e R:00 C:060400 B:03..07 
[   66.180227773,5] PHB#0000:03:01.0 [ETOX] 1b36 000e R:00 C:060400 B:04..07 
[   66.180953149,5] PHB#0000:04:01.0 [ETOX] 1b36 000e R:00 C:060400 B:05..07 
[   66.181673576,5] PHB#0000:05:01.0 [ETOX] 1b36 000e R:00 C:060400 B:06..07 
[   66.182395253,5] PHB#0000:06:01.0 [ETOX] 1b36 000e R:00 C:060400 B:07..07 
[   66.183207399,5] PHB#0000:07:02.0 [PCID] 1af4 1001 R:00 C:010000 (          scsi) 
[   66.183969138,5] PHB#0000:07:03.0 [PCID] 1af4 1001 R:00 C:010000 (          scsi) 

(a lot more of this)

[   67.055196945,5] PHB#0002:02:1e.0 [PCID] 1af4 1001 R:00 C:010000 (          scsi) 
[   67.055926264,5] PHB#0002:02:1f.0 [PCID] 1af4 1001 R:00 C:010000 (          scsi) 
[   67.094591773,5] INIT: Waiting for kernel...
[   67.095105901,5] INIT: 64-bit LE kernel discovered
[   68.095749915,5] INIT: Starting kernel at 0x20010000, fdt at 0x3075d270 168365 bytes

zImage starting: loaded at 0x0000000020010000 (sp: 0x0000000020d30ee8)
Allocating 0x1dc5098 bytes for kernel...
Decompressing (0x0000000000000000 <- 0x000000002001f000:0x0000000020d2e578)...
Done! Decompressed 0x1c22900 bytes

Linux/PowerPC load: 
Finalizing device tree... flat tree at 0x20d320a0
[   10.120562] watchdog: CPU 0 self-detected hard LOCKUP @ pnv_pci_cfg_write+0x88/0xa4
[   10.120746] watchdog: CPU 0 TB:50402010473, last heartbeat TB:45261673150 (10039ms ago)
[   10.120808] Modules linked in:
[   10.120906] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.0.5-openpower1 #2
[   10.120956] NIP:  c000000000058544 LR: c00000000004d458 CTR: 0000000030052768
[   10.121006] REGS: c0000000fff5bd70 TRAP: 0900   Not tainted  (5.0.5-openpower1)
[   10.121030] MSR:  9000000002009033 <SF,HV,VEC,EE,ME,IR,DR,RI,LE>  CR: 48002482  XER: 20000000
[   10.121215] CFAR: c00000000004d454 IRQMASK: 1 
[   10.121260] GPR00: 00000000300051ec c0000000fd7c3130 c000000001bcaf00 0000000000000000 
[   10.121368] GPR04: 0000000048002482 c000000000058544 9000000002009033 0000000031c40060 
[   10.121476] GPR08: 0000000000000000 0000000031c40060 c00000000004d46c 9000000002001003 
[   10.121584] GPR12: 0000000031c40000 c000000001dd0000 c00000000000f560 0000000000000000 
[   10.121692] GPR16: 0000000000000000 0000000000000000 0000000000000001 0000000000000000 
[   10.121800] GPR20: 0000000000000000 0000000000000000 0000000000000000 0000000000000000 
[   10.121908] GPR24: 0000000000000005 0000000000000000 0000000000000000 0000000000000104 
[   10.122016] GPR28: 0000000000000002 0000000000000004 0000000000000086 c0000000fd9fba00 
[   10.122150] NIP [c000000000058544] pnv_pci_cfg_write+0x88/0xa4
[   10.122187] LR [c00000000004d458] opal_return+0x14/0x48
[   10.122204] Call Trace:
[   10.122251] [c0000000fd7c3130] [c000000000058544] pnv_pci_cfg_write+0x88/0xa4 (unreliable)
[   10.122332] [c0000000fd7c3150] [c0000000000585d0] pnv_pci_write_config+0x70/0x9c
[   10.122398] [c0000000fd7c31a0] [c000000000234fec] pci_bus_write_config_word+0x74/0x98
[   10.122458] [c0000000fd7c31f0] [c00000000023764c] __pci_read_base+0x88/0x3a4
[   10.122518] [c0000000fd7c32c0] [c000000000237a18] pci_read_bases+0xb0/0xc8
[   10.122605] [c0000000fd7c3300] [c0000000002384bc] pci_setup_device+0x4f8/0x5b0
[   10.122670] [c0000000fd7c33a0] [c000000000238d9c] pci_scan_single_device+0x9c/0xd4
[   10.122729] [c0000000fd7c33f0] [c000000000238e2c] pci_scan_slot+0x58/0xf4
[   10.122796] [c0000000fd7c3430] [c000000000239eb8] pci_scan_child_bus_extend+0x40/0x2a8
[   10.122861] [c0000000fd7c34a0] [c000000000239e34] pci_scan_bridge_extend+0x4d4/0x504
[   10.122928] [c0000000fd7c3580] [c00000000023a0f8] pci_scan_child_bus_extend+0x280/0x2a8
[   10.122993] [c0000000fd7c35f0] [c000000000239e34] pci_scan_bridge_extend+0x4d4/0x504
[   10.123059] [c0000000fd7c36d0] [c00000000023a0f8] pci_scan_child_bus_extend+0x280/0x2a8
[   10.123124] [c0000000fd7c3740] [c000000000239e34] pci_scan_bridge_extend+0x4d4/0x504
[   10.123191] [c0000000fd7c3820] [c00000000023a0f8] pci_scan_child_bus_extend+0x280/0x2a8
[   10.123256] [c0000000fd7c3890] [c000000000239b5c] pci_scan_bridge_extend+0x1fc/0x504
[   10.123322] [c0000000fd7c3970] [c00000000023a064] pci_scan_child_bus_extend+0x1ec/0x2a8
[   10.123388] [c0000000fd7c39e0] [c000000000239b5c] pci_scan_bridge_extend+0x1fc/0x504
[   10.123454] [c0000000fd7c3ac0] [c00000000023a064] pci_scan_child_bus_extend+0x1ec/0x2a8
[   10.123516] [c0000000fd7c3b30] [c000000000030dcc] pcibios_scan_phb+0x134/0x1f4
[   10.123574] [c0000000fd7c3bd0] [c00000000100a800] pcibios_init+0x9c/0xbc
[   10.123635] [c0000000fd7c3c50] [c00000000000f398] do_one_initcall+0x80/0x15c
[   10.123698] [c0000000fd7c3d10] [c000000001000e94] kernel_init_freeable+0x248/0x24c
[   10.123756] [c0000000fd7c3db0] [c00000000000f574] kernel_init+0x1c/0x150
[   10.123820] [c0000000fd7c3e20] [c00000000000b72c] ret_from_kernel_thread+0x5c/0x70
[   10.123854] Instruction dump:
[   10.123885] 7d054378 4bff56f5 60000000 38600000 38210020 e8010010 7c0803a6 4e800020 
[   10.124022] e86a0018 54c6043e 7d054378 4bff5731 <60000000> 4bffffd8 e86a0018 7d054378 
[   10.124180] Kernel panic - not syncing: Hard LOCKUP
[   10.124232] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.0.5-openpower1 #2
[   10.124251] Call Trace:

I wonder if I can submit that bug without someone throwing something at my desk.

The X Axis is growing...

The new cnc X axis will be around a meter in length. This presents some issues with material selection as steel that is 1100mm long by 350mm wide and 5mm thick will flex when only supported by the black columns at each end. I have some brackets to sure that up so the fixture plate will not be pushed away or vibrate under cutting load.




The linear rails are longer than the ballscrew to allow the gantry to travel the full length of the ballscrew. In this case a 1 meter ballscrew allows about 950mm of tip to tip travel and thus 850mm of cutter travel. The gantry is 100mm wide, shown as just the mounting plate in the picture above.

The black columns to hold the fixture plate are 38mm square and 60mm high solid steel. They come in at about 500grams a pop. The steel plate is about 15kg. I was originally going to use 38mm solid square steel stock as the shims under the linear rails but they came in at over 8kg each and the build was starting to get heavy.

The columns are m6 tapped both ends to hold the fixture plate up above the assembly. I will likely laminate some 1.2mm alloy to the base of the fixture plate to mitigate chips falling through the screw fixture holes into the rails and ballscrew.

I have to work out the final order of the 1/4 inch 6061 brackets that sure up the 5mm thick fixture plate yet. Without edge brackets you can flex the steel when it is only supported at the ends. Yes, I can see why vertical mills are made.

I made the plate that will have the gantry attached on the cnc but had to refixture things as the cnc can not cut something that long in any of the current axis.



It is interesting how much harder 6061 is compared to some of the more economic alloys when machining things. You can see the cnc machine facing more resistance especially on 6mm and larger holes.  It will be interesting to see if the cnc can handle drilling steel at some stage.

June 15, 2019

OpenSUSE 15 LXC setup on Ubuntu Bionic 18.04

Similarly to what I wrote for Fedora, here is how I was able to create an OpenSUSE 15 LXC container on an Ubuntu 18.04 (bionic) laptop.

Setting up LXC on Ubuntu

First of all, install lxc:

apt install lxc
echo "veth" >> /etc/modules
modprobe veth

turn on bridged networking by putting the following in /etc/sysctl.d/local.conf:

net.ipv4.ip_forward=1

and applying it using:

sysctl -p /etc/sysctl.d/local.conf

Then allow the right traffic in your firewall (/etc/network/iptables.up.rules in my case):

# LXC containers
-A FORWARD -d 10.0.3.0/24 -m conntrack --ctstate RELATED,ESTABLISHED -j ACCEPT
-A FORWARD -s 10.0.3.0/24 -j ACCEPT
-A INPUT -d 224.0.0.251 -s 10.0.3.1 -j ACCEPT
-A INPUT -d 239.255.255.250 -s 10.0.3.1 -j ACCEPT
-A INPUT -d 10.0.3.255 -s 10.0.3.1 -j ACCEPT
-A INPUT -d 10.0.3.1 -s 10.0.3.0/24 -j ACCEPT

and apply these changes:

iptables-apply

before restarting the lxc networking:

systemctl restart lxc-net.service

Creating the container

Once that's in place, you can finally create the OpenSUSE 15 container:

lxc-create -n opensuse15 -t download -- -d opensuse -r 15 -a amd64

To see a list of all distros available with the download template:

lxc-create -n foo --template=download -- --list

Logging in as root

Start up the container and get a login console:

lxc-start -n opensuse15 -F

In another terminal, set a password for the root user:

lxc-attach -n opensuse15 passwd

You can now use this password to log into the console you started earlier.

Logging in as an unprivileged user via ssh

As root, install a few packages:

zypper install vim openssh sudo man
systemctl start sshd
systemctl enable sshd

and then create an unprivileged user:

useradd francois
passwd francois
cd /home
mkdir francois
chown francois:100 francois/

and give that user sudo access:

visudo  # uncomment "wheel" line
groupadd wheel
usermod -aG wheel francois

Now login as that user from the console and add an ssh public key:

mkdir .ssh
chmod 700 .ssh
echo "<your public key>" > .ssh/authorized_keys
chmod 644 .ssh/authorized_keys

You can now login via ssh. The IP address to use can be seen in the output of:

lxc-ls --fancy

June 13, 2019

Intersections and connections

Intersections and connections kattekrab Thu, 13/06/2019 - 07:07

Raspberry Pi HAT identity EEPROMs, a simple guide

Share

I’ve been working on a RFID scanner than can best be described as an overly large Raspberry Pi HAT recently. One of the things I am grappling with as I get closer to production boards is that I need to be able to identify what version of the HAT is currently installed — the software can then tweak its behaviour based on the hardware present.

I had toyed with using some spare GPIO lines and “hard coded” links on the HAT to identify board versions to the Raspberry Pi, but it turns out others have been here before and there’s a much better way. The Raspberry Pi folks have defined something called the “Hardware On Top” (HAT) specification which defines an i2c EEPROM which can be used to identify a HAT to the Raspberry Pi.

There are a couple of good resources I’ve found that help you do this thing — sparkfun have a tutorial which covers it, and there is an interesting forum post. However, I couldn’t find a simple tutorial for HAT designers that just covered exactly what they need to know and nothing else. There were also some gaps in those documents compared with my experiences, and I knew I’d need to look this stuff up again in the future. So I wrote this page.

Initial setup

First off, let’s talk about the hardware. I used an 24LC256P DIL i2c EEPROM — these are $2 on ebay, or $6 from Jaycar. The pins need to be wired like this:

24LC256P Pin Raspberry Pi Pin Notes
1 (AO) GND (pins 6, 9, 14, 20, 25, 30, 34, 39) All address pins tied to ground will place the EEPROM at address 50. This is the required address in the specification
2 (A1) GND
3 (A2) GND
4 VSS GND
5 SDA 27

You should also add a 3.9K pullup resistor from EEPROM pin 5 to 3.3V.

You must use this pin for the Raspberry Pi to detect the EEPROM on startup!
6 SCL 28

You should also add a 3.9K pullup resistor from EEPROM pin 6 to 3.3V.

You must use this pin for the Raspberry Pi to detect the EEPROM on startup!
7 WP Not connected Write protect. I don’t need this.
8 VCC 3.3V (pins 1 or 17) The EEPROM is capable of being run at 5 volts, but must be run at 3.3 volts to work as a HAT identification EEPROM.

The specification requires that the data pin be on pin 27, the clock pin be on pin 28, and that the EEPROM be at address 50 on the i2c bus as described in the table above. There is also some mention of pullup resistors in both the data sheet and the HAT specification, but not in a lot of detail. The best I could find was a circuit diagram for a different EEPROM with the pullup resistors shown.

My test EEPROM wired up on a little breadboard looks like this:

My prototype i2c EEPROM circuit

And has a circuit diagram like this:

An ID EEPROM circuit

Next enable i2c on your raspberry pi. You also need to hand edit /boot/config.txt and then reboot. The relevant line of my config.txt look like this:

dtparam=i2c_vc=on

After reboot you should have an entry at /dev/i2c-0.

GOTCHA: you can’t probe the i2c bus that the HAT standard uses, and I couldn’t get flashing the EEPROM to work on that bus either.

Now time for our first gotcha — the version detection i2c bus is only enabled during boot and then turned off. An i2cdetect on bus zero wont show the device post boot for this reason. This caused an initial panic attack because I thought my EEPROM was dead, but that was just my twitchy nature showing through.

You can verify your EEPROM works by enabling bus one. To do this, add these lines to /boot/config.txt:

dtparam=i2c_arm=on
dtparam=i2c_vc=on

After a reboot you should have /dev/i2c-0 and /dev/i2c-1. You also need to move the EEPROM to bus 1 in order for it to be detected:

24LC256P Pin Raspberry Pi Pin Notes
5 SDA 3
6 SCL 5

You’ll need to move the EEPROM back before you can use it for HAT detection.

Programming the EEPROM

You program the EEPROM with a set of tools provided by the raspberry pi folks. Check those out and compile them, they’re not packaged for raspbian that I can find:

pi@raspberrypi:~ $ git clone https://github.com/raspberrypi/hats
Cloning into 'hats'...
remote: Enumerating objects: 464, done.
remote: Total 464 (delta 0), reused 0 (delta 0), pack-reused 464
Receiving objects: 100% (464/464), 271.80 KiB | 119.00 KiB/s, done.
Resolving deltas: 100% (261/261), done.
pi@raspberrypi:~ $ cd hats/eepromutils/
pi@raspberrypi:~/hats/eepromutils $ ls
eepdump.c    eepmake.c            eeptypes.h  README.txt
eepflash.sh  eeprom_settings.txt  Makefile
pi@raspberrypi:~/hats/eepromutils $ make
cc eepmake.c -o eepmake -Wno-format
cc eepdump.c -o eepdump -Wno-format

The file named eeprom_settings.txt is a sample of the settings for your HAT. Fiddle with that until it makes you happy, and then compile it:

$ eepmake eeprom_settings.txt eeprom_settings.eep
Opening file eeprom_settings.txt for read
UUID=b9e3b4e9-e04f-4759-81aa-8334277204eb
Done reading
Writing out...
Done.

And then we can flash our EEPROM, remembering that I’ve only managed to get flashing to work while the EEPROM is on bus 1 (pins 2 and 5):

$ sudo sh eepflash.sh -w -f=eeprom_settings.eep -t=24c256 -d=1
This will attempt to talk to an eeprom at i2c address 0xNOT_SET on bus 1. Make sure there is an eeprom at this address.
This script comes with ABSOLUTELY no warranty. Continue only if you know what you are doing.
Do you wish to continue? (yes/no): yes
Writing...
0+1 records in
0+1 records out
107 bytes copied, 0.595252 s, 0.2 kB/s
Closing EEPROM Device.
Done.

Now move the EEPROM back to bus 0 (pins 27 and 28) and reboot. You should end up with entries in the device tree for the HAT. I get:

$ cd /proc/device-tree/hat/
$ for item in *
> do
>   echo "$item: "`cat $item`
>   echo
> done
name: hat

product: GangScan

product_id: 0x0001

product_ver: 0x0008

uuid: b9e3b4e9-e04f-4759-81aa-8334277204eb

vendor: madebymikal.com

Now I can have my code detect if the HAT is present, and if so what version. Comments welcome!

Share

June 11, 2019

Codec 2 700C Equaliser

During the recent FreeDV QSO party, I was reminded of a problem with the codec used for FreeDV 700D. Some speakers are muffled and hard to understand, while others code quite nicely and are easy to listen too. I’m guessing the issue is around the Vector Quantiser (VQ) used to encode the speech spectrum. As I’ve been working on Vector Quantisation (VQ) recently for the LPCNet project, I decided to have a fresh look at this problem with Codec 2.

I’ve been talking to a few people about the Codec 2 700C VQ and the idea of an equaliser. Thanks Stefan, Thomas, and Jean Marc for you thoughts.

Vector Quantiser

The FreeDV 700C and FreeDV 700D modes both use Codec 2 700C. Sorry about the confusing nomenclature – FreeDV and Codec 2 aren’t always in lock step.

Vector Quantisers are trained on speech from databases that represent a variety of speakers. These databases tend to have standardised frequency responses across all samples. I used one of these databases to train the VQ for Codec 2 700C. However when used in the real world (e.g. with FreeDV), the codec gets connected to many different microphones and sound cards with varying frequency responses.

For example, the VQ training database might be high pass filtered at 150 Hz, and start falling off at 3600 Hz. A gamer headset used for FreeDV might have low frequency energy down to 20 Hz, and and have a gentle high pass slope such that energy at 3 kHz is 6dB louder than energy at 1000 Hz. Another user of FreeDV might have a completely different frequency response on their system.

The VQ naively tries to match the input spectrum. If you have input speech that is shaped differently to the VQ training data, the VQ tends to expend a lot of bits matching spectral shaping rather than concentrating on important features of speech that make it intelligible. The result is synthesised speech that has artefacts, is harder to understand and muffled.

In contrast, commercial radios have it easier, they can control the microphone and input analog signal frequency response to neatly match the codec.

Equaliser Algorithm

I wrote an Octave script (vq_700c_eq.m) to look into the problem and try a few different algorithms. It allows to me analyse speech frame by frame and in batch mode, so I can listen to the results.

The equaliser is set to the average quantiser error for each of the K=20 bands in each vector, a similar algorithm to [1]. For these initial tests, I calculated the equaliser values as the mean quantiser error over the entire sample. This is cheating a bit to get an “early result” and test the general idea. A real world EQ would need to adapt to input speech.

Here is a 3D mesh plot of the spectrum of the cq_ref sample evolving over time. For each 40ms frame, we have a K=20 element vector of samples we need to quantise. Note the high levels towards the low frequency end in this sample.

Averaging the first stage VQ error over the sample, we get the following equaliser values:

Note the large values at the start and end of the spectrum. The eq_hi curve is the mean error of just the high energy frames (ignoring silence frames).

Here is a snapshot of a single frame, showing the target vector, the first stage vector quantiser’s best effort, and the error.

Here is the same frame after the equaliser has been applied to the target vector:

In this case the vector quantiser has selected a different vector – it’s not “fighting” the static frequency response so much and can focus on the more perceptually important parts of the speech spectrum. It also means the 2nd stage can address perceptually important features of the vector rather than the static frequency response.

For this frame the variance (mean square error) was halved using the equaliser.

Results

The following table presents the results in terms of the variance (mean square error) in dB^2. The first column is the variance of the input data, samples with a wider spectral range will tend to be higher. The idea of the quantiser is to reduce the variance (quantiser error) as much as possible. It’s a two stage quantiser (9 bits or 512 entries) per stage.

The two right hand columns show the results (variance after 2nd stage) without and with the equaliser, using the Codec 2 700C VQ (which I have labelled train_120 in the source code). On some samples it has quite an effect (cq_ref, cq_freedv_8k), less so on others. After the equaliser, most of the samples are in the 8dB^2 range.

--------------------------------------------------------
Sample        Initial  stg1    stg1_eq     stg2  stg2_eq
--------------------------------------------------------
hts1a         120.40   17.40   16.07       9.34    8.66
hts2a         149.13   18.67   16.85       9.71    8.81
cq_ref        170.07   34.08   20.33      20.07   11.33
ve9qrp_10s     66.09   23.15   14.97      13.14    8.10
vk5qi         134.39   21.52   14.65      12.03    8.05
c01_01_8k     126.75   18.84   14.11      10.19    7.51
ma01_01        91.22   23.96   16.26      14.05    8.91
cq_freedv_8k  118.80   29.60   16.41      17.46    8.97

In the next table, we use a different vector quantiser (all_speech), derived from a different training database. This used much more training data than train_120 above. In this case, the VQ (in general) does a better job, and the equaliser has a smaller effect. Notable exceptions are the hts1a/hts2a samples, which are poorer. They seem messed up no matter what I do. The c01_01_8k/ma0_01 samples are from within the all-speech database, so predictably do quite well.

---------------------------------------------------------
Sample        Initial  stg1    stg1_eq    stg2    stg2_eq
---------------------------------------------------------
hts1a         120.40   20.75   16.63      11.36    9.05
hts2a         149.13   24.31   16.54      12.48    7.90
cq_ref        170.07   22.41   17.54      12.11    9.26
ve9qrp_10s     66.09   15.29   13.88       8.12    7.29
vk5qi         134.39   14.66   13.25       7.95    7.17
c01_01_8k     126.75   10.64   10.17       5.19    5.07
ma01_01        91.22   14.09   13.20       7.05    6.68
cq_freedv_8k  118.80   17.25   13.23       9.13    7.04

Listening to the samples:

  1. The equaliser doesn’t mess up anything. This is actually quite important. We are modifying the speech spectrum so it’s important that the equaliser doesn’t make any samples sound worse if they don’t need equalisation.
  2. In line with the tables above, the equaliser improves samples like cq_ref, cq_freedv_8k and vk5qi on Codec 700C. In particular a bass artefact that I sometimes hear can be removed, and (I hope) intelligibility improved.
  3. The second (all_speech) VQ improves the quality of most samples. Also in line with the variance table for this VQ, the equaliser makes a smaller improvement.
  4. hts1a and hts2a are indeed poorer with the all_speech VQ. You can’t win them all.

Here are some samples showing the various VQ and EQ options. For listening, I used a small set of loudspeakers with some bass response, as the artefacts I am interested in often affect the bass end.

Codec 2 700C + train_120 VQ Listen
Codec 2 700C + train_120 VQ + EQ Listen
Codec 2 700C + all_speech VQ Listen
Codec 2 700C + all_speech VQ + EQ Listen

Using my loudspeakers, I can hear the annoying bass artefact being removed by the EQ (first and second samples). In the next two samples (all_speech VQ), the effect of the EQ is less pronounced, as the VQ itself does a better job.

Conclusions and Discussion

The Codec 2 700C VQ can be improved, either by training a new VQ, or adding an equaliser that adjusts the input target vector. A new VQ would break compatibility with Codec 2 700C, so we would need to release a new mode, and push that through to a new FreeDV mode. The equaliser alone could be added to the current Codec 2 700C implementation, without breaking compatibility.

Variance (mean squared error) is a pretty good objective measure of the quantiser performance, and aligned with the listening tests results. Minimising variance is heading in the right direction. This is important, as listening tests are difficult and subjective in nature.

This work might be useful for LPCNet or other NN based codecs, which have a similar set of parameters that require vector quantisation. In particular if we want to use NN techniques at lower bit rates.

There is remaining mystery over the hts1a/hts2a samples. They must have a spectrum that the equaliser can’t adjust effectively and the VQ doesn’t address well. This suggests other equaliser algorithms might to a better job.

Another possibility is training the VQ to handle a wider variety of inputs by including static spectral shaping in the training data. This could be achieved by filtering the spectrally flat training database, and appending the shaped data to the VQ. However this would increase the variance the VQ has to deal with, and possibly lead to more bits for a given VQ performance.

Reading Further

[1] I. H. J. Nel and W. Coetzer, “An Adaptive Homomorphic Vocoder at 550 bits/second,” /IEEE South African Symposium on Communications and Signal Processing/, Johannesburg, South Africa, 1990, pp. 131-136.

Codec 2 700C

June 10, 2019

Trail run: the base of Urambi Hill

Share

This one has been on my list for a little while — a nice 10km loop around the bottom of Urambi Hill. I did it as an out and back, although there is a loop option if you cross the bridge that was my turn around point. For the loop option cross the bridge, run a couple of hundred meters to the left and then cross the river again at the ford. Expect to get your feet wet if you choose that option!

Not particularly shady, but nice terrain. There is more vertical ascent than I expected, but it wasn’t crazy. I haven’t posted pictures of this run because it was super foggy when I did it so the pictures are just of white mist.

Share

Securing Linux with Ansible

The Ansible Hardening role from the OpenStack project is a great way to secure Linux boxes in a reliable, repeatable and customisable manner.

It was created by former colleague of mine Major Hayden and while it was spun out of OpenStack, it can be applied generally to a number of the major Linux distros (including Fedora, RHEL, CentOS, Debian, SUSE).

The role is based on the Secure Technical Implementation Guide (STIG) out of the Unites States for RHEL, which provides recommendations on how best to secure a host and the services it runs (category one for highly sensitive systems, two for medium and three for low). This is similar to the Information Security Manual (ISM) we have in Australia, although the STIG is more explicit.

Rules and customisation

There is deviation from the STIG recommendations and it is probably a good idea to read the documentation about what is offered and how it’s implemented. To avoid unwanted breakages, many of the controls are opt-in with variables to enable and disable particular features (see defaults/main.yml).

You probably do not want to blindly enable everything without understanding the consequences. For example, Kerberos support in SSH will be disabled by default (via “security_sshd_disable_kerberos_auth: yes” variable) as per V-72261, so this might break access if you rely on it.

Other features also require values to be enabled. For example, V-71925 of the STIG recommends passwords for new users be restricted to a minimum lifetime of 24 hours. This is not enabled by default in the Hardening role (central systems like LDAP are recommended), but can be enabled be setting the following variable for any hosts you want it set on.

security_password_min_lifetime_days: 1

In addition, not all controls are available for all distributions.

For example, V-71995 of the STIG requires umask to be set to 077, however the role does not currently implement this for RHEL based distros.

Run a playbook

To use this role you need to get the code itself, using either Ansible Galaxy or Git directly. Ansible will look in the ~/.ansible/roles/ location by default and find the role, so that makes a convenient spot to clone the repo to.

mkdir -p ~/.ansible/roles
git clone https://github.com/openstack/ansible-hardening \
~/.ansible/roles/ansible-hardening

Next, create an Ansible play which will make use of the role. This is where we will set variables to enable or disable specific control for hosts which are run using the play. For example, if you’re using a graphical desktop, then you will want to make sure X.Org is not removed (see below). Include any other variables you want to set from the defaults/main.yml file.

cat > play.yml << EOF
---
- name: Harden all systems
  hosts: all
  become: yes
  vars:
    security_rhel7_remove_xorg: no
    security_ntp_servers:
      - ntp.internode.on.net
  roles:
    - ansible-hardening
EOF

Now we can run our play! Ansible uses an inventory of hosts, but we’ll just run this against localhost directly (with the options -i localhost, -c local). It’s probably a good idea to run it with the –check option first, which will not actually make any changes.

If you’re running in Fedora, make sure you also set Python3 as the interpreter.

ansible-playbook -i localhost, -c local \
-e ansible_python_interpreter=/usr/bin/python3 \
--ask-become-pass \
--check \
./play.yml

This will run through the role, executing all of the default tasks while including or excluding others based on the variables in your play.

Running specific sets of controls

If you only want to run a limited set of controls, you can do so by running the play with the relevant –tags option. You can also exclude specific tasks with –skip-tags option. Note that there are a number of required tasks with the always tag which will be run regardless.

To see all the available tags, run your playbook with the –list-tags option.

ansible-playbook --list-tags ./play.yml

For example, if you want to only run the dozen or so Category III controls you can do so with the low tag (don’t forget that some tasks may still need enabling if you want to run them and that the always tagged tasks will still be run). Combine tags by comma separating them, so to also run a specific control like V-72057, or controls related to SSH, just add it them with low.

ansible-playbook -i localhost, -c local \
-e ansible_python_interpreter=/usr/bin/python3 \
--ask-become-pass \
--check \
--tags low,sshd,V-72057 \
./play.yml

Or if you prefer, you can just run everything except a specific set. For example, to exclude Category I controls, skip the high tag. You can also add both options.

ansible-playbook -i localhost, -c local \
-e ansible_python_interpreter=/usr/bin/python3 \
--ask-become-pass \
--check \
--tags sshd,V-72057 \
--skip-tags high \
./play.yml

Once you’re happy, don’t forget to remove the –check option to apply the changes.

June 05, 2019

What is Gang Scan?

Share

Gang Scan is an open source (and free) attendance tracking system based on custom RFID reader boards that communicate back to a server over wifi. The boards are capable of queueing scan events in the case of intermittent network connectivity, and the server provides simple reporting.

Share

June 02, 2019

Audiobooks – May 2019

Springfield Confidential: Jokes, Secrets, and Outright Lies from a Lifetime Writing for The Simpsons by Mike Reiss

Great book. Simpsons insider stories, stuff about show business, funny jokes. 9/10

Combat Crew: The Story of 25 Combat Missions Over Europe From the Daily Journal of a B-17 Gunner by John Comer

Interesting 1st-hand account (with some borrowings from others in unit). Good details and atmosphere from missions and back at base/leave 8/10

Far-Seer by Robert J. Sawyer

“An allegory about Galileo on a planet of intelligent dinosaurs”. 1st in a Trilogy by one of my favorite authors. Balanced between similarities & differences from humans. 7/10

Working Actor: Breaking in, Making a Living, and Making a Life in the Fabulous Trenches of Show Business by David Dean Bottrell

Lots of advice for aspiring actors along with plenty of interesting stories from the author’s career. 8/10

Becoming by Michelle Obama

A good memoir. Lots of coverage of her early life, working career and the White House. Not exhaustive and it skips ahead at time. But very interesting and inspirational. 8/10

Fossil Hunter by Robert J. Sawyer

2nd in the Trilogy. The main human analog here is Darwin with a murder-mystery and God checked in for fun. 7/10

The Wright Brothers by David McCullough

Well written as expected and concentrates on the period when the brothers were actively flying which is the most interesting but avoids their legal battles & some other negatives. 8/10


Share

May 31, 2019

LUV June 2019 Main Meeting: Unlocking insights from Big Data / An Introduction to Packaging

Jun 4 2019 19:00
Jun 4 2019 21:00
Jun 4 2019 19:00
Jun 4 2019 21:00
Location: 
Kathleen Syme Library, 251 Faraday Street Carlton VIC 3053

PLEASE NOTE LATER START TIME

7:00 PM to 9:00 PM Tuesday, June 4, 2019
Kathleen Syme Library, 251 Faraday Street Carlton VIC 3053

Speakers:

  • Matt Moore: Unlocking insights from Big Data
  • Andrew Worsely: An Introduction to Packaging

 

Unlocking insights from Big Data

Many of us like to go for dinner nearby after the meeting, typically at Brunetti's or Trotters Bistro in Lygon St.  Please let us know if you'd like to join us!

Linux Users of Victoria is a subcommittee of Linux Australia.

June 4, 2019 - 19:00

read more

LUV June 2019 Workshop: Computer philosopher Ted Nelson

Jun 15 2019 12:30
Jun 15 2019 16:30
Jun 15 2019 12:30
Jun 15 2019 16:30
Location: 
Infoxchange, 33 Elizabeth St. Richmond

Speaker: Andrew Pam celebrated the birthday of computer philosopher Ted Nelson with a summary of his work.

There will also be the usual casual hands-on workshop, Linux installation, configuration and assistance and advice. Bring your laptop if you need help with a particular issue. This will now occur BEFORE the talks from 12:30 to 14:00. The talks will commence at 14:00 (2pm) so there is time for people to have lunch nearby.

The meeting will be held at Infoxchange, 33 Elizabeth St. Richmond 3121.  Late arrivals please call (0421) 775 358 for access to the venue.

LUV would like to acknowledge Infoxchange for the venue.

Linux Users of Victoria is a subcommittee of Linux Australia.

June 15, 2019 - 12:30

May 29, 2019

dm-crypt: Password Prompts Proliferate

Just recently Petitboot added a method to ask the user for a password before allowing certain actions to proceed. Underneath the covers this is checking against the root password, but the UI "pop-up" asking for the password is relatively generic. Something else which has been on the to-do list for a while is support for mounting encrpyted partitions, but there wasn't a good way to retrieve the password for them - until now!

With the password problem solved, there isn't too much else to do. If Petitboot sees an encrypted partition it makes a note of it and informs the UI via the device_add interface. Seeing this the UI shows this device in the UI even though there aren't any boot options associated with it yet:

encrypted_hdr

Unlike normal devices in the menu these are selectable; once that happens the user is prompted for the password:

encrypted_password

With password in hand pb-discover will then try to open the device with cryptsetup. If that succeeds the encrypted device is removed from the UI and replaced with the new un-encrypted device:

unencrypted_hdr

That's it! These devices can't be auto-booted from at the moment since the password needs to be manually entered. The UI also doesn't have a way yet to select specific options for cryptsetup, but if you find yourself needing to do so you can run cryptsetup manually from the shell and pb-discover will recognise the new unencrypted device automatically.

This is in Petitboot as of v1.10.3 so go check it out! Just make sure your image includes the kernel and cryptsetup dependencies.

May 27, 2019

1-Wire home automation tutorial from linux.conf.au 2019, part 3

Share

This is the third in a set of posts about the home automation tutorial from linux.conf.au 2019. You should probably read part 1 and part 2 before this post.

In the end Alistair decided that my home automation shield was defective, which is the cause of the errors from the past post. So I am instead running with the prototype shield that he handed me when I started helping with the tutorial preparation. That shield has some other bugs (misalignments of holes mainly), but is functional apart from that.

I have also decided that I’m not super excited by hassos, and just want to run the orangepi with the OWFS to MQTT gateway into my existing home assistant setup if possible, so I am going to focus on getting that bare component working for now.

To that end, the gateway can be found at https://github.com/InfernoEmbedded/OWFS-MQTT-Bridge, and is a perl script named ha-daemon.pl. I needed to install some dependancies, which in my case were for armbian:

$ apt-get install perl libanyevent-perl cpanminus libdist-zilla-perl libfile-slurp-perl libdatetime-format-strptime-perl
$ dzil listdeps | cpanm --sudo

Then I needed to write a configuration file and put it at ha.toml in the same directory as the daemon. Mine looks like this:

[general]
	timezone="Australia/Sydney"
	discovery_prefix="homeassistant"

[1wire]
	host="localhost"
	port=4304
	timeout=5 # seconds, will reconnect after this if no response
	sensor_period=30 # seconds
	switch_period=10 # seconds
	debug=true

[mqtt]
	host="192.168.1.6"
	port=1883

Now run the gateway like this:

$ perl ha-daemon.pl

I see messages on MQTT that a temperature sensor is being published to home assistant:

homeassistant/sensor/1067C6697351FF_temperature/config {
	"name": "10.67C6697351FF_temperature",
	"current_temperature_topic": "temperature/10.67C6697351FF/state",
	"unit_of_measurement": "°C"
}

However, I do not see temperature readings being published. Having added some debug code to OWFS-MQTT, this appears to be because no temperature is being returned from the read operation:

2019-05-27 17:28:14.833: lib/Daemon/OneWire.pm:73:Daemon::OneWire::readTemperatureDevices(): Reading temperature for device '10.67C6697351FF'
[...snip...]
2019-05-27 17:28:14.867: /usr/local/share/perl/5.24.1/AnyEvent/OWNet.pm:117:Daemon::OneWire::__ANON__(): Read data: $VAR1 = bless( {
                 'payload' => 0,
                 'size' => 0,
                 'version' => 0,
                 'offset' => 0,
                 'ret' => 4294967295,
                 'sg' => 270
               }, 'AnyEvent::OWNet::Response' );

I continue to debug.

Share

May 26, 2019

Mount Bimberi on a Scout Bushwalking Course

Share

Julian Yates kindly ran a bushwalking course for Scouts Australia over the last five days, which covered walking in Uncontrolled Terrain (the definition in the Australian VET scheme for the most difficult bushwalking — significant off track navigation in areas where emergency response will be hard to get). I helped with some of the instruction, but was also there working on my own bushwalking qualifications.

The walk was to Mount Bimberi, which is the highest point in the ACT. We started with a short night walk into Oldfield’s Hut on Friday night after a day of classroom work. The advantage of this was that we started Saturday at Oldfield’s Hut, which offered morning views which did not suck.

On Saturday morning we walked up to Mount Bimberi via Murray’s Gap. This involved following the ACT / NSW border up the hillside, which was reasonably well marked with tape and cairns.

Our route on the way to Bimberi:

And the way back:

On Sunday we walked back out to the cars and did the three hour drive back to Canberra. I’ll include the walk out here for completeness:

Share

May 23, 2019

Installing Ubuntu 18.04 using both full-disk encryption and RAID1

I recently setup a desktop computer with two SSDs using a software RAID1 and full-disk encryption (i.e. LUKS). Since this is not a supported configuration in Ubuntu desktop, I had to use the server installation medium.

This is my version of these excellent instructions.

Server installer

Start by downloading the alternate server installer and verifying its signature:

  1. Download the required files:

     wget http://cdimage.ubuntu.com/ubuntu/releases/bionic/release/ubuntu-18.04.2-server-amd64.iso
     wget http://cdimage.ubuntu.com/ubuntu/releases/bionic/release/SHA256SUMS
     wget http://cdimage.ubuntu.com/ubuntu/releases/bionic/release/SHA256SUMS.gpg
    
  2. Verify the signature on the hash file:

     $ gpg --keyid-format long --keyserver hkps://keyserver.ubuntu.com --recv-keys 0xD94AA3F0EFE21092
     $ gpg --verify SHA256SUMS.gpg SHA256SUMS
     gpg: Signature made Fri Feb 15 08:32:38 2019 PST
     gpg:                using RSA key D94AA3F0EFE21092
     gpg: Good signature from "Ubuntu CD Image Automatic Signing Key (2012) <cdimage@ubuntu.com>" [undefined]
     gpg: WARNING: This key is not certified with a trusted signature!
     gpg:          There is no indication that the signature belongs to the owner.
     Primary key fingerprint: 8439 38DF 228D 22F7 B374  2BC0 D94A A3F0 EFE2 1092
    
  3. Verify the hash of the ISO file:

     $ sha256sum --ignore-missing -c SHA256SUMS
     ubuntu-18.04.2-server-amd64.iso: OK
    

Then copy it to a USB drive:

dd if=ubuntu-18.04.2-server-amd64.iso of=/dev/sdX

and boot with it.

Manual partitioning

Inside the installer, use manual partitioning to:

  1. Configure the physical partitions.
  2. Configure the RAID array second.
  3. Configure the encrypted partitions last

Here's the exact configuration I used:

  • /dev/sda1 is 512 MB and used as the EFI parition
  • /dev/sdb1 is 512 MB but not used for anything
  • /dev/sda2 and /dev/sdb2 are both 4 GB (RAID)
  • /dev/sda3 and /dev/sdb3 are both 512 MB (RAID)
  • /dev/sda4 and /dev/sdb4 use up the rest of the disk (RAID)

I only set /dev/sda2 as the EFI partition because I found that adding a second EFI partition would break the installer.

I created the following RAID1 arrays:

  • /dev/sda2 and /dev/sdb2 for /dev/md2
  • /dev/sda3 and /dev/sdb3 for /dev/md0
  • /dev/sda4 and /dev/sdb4 for /dev/md1

I used /dev/md0 as my unencrypted /boot partition.

Then I created the following LUKS partitions:

  • md1_crypt as the / partition using /dev/md1
  • md2_crypt as the swap partition (4 GB) with a random encryption key using /dev/md2

Post-installation configuration

Once your new system is up, sync the EFI partitions using DD:

dd if=/dev/sda1 of=/dev/sdb1

and create a second EFI boot entry:

efibootmgr -c -d /dev/sdb -p 1 -L "ubuntu2" -l \EFI\ubuntu\shimx64.efi

Ensure that the RAID drives are fully sync'ed by keeping an eye on /prod/mdstat and then reboot, selecting "ubuntu2" in the UEFI/BIOS menu.

Once you have rebooted, remove the following package to speed up future boots:

apt purge btrfs-progs

To switch to the desktop variant of Ubuntu, install these meta-packages:

apt install ubuntu-desktop gnome

then use debfoster to remove unnecessary packages (in particular the ones that only come with the default Ubuntu server installation).

Fixing booting with degraded RAID arrays

Since I have run into RAID startup problems in the past, I expected having to fix up a few things to make degraded RAID arrays boot correctly.

I did not use LVM since I didn't really feel the need to add yet another layer of abstraction of top of my setup, but I found that the lvm2 package must still be installed:

apt install lvm2

with use_lvmetad = 0 in /etc/lvm/lvm.conf.

Then in order to automatically bring up the RAID arrays with 1 out of 2 drives, I added the following script in /etc/initramfs-tools/scripts/local-top/cryptraid:

 #!/bin/sh
 PREREQ="mdadm"
 prereqs()
 {
      echo "$PREREQ"
 }
 case $1 in
 prereqs)
      prereqs
      exit 0
      ;;
 esac

 mdadm --run /dev/md0
 mdadm --run /dev/md1
 mdadm --run /dev/md2

before making that script executable:

chmod +x /etc/initramfs-tools/scripts/local-top/cryptraid

and refreshing the initramfs:

update-initramfs -u -k all

Disable suspend-to-disk

Since I use a random encryption key for the swap partition (to avoid having a second password prompt at boot time), it means that suspend-to-disk is not going to work and so I disabled it by putting the following in /etc/initramfs-tools/conf.d/resume:

RESUME=none

and by adding noresume to the GRUB_CMDLINE_LINUX variable in /etc/default/grub before applying these changes:

update-grub
update-initramfs -u -k all

Test your configuration

With all of this in place, you should be able to do a final test of your setup:

  1. Shutdown the computer and unplug the second drive.
  2. Boot with only the first drive.
  3. Shutdown the computer and plug the second drive back in.
  4. Boot with both drives and re-add the second drive to the RAID array:

     mdadm /dev/md0 -a /dev/sdb3
     mdadm /dev/md1 -a /dev/sdb4
     mdadm /dev/md2 -a /dev/sdb2
    
  5. Wait until the RAID is done re-syncing and shutdown the computer.

  6. Repeat steps 2-5 with the first drive unplugged instead of the second.
  7. Reboot with both drives plugged in.

At this point, you have a working setup that will gracefully degrade to a one-drive RAID array should one of your drives fail.

May 21, 2019

A nerd snipe, in which I learn to read gerber files

Share

So, I had the realisation last night that the biggest sunk cost with getting a PCB made in China is the shipping. The boards are about 50 cents each, and then its $25 for shipping (US dollars of course). I should therefore be packing as many boards into a single order as possible to reduce the shipping cost per board.

I have a couple of boards on the trot at the moment, my RFID attendance tracker project (called GangScan), and I’ve just decided to actually get my numitrons working and whipped up a quick break out board for those. You’ll see more about that one later I’m sure.

I decided to ask my friends in Canberra if they needed any boards made, and one friend presented with a set of Gerber CAM files and nothing else. That’s a pain because I need to know the dimensions of the board for the quoting system. Of course, I couldn’t find a tool to do extract that for me with a couple of minutes of Googling, so… I decided to just learn to read the file format.

Gerber is well specified, with a quite nice specification available online. So it wasn’t too hard to dig out the dimensions layer from the zipped gerber files and then do this:

Contents of file Meaning Dimensional impact
G04 DipTrace 3.3.1.2* Comment
G04 BoardOutline.gbr* Comment
%MOIN*% File is in inch units
G04 #@! TF.FileFunction,Profile* Comment
G04 #@! TF.Part,Single* Comment
%ADD11C,0.005512*% Defines an apperture. D11 is a circle with diameter 0.005512 inches
%FSLAX26Y26*% Resolution is 2.6, i.e. there are 2 integer places and 6 decimal places
G04* Comment
G70* Historic way of setting units to inches
G90* Historic way of setting coordinates to absolute notation
G75* Sets quadrant mode graphics state parameter to ‘multi quadrant’
G01* Sets interpolation mode graphics state parameter to ‘linear interpolation’
G04 BoardOutline* Comment
%LPD*% Sets the object polarity to dark
X394016Y394016D2* Set current point to 0.394016, 0.394016 (in inches) Top left is 0.394016, 0.394016 inches
D11* Draw the previously defined tiny circle
Y1194016D1* Draw a vertical line to 1.194016 inches Board is 1.194016 inches tall
X1931366Y1194358D1* Draw a line to 1.931366, 1.194358 inches
Board is 1.931366 inches wide (and not totally square)
Y394358D1* Draw a vertical line to 0.394358 inches
X394016Y394016D1* Draw a line to 0.394016, 0.394016 inches
M02* End of file

So this board is effectively 3cm by 5cm.

A nice little nerd snipe to get the morning going.

Share

May 20, 2019

Linux Security Summit 2019 North America: CFP / OSS Early Bird Registration

The LSS North America 2019 CFP is currently open, and you have until May 31st to submit your proposal. (That’s the end of next week!)

If you’re planning on attending LSS NA in San Diego, note that the Early Bird registration for Open Source Summit (which we’re co-located with) ends today.

You can of course just register for LSS on its own, here.

Gangscan 0.6 boards

Share

So I’ve been pottering away for a while working on getting the next version of the gang scan boards working. These ones are much nicer: thicker tracks for signals, better labelling, support for a lipo battery charge circuit, a prototype audio circuit, and some LEDs to indicate status. I had them fabbed at the same place as last time, although the service was much faster this time around.

A gang scan 0.6 board

I haven’t got as far as assembling a board yet — I need to get some wire thin enough for the vias before I can do that. I’ll let you know how I go though.

Share

May 19, 2019

Trigs map

Share

A while ago I had a map of all the trig points in the ACT and links to the posts I’d written during my visits. That had atrophied over time. I’ve just spent some time fixing it up again, and its now at https://www.madebymikal.com/trigs_map.html — I hope its useful to someone else.

Share

May 18, 2019

Trail run: Lake Tuggeranong to Kambah Pool (return)

Share

This wasn’t the run I’d planned for this day, but here we are. This runs along the Centenary Trail between Kambah Pool and Lake Tuggeranong. Partially shaded, but also on the quite side of the ridge line where you can’t tell that you’re near the city. Don’t take the tempting river ford, there is a bridge a little further downstream! 14.11km and 296 vertical ascent.

Be careful of mountain bikers on this popular piece of single track. You’re allowed to run here, but some cyclists don’t leave much time to notice other track users.

Share

Trail run: Tuggeranong Stone Wall loop

Share

The Tuggeranong Stone wall is a 140 year old boundary between to former stations. Its also a nice downhill start to a trail run. This loop involves starting at the Hyperdome, following the wall down, and the continuing along to Pine Island before returning. Partially shaded, and with facilities at the Hyperdome and Pine Island. 6km, and 68m vertically.

Share

Trail run: Barnes and ridgeline

Share

A first attempt at running to Barnes and Brett trigs, this didn’t work out quite as well as I’d expected (I ran out of time before I’d hit Brett trig). The area wasn’t as steep as I’d expected, being mostly rolling grazing land with fire trails. Lots of gates and now facilities, but stunning views of southern Canberra from the ridgeline. 11.11km and 421m of vertical ascent.

Share

Trail run: Pine Island South to Point Hut with a Hill

Share

This one is probably a little bit less useful to others, as the loop includes a bit more of the suburb than is normal. That said, you could turn this into a suburb avoiding loop quite easily. A nice 11.88km run with a hill climb at the end. A total ascent of 119 metres. There isn’t much shade along the run, but there is some in patches. There are bathrooms at Point Hut and Pine Island.

Be careful of mountain bikers on this popular piece of single track. You’re allowed to run here, but some cyclists don’t leave much time to notice other track users.

Share

Trail run: Cooleman Ridge

Share

This run includes Cooleman and Arawang trig points. Not a lot of shade, but a pleasant run. 9.86km and 264m of vertical ascent.

Share

Trail running guide: Tuggeranong

Share

I’ve been running on trails more recently (I’m super bored with roads and bike paths), but running on trails makes load management harder — often I’m looking for a run of approximately XX length with no more than YY vertical ascent. So I was thinking, maybe I should just write the runs that I do down so that over time I create a menu of options for when I need them.

This page documents my Tuggeranong runs.

NameDistance (km)Vertical Ascent (m)NotesPosts
Cooleman Ridge9.78264Cooleman and Arawang Trigs. Not a lot of shade and no facilities.25 April 2019
Pine Island South to Point Hut with a Hill11.88119A nice Point Hut and Pine Island loop with a hill climb at the end. Toilets at Point Hut and Pine Island. Not a lot of shade. Beware of mountain bikes!21 February 2019
Barnes and ridgeline11.11421Not a lot of shade and no facilities, but stunning views of southern Canberra.2 May 2019
Lake Tuggeranong to Kambah Pool (return)14.11296Partial shade and great views, but beware the mountain bikes!11 May 2019
Tuggeranong Stone Wall loop668Partial shade and facilities at the Hyperdome and Pine Island.27 April 2019

Share

May 09, 2019

Audiobooks – April 2019

Enlightenment Now: The Case for Reason, Science, Humanism, and Progress by Steven Pinker

Amazing good book, well argued and lots of information. The only downside is he talks to some diagrams [downloadable] at times. Highly Recommend. 9/10

A History of Britain, Volume : Fate of Empire 1776 – 2000 by Simon Schama

I didn’t enjoy this all that much. The author tried to use various lives to illustrate themes but both the themes and biographies suffered. Huge areas also left out. 6/10

Where Did You Get This Number? : A Pollster’s Guide to Making Sense of the World by Anthony Salvanto

An overview of (mostly) political polling and it’s history. Lots of examples for the 2016 US election campaign. Light but interesting. 7/10

Squid Empire: The Rise and Fall of the Cephalopods by Danna Staaf

Pretty much what the titles says. I got a little lost with all the similarly names species but the general story was interesting enough and not too long. 6/10

Apollo in the Age of Aquarius by Neil M. Maher

The story of the back and forth between NASA and the 60s counterculture from the civil rights struggle and the antiwar movement to environmentalism and feminism. Does fairly well. 7/10


Share

May 06, 2019

Visual Studio Code for Linux kernel development

Here we are again - back in 2016 I wrote an article on using Atom for kernel development, but I didn't stay using it for too long, instead moving back to Emacs. Atom had too many shortcomings - it had that distinctive Electron feel, which is a problem for a text editor - you need it to be snappy. On top of that, vim support was mediocre at best, and even as a vim scrub I would find myself trying to do things that weren't implemented.

So in the meantime I switched to spacemacs, which is a very well integrated "vim in Emacs" experience, with a lot of opinionated (but good) defaults. spacemacs was pretty good to me but had some issues - disturbingly long startup times, mediocre completions and go-to-definitions, and integrating any module into spacemacs that wasn't already integrated was a big pain.

After that I switched to Doom Emacs, which is like spacemacs but faster and closer to Emacs itself. It's very user configurable but much less user friendly, and I didn't really change much as my elisp-fu is practically non-existent. I was decently happy with this, but there were still some issues, some of which are just inherent to Emacs itself - like no actually usable inbuilt terminal, despite having (at least) four of them.

Anyway, since 2016 when I used Atom, Visual Studio Code (henceforth referred to as Code) came along and ate its lunch, using the framework (Electron) that was created for Atom. I did try it years ago, but I was very turned off by its Microsoft-ness, it seeming lack of distinguishing features from Atom, and it didn't feel like a native editor at all. Since it's massively grown in popularity since then, I decided I'd give it a try.

Visual Studio Code

Vim emulation

First things first for me is getting a vim mode going, and Code has a pretty good one of those. The key feature for me is that there's Neovim integration for Ex-commands, filling a lot of shortcomings that come with most attempts at vim emulation. In any case, everything I've tried to do that I'd do in vim (or Emacs) has worked, and there are a ton of options and things to tinker with. Obviously it's not going to do as much as you could do with Vimscript, but it's definitely not bad.

Theming and UI customisation

As far as the editor goes - it's good. A ton of different themes, you can change the colour of pretty much everything in the config file or in the UI, including icons for the sidebar. There's a huge sore point though, you can't customise the interface outside the editor pretty much at all. There's an extension for loading custom CSS, but it's out of the way, finnicky, and if I wanted to write CSS I wouldn't have become a kernel developer.

Extensibility

Extensibility is definitely a strong point, the ecosystem of extensions is good. All the language extensions I've tried have been very fully featured with a ton of different options, integration into language-specific linters and build tools. This is probably Code's strongest feature - the breadth of the extension ecosystem and the level of quality found within.

Kernel development

Okay, let's get into the main thing that matters - how well does the thing actually edit code. The kernel is tricky. It's huge, it has its own build system, and in my case I build it with cross compilers for another architecture. Also, y'know, it's all in C and built with make, not exactly great for any kind of IDE integration.

The first thing I did was check out the vscode-linux-kernel project by GitHub user "amezin", which is a great starting point. All you have to do is clone the repo, build your kernel (with a cross compiler works fine too), and run the Python script to generate the compile_commands.json file. Once you've done this, go-to-definition (gd in vim mode) works pretty well. It's not flawless, but it does go cross-file, and will pop up a nice UI if it can't figure out which file you're after.

Code has good built-in git support, so actions like staging files for a commit can be done from within the editor. Ctrl-P lets you quickly navigate to any file with fuzzy-matching (which is impressively fast for a project of this size), and Ctrl-Shift-P will let you search commands, which I've been using for some git stuff.

git command completion in Code

There are some rough edges, though. Code is set on what so many modern editors are set on, which is the "one window per project" concept - so to get things working the way you want, you would open your kernel source as the current project. This makes it a pain to just open something else to edit, like some script, or checking the value of something in firmware, or chucking something in your bashrc.

Auto-triggering builds on change isn't something that makes a ton of sense for the kernel, and it's not present here. The kernel support in the repo above is decent, but it's not going to get you close to what more modern languages can get you in an editor like this.

Oh, and it has a powerpc assembly extension, but I didn't find it anywhere near as good as the one I "wrote" for Atom (I just took the x86 one and switched the instructions), so I'd rather use the C mode.

Terminal

Code has an actually good inbuilt terminal that uses your login shell. You can bring it up with Ctrl-`. The biggest gripe I have always had with Emacs is that you can never have a shell that you can actually do anything in, whether it's eshell or shell or term or ansi-term, you try to do something in it and it doesn't work or clashes with some Emacs command, and then when you try to do something Emacs-y in there it doesn't work. No such issue is present here, and it's a pleasure to use for things like triggering a remote build or doing some git operation you don't want to do with commands in the editor itself.

Not the most important feature, but I do like not having to alt-tab out and lose focus.

Well...is it good?

Yeah, it is. It has shortcomings, but installing Code and using the repo above to get started is probably the simplest way to get a competent kernel development environment going, with more features than most kernel developers (probably) have in their editors. Code is open source and so are its extensions, and it'd be the first thing I recommend to new developers who aren't already super invested into vim or Emacs, and it's worth a try if you have gripes with your current environment.

May 05, 2019

Ignition!

Share

Whilst the chemistry was sometimes over my head, this book is an engaging summary of the history of US liquid rocket fuels during the height of the cold war. Fun to read and interesting as well. I enjoyed it.

Ignition! Book Cover Ignition!
John Drury Clark
Technology & Engineering
1972
214

Share

May 04, 2019

Codec2 and FreeDV Update

Quite a lot of Codec2/FreeDV development going on this year, so much that I have been neglecting the blog! Here is an update…..

Github, Travis, and STM32

Early in 2019, the number of active developers had grown to the point where we needed more sophisticated source control, so in March we moved the Codec 2 project to GitHub. One feature I’m enjoying is the collaboration and messaging between developers.

Danilo (DB4PLE) immediately had us up and running with Travis, a tool that automatically builds our software every time it is pushed. This has been very useful in spotting build issues quickly, and reducing the amount of “human in the loop” manual testing.

Don (W7DMR), Danilo, and Richard (KF5OIM) have been doing some fantastic work on the cmake build and test system for the stm32 port of 700D. A major challenge has been building the same code on desktop platforms without breaking the embedded stm32 version, which has tight memory constraints.

We now have a very professional build and test system, and can run sophisticated unit tests from anywhere in the world on remote stm32 development hardware. A single “cmake test all” command can build and run a suite of automated tests on the x86 and stm32 platforms.

The fine stm32 work by Don will soon lead to new firmware for the SM1000, and FreeDV 700D is already running on radios that support the UHSDR firmware.

FreeDV in the UK

Mike (G4ABP), contacted me with some fine analysis of the FreeDV modems on the UK NVIS channel. Mike is part of a daily UK FreeDV net, which was experiencing some problems with loss of sync on FreeDV 700C. Together we have found (and fixed) bugs with FreeDV 700C and 700D.

The UK channel is interesting: high SNR (>10dB), but at times high Doppler spread (>3Hz) which the earlier FreeDV 700C modem may deal with better due to it’s high sampling rate of the channel phase. In contrast, FreeDV 700D has been designed for moderate Doppler (1Hz), but heavily optimised for low SNR operation. More investigation required here with off air samples to run any potential issues to ground.

I would like to hear from you if you have problems getting FreeDV 700D to work with strong signals! This could be a sign of fast fading “breaking” the modem. By working together, we can improve FreeDV.

FreeDV in Argentina

Jose, LU5DKI, is part of an active FreeDV group in Argentina. They have a Facebook page for their Radio Club Coronel Pringles LU1DIL that describes their activities. They are very happy with the low SNR and interference rejecting capabilities of FreeDV 700D:

Regarding noise FREEDV IS IMMUNE TO NOISE, coincidentally our CLUB is installed in a TELEVISION MONITORING CENTER, where the QRN by the monitors and computers is very intense, it is impossible to listen to a single SSB station, BUT FREEDV LISTENS PERFECTLY IN 40 METERS

Roadmap for 2019

This year I would like to get FreeeDV 2020 released, and FreeDV 700D running on the SM1000. A bonus would be some improvement in the speech quality for the lower rate modes.

Reading Further

FreeDV 2020 First On Air Tests
Porting a LDPC Decoder to a STM32 Microcontroller
Universal Ham Software Defined Radio Github page

FreeDV 2020 First On Air Tests

Brad (AC0ZJ), Richard (KF5OIM) and I have been putting the pieces required for the new FreeDV 2020 mode, which uses LPCNet Neural Net speech synthesis technology developed by Jean-Marc Valin. The goal of this mode is 8kHz audio bandwidth in just 1600 Hz of RF bandwidth. FreeDV 2020 is designed for HF channels where SSB an “armchair copy” – SNRs of better than 10dB and slow fading.

FreeDV 2020 uses the fine OFDM modem ported to C by Steve (K5OKC) for the FreeDV 700D mode. Steve and I have modified this modem so it can operate at the higher bit rate required for FreeDV 2020. In fact, the modem can now be configured on the command line for any bandwidth and bit rate that you like, and even adapt the wonderful LDPC FEC codes developed by Bill (VK5DSP) to suit.

Brad is working on the integration of the FreeDV 2020 mode into the FreeDV GUI program. It’s going well, and he has made 1200 mile transmissions across the US to a SDR using the Linux version. Brad has also done some work on making FreeDV GUI deal with USB sound cards that come and go in different order.

Mark, VK5QI has just made a 3200km FreeDV transmission from Adelaide, South Australia to a KiwiSDR in the Bay of Islands, New Zealand. He decoded it with the partially working OSX version (we do most of our development on Ubuntu Linux).

I’m surprised as I didn’t think it would work so well over such long paths! There’s a bit of acoustic echo from Mark’s shack but you can get an idea of the speech quality compared to SSB. Thanks Mark!

For the adventurous, the freedv-gui source code 2020 development branch is here). We are currently performing on air tests with the Linux version, and Brad is working on the Windows build.

Reading Further

Steve Ports an OFDM modem from Octave to C
Bill’s (VK5DSP) Low SNR Blog

May 02, 2019

Restricted Sleep Regime

Since moving down to Melbourne my poor sleep has started up again. It’s really hard to say what the main factor driving this is. My doctor down here has put me onto a drug free way of trying to improve my sleep, and I think I kind of like it, while it’s no silver bullet, it is something I can go back to if I’m having trouble with my sleep, without having to get a prescription.

The basic idea is to maximise sleep efficiency. If you’re only getting n hours sleep a night, only spend n hours  a night in bed. This forces you to stay up and go to bed rather late for a few nights. Hopefully, being tired will help you sleep through the night in one large segment. Once you’ve successfully slept through the night a few times, relax your bed time by say fifteen minutes, and get used to that. Slowly over time, you increase the amount of sleep you’re getting, while keeping your efficiency high.

 
Person T has had Person A design a one-page flyer and sent it to Person J... as a single image. Person T is two hours ahead, time-zone wise, and Person A is roughly 12 hours behind.

Person J also wishes to email out the flyer with hyperlinks on each of two names in the image.

Sent as a bare image, she will not fly.

Embedding the image in a PDF would allow only the entire image to possess a single hyperlink.

So... crank up GIMP, open image, select the Move tool, drag Guides from each Ruler to section up the image. Each Guide changes nothing, however its presence allows the Rectangle Select tool to be very precise and consistent.

Now File ⇒ Save the work-file in case you wish to adjust things for another round. Here, I have applied the Cubist tool from the Filters to most of the content, so the idea is conveyed without revealing details of said content.

The next step is to Rectangle Select the top area (in the screenshot above, the left-name area has been Rectangle Selected), then Copy it (Ctrl+C is the keyboard shortcut), then File ⇒ Create ⇒ From Clipboard (Ctrl+Shift+V is the shortcut) to make the copy into a new image, export that image (File ⇒ Export) as a PNG (lossless compression), repeat for the bottom area, then in the central section, for the left, left-name, centre, right-name, right areas.

Open LibreOffice Writer, Insert ⇒ Image the top-area image, right-click, choose Properties, under the Type tab make it “As character” under the Crop tab set the Scale so it will all fit nicely (58% in this case, which can be tweaked later to suit), OK. Click to the right of the image, press Shift+Enter to insert a NewLine (rather than a paragraph).

Now Insert ⇒ Image the centre left area, then left-name, centre, right-name, right. With the name areas (in this case) I also chose the Hyperlink tab within the Properties dialogue, and pasted the link into the URL field, making that image section click-able. When done, Shift+Enter to make a place for the bottom area.

Finally, Insert ⇒ Image the bottom-area image (and if it does not all butt up squarely, check (Format ⇒ Paragraph) that the Line Spacing for the document’s sole paragraph is set to Single). Now save (for the sake of posterior) and click the “Export as PDF” button.

April 30, 2019

Election Activity Bundle

With the upcoming federal election, many teachers want to do some related activities in class – and we have the materials ready for you! To make selecting suitable resources a bit easier, we have an Election Activity Bundle containing everything you need, available for just $9.90. Did you know that the secret ballot is an Australian […]

FreeDV QSO Party 2019 Part 2

Here is a fine summary of the FreeDV QSO Party 2019, which took place last weekend. I took part, and made several local and interstate contacts, plus listened in to many more.

Thanks so much to AREG for organising the event, and for all the Hams world wide who took part.

It would be great to make some international DX contacts using the mode, in particular to get some operational experience with the modems on long distance channels.

Generating Wideband Speech from Narrowband

I’m slowly getting up to speed on this machine learning caper. I had some free time this week, so I set myself a little machine learning exercise.

LPCNet represents speech using 18 bands that cover the range from 0 to 8000Hz (in the form of MFCCs). However the Wavenet work demonstrated high quality speech using just the Codec 2 2400 bit/s features, which only contain information in the 0 to 4 kHz range. This suggests we can regenerate the speech energy above 4000Hz, from the features beneath 4000Hz. In a speech coding application, this would save bits, as we no longer have to quantise and transmit the high frequency band energies.

So the goals of this project were:

  1. Gain experience in Machine Learning.
  2. Generate reasonable quality speech by synthesising the top 6 bands (3200 to 8000Hz) from the information in the lower 12 bands (which cover 0 to 2800Hz). Doesn’t have to be a perfect reconstruction, after all, we are throwing information away.

Method

As a starting point I set up a Keras model with a couple of Dense layers, and a linear output layer. The band features extracted by dump_data are the log10 of the band energies. Multiplying by 10 gives us the energy in dB. This is quite neat, as the network “loss” function (mean square error) effectively reports the distance in dB^2/10. This gives us a nice objective measure of distortion, and hopefully speech quality.

So the input to the network is the lower 12 bands, and the LPCNet pitch gain (a rough estimate of voicing). The output is the 6 high frequency bands.

I wanted to sanity check my network to make sure it was getting results better than random. So I reasoned a trivial algorithm would just set the HF band energies to their mean. The distortion of this trivial algorithm is the variance of the training data. I measured the variance of the HF bands as 200dB^2, and my first attempts were reducing this to 50dB^2, a factor of four improvement. So that’s a good start.

I messed about with the number and size of layers, activation functions, optimisers, batch size, epochs, which produced minor changes. On a whim I decided to remove the mean and that significantly reduced the error to 12dB^2, or a rms error of 3.5dB. That’s within the range of what a (coarse) vector quantiser will achieve – and we are doing it with zero bits.

BTW – is it just me or is NN design just guesswork? Maybe it becomes educated guesswork as experience grows.

The mean of the bands is the related to frame energy, and is often sent to the decoder in some form as part of regular quantisation. So is the pitch gain (e.g. in the form of a voicing flag). However this NN only seems to work well when it’s the mean of all 18 bands.

Testing

I used a “vanilla” LPCNet system to test, slightly modified to output band energies rather than the DCT of the band energies. First I generate features using dump_data, then play the features straight into test_lpcnet to synthesise. There is no quantisation. I placed my HF regeneration network in between, optionally replacing the top 6 bands with the estimates from the network.

I use a small database (1E6 vectors) for LPCNet experimentation, as this trains LPCNet in just a few hours. I determined by experiment this was just large enough to synthesise good quality speech on samples from within the training database. I have a script (tinytrain.sh) that trains the network, and generates test samples, automating the process. If my experimental algorithms don’t work with this tiny database, no point going any further, and I haven’t wasted too much time. If the experiments work out, I can then train a better network, e.g. to deal with many different speakers and conditions.

So I used the same 1E6 size database of vectors for training the high band regeneration algorithm. The file are available from my LPCNet Github repo (train_regen.py, test_regen.py, run_regen.sh, plot_regen.m).

Samples

Condition Female Male
Original Play Play
Vanilla LPCNet Play Play
Regen HF with mean removed Play Play
Regen HF without mean removed Play Play

The mean removed samples sound rather close to the vanilla. Almost too close. In the other samples without mean removal, the high frequencies are not regenerated as well, although they might still be useful in a low bit rate coding scenario, given they cost zero bits.

Here is a 3D plot of the first 1 second (100 x 10ms frames) of the bands for the female sample above. The little hills roll up and down as the words are articulated. At the high frequency end, the peaks tend to correspond to consonants, e.g. the “ch” in birch is in the middle. Right at the end (around frame 100 at the “rear”), we can see thet start of “sss” in slid

The second plot shows the error of the regeneration network:

Below is a plot of a single frame, showing the original and regenerated bands;

Conclusion

Anyhoo, looks like the idea has promise, especially for reducing the bit rate required for speech codecs. Further work will show if the idea works for a wider range of speech samples, and with quantisation. The current model could possibly be improved to take into account adjacent frames using a covnet or RNN. OK, on with the next experiment …..

LUV May 2019 Main Meeting: Kali Linux

May 7 2019 19:00
May 7 2019 21:00
May 7 2019 19:00
May 7 2019 21:00
Location: 
Kathleen Syme Library, 251 Faraday Street Carlton VIC 3053

PLEASE NOTE LATER START TIME

7:00 PM to 9:00 PM Tuesday, May 7, 2019
Kathleen Syme Library, 251 Faraday Street Carlton VIC 3053

Speakers:

  • errbufferoverfl: Kali Linux

Kali Linux

Many of us like to go for dinner nearby after the meeting, typically at Brunetti's or Trotters Bistro in Lygon St.  Please let us know if you'd like to join us!

Linux Users of Victoria is a subcommittee of Linux Australia.

May 7, 2019 - 19:00

read more

LUV May 2019 Workshop: RISC-V development board

May 25 2019 12:30
May 25 2019 16:30
May 25 2019 12:30
May 25 2019 16:30
Location: 
Infoxchange, 33 Elizabeth St. Richmond

PLEASE NOTE CHANGE OF DATE DUE TO THE ELECTION

This month's meeting will be on 25 May rather than 18 May 2019 due to the election on the usual workshop date.

Speaker: Rodney Brown demonstrates his low-cost RISC-V development board running facial recognition software using the on-chip nerual network.

The meeting will be held at Infoxchange, 33 Elizabeth St. Richmond 3121.  Late arrivals please call (0421) 775 358 for access to the venue.

LUV would like to acknowledge Infoxchange for the venue.

Linux Users of Victoria is a subcommittee of Linux Australia.

May 25, 2019 - 12:30

read more

April 27, 2019

Building new pods for the Spectracom 8140 using modern components

I've mentioned a bunch of times on the time-nuts list that I'm quite fond of the Spectracom 8140 system for frequency distribution. For those not familiar with it, it's simply running a 10MHz signal against a 12v DC power feed so that line-powered pods can tap off the reference frequency and use it as an input to either a buffer (10MHz output pods), decimation logic (1MHz, 100kHz etc.), or a full synthesizer (Versa-pods).

It was only in October last year that I got a house frequency standard going using an old Efratom FRK-LN which now provides the reference; I'd use a GPSDO, but I live in a ground floor apartment without a usable sky view, this of course makes it hard to test some of the GPS projects I'm doing. Despite living in a tiny apartment I have test equipment in two main places, so the 8140 is a great solution to allow me to lock all of them to the house standard.


(The rubidium is in the chunky aluminium chassis underneath the 8140)

Another benefit of the 8140 is that many modern pieces of equipment (such as my [HP/Agilent/]Keysight oscilloscope) have a single connector for reference frequency in/out, and should the external frequency ever go away it will switch back to its internal reference, but also send that back out the connector, which could lead to other devices sharing the same signal switching to it. The easy way to avoid that is to use a dedicated port from a distribution amplifier for each device like this, which works well enough until you have this situation in multiple locations.

As previously mentioned the 8140 system uses pods to add outputs, while these pods are still available quite cheaply used on eBay (as of this writing, for as low as US$8, but ~US$25/pod has been common for a while), recently the cost of shipping to Australia has gone up to the point I started to plan making my own.

By making my own pods I also get to add features that the original pods didn't have[1], I started with a quad-output pod with optional internal line termination. This allows me to have feeds for multiple devices with the annoying behaviour I mentioned earlier. The enclosure is a Pomona model 4656, with the board designed to slot in, and offer pads for the BNC pins to solder to for easy assembly.



This pod uses a Linear Technologies (now Analog Devices) LTC6957 buffer for the input stage replacing a discrete transistor & logic gate combined input stage in the original devices. The most notable change is that this stage works reliably down to -30dBm input (possibly further, couldn't test beyond that), whereas the original pods stop working right around -20dBm.

As it turns out, although it can handle lower input signal levels, in other ways including power usage it seems very similar. One notable downside is the chip tops out at 4v absolute maximum input, so a separate regulator is used just to feed this chip. The main regulator has also been changed from a 7805 to an LD1117 variant.

On this version the output stage is the same TI 74S140 dual 4-input NAND gate as was used on the original pods, just in SOIC form factor.

As with the next board there is one error on the board, the wire loop that forms the ground connection was intended to fit a U-type pin header, however the footprint I used on the boards was just too tight to allow the pins through, so I've used some thin bus wire instead.



The second major variant I designed was a combo version, allowing sine & square outputs by just switching a jumper, or isolated[2] or line-regenerator (8040TA from Spectracom) versions with a simple sub-board containing just an inductor (TA) or 1:1 transformer (isolated).



This is the second revision of that board, where the 74S140 has been replaced by a modern TI 74LVC1G17 buffer. This version of the pod, set for sine output, uses almost exactly 30mA of current (since both the old & new pods use linear supplies that's the most sensible unit), whereas the original pods are right around 33mA. The empty pods at the bottom-left are simply placeholders for 2 100 ohm resistors to add 50 ohm line termination if desired.

The board fits into the Pomona 2390 "Size A" enclosures, or for the isolated version the Pomona 3239 "Size B". This is the reason the BNC connectors have to be extended to reach the board, on the isolated boxes the BNC pins reach much deeper into the enclosure.

If the jumpers were removed, plus the smaller buffer it should be easy to fit a pod into the Pomona "Miniature" boxes too.



I was also due to create some new personal businesscards, so I arranged the circuit down to a single layer (the only jumper is the requirement to connect both ground pins on the connectors) and merged it with some text converted to KiCad footprints to make a nice card on some 0.6mm PCBs. The paper on that photo is covering the link to the build instructions, which weren't written at the time (they're *mostly* done now, I may update this post with the link later).

Finally, while I was out travelling at the start of April my new (to me) HP 4395A arrived so I've finally got some spectrum output. The output is very similar between the original and my version, with the major notable difference being that my version is 10dB worse at the third harmonic. I lack the equipment (and understanding) to properly measure phase noise, but if anyone in AU/NZ wants to volunteer their time & equipment for an afternoon I'd love an excuse for a field trip.



Spectrum with input sourced from my house rubidium (natively a 5MHz unit) via my 8140 line. Note that despite saying "ExtRef" the analyzer is synced to its internal 10811 (which is an optional unit, and uses an external jumper, hence the display note.



Spectrum with input sourced from the analyzer's own 10811, and power from the DC bias generator also from the analyzer.


1: Or at least I didn't think they had, I've since found out that there was a multi output pod, and one is currently in the post heading to me.
2: An option on the standard Spectracom pods, albeit a rare one.

April 22, 2019

Pi-hole with DNS over TLS on Fedora

Quick and dirty guide to using Pi-hole with Stubby to provide both advertisement blocking and DNS over TLS. I’m using Fedora 29 ARM server edition on a Raspberry Pi 3.

Download Fedora server ARM edition and write it to an SD card for the Raspberry Pi 3.

sudo fedora-arm-image-installer --resizefs --image=Fedora-Server-armhfp-29-1.2-sda.raw.xz --target=rpi3 --media=/dev/mmcblk0

Make sure your Raspberry Pi can already resolve DNS queries from some other source, such as your router or internet provider.

Log into the Fedora Server Cockpit web interface for the server (port 9090) and enable automatic updates from the Software tab. Else you can do updates manually.

sudo dnf -y update && sudo reboot

Install Stubby

Install Stubby to forward DNS requests over TLS.

sudo dnf install getdns bind-utils

Edit the Stubby config file.

sudo vim /etc/stubby/stubby.yml

Set listen_addresses to localhost 127.0.0.1 on port 53000 (also set your preferred upstream DNS providers, if you want to change the defaults, e.g. CloudFlare).

listen_addresses:
– 127.0.0.1@53000
– 0::1@53000

Start and enable Stubby, checking that it’s listening on port 53000.

sudo systemctl restart stubby
sudo ss -lunp |grep 53000
sudo systemctl enable stubby

Stubby should now be listening on port 53000, which we can test with dig. The following command should return an IP address for google.com.

dig @localhost -p 53000 google.com

Next we’ll use Pi-hole as a caching DNS service to forward requests to Stubby (and provide advertisement blocking).

Install Pi-hole

Sadly, Pi-hole doesn’t support SELinux at the moment so set it to permissive mode (or write your own rules).

sudo setenforce 0
sudo sed -i s/^SELINUX=.*/SELINUX=permissive/g /etc/selinux/config

Install Pi-hole from their Git repository.

sudo dnf install git
git clone --depth 1 https://github.com/pi-hole/pi-hole.git Pi-hole
cd "Pi-hole/automated install/"
sudo ./basic-install.sh

The installer will run, install deps and prompt for configuration. When asked what DNS to use, select Custom from the bottom of the list.

Custom DNS servers

Set the server to 127.0.0.1 (note that we cannot set the port here, we’ll do that later)

Use local DNS server

In the rest of the installer, also enable the web interface and server if you like and allow it to modify the firewall else this won’t work at all! 🙂 Make sure you take note of your admin password from the last screen, too.

Finally, add the port to our upstream (localhost) DNS server so that Pi-hole can forward requests to Stubby.

sudo sed -i '/^server=/ s/$/#53000/' /etc/dnsmasq.d/01-pihole.conf
sudo sed -i '/^PIHOLE_DNS_[1-9]=/ s/$/#53000/' /etc/pihole/setupVars.conf
sudo systemctl restart pihole-FTL

If you don’t want to muck around with localhost and ports you could probably add an IP alias and bind your Stubby to that on port 53 instead.

Testing

On a machine on your network, set /etc/resolv.conf to point to the IP address of your Pi-hole server to use it for DNS.

On the Pi-hole, check incoming DNS requests to ensure they are listening and forwarding on the right ports using tcpdump.

sudo tcpdump -Xnn -i any port 53 or port 53000 or port 853

Back on your client machine, ping google.com and with any luck it will resolve.

For a new query, tcpdump on your Pi-hole box should show an incoming request from the client machine to your pi-hole on port 53, a follow-up localhost request to 53000 and then outward request from your Pi-hole to 853, then finally the returned result back to your client machine.

You should also notice that the payload for the internal DNS queries are plain text, but the remote ones are encrypted.

Web interface

Start browsing around and see if you notice any difference where you’d normally see ads. Then jump onto the web interface on your Pi-hole box and take a look around.

Pi-hole web interface

If that all worked, you could get your DHCP server to point clients to your shiny new Pi-hole box (i.e. use DHCP options 6,<ip_address>).

If you’re feeling extra brave, you could redirect all unencrypted DNS traffic on port 53 back to your internal DNS before it leaves your network, but that might be another blog post…


Vale Polly Samuel (1963-2017): On Dying & Death

Those of you who follow me on Twitter will know some of this already, but I’ve been meaning to write here for quite some time about all this. It’s taken me almost two years to write, because it’s so difficult to find the words to describe this. I’ve finally decided to take the plunge and finish it despite feeling it could be better, but if I don’t I’ll never get this out.

September 2016 was not a good month for my wonderful wife Polly, she’d been having pains around her belly and after prodding the GP she managed to get a blood test ordered. They had suspected gallstones or gastritis but when the call came one evening to come in urgently in the next morning for another blood test we knew something was up. After the blood test we were sent off for an ultrasound of the liver and with that out of the way went out for a picnic on Mount Dandenong for a break. Whilst we were eating we got another phone call from the GP, this time to come and pick up a referral for an urgent MRI. We went to pick it up but when they found out Polly had already eaten they realised they would need to convert to a CT scan. A couple of phone calls later we were booked in for one that afternoon. That evening was another call to come back to see the GP. We were pretty sure we knew what was coming.

The news was not good, Polly had “innumerable” tumours in her liver. Over 5 years after surgery and chemo for her primary breast cancer and almost at the end of her 5 years of tamoxifen the cancer had returned. We knew the deal with metastatic cancer, but it was still a shock when the GP said “you know this is not a curable situation”. So the next day (Friday) it was right back to her oncologist who took her of the tamoxifen immediately (as it was no longer working) and scheduled chemotherapy for the following Monday, after an operation to install a PICC line. He also explained about what this meant, that this was just a management technique to (hopefully) try and shrink the tumours and make life easier for Polly for a while. It was an open question about how long that while would be, but we knew from the papers online that she had found looking at the statistics that it was likely months, not years, that we had. Polly wrote about it all at the time, far more eloquently than I could, with more detail, on her blog.

Chris, my husband, best pal, and the love of my life for 17 years, and I sat opposite the oncologist. He explained my situation was not good, that it was not a curable situation. I had already read that extensive metastatic spread to the liver could mean a prognosis of 4-6 months but if really favorable as long as 20 months.

The next few months were a whirlwind of chemo, oncology, blood tests, crying, laughing and loving. We were determined to talk about everything, and Polly was determined to prepare as quickly as she could for what was to come. They say you should “put your affairs in order” and that’s just what she did, financially, business-wise (we’d been running an AirBNB and all those bookings had to be canceled ASAP, plus of course all her usual autism consulting work) and personally. I was so fortunate that my work was so supportive and able to be flexible about my hours and days and so I could be around for these appointments.

Over the next few weeks it was apparent that the chemo was working, breathing & eating became far easier for her and a follow up MRI later on showed that the tumours had shrunk by about 75%. This was good news.

In October 2016 was Polly’s 53rd birthday and so she set about planning a living wake for herself, with a heap of guests, music courtesy of our good friend Scott, a lot of sausages (and other food) and good weather. Polly led the singing and there was an awful lot of merriment. Such a wonderful time and such good memories were made that day.

Polly singing at her birthday party in 2016

That December we celebrated our 16th wedding anniversary together at a lovely farm-stay place in the Yarra Valley before having what we were pretty sure was our last Christmas together.

Polly and Chris at the farm-stay for our wedding anniversary

But then in January came the news we’d been afraid of, the blood results were showing that the first chemo had run out of steam and stopped working, so it was on to chemo regime #2. A week after starting the new regime we took a delayed holiday up to the Blue Mountains in New South Wales (we’d had to cancel previously due to her diagnosis) and spent a long weekend exploring the area and generally having fun.

Polly and Chris at Katoomba, NSW

But in early February it was clear that the second line chemo wasn’t doing anything, and so it was on to the third line chemo. Polly had also been having fluid build up in her abdomen (called ascites) and we knew they would have to start to draining that at some point, February was that point; we spent the morning of Valentines Day in the radiology ward where they drained around 4 litres from her! The upside from that was it made life so much easier again for her. We celebrated that by going to a really wonderful restaurant that we used for special events for dinner for Valentines, something we hadn’t thought possible that morning!

Valentine's Day dinner at Copperfields

Two weeks after that we learned from the oncologist that the third line chemo wasn’t doing anything either and he had to give us the news that there wasn’t any treatment he could offer us that had any prospect of helping. Polly took that in her usual pragmatic and down-to-earth way, telling the oncologist that she didn’t see him as the reaper but as her fairy godfather who had given her months of extra quality time and bringing a smile to his and my face. She also asked whether the PICC line (which meant she couldn’t have a bath, just shower with a protective cover over it) could come out and the answer was “yes”.

The day before that news we had visited the palliative ward there for the first time, Polly had a hard time with hospitals and so we spent time talking to the staff, visiting rooms and Polly all the time reframing it to reduce and remove the anxiety. The magic words were “hotel-hospital”, which it really did feel like. We talked with the oncologist about how it all worked and what might happen.

We also had a home palliative team who would come and visit, help with pain management and be available on the phone at all hours to give advice and assist where they could. Polly felt uncertain about them at first as she wasn’t sure what they would make of her language issues and autism, but whilst they did seem a bit fazed at first by someone who was dealing with the fact that they were dying in such a blunt and straightforward manner things soon smoothed out.

None of this stopped us living, we continued to go out walking at our favourite places in our wonderful part of Melbourne, continued to see friends, continued to joke and dance and cry and laugh and cook and eat out.

Polly on minature steam train

Oh, and not forgetting putting a new paved area in so we could have a little outdoor fire area to enjoy with friends!

Chris laying paving slabs for fire area Polly and Morghana enjoying the fire!

But over time the ascites was increasing, with each drain being longer, with more fluid, and more taxing for Polly. She had decided that when it would get to the point that she would need two a week then that was enough and time to call it a day. Then, on a Thursday evening after we’d had an afternoon laying paving slabs for another little patio area, Polly was having a bath whilst I was researching some new symptoms that had appeared, and when Polly emerged I showed her what I had found. The symptoms matched what happens when that pressure that causes the ascites gets enough to push blood back down other pathways and as we read what else could lie in store Polly decided that was enough.

That night Polly emailed the oncologist to ask them to cancel her drain which was scheduled for the next day and instead to book her into the palliative ward. We then spent our final night together at home, before waking the next day to get the call to confirm that all was arranged from their end and that they would have a room by 10am, but to arrive when was good for us. Friends were informed and Polly and I headed off to the palliative ward, saying goodbye to the cats and leaving our house together for the very last time.

Arriving at the hospital we dropped in to see the oncology, radiology and front-desk staff we knew to chat with them before heading up to the palliative ward to meet the staff there and set up in the room. The oncologist visited and we had a good chat about what would happen with pain relief and sedation once Polly needed it. Shortly after our close friends Scott and Morghana arrived from different directions and I had brought Polly’s laptop and a 4G dongle and so on Skype arrived Polly’s good Skype pal Marisol joined us, virtually. We shared a (dairy free) Easter egg, some raspberry lemonade and even some saké! We had brought in a portable stereo and CD’s and danced and sang and generally made merry – which was so great.

After a while Polly decided that she was too uncomfortable and needed the pain relief and sedation, so everything was put in its place and we all said our goodbyes to Polly as she was determined to do the final stages on her own, and she didn’t want anyone around in case it caused her to try and hang on longer than she really should. I told her I was so proud of her and so honoured to be her husband for all this time. Then we left, as she wished, with Scott and Morghana coming back with me to the house. We had dinner together at the house and then Morghana left for home and Scott kindly stayed in the spare room.

The next day Scott and I returned to the hospital, Polly was still sleeping peacefully so after a while he and I had a late lunch together, making sure to fulfil Polly’s previous instructions to go enjoy something that she couldn’t, and then we went our separate ways. I had not been home long before I got the call from the hospital – Polly was starting to fade – so I contacted Scott and we both made our way back there again. The staff were lovely, they managed to rustle up some food for us as well as tea and coffee and would come and check on us in the waiting lounge, next door to where Polly was sleeping. At one point the nurse came in and said “you need a hug, she’s still sleeping”. Then, a while after, she came back in and said “I need a hug, she’s gone…”.

I was bereft. Whilst intellectually I knew this was inevitable, the reality of knowing that my life partner of 17 years was gone was so hard. The nurse told me us that we could see Polly now, and so Scott and I went to see her to say our final goodbye. She was so peaceful, and I was grateful that things had gone as she wanted and that she had been able to leave on her own terms and without the greater discomforts and pain that she was worried would still be coming. Polly had asked us to leave a CD on, and as we were leaving the nurses said to us “oh, we changed the CD earlier on today because it seemed strange to just have the one on all the time. We put this one on by someone called ‘Donna Williams’, it was really nice.”. So they had, unknowingly, put her own music on to play her out.

As you would expect if you had ever met Polly she had put all her affairs in order, including making preparations for her memorial as she wanted to make things as easy for me as possible. I arranged to have it live streamed for friends overseas and as part of that I got a recording of it, which I’m now making public below. Very sadly her niece Jacqueline, who talks at one point about going ice skating with her, has also since died.

Polly and I were so blessed to have 16 wonderful years together, and even at the end the fact that we did not treat death as a taboo and talked openly and frankly about everything (both as a couple and with friends) was such a boon for us. She made me such a better person and will always be part of my life, in so many ways.

Finally, I leave you with part of Polly’s poem & song “Still Awake”..

Time is a thief, which steals the chances that we never get to take.
It steals them while we are asleep.
Let’s make the most of it, while we are still awake.

Polly at Cardinia Reservoir, late evening

This item originally posted here:

Vale Polly Samuel (1963-2017): On Dying & Death

April 20, 2019

Now migrated to Drupal 8!

Now migrated to Drupal 8! kattekrab Sat, 20/04/2019 - 22:08

Leadership, and teamwork.

Leadership, and teamwork. kattekrab Fri, 13/04/2018 - 04:09

Makarrata

Makarrata kattekrab Thu, 14/06/2018 - 20:19

Communication skills for everyone

Communication skills for everyone kattekrab Sat, 17/03/2018 - 13:01

DrupalCon Nashville

DrupalCon Nashville kattekrab Sat, 17/03/2018 - 22:01

Powerful Non Defensive Communication (PNDC)

Powerful Non Defensive Communication (PNDC) kattekrab Sun, 10/03/2019 - 09:00

I said, let me tell you now

I said, let me tell you now kattekrab Sat, 10/03/2018 - 09:56

The Five Whys

The Five Whys kattekrab Sat, 16/06/2018 - 09:16

Site building with Drupal

Site building with Drupal kattekrab Sat, 17/02/2018 - 14:05

Six years and 9 months...

Six years and 9 months... kattekrab Sat, 27/10/2018 - 13:05

April 17, 2019

Programming an AnyTone AT-D878UV on Linux using Windows 10 and VirtualBox

I recently acquired an AnyTone AT-D878UV DMR radio which is unfortunately not supported by chirp, my usual go-to free software package for programming amateur radios.

Instead, I had to setup a Windows 10 virtual machine so that I could setup the radio using the manufacturer's computer programming software (CPS).

Install VirtualBox

Install VirtualBox:

apt install virtualbox virtualbox-guest-additions-iso

and add your user account to the vboxusers group:

adduser francois vboxusers

to make filesharing before the host and the guest work.

Finally, reboot to ensure that group membership and kernel modules are all set.

Create a Windows 10 virtual machine

Create a new Windows 10 virtual machine within VirtualBox. Then, download Windows 10 from Microsoft then start the virtual machine mounting the .iso file as an optical drive.

Follow the instructions to install Windows 10, paying attention to the various privacy options you will be offered.

Once Windows is installed, mount the host's /usr/share/virtualbox/VBoxGuestAdditions.iso as a virtual optical drive and install the VirtualBox guest additions.

Installing the CPS

With Windows fully setup, it's time to download the latest version of the computer programming software.

Unpack the downloaded file and then install it as Admin (right-click on the .exe).

Do NOT install the GD driver update or the USB driver, they do not appear to be necessary.

Program the radio

First, you'll want to download from the radio to get a starting configuration that you can change.

To do this:

  1. Turn the radio on and wait until it has finished booting.
  2. Plug the USB programming cable onto the computer and the radio.
  3. From the CPS menu choose "Set COM port".
  4. From the CPS menu choose "Read from radio".

Save this original codeplug to a file as a backup in case you need to easily reset back to the factory settings.

To program the radio, follow this handy third-party guide since it's much better than the official manual.

You should be able to use the "Write to radio" menu option without any problems once you're done creating your codeplug.

April 13, 2019

Secure ssh-agent usage

ssh-agent was in the news recently due to the matrix.org compromise. The main takeaway from that incident was that one should avoid the ForwardAgent (or -A) functionality when ProxyCommand can do and consider multi-factor authentication on the server-side, for example using libpam-google-authenticator or libpam-yubico.

That said, there are also two options to ssh-add that can help reduce the risk of someone else with elevated privileges hijacking your agent to make use of your ssh credentials.

Prompt before each use of a key

The first option is -c which will require you to confirm each use of your ssh key by pressing Enter when a graphical prompt shows up.

Simply install an ssh-askpass frontend like ssh-askpass-gnome:

apt install ssh-askpass-gnome

and then use this to when adding your key to the agent:

ssh-add -c ~/.ssh/key

Automatically removing keys after a timeout

ssh-add -D will remove all identities (i.e. keys) from your ssh agent, but requires that you remember to run it manually once you're done.

That's where the second option comes in. Specifying -t when adding a key will automatically remove that key from the agent after a while.

For example, I have found that this setting works well at work:

ssh-add -t 10h ~/.ssh/key

where I don't want to have to type my ssh password everytime I push a git branch.

At home on the other hand, my use of ssh is more sporadic and so I don't mind a shorter timeout:

ssh-add -t 4h ~/.ssh/key

Making these options the default

I couldn't find a configuration file to make these settings the default and so I ended up putting the following line in my ~/.bash_aliases:

alias ssh-add='ssh-add -c -t 4h'

so that I can continue to use ssh-add as normal and have not remember to include these extra options.

April 11, 2019

Using a MCP4921 or MCP4922 as a SPI DAC for Audio on Raspberry Pi

Share

I’ve been playing recently with using a MCP4921 as an audio DAC on a Raspberry Pi Zero W, although a MCP4922 would be equivalent (the ’22 is a two channel DAC, the ’21 is a single channel DAC). This post is my notes on where I got to before I decided that thing wasn’t going to work out for me.

My basic requirement was to be able to play sounds on a raspberry pi which already has two SPI buses in use. Thus, adding a SPI DAC seemed like a logical choice. The basic circuit looked like this:

MCP4921 SPI DAC circuit

Driving this circuit looked like this (noting that this code was a prototype and isn’t the best ever). The bit that took a while there was realising that the CS line needs to be toggled between 16 bit writes. Once that had been done (which meant moving to a different spidev call), things were on the up and up.

This was the point I realised that I was at a dead end. I can’t find a way to send the data to the DAC in a way which respects the timing of the audio file. Before I had to do small writes to get the CS line to toggle I could do things fast enough, but not afterwards. Perhaps there’s a DMA option instead, but I haven’t found one yet.

Instead, I think I’m going to go and try PWM based audio. If that doesn’t work, it will be a MAX219 i2c DAC for me!

Share

April 10, 2019

Audiobooks – March 2019

An Economist Gets Lunch: New Rules for Everyday Foodies by Tyler Cowen

A huge amount of practical advice and how and where to find the best food both locally and abroad. Plus good explanations as to why. 8/10

The Not-Quite States of America: Dispatches from the Territories and Other Far-Flung Outposts of the USA by Doug Mack

Writer tours the not-states of the USA. A bit too fluffy most of the time & too much hanging with US expats. Some interesting bits. 6/10

Shattered: Inside Hillary Clinton’s Doomed Campaign by Jonathan Allen & Amie Parnes

Chronology of the campaign based on background interviews with staffers. A ready needs a good knowledge of the race since this is assumed. Interesting enough. 7/10

Rush Hour by Iain Gatel

A history of commuting (from the early railway era), how it has driven changes in housing, work and society. Plus lots of other random stuff. Very pleasant. 8/10

Share

April 08, 2019

1-Wire home automation tutorial from linux.conf.au 2019, part 2

Share

For the actual on-the-day work, delegates were handed a link to these instructions in github. If you’re playing along at home, you should probably read 1-Wire home automation tutorial from linux.conf.au 2019, part 1 before attempting the work described here. Its especially important that you know the IP address of your board for example.

Relay tweaks

The instructions are pretty self explanatory, although I did get confused about where to connect the relay as I couldn’t find PC8 in my 40 pin header diagrams. That’s because the shields for the tutorial have a separate header which is a bit more convenient:

GPIO header

I was also a bit confused when the relay didn’t work initially, but that turns out because I’d misunderstood the wiring. The relay needs to be powered from the 3.3v pin on the 40 pin header, as there is a PCB error which puts 5v on the pins labelled as 3.3v on the GPIO header. I ended up with jumper wires which looked like this:

Cabling the relay

1-Wire issues

Following on the tutorial instructions worked well from then on until I tried to get 1-Wire setup. The owfs2mqtt bridge plugin was logging this:

2019-04-08 19:23:55.075: /opt/OWFS-MQTT-Bridge/lib/Daemon/OneWire.pm:148:Daemon::logError(): Connection to owserver failed: Can't connect owserver: Address not available

Debugging that involved connecting to the owfs2mqtt docker container (hint: ssh to the Orange Pi, do a docker ps, and then run bash inside the docker container for the addon). Running owserver with debug produces this:

owserver fails

Sorry to post that as an image, cut and paste for the hassos ssh server doesn’t like me for some reason. I suspect I have a defective DS2482, but I’ll have to wait and see what Allistair says.

Share

1-Wire home automation tutorial from linux.conf.au 2019, part 1

Share

I didn’t get much of a chance to work through the home automation tutorial at linux.conf.au 2019 because I ended up helping others in the room get their Orange Pi is booting. Now that things have settled down after the conference, I’ve had a chance to actually do some of the tutorial myself. These are my notes so I can remember what I did later…

Pre-tutorial setup

You need to do the pre-tutorial setup first. I use Ubuntu, which means its important that I use 18.10 or greater so that st-link is packaged. Apart from that the instructions as written just worked.

You also need to download the image for the SD card, which was provided on the day at the conference. The URL for that is from github. Download that image, decompress it, and then flash it to an SD card using something like Balena Etcher. The tutorial used 32gb SD cards, but the image will fit on something smaller than that.

hassos also doesn’t put anything on the Orange Pi HDMI port when it boots, so your machine is going to look like it didn’t boot. That’s expected. For the tutorial we provided a mapping from board number (mac address effectively) to IP address allocated in the tutorial. At home if you’re using an Orange Pi that isn’t from the conference you’re going to have to find another way to determine the IP address of your Orange Pi.

The way we determined MAC addresses and so forth for the boards used at the conference was to boot an Armbian image and then run a simple python script which performed some simple checks of each board by logging into the board over serial. The MAC addresses for the boards handed out on the day are on github.

An Aside: Serial on the Orange Pi Prime

As an aside, the serial on the Orange Pi Prime is really handy, especially with the hassos image. Serial is exposed by a three pin header on the board, which is sort of labelled:

Orange Pi Prime Serial PortThe Orange Pi Prime Serial Port

Noting that you’ll need to bend the pins of the serial header a little if you’re using the shield from the conference:

Serial port connected to a USB to serial converter

The advantage being that suddenly you get useful debugging information! The serial connection is 115200 baud 8N1 (8 data bits, no parity, 1 stop bit) by the way.

Serial debug information from an hassos boot

The hassos image used for the conference allows login as root with no password over serial, which dumps you into a hass interface. Type “login” to get a bash prompt, even though its not in the list of commands available. At this point you can use the “ip address” command to work out what address DHCP handed the board.

The actual on-the-day work

So at this point we’re about as ready as people were meant to be when they walked into the room for the tutorial. I’ll write more notes when I complete the actual tutorial.

Share

April 05, 2019

Introducing GangScan

Share

As some of you might know, I am a Scout Leader. One of the things I do for Scouts is I assist in a minor role with the running of Canberra Gang Show, a theatre production for young people.

One of the things Gang Show cares about is that they need to be able to do rapid roll calls and reporting on who is present at any given time — this is used for working out who is absent before a performance (and therefore needs an understudy), as well as ensuring we know where everyone is in an environment that sometimes has its fire suppression systems isolated.

Before I came along, Canberra Gang Show was doing this with a Windows based attendance tracking application, and 125kHz RFID tags. This system worked just fine, except that the software was clunky and there was only one badge reader — we struggled explaining to youth that they need to press the “out” button when logging out, and we wanted to be able to have attendance trackers at other locations in the theatre instead of forcing everyone to flow through a single door.

So, I got thinking. How hard can it be to build something a bit better?

Let’s start with some requirements: simple to deploy and manage; free software (both cost and freedom); more badge readers than what we have now; and low cost.

My basic proposal for such a thing is a Raspberry Pi Zero W, with a small LCD screen and a RFID reader. The device scans badges, and displays a confirmation of scan to the user. If the device can talk to a central server it streams events to it; otherwise it queues them until the server is available and then streams them.

Sourcing a simple SPI LCD screen and SPI RFID reader from ebay wasn’t too hard, and we were off! The only real wart was that I wanted to use 13.56mHz RFID cards, because then I could store some interesting (up to 1kb) data on the card itself. The first version was simply a ribbon cable:

v0.0, a ribbon cable

Which then led to me having my first PCB ever made. Let’s ignore that its the wrong size shall we?

v0.1, an incorrectly sized PCB

I’m now at the point where the software for the scanner is reasonable, and there is a bare bones server that does enough roll call that it should be functional. I am sure there’s more to be done, but it works enough to demo. One thing I learned while showing off the device at coffee the other day is that it really needs to make a noise when you scan a badge. I’ve ordered a SPI DAC to play with, which might be the solution there. Other next steps include a newer version of the PCB, and some sort of case solution. I’ll do another post when things progress further.

Oh yes, and I’ll eventually release the software too once its in a more workable state.

Share

April 04, 2019

React Isn’t The Problem

As React (via Gutenberg) becomes more present in the WordPress world, I’m seeing some common themes pop up in conversations about it. I spoke a bit about this kind of thing at WordCamp US last year, but if you don’t feel like sitting through a half hour video, let me summarise my thoughts. 🙂

I agree that React is hard. I strongly disagree with the commonly contrasted view that HTML, CSS, PHP, or vanilla JavaScript are easy. They’re all just as hard to work with as React, sometimes more-so, particularly when having to deal with the exciting world of cross-browser compatibility.

The advantage that PHP has over modern JavaScript development isn’t that it’s easy, or that the tooling is better, or more reliable, or anything like that. The advantage is that it’s familiar. If you’re new to web development, React is just as easy anything else to start with.

I’m honestly shocked when someone manages to wade through the mess of tooling (even pre-Gutenberg) to contribute to WordPress. It’s such an incomprehensible, thankless, unreliable process, the tenacity of anyone who makes it out the other side should be applauded. That said, this high barrier is unacceptable.

I’ve been working in this industry for long enough to have forgotten the number of iterations of my personal development environment I’ve gone through, to get to where I can set up something for myself which isn’t awful. React wasn’t around for all of that time, so that can’t be the reason web development has been hard for as long as I remember. What is, then?

Doing Better

Over the past year or so, I’ve been tinkering with a tool to help deal with the difficulties of contributing to WordPress. That tool is called TestPress, it’s getting pretty close to being usable, at least on MacOS. Windows support is a little less reliable, but getting better. 🙂 If you enjoy tinkering with tools, too, you’re welcome to try out the development version, but it does still has some bugs in it. Feedback and PRs are always welcome! There are some screenshots in this issue that give an idea of what the experience is like, if you’d like to check it out that way.

TestPress is not a panacea: at best, it’s an attempt at levelling the playing field a little bit. You shouldn’t need years of experience to build a reliable development environment, that should be the bare minimum we provide.

React is part of the solution

There’s still a lot of work to do to make web development something that anyone can easily get into. I think React is part of the solution to this, however.

React isn’t without its problems, of course. Modern JavaScript can encourage iteration for the sake of iteration. Certainly, there’s a drive to React-ify All The Things (a trap I’m guilty of falling into, as well). React’s development model is fundamentally different to that of vanilla JavaScript or jQuery, which is why it can seem incomprehensible if you’re already well versed in the old way of doing things: it requires a shift in your mental model of how JavaScript works. This is a hard problem to solve, but it’s not insurmountable.

Perhaps a little controversially, I don’t think that React is guilty of causing the web to become less accessible. At worst, it’s continuing the long standing practice of web standards making accessibility an optional extra. Building anything beyond a basic, non-interactive web page with just HTML and CSS will inevitably cause accessibility issues, unless you happen to be familiar with the mystical combinations of accessible tags, or applying aria attributes, or styling your content in just the right way (and none of the wrong ways).

React (or any component-based development system, really) can improve accessibility for everyone, and we’re seeing this with Gutenberg already. By providing a set of base components for plugin and theme authors to use, we can ensure the correct HTML is produced for screen readers to work with. Much like desktop and mobile app developers don’t need to do anything to make their apps accessible (because it’s baked into the APIs they use to build their apps), web developers should have the same experience, regardless of the complexity of the app they’re building.

Arguing that accessibility needs to be part of the design process is the wrong argument. Accessibility shouldn’t be a consideration, it should be unavoidable.

Do Better

Now, can we do better? Absolutely. There’s always room for improvement. People shouldn’t need to learn React if they don’t want to. They shouldn’t have to deal with the complexities of the WCAG. They should have the freedom to tinker, and the reassurance that they can tinker without breaking everything.

The pre-React web didn’t arrive in its final form, all clean, shiny, and perfect. It took decades of evolution to get there. The post-React web needs some time to evolve, too, but it has the benefit of hindsight: we can compress the decades of evolving into a much shorter time period, provide a fresh start for those who want it, while also providing backwards compatibility with the existing ways of doing things.

April 03, 2019

Easybuild: Building Software with Ease

Building software from source is necessary for performance and development reasons. However, this can come with complex dependency and compiler requirements, which have to be explicitly stated in research computing to ensure replication of results. EasyBuild, originally developed by the Julich Supercomputing Centre, the University of Gent, and the Texas Advanced Computing Center, is a tool that allows the building of software with ease, managing the complex dependencies and toolchains, and integrating by default with the Lmod environment modules system.

This presentation will outline the need for tools like Easybuild, describe the framework of Easyblocks, Toolchains, and Easyconfig recipes, and extensions. It will also describe how to install and configure EasyBuild, write and contribute configuration files, and use the configurations to install software with a variety of optional parameters, such as rebuilds and partial builds. Finally, it will conclude with a discussion of some of the more advanced options and opportunities for involvement in the Easybuild community.

Easybuild: Building Software with Ease
Presentation to Linux Users of Victoria, 2nd April, 2010

April 01, 2019

Article Review: Curing the Vulnerable Parser

Every once in a while I read papers or articles. Previously, I've just read them myself, but I was wondering if there were more useful things I could do beyond that. So I've written up a summary and my thoughts on an article I read - let me know if it's useful!

I recently read Curing the Vulnerable Parser: Design Patterns for Secure Input Handling (Bratus, et al; USENIX ;login: Spring 2017). It's not a formal academic paper but an article in the Usenix magazine, so it doesn't have a formal abstract I can quote, but in short it takes the long history of parser and parsing vulnerabilities and uses that as a springboard to talk about how you could design better ones. It introduces a toolkit based on that design for more safely parsing some binary formats.

Background

It's worth noting early on that this comes out of the LangSec crowd. They have a pretty strong underpinning philosophy:

The Language-theoretic approach (LANGSEC) regards the Internet insecurity epidemic as a consequence of ad hoc programming of input handling at all layers of network stacks, and in other kinds of software stacks. LANGSEC posits that the only path to trustworthy software that takes untrusted inputs is treating all valid or expected inputs as a formal language, and the respective input-handling routines as a recognizer for that language. The recognition must be feasible, and the recognizer must match the language in required computation power.

A big theme in this article is predictability:

Trustworthy input is input with predictable effects. The goal of input-checking is being able to predict the input’s effects on the rest of your program.

This seems sensible enough at first, but leads to some questionable assertions, such as:

Safety is predictability. When it's impossible to predict what the effects of the input will be (however valid), there is no safety.

They follow this with an example of Ethereum contracts stealing money from the DAO. The example is compelling enough, but again comes with a very strong assertion about the impossibility of securing a language virtual machine:

From the viewpoint of language-theoretic security, a catastrophic exploit in Ethereum was only a matter of time: one can only find out what such programs do by running them. By then it is too late.

I'm not sure that (a) I buy the assertions, or that (b) they provide a useful way to deal with the world as we find it.

Is this even correct?

You can tease out 2 contentions in the first part of the article:

  • there should be a formal language that describes the data, and
  • this language should be as simple as possible, ideally being regular and context-free.

Neither of these are bad ideas - in fact they're both good ideas - but I don't know that I draw the same links between them and security.

Consider PostScript as a possible counter-example. It's a Turing-complete language, so it absolutely cannot have predictable results. It has a well documented specification and executes in a restricted virtual machine. So let's say that it satisfies only the first plank of their argument.

I'd say that PostScript has a good security record, despite being Turing complete. PostScript has been around since 1985 and apart from the recent bugs in GhostScript, it doesn't have a long history of bugs and exploits. Maybe this just because no-one has really looked, or maybe it is possible to have reasonably safe complex languages by restricting the execution environment, as PostScript consciously and deliberately does.

Indeed, if you consider the recent spate of GhostScript bugs, perhaps some may be avoided by stricter compliance with a formal language specification. However, most seem to me to arise from the desirability of implementing some of the PostScript functionality in PostScript itself, and some of the GhostScript-specific, stupendously powerful operators exposed to the language to enable this. The bugs involve tricks to allow a user to get access to these operators. A non-Turing-complete language may be sufficient to prevent these attacks, but it is not necessary: just not doing this sort of meta-programming with such dangerous operators would also have worked. Storing the true values of the security state outside of a language-accessible object would also be good.

Is this a useful way to deal with the world as we find it?

My main problem with the general LangSec approach that this article takes is this: to get to their desired world, we need to rewrite a bunch of things with entirely different language foundations. The article talks about HTML and PDFs as examples of unsafe formats, but I cannot imagine the sudden wholesale replacement of either of these - although I would love to be proven wrong.

Can we get even part of the way with existing standards? Kinda-sorta, but mostly no, and to the authors' credit, they are open about this. They argue that formal definition parsing the language should be the "most restrictive input definition" - they specifically require you to "give up attempting to accept arbitrarily complex data", and call for "subsetting of many protocols, formats, encodings and command languages, including eliminating unneeded variability and introducing determinism and static values".

No doubt we would be in a better place if people took up these ideas for future programs. However, for our current set of programs and use cases, this is probably not tractable in any meaningful way.

The rest of the paper

The rest of the paper is reasonably interesting. Their general theory is that you should build your parsers based on a formal definition of a language, and that the parser should convert the input data to a set of objects, and then your business logic should deal with those objects. This is the 'recognizer pattern', and is illustrated below:

The recognizer pattern: separate code parses input according to a formal grammar, creating valid objects that are passed to the business logic

In short, the article is full of great ideas if you happen to be parsing a simple language, or are designing not just a parser but a full language ecosystem. They do also provide a binary parser toolkit that might be helpful if you are parsing a binary format that can be expressed with a parser combinator.

Overall, however, I think the burden of maintaining old systems is such that a security paradigm that relies on new code is pretty unlikely, and one that relies on new languages is fatally doomed from the outset. New systems should take up these ideas, yes. But I'd really like to see people grappling with how to deal with the complex and irregular languages that we're stuck with (HTML, PDF, etc) in secure ways.

Praise-Singing Poppler Utilities

Last year I gave a presentation at Linux Users of Victoria entitled Being An Acrobat: Linux and PDFs (there was an additional discussion not in the presentation about embedding Javascript in a PDF and some related security issues, but that's for another post). Part of this presentation was singing the praises of Poppler Utilities (named after the Futurama episode, "The Problem with Popplers"). This is probably the most single most common suite of tools I use when dealing with PDFs with the single exception of reading and creating. There is an enormous number of times I have encountered an issue with PDFs and resolved it with something from the Popper utils suite.

The entire relevant slide from the presentation is reproduced was as follows:

Derived from xpdf, Poppler is a software library for rendering PDF documents and is used in a variety of PDF readers (including Evince, KPDF, LibreOffice, Inkscape, Okular, Zathura etc). A collection of tools, poppler-utils, is built on Poppler’s API provides a variety of useful functions e.g.,

pdffonts - lists the fonts used in a PDF (e.g., pdfonts filename.pdf)
pdfimages - extract images from a PDF (e.g., pdfimages -png filename.pdf images/)
pdfseparate - extract single pages from a PDF (e.g., pdfseparate sample.pdf sample-%d.pdf)
pdftohtml - convert PDF to HTML format retaining formatting
pdftops - convert PDF to printable PS format
pdftotext - extract text from a PDF
pdfunite - merges PDFs (pdfunite page{01..13}.pdf combined.pdf)

Recently I had an experience with this that illustrates a practical example of one of the tools. I am currently doing a MSc in Information Systems at the University of Salford. The course content itself is conducted through the Robert Kennedy College in Swizterland. For each asssignment the course accepts uploads for one file and one file only, and only in particular formats as well (e.g., docx, pdf, rtf etc). An additional upload on the RKC system will overwrite one's previously submitted assignment file.

If you have multi-part components to an assignment, you will have to export them to a common format, combine them, and upload them as a single document. In a project management course, I ended with several files, as the assignment demanded a title page, a slideshow with nodes, a main body of the assignment (business case and project product plan), a Gannt chart (created through ProjLibre), and an reference and appendix file.

At the end of the assignment, I had a Title Page file, a Part A file, Part B Main Body file, Part B LibreProj file, and a Refs and Appendix File. First I converted them all to PDFs. This is one of the file formats accepted, and is a common export format for the text, slideshow, and project. Then I used the application pdfunite (Linux application) to combine them into a single file. e.g.,

pdfunite title.pdf parta.pdf partb.pdf gannt.pdf refs.pfs assignment.pdf

Quite clearly RKC has a limited and arguably poorly designed upload system. But the options are either complain about it, give up, or work around it. Information systems science demands the latter because we will encounter this all the time. We all have to learn how to workaround system limitations. When it comes to PDF limitations, I have found that Popper Utilities are one of the most useful tools available and I have found that I use the various utilities with almost alarming regularity. So here's a few more that I didn't mention in my initial presentation due to time constraints:

pdfdetach - extract embedded documents from a PDF (e.g., pdfdetach --saveall filename.pdf)
pdfinfo - print file information from a PDF (title, subject, keywords, author, creator etc) (e.g., pdfinfo filename.pdf)
pdftocairo - convert pdf to a png/jpeg/tiff/ps/eps/svg using cairo (e.g., pdftocairo -svg filename.pdf filename.svg)
pdftoppm - convert pdf to Portable Pixmap bitmaps (e.g., pdftoppm filename.pdf filename)

March 30, 2019

Digital government: it all starts with open

This is a short video I did on the importance of openness for digital government, for the EngageTech Forum 2018. I’ve had a few people reuse it for other events so I thought I should blog it properly :) Please see the transcript below. 

<Conference introductory remarks>

I wanted to talk about why openness and engagement is so critical for our work in a modern public service.

For me, looking at digital government, it’s not just about digital services, it’s about how we transform governments for the 21st century: how we do service delivery, engagement, collaboration, and how we do policy, legislation and regulation. How we make public services fit for purpose so they can serve you, the people, communities and economy of the 21st century.

For me, a lot of people think about digital and think about technology, but open government is a founding premise, a founding principle for digital government. Open that’s not digital doesn’t scale, and digital that’s not open doesn’t last. That doesn’t just mean looking at things like open source, open content and open APIs, but it means being open. Open to change. Being open to people and doing things with people, not just to people.

There’s a fundamental cultural, technical and process shift that we need to make, and it all starts with open.

<closing conference remarks>

March 29, 2019

Installing an xmonad Environment on NixOS

NixOS Coin by Craige McWhirter

Xmonad is a very fast, reliable and flexible window manager for Linux and other related operating systems. As I recently shifted from Debian + Propellor to NixOS + NixOps, I now needed to redefine my Xmonad requirements for the new platform.

TL;DR

  • Grab my xmonad.nix file and import it in your /etc/nixos/configuration.nix
  • You can also grab my related xmonad.hs, xmobarrc and session files to use the complete setup.

An Example

At the time of writing, I used the below xmonad.nix to install my requirements. My current version can be found here.

# Configuration for my xmonad desktop requirements

{ config, pkgs, ... }:

{

  services.xserver.enable = true;                        # Enable the X11 windowing system.
  services.xserver.layout = "us";                        # Set your preferred keyboard layout.
  services.xserver.desktopManager.default = "none";      # Unset the default desktop manager.
  services.xserver.windowManager = {                     # Open configuration for the window manager.
    xmonad.enable = true;                                # Enable xmonad.
    xmonad.enableContribAndExtras = true;                # Enable xmonad contrib and extras.
    xmonad.extraPackages = hpkgs: [                      # Open configuration for additional Haskell packages.
      hpkgs.xmonad-contrib                               # Install xmonad-contrib.
      hpkgs.xmonad-extras                                # Install xmonad-extras.
      hpkgs.xmonad                                       # Install xmonad itself.
    ];
    default = "xmonad";                                  # Set xmonad as the default window manager.
  };

  services.xserver.desktopManager.xterm.enable = false;  # Disable NixOS default desktop manager.

  services.xserver.libinput.enable = true;               # Enable touchpad support.

  services.udisks2.enable = true;                        # Enable udisks2.
  services.devmon.enable = true;                         # Enable external device automounting.

  services.xserver.displayManager.sddm.enable = true;    # Enable the default NixOS display manager.
  services.xserver.desktopManager.plasma5.enable = true; # Enable KDE, the default NixOS desktop environment.

  # Install any additional fonts that I require to be used with xmonad
  fonts.fonts = with pkgs; [
    opensans-ttf             # Used in in my xmobar configuration
  ];

  # Install other packages that I require to be used with xmonad.
  environment.systemPackages = with pkgs; [
    dmenu                    # A menu for use with xmonad
    feh                      # A light-weight image viewer to set backgrounds
    haskellPackages.libmpd   # Shows MPD status in xmobar
    haskellPackages.xmobar   # A Minimalistic Text Based Status Bar
    libnotify                # Notification client for my Xmonad setup
    lxqt.lxqt-notificationd  # The notify daemon itself
    mpc_cli                  # CLI for MPD, called from xmonad
    scrot                    # CLI screen capture utility
    trayer                   # A system tray for use with xmonad
    xbrightness              # X11 brigthness and gamma software control
    xcompmgr                 # X composting manager
    xorg.xrandr              # CLI to X11 RandR extension
    xscreensaver             # My preferred screensaver
    xsettingsd               # A lightweight desktop settings server
  ];

}

This provides my xmonad environment with everything I need for xmonad to run as configured.

Herringback

It occurs to me that I never wrote up the end result of the support ticket I opened with iiNet after discovering significant evening packet loss on our fixed wireless NBN connection in August 2017.

The whole saga took about a month. I was asked to run a battery of tests (ping, traceroute, file download and speedtest, from a laptop plugged directly into the NTD) three times a day for three days, then send all the results in so that a fault could be lodged. I did this, but somehow there was a delay in the results being communicated, so that by the time someone actually looked at them, they were considered stale, and I had to run the whole set of tests all over again. It’s a good thing I work from home, because otherwise there’s no way it would be possible to spend half an hour three times a day running tests like this. Having finally demonstrated significant evening slowdowns, a fault was lodged, and eventually NBN Co admitted that there was congestion in the evenings.

We have investigated and the cell which this user is connected to experiences high utilisation during busy periods. This means that the speed of this service is likely to be reduced, particularly in the evening when more people are using the internet.

nbn constantly monitors the fixed wireless network for sites which require capacity expansion and we aim to upgrade site capacity before congestion occurs, however sometimes demand exceeds expectations, resulting in a site becoming congested.

This site is scheduled for capacity expansion in Quarter 4, 2017 which should result in improved performance for users on the site. While we endeavour to upgrade sites on their scheduled date, it is possible for the date to change.

I wasn’t especially happy with that reply after a support experience that lasted for a month, but some time in October that year, the evening packet loss became less, and the window of time where we experienced congestion shrank. So I guess they did do some sort of capacity expansion.

It’s been mostly the same since then, i.e. slower in the evenings than during the day, but, well, it could be worse than it is. There was one glitch in November or December 2018 (poor speed / connection issues again, but this time during the day) which resulted in iiNet sending out a new router, but I don’t have a record of this, because it was a couple of hours of phone support that for some reason never appeared in the list of tickets in the iiNet toolbox, and even if it had, once a ticket is closed, it’s impossible to click it to view the details of what actually happened. It’s just a subject line, status and last modified date.

Fast forward to Monday March 25 2019 – a day with a severe weather warning for damaging winds – and I woke up to 34% packet loss, ping times all over the place (32-494ms), continual disconnections from IRC and a complete inability to use a VPN connection I need for work. I did the power-cycle-everything dance to no avail. I contemplated a phone call to support, then tethered my laptop to my phone instead in order to get a decent connection, and decided to wait it out, confident that the issue had already been reported by someone else after chatting to my neighbour.

hideous-packet-loss-march-2019

Tuesday morning it was still horribly broken, so I unplugged the router from the NTD, plugged a laptop straight in, and started running ping, traceroute and speed tests. Having done that I called support and went through the whole story (massive packet loss, unusable connection). They asked me to run speed tests again, almost all of which failed immediately with a latency error. The one that did complete showed about 8Mbps down, compared to the usual ~20Mbps during the day. So iiNet lodged a fault, and said there was an appointment available on Thursday for someone to come out. I said fine, thank you, and plugged the router back in to the NTD.

Curiously, very shortly after this, everything suddenly went back to normal. If I was a deeply suspicious person, I’d imagine that because I’d just given the MAC address of my router to support, this enabled someone to reset something that was broken at the other end, and fix my connection. But nobody ever told me that anything like this happened; instead I received a phone call the next day to say that the “speed issue” I had reported was just regular congestion and that the tower was scheduled for an upgrade later in the year. I thanked them for the call, then pointed out that the symptoms of this particular issue were completely different to regular congestion and that I was sure that something had actually been broken, but I was left with the impression that this particular feedback would be summarily ignored.

I’m still convinced something was broken, and got fixed. I’d be utterly unsurprised if there had been some problem with the tower on the Sunday night, given the strong winds, and it took ’til mid-Tuesday to get it sorted. But we’ll never know, because NBN Co don’t publish information about congestion, scheduled upgrades, faults and outages anywhere the general public can see it. I’m not even sure they make this information consistently available to retail ISPs. My neighbour, who’s with a different ISP, sent me a notice that says there’ll be maintenance/upgrades occurring on April 18, then again from April 23-25. There’s nothing about this on iiNet’s status page when I enter my address.

There was one time in the past few years though, when there was an outage that impacted me, and it was listed on iiNet’s status page. It said “customers in the area of Herringback may be affected”. I initially didn’t realise that meant me, as I’d never heard for a suburb, region, or area called Herringback. Turns out it’s the name of the mountain our NBN tower is on.

LUV April 2019 Main Meeting: EasyBuild

Apr 2 2019 19:00
Apr 2 2019 21:00
Apr 2 2019 19:00
Apr 2 2019 21:00
Location: 
Kathleen Syme Library, 251 Faraday Street Carlton VIC 3053

PLEASE NOTE LATER START TIME

7:00 PM to 9:00 PM Tuesday, April 2, 2019
Kathleen Syme Library, 251 Faraday Street Carlton VIC 3053

Speakers:

  • Lev Lafayette: EasyBuild

 

Many of us like to go for dinner nearby after the meeting, typically at Brunetti's or Trotters Bistro in Lygon St.  Please let us know if you'd like to join us!

Linux Users of Victoria is a subcommittee of Linux Australia.

April 2, 2019 - 19:00

read more

LUV April 2019 Workshop: Computerbank tour

Apr 27 2019 15:00
Apr 27 2019 19:00
Apr 27 2019 15:00
Apr 27 2019 19:00
Location: 
Computerbank Victoria, 1 Stawell St. North Melbourne

PLEASE NOTE DIFFERENT TIME AND LOCATION

Brian Salter-Duke will be giving a tour of the new premises of Computerbank Victoria at 1 Stawell St. North Melbourne starting at 15:00, followed by a meeting at Errol's Cafe at 69-71 Errol St. North Melbourne from 17:00 onward.

 

Linux Users of Victoria is a subcommittee of Linux Australia.

April 27, 2019 - 15:00

March 26, 2019

Brilliant Smart Wifi plug with Tasmota

Share

A couple of weeks ago I was playing with Tuya derived smart light globes (Mirabella Genio from K-Mart Australia in my case, but there are a variety of other options as well). Now Bunnings has the Brillant Smart Wifi Plug on special for $20 and I decided to give that a go as well, given it is also Tuya derived.

The basic procedure for OTA flashing was the same as flashing the globes, except that you hold down the button on the device for five seconds to put it into flash mode. That all worked brilliantly, until I appear to have fat fingered my wifi details in Tasmota — when I rebooted the device it never appeared on my network.

That would be much more annoying on the globes, but it turns out these smart plugs are really easy to open and that Tuya has documented the pin out of the controlling microprocessor. So, I ended up temporarily soldering some cables to the microprocessor to debug what had gone wrong. It should be noted that as a soldering person I make a great software engineer:

Jumper wires soldered to the serial port.

Once you’ve connected with a serial console, its pretty obvious who can’t be trusted to type their wifi password correctly:

I can’t have nice things.

At this point I was in the process of working out how to use esptool to re-flash the plug when I got super lucky. However, first… Where the hell is GPIO0 (the way you turn on flashing mode)? Its not broken out on the pins for the MCU, but as a kind redditor pointed out, it is exposed on a pad on the back of the board:

The cunningly hidden GPIO0.

…and then I got lucky. You see, to put the MCU into flashing mode you short GPIO0 to ground. I was having trouble getting that to work with esptool, so I had a serial console attached to see what was happening. There I was, shorting GPIO0 out over and over trying to get the magic to work. However, Tasmota also setups up a button on GPIO0 by default, because the sonoffs have a hardware button on that pin. If you hit that button with four short presses, you put the device back into captive portal configuration mode

Once I was back in that mode I could just use a laptop over wifi to re-enter the wifi password and I’m good to go. In hindsight I didn’t need the serial port if I could have powered the device and shorted that pin four times, but it sure was nice to be told what was happening on the serial console while poking around.

Share

March 22, 2019

Seeding sccache for Faster Brave Browser Builds

Compiling the Brave Browser (based on Chromium) on Linux can take a really long time and so most developers use sccache to cache objects files and speed up future re-compilations.

Here's the cronjob I wrote to seed my local cache every work day to pre-compile the latest builds:

30 23 * * 0-4   francois  /usr/bin/chronic /home/francois/bin/seed-brave-browser-cache

and here are the contents of that script:

#!/bin/bash
set -e

# Set the path and sccache environment variables correctly
source ${HOME}/.bashrc-brave
export LANG=en_CA.UTF-8

cd ${HOME}/devel/brave-browser-cache

echo "Environment:"
echo "- HOME = ${HOME}"
echo "- PATH = ${PATH}"
echo "- PWD = ${PWD}"
echo "- SHELL = ${SHELL}"
echo "- BASH_ENV = ${BASH_ENV}"
echo

echo $(date)
echo "=> Clean up repo and delete old build output"
rm -rf src/out node_modules src/brave/node_modules
git clean -f -d
git checkout HEAD package-lock.json

echo $(date)
echo "=> Update repo"
git pull
npm install
npm run init

echo $(date)
echo "=> Debug build"
killall sccache || true
ionice nice timeout 4h npm run build || ionice nice timeout 4h npm run build
ionice nice ninja -C src/out/Debug brave_unit_tests
ionice nice ninja -C src/out/Debug brave_browser_tests
echo

echo $(date)
echo "=>Release build"
killall sccache || true
ionice nice timeout 5h npm run build Release || ionice nice timeout 5h npm run build Release
ionice nice ninja -C src/out/Release brave_unit_tests
ionice nice ninja -C src/out/Release brave_browser_tests
echo

echo $(date)
echo "=> Delete build output"
rm -rf src/out

March 16, 2019

I, Robot

Share

Not the book of the movie, but the collection of short stories by Isaac Asimov. I’ve read this book several times before and enjoyed it, although this time I found it to be more dated than I remembered, both in its characterisations of technology as well as it’s handling of gender. Still enjoyable, but not the best book I’ve read recently.

I, Robot Book Cover I, Robot
Isaac Asimov
Fiction
Spectra
2004
224

The development of robot technology to a state of perfection by future civilizations is explored in nine science fiction stories.

Share

March 10, 2019

Running Home Assistant on Fedora with Docker

Home Assistant is a really great, open source home automation platform written in Python which supports hundreds of components. They have a containerised version called Hass.io which can run on a bunch of hardware and has a built-in marketplace to make the running of addons (like Let’s Encrypt) easy.

I’ve been running Home Assistant on a Raspberry Pi for a couple of years, but I want something that’s more poweful and where I have more control. Here’s how you can use the official Home Assistant containers on Fedora (note that this does not include their Hass.io marketplace).

First, install Fedora Server edition, which comes with the handy web UI for managing the system called Cockpit.

Once you’re up and running, install Docker and the Cockpit plugin.

sudo dnf install -y docker cockpit-docker

Now we can start and enable the Docker daemon and restart cockpit to load the Docker plugin.

sudo systemctl start docker && sudo systemctl enable docker
sudo systemctl restart cockpit

Create a location for the Home Assistant configuration and set the appropriate SELinux context. This lets you modify the configuration directly from the host and restart the container to pick up the change.

sudo mkdir -p /hass/config
sudo chcon -Rt svirt_sandbox_file_t /hass

Start up a container called hass using the Home Assistant Docker image which will start automatically thanks to the restart option. We pass through the /hass/config directory on the host as /config inside the container.

docker run --init -d \
--restart unless-stopped \
--name="hass" \
-v /hass/config/:/config \
-v /etc/localtime:/etc/localtime:ro \
--net=host \
homeassistant/home-assistant

You should be able to see the container starting up.

sudo docker ps

If you need to, you can get the logs for the container.

sudo docker logs hass

Once it’s started you should see port 8123 listening on the host

sudo ss -ltnp |grep 8123

Finally, enable port 8123 on the firewall to access the service on your network.

sudo firewall-cmd --zone=FedoraServer --add-port=8123/tcp
sudo firewall-cmd --runtime-to-permanent

Now browse to the IP address of your server on port 8123 and you should see Home Assistant. Create an account to get started!

March 08, 2019

Secure Data in Public Configuration Management With Propellor

Steampunk propeller ring by Daniel Proulx

TL;DR

List fields and contexts:

$ propellor --list-fields

Set a field for a particular context:

$ propellor --set 'SshAuthorizedKeys "myuser"' yourServers < authKeys

Dump a field from a specific context:

$ propellor --dump 'SshAuthorizedKeys "myuser"' yourServers

An Example

When using Propellor for configuration management, you can utilise GPG encryption to encrypt data sets. This enables you to leverage public git repositories for your centralised configuration management needs.

To list existing fields, you can run:

$ propellor --list-fields

which will not only list existing fields but will helpfully also list fields that would be used if set:

Missing data that would be used if set:
Field                             Context          Used by
-----                             -------          -------
'Password "myuser"'               'yourDesktops'   your.host.name
'CryptPassword "myuser"'          'yourServers'    your.server.name
'PrivFile "/etc/mail/dkim.key"'   'mailServers'    your.mail.server

You can set these fields with input from either STDIN or files prepared earlier.

For example, if you have public SSH keys you wish to distribute, you can place then into a file then use that file to populate the fields of an appropriate context. The contents of an example authorized_keys, we'll call authKeys, may look like this:

ssh-ed25519 eetohm9doJ4ta2Joo~P2geetoh6aBah9efu4ta5ievoongah5feih2eY4fie9xa1ughi you@host1
ssh-ed25519 choi7moogh<i2Jie6uejoo6ANoMei;th2ahm^aiR(e5Gohgh5Du-oqu1roh6Mie4shie you@host2
ssh-ed25519 baewah%vooPho2Huofaicahnob=i^ph;o1Meod:eugohtiuGeecho2eiwi.a7cuJain6 you@host3

To add these keys to the appropriate users for the hosts of a particular context you could run:

$ propellor --set 'SshAuthorizedKeys "myuser"' yourServers < authKeys

To verify that the fields for this context have the correct data, you can dump it:

$ propellor --dump 'SshAuthorizedKeys "myuser"' yourServers
gpg: encrypted with 256-bit ECDH key, ID 5F4CEXB7GU3AHT1E, created 2019-03-08
      "My User <myuser@my.domain.tld>"
      ssh-ed25519 eetohm9doJ4ta2Joo~P2geetoh6aBah9efu4ta5ievoongah5feih2eY4fie9xa1ughi you@host1
      ssh-ed25519 choi7moogh<i2Jie6uejoo6ANoMei;th2ahm^aiR(e5Gohgh5Du-oqu1roh6Mie4shie you@host2
      ssh-ed25519 baewah%vooPho2Huofaicahnob=i^ph;o1Meod:eugohtiuGeecho2eiwi.a7cuJain6 you@host3

When you next spin Propellor for the desired hosts, those SSH public keys with be installed into the authorized_keys_ filefor the user myuser for hosts that belong to the allServers context.

Setting and Storing Passwords

One of the most obvious and practical uses of this feature is to set secure data that needs to be distributed, such as passwords or certificates. We'll use passwords for this example.

Create a hash of the password you wish to distribute:

$ mkpasswd -m sha-512 > /tmp/deleteme
Password:
$ cat /tmp/deleteme
$6$cyxX.TmGPZWuqQu$LxhbVBaUnFmevOVi1V1NApZA0TCcSkK1241eiZwhhBQTm/PpjoLHe3OMnbjeswa6rgzNAq3pXTB4KjvfF1iXA1

Now that we have that file, we can use it as input for Propellor:

$ propellor --set 'CryptPassword "myuser"' yourServers < /tmp/deleteme
Enter private data on stdin; ctrl-D when done:
gpg: encrypted with 256-bit ECDH key, ID 5F4CEXB7GU3AHT1E, created 2019-03-08
      "My User <myuser@my.domain.tld>"
gpg: WARNING: standard input reopened
Private data set.

Tidy up:

$ rm /tmp/deletem

You're now ready to deploy that password for that user to those servers.

Mirabella Genio smart lights with Tasmota and Home Assistant

Share

One of the things I like about Home Assistant is that it allows you to take hardware from a bunch of various vendors and stitch it together into a single consistent interface. So for example I now have five home automation vendor apps on my phone, but don’t use any of them because Home Assistant manages everything.

A concrete example — we have Philips Hue lights, but they’re not perfect. They’re expensive, require a hub, and need to talk to a Philips data centre to function (i.e. the internet needs to work at my house, which isn’t always true thanks to the failings of the Liberal Party).

I’d been meaning to look at the cheapo smart lights from Kmart for a while, and finally got around to it this week. For $15 you can pickup a dimmable white globe, and for $29 you can have a RGB one. That’s heaps cheaper than the Hue options. Even better, the globes are flashable to run the open source Tasmota stack, which means no web services required!

So here are some instructions on flashing these globes to be useful:

Buy the globes. I bought this warm while dimmable and this RBG option.

Flash to tasmota. This was a little bit fiddly, but was mostly about getting the sequence to put the globes into config mode right (turn off for 10 seconds, turn on, turn off, turn on, turn off, turn on). Wait a few seconds and then expect the lamp to blink rapidly indicating its in config mode. For Canberra people I now have a raspberry pi setup to do this easily, so we can run a flashing session sometime if people want.

Configure tasmota. This is really up to you, but the globes need to know local wifi details, where your MQTT server is, and stuff like that.

And then configure Home Assistant. The example of how to do that from my house is on github.

Share

March 06, 2019

Authentication in WordPress

WebAuthn is now a W3C recommendation, bringing us one step closer to not having to use passwords anymore. If you’re not familiar with WebAuthn, here’s a little demo (if you don’t own a security key, it’ll probably work best on an Android phone with a fingerprint reader).

That I needed to add a disclaimer for the demo indicates the state of WebAuthn authenticator support. It’s nice when it works, but it’s clearly still in progress, and that progress varies. WebAuthn also doesn’t cover how the authenticator device works, that falls under the proposed CTAP standard. They work together to form the FIDO2 Project. Currently, the most reliable option is to purchase a security key, but quality varies wildly, and needing to carry around an extra dongle just for logging in to sites is no fun.

What WordPress Needs

Anything that replaces passwords needs to provide some extra benefit, without losing the strengths of the password model:

  • Passwords are universally understood as an authentication model.
  • They’re portable: you don’t need a special app or token to use them anywhere.
  • They’re extendable: strong passwords can be enforced as needed. Additional authentication (2FA codes, for example) can be added, too.

Magic login links are an interesting step in this direction. The WordPress mobile apps added magic login support for WordPress.com accounts a while ago, I’d love to see this working on all WordPress sites.

A WebAuthn-based model would be a wonderful future step, once the entire user experience is more polished.

The password-less future hasn’t quite arrived yet, but we’re getting closer.

March 04, 2019

What If?

Share

More correctly titled “you die horribly and it probably involves plasma”, this light hearted and fun read explores serious answers to silly scientific questions. The footnotes are definitely the best bit. A really enjoyable read.

What If? Book Cover What If?
Randall Munroe
Humor
Houghton Mifflin Harcourt
September 2, 2014
320

The creator of the incredibly popular webcomic xkcd presents his heavily researched answers to his fans' oddest questions, including “What if I took a swim in a spent-nuclear-fuel pool?” and “Could you build a jetpack using downward-firing machine guns?”

Share

Audiobooks – February 2019

Tamed: Ten Species that Changed our World by Alice Roberts

Plenty of content (14 hours) and not too dumbed down. About 8 of the 10 species are the ones you’d expect. 8/10

It Won’t Be Easy: An Exceedingly Honest (and Slightly Unprofessional) Love Letter to Teaching by Tom Rademacher

A breezy little book about the realities of teaching (at least in the US). Interesting to outsiders & hopefully useful to those in the profession. 7/10

The Hobbit by J. R. R Tolkien, Read by Rob Inglis

A good audio-edition of the book. Unabridged & really the default one for most people. I alternated chapters of this with the excellent Prancing Pony Podcast commentaries on those chapters. 9/10

The Life of Greece: The Story of Civilization, Volume 2 (The Story of Civilization series) by Will Durant

32 hours on the history of Ancient Greece. Seemed to cover just above everything. Written in the 1930s so probably a little out-of-date in places. 7/10

Share

March 03, 2019

Problems with Dreamhost

Share

This site is hosted at Dreamhost, and for reasons I can’t explain right now isn’t accessible from large chunks of Australia. It seems to work fine from elsewhere though. Dreamhost certainly has an explaination — they allege in their emails that take 24 hours that you can’t reply to that its because wordpress is using too much RAM.

However, they don’t explain why that’s suddenly happened when its been previously fine for years, and they certainly don’t explain why it works from some places but not others and why other Dreamhost sites are also offline from the sites having issues.

Its time for a new hosting solution I think, although not bothering to have hosting might also be that solution.

Share

LPCNet Quantiser – wideband speech at 1700 bits/s

I’ve been working with Neural Net (NN) speech synthesis using LPCNet.

My interest is digital voice over HF radio. To get a NN codec “on the air” I need a fully quantised version at 2000 bit/s or below. The possibility of 8kHz audio over HF radio is intriguing, so I decided to experiment with quantising the LPCNet features. These consist of 18 spectral energy samples, pitch, and the pitch gain which is effectively a measure of voicing.

So I have built a Vector Quantiser (VQ) for the DCT-ed 18 log-magnitude samples. LPCNet updates these every 10ms, which is a bit too fast for my target bit rate. So I decimate to say 30ms, then use linear interpolation to reconstruct the 10ms frames at the decoder. The spectrum changes slowly (most of the time), so I quantise the difference between frames to save a few bits.

Detailed Results

I’ve developed a script that generates a bunch of samples, plots various statistics, and builds a HTML page to summarise the results. Here is the current page, including samples for the fully quantised prototype codec at three bit rates between around 2000 and 1400 bits/s. If anyone would like more explanation of that page, just ask.

Discussion of Results

I can hear “birch” losing some quality at the 20ms decimation step. When training my own NN, I have had quite a bit of trouble with very rough speech when synthesising “canadian”. I’m learning that roughness in NN synthesis means more training required, the network just hasn’t experienced this sort of speaker before. The “canadian” sample is quite low pitch so I may need some more training material with low pitch speakers.

My quantisation scheme works really well on some of the carefully spoken Harvard sentences (oak, glue), in ideal recording conditions. However with more realistic, quickly spoken speech with real world background noise (separately, wanted) it starts to sound vocoder-ish (albeit a pretty good vocoder).

One factor is the frame rate decimation from 10 to 20-30ms, which I used to get the bit rate beneath 2000 bit/s. A better quantisation scheme, or LPCNet running on 20ms frames could improve this. Or we could just run it at greater that 2000 bit/s (say for VHF/UHF two way radio).

Comparison to Wavenet

Source Listen
Wavenet, Codec 2 encoder, 2400 bits/s Listen
LPCnet unquantised, 10ms frame rate Listen
Quantised to 1733 bits/s (44bit/30ms) Listen

The “separately” sample from the Wavenet team sounds better to me. Ironically, the these samples use my Codec 2 encoder, running at just 8kHz! It’s difficult to draw broad conclusions from this, as we don’t have access to a Wavenet system to try many different samples. All codecs tend to break down under certain conditions and samples.

However it does suggest (i) we can eventually get higher quality from NN synthesis and (ii) it is possible to encode high quality wideband speech with features covering a narrow spectral range (e.g. 200-3800Hz for the Codec 2 encoder). The 18 element vectors (covering DC to 8000Hz) I’m currently using ultimately set the bit rate of my current system. After a few VQ stages the elements are independent Gaussians and reduction in quantiser noise is very slow as bits are added.

The LPCNet engine has several awesome features: it’s open source, runs in real time on regular CPUs, and is available for us to test on wide variety of samples. The speech quality I am achieving with even my first attempts is rather good compared to any other speech codecs I have played with at these bit rates – in either the open or closed source worlds.

Tips and Observations

I’ve started training my own models, and discovered that if you get rough speech – you probably need more data. For example when I tried training on 1E6 vectors, I had a few samples sounding rough when I tested the network. However with 5E6 vectors, it works just fine.

The LPCNet dump_data –train mode program helps you by being very clever. It “fuzzes” the speech frequency, gain, and adds a little noise. If the NN hasn’t experienced a particular combination of features before, it tends to get lost – and you get rough sounding speech.

I found that 10 Epochs of 5E6 vectors gives me good speech quality on my test samples. That takes about a day with my somewhat underpowered GPU. In fact, most of the training seems to happen on the first few Epochs:

Here is a plot of the training and validation loss for my training database:

This plot shows how much the loss changes on each Epoch, not very much, but not zero. I’m unsure if these small gains lead to meaningful improvements over many Epochs:

I looked into the LPCNet pitch and voicing estimation. Like all estimators (including those in Codec 2), they tend to make occasional mistakes. That’s what happen when you try to fit neat signal processing models to real-world biological signals. Anyway, the amazing thing is that LPCNet doesn’t care very much. I have some samples where pitch is all over the place but the speech still sounds OK.

This is really surprising to me. I’ve put a lot of time into the Codec 2 pitch estimators. Pitch errors are very obvious in traditional, model based low bit rate speech codecs. This suggest that with NNs we can get away with less pitch information – which means less bits and better compression. Same with voicing. This leads to intriguing possibilities for very low bit (few 100 bit/s) speech coding.

Conclusions, Further Work and FreeDV 2020

Overall I’m pleased with my first attempt at quantisation. I’ve learnt a lot about VQ and NN synthesis and carefully documented (and even scripted) my work. The learning and experimental experience has been very satisfying.

Next I’d like to get one of these candidates on the air, see how it sounds over real world digital radio channels, and find out what happens when we get bit errors. I’m a bit nervous about predictive quantisation on radio channels, as it causes errors to propagate in time. However I have a good HF modem and FEC, and some spare bits to add some non-predictive quantisation if needed.

My design for a new, experimental “FreeDV 2020” mode employing LPCNet uses just 1600 Hz of RF bandwidth for 8kHz bandwidth speech, and should run at 10dB SNR on a moderate fading channel.

Here is a longer example of LPCNet at 1733 bit/s compared to HF SSB at a SNR of 10dB (we can send error free LPCNet through a similar HF channel). The speech sample is from the MP3 source of the Australian weekly WIA broadcast:

Source Listen
SSB simulation at 10dB SNR Listen
LPCNet Quantised to 1733 bits/s (44bit/30ms) Listen
Mixed LPCNet Quantised and SSB (thanks Peter VK2TPM!) Listen

This is really new technology, and there is a lot to explore. The work presented here represents my initial attempt at quantisation with the LPCNet synthesis engine, and is hopefully useful for other people who would like to experiment in the area.

Acknowledgements

Thanks Jean-Marc for developing the LPCnet technology, making the code open source, and answering my many questions.

Links

LPCnet introductory page.

The source code for my quantisation work (and notes on how to use it) is available as a branch on the GitHub LPCNet repo.

WaveNet and Codec 2

March 01, 2019

Connecting a VoIP phone directly to an Asterisk server

On my Asterisk server, I happen to have two on-board ethernet boards. Since I only used one of these, I decided to move my VoIP phone from the local network switch to being connected directly to the Asterisk server.

The main advantage is that this phone, running proprietary software of unknown quality, is no longer available on my general home network. Most importantly though, it no longer has access to the Internet, without my having to firewall it manually.

Here's how I configured everything.

Private network configuration

On the server, I started by giving the second network interface a static IP address in /etc/network/interfaces:

auto eth1
iface eth1 inet static
    address 192.168.2.2
    netmask 255.255.255.0

On the VoIP phone itself, I set the static IP address to 192.168.2.3 and the DNS server to 192.168.2.2. I then updated the SIP registrar IP address to 192.168.2.2.

The DNS server actually refers to an unbound daemon running on the Asterisk server. The only configuration change I had to make was to listen on the second interface and allow the VoIP phone in:

server:
    interface: 127.0.0.1
    interface: 192.168.2.2
    access-control: 0.0.0.0/0 refuse
    access-control: 127.0.0.1/32 allow
    access-control: 192.168.2.3/32 allow

Finally, I opened the right ports on the server's firewall in /etc/network/iptables.up.rules:

-A INPUT -s 192.168.2.3/32 -p udp --dport 5060 -j ACCEPT
-A INPUT -s 192.168.2.3/32 -p udp --dport 10000:20000 -j ACCEPT

Accessing the admin page

Now that the VoIP phone is no longer available on the local network, it's not possible to access its admin page. That's a good thing from a security point of view, but it's somewhat inconvenient.

Therefore I put the following in my ~/.ssh/config to make the admin page available on http://localhost:8081 after I connect to the Asterisk server via ssh:

Host asterisk
    LocalForward 8081 192.168.2.3:80

February 28, 2019

LUV March 2019 Workshop: 30th Anniversary of the Web / Federated Social Media

Mar 16 2019 12:30
Mar 16 2019 16:30
Mar 16 2019 12:30
Mar 16 2019 16:30
Location: 
Infoxchange, 33 Elizabeth St. Richmond

This month we will celebreate the 30th anniversary of the World Wide Web with a discussion of its past, present and future.  Andrew Pam will also demonstrate and discuss the installation, operation and use of federated social media platforms including Diaspora and Hubzilla.

The meeting will be held at Infoxchange, 33 Elizabeth St. Richmond 3121.  Late arrivals please call (0421) 775 358 for access to the venue.

LUV would like to acknowledge Infoxchange for the venue.

Linux Users of Victoria is a subcommittee of Linux Australia.

March 16, 2019 - 12:30

read more

LUV March 2019 Main Meeting: ZeroTier / Ethics in the computer realm

Mar 5 2019 19:00
Mar 5 2019 21:00
Mar 5 2019 19:00
Mar 5 2019 21:00
Location: 
Kathleen Syme Library, 251 Faraday Street Carlton VIC 3053

PLEASE NOTE LATER START TIME

7:00 PM to 9:00 PM Tuesday, March 5, 2019
Kathleen Syme Library, 251 Faraday Street Carlton VIC 3053

Speakers:

  • Adrian Close: ZeroTier
  • Enno Davids: Ethics - Weasel words, weasel words, weasel words, bad thing

 

Many of us like to go for dinner nearby after the meeting, typically at Brunetti's or Trotters Bistro in Lygon St.  Please let us know if you'd like to join us!

Linux Users of Victoria is a subcommittee of Linux Australia.

March 5, 2019 - 19:00

read more

Propagating Native Plants

by Alec M. Blomberry & Betty Maloney

Propagating Australian Plants

I have over 1,000 Diospyrus Geminata (scaly ebony) seedlings growing in my shade house (in used toilet rolls). I'd collected the seeds from the (delicious) fruit in late 2017 (they appear to fruit based on rainfall - no fruit in 2018) and they're all still rather small.

All the literature stated that they were slow growing. I may have been more dismissive of this than I needed to be.

I'm growing these for landscape scale planting, it's going to be a while between gathering the seeds (mid 2017) and planting the trees (maybe mid 2019).

So I needed to look into other forms of propagation and either cutting or aerial layering appear to be the way to go, as I already have large numbers of of mature Diospyros Geminata on our property or nearby.

The catch being that I know nothing of either cutting or aerial layering and in particular I want to do this at a reasonable scale (ie: possibly thousands).

So this is where Propagating Australian Plants comes in.

Aerial Layering

It's a fairly dry and academic read, that feels like it may be more of an introductory guide for botanic students than a lay person such as myself.

Despite being last published in 1994 by a publisher that no longer exists and having a distinct antique feel to it, the information within is crisp and concise with clear and helpful illustrations.

Highly recommended if you're starting to propagate natives as I am.

Although I wish you luck picking up a copy - I got mine from an Op Shop. At least the National Library of Australia appears to have a copy.

February 27, 2019

FreeDV QSO Party 2019 Part 1

My local radio club, the Amateur Radio Experimenters Group (AREG), have organised a special FreeDV QSO Party Weekend from April 27th 0300z to April 28th 0300z 2019. This is a great chance to try out FreeDV, work Australia using open source HF digital voice, and even talk to me!

All the details including frequencies, times and the point scoring system over on the AREG site.

February 26, 2019

5 axis cnc fun!

The 5th axis build came together surprisingly well. I had expected much more resistance getting the unit to be known to both fusion360 and LinuxCNC. There is still some tinkering to be done for sure but I can get some reasonable results already. The video below gives an overview of the design:



Shown below is a silent movie of a few test jobs I created to see how well tool contact would be maintained during motion in A and B axis while the x,y,z are moved to keep the tool in the right position. This is the flow toolpath in Fusion360 in action. Non familiarity with these CAM paths makes for a learning curve which is interesting when paired with a custom made 5th that you are trying to debug at the same time.



I haven't tested how well the setup works when cutting harder materials like alloy yet. It is much quieter and forgiving to test cutting on timber and be reasonably sure about the toolpaths and that you are not going to accidentally crash to deep into the material after a 90 degree rotation.


New Dark Age: Technology and the End of the Future

by James Bridle

New Dark Age: Technology and the End of the Future

It may be my first book for 2019 but I'm going to put it out there, this is my must read book for 2109 already. Considering it was published in 2018 to broad acclaim, it may be a safe call.

tl;dr; Read this book. It's well resourced, thoroughly referenced, well thought out with the compiled information and lines drawn potentially redrawing the way you see our industry and the world. If you're already across the issues, the hard facts will still cause you to draw breath.

I read this book in bursts over 4 weeks. Each chapter packing it's own informative punch. The narrative first grabbed my attention on page 4 where the weakness of learning to code alone was fleshed out.

"Computational thinking is predominant in the world today, driving the worst trends in our societies and interactions, and must be opposed by real systemic literacy." page 4

Where it is argued that systemic literacy is much more important than learning to code - with a humorous but fitting plumbing analogy in the mix.

One of the recurring threads in the book is the titular "Dark New Age", with points being drawn back to the various actions present in modern society that are actively reducing our knowledge.

"And so we find ourselves today connected to vast repositories of knowledge, and yet we have not learned to think. In fact, the opposite is true: that which was intended to enlighten the world in practice darkens it. The abundance of information and the plurality of world-views now accessible to us through the Internet are not producing a coherent consensus reality, but one riven by fundamentalist insistence on simplistic narratives, conspiracy theories, and post-factual politics." page 10

Also covered are more well known instances of corporate and government censorship, the traps of the modern convenience technologies.

"When an ebook is purchased from an online service, it remains the property of the seller, it's loan subject to revocation at any time - as happened when Amazon remotely deleted thousands of copies of 1984 and Animal Farm from customers' Kindles in 2009. Streaming music and video services filter the media available by legal jurisdiction and algorithmically determine 'personal' preferences. Academic journals determine access to knowledge by institutional affiliation and financial contribution as physical, open-access libraries close down." page 39

It was the "Climate" chapter that packed the biggest punch for me, as an issue I considered myself rather well across over the last 30 years, it turns out there was a significant factor I'd missed. A hint that surprise was coming came in an interesting diversion into clear air turbulence.

"An advisory circular on preventing turbulence-related injuries, published by the US Federal Aviation Administration in 2006, states that the frequency of turbulence accidents has increased steadily for more than a decade, from 0.3 accidents per million flights in 1989 to 1.7 in 2003." page 68

The reason for this increase was laid at the feet of increased CO2 levels in the atmosphere by Paul Williams of the National Centre for Atmospheric Science and the implications were expounded upon in his paper Nature Climate Change (2013) thusly:

"...in winter, most clear air turbulence measures show a 10-40 per cent increase in the median strength...40-70 per cent increase in the frequency of occurrence of moderate or greater turbulence." page 69

The real punch in the guts came on page 73, where I first came across the concept of "Peak Knowledge" and how the climate change was playing it's defining role in that decline, where President of the American Meteorological Society William B Gail wonders if:

"we have already passed through 'peak knowledge", just as we may have already passed 'peak oil'." page 73

Wondering what that claim was based on, the next few paragraphs of information can be summarised in the following points:

  • From 1000 - 1750 CE CO2 was at 275-285 parts / million.
  • 295ppm by the start of the 20th century
  • 310ppm by 1950
  • 325ppm in 1970
  • 350ppm in 1988
  • 375ppm by 2004
  • 400ppm by 2015 - the first time in 800,000 years
  • 1,000ppm is projected to be passed by the end of this century.

"At 1,000ppm, human cognitive ability drops by 21%" page 74

Then a couple of bombshells:

"CO2 already reaches 500ppm in industrial cities"

"indoors in poorly ventilated schools, homes and workplaces it can regularly exceed 1,000ppm - substantial numbers of schools in California and Texas measured in 2012 breached 2,000ppm."

The implications of this are fairly obvious.

All this is by the end of chapter 3. It's a gritty, honest look at where we're at and where going. It's not pretty but as the old saying goes, to be forewarned is to be forearmed.

Do yourself a favour, read it.

A (simplified) view of OpenPOWER Firmware Development

I’ve been working on trying to better document the whole flow of code that goes into a build of firmware for an OpenPOWER machine. This is partially to help those not familiar with it get a better grasp of the sheer scale of what goes into that 32/64MB of flash.

I also wanted to convey the components that we heavily re-used from other Open Source projects, what parts are still “IBM internal” (as they relate to the open source workflow) and which bits are primarily contributed to by IBMers (at least at this point in time).

As such, let’s start with the legend of the diagram:

Now, the diagram:

Simplified development flow for OpenPOWER firmware

The end thing that a user with a machine will download and apply (or that comes shipped with a box) is the purple “Installable Firmware Release” nodes (bottom center). In this diagram, there are 4 of them. One for POWER9 systems such as the just-announced AC922 system (this is the “OP910 Release” node, which is the witherspoon_defconfig in the op-build tree); one for the p9dsu platform (p9dsu_defconfig in op-build) and one is for IBM FSP based systems such as the S812L and S822L systems (or S812/S822 in OPAL mode).

There are more platforms out there, but this diagram is meant to be simplified. The key difference with the p9dsu platform is that this is produced by somebody other than IBM.

All of these releases are based off the upstream op-build project, op-build is the light blue box in the center of the diagram. We do regular X.Y releases and sometimes do X.Y.Z releases. It’s primarily a pull request based workflow currently, so everything goes via a pull request. The op-build project brings together all the POWER specific firmware components (pretty much everything in every other light blue/blue box) along with a Linux kernel and buildroot.

The kernel and buildroot are the two big yellow boxes on the top right. Buildroot brings together a lot of open source components that are in our firmware image (including some power specific ones that we get through upstream buildroot).

For Linux, this is a pretty simplified view of the process, but we primarily ship the stable tree (with maybe up to half a dozen patches).

The skiboot and petitboot components both use a mailing list based workflow (similar to kernel) as well as X.Y and X.Y.Z releases (again, similar to the linux kernel).

On the far left of the diagram, we have Hostboot, SBE and OCC. These are three firmware components that come from the traditional IBM POWER Firmware group, and are shared with the IBM non-OpenPOWER POWER systems (“traditional” POWER). These components have part of their code from from an (internal) repository called “ekb” which also goes into a (very) low level debug tool and the FSP based systems. There’s also an (internal) gerrit instance that’s the primary place where code review/development discussions are for these components.

In future posts, I’ll probably delve into more specifics of the current development process, and how we may try and change things for the better.

How I do email (at work)

Recently, I blogged on my home email setup and in that post, I hinted that my work setup was rather different. I have entirely separate computing devices for work and personal, a setup I strongly recommend. This also lets me “go home” from work even when working from home, I use a different physical machine!

Since I work for IBM I have (at least) two email accounts for work: a Lotus Notes one and a internet standards compliant one. It’s “easy” enough to get the Notes one to forward to the standards compliant one, from which I can just use fetchmail or similar to pull down mail.

I run mail through a rather simple procmail script: it de-mangles some URL mangling that can happen in the current IBM email infrastructure, runs things through SpamAssassin and deliver to a date based Maildir (or one giant pile for spam).

My ~/.procmailrc looks something like this:

LOGFILE=$HOME/mail_log
LOGABSTRACT=yes

DATE=`date +"%Y%m"`
MAILDIR=Maildir/INBOX
DEFAULT=$DATE/

:0fw
| magic_script_to_unmangle_things

:0fw
| spamc

:0
* ^X-Spam-Status: Yes
$HOME/Maildir/junkmail/incoming/

I use tail -f mail_log as a really dumb kind of biff replacement.

Now, what do I read and write mail with? Notmuch! It is the only thing that even comes close to being able to deal with a decent flow of mail. I have a couple of saved searches just to track how much mail I pull in a day/week. Today (on Monday), it says 442 today and 10,403 over the past week.

For the most part, my workflow is kind of INBOX-ZERO like, except that I currently view victory as INBOX 2000. Most mail does go into my INBOX, the notable exceptions are two main mailing lists I’m subscribed to mostly as FYI and to search/find things when needed. Those are the Linux Kernel Mailing List (LKML) and the buildroot mailing list. Why notmuch rather than just searching the web for mailing list archives? Notmuch can return the result of a query in less time it takes light to get to and from the United States in ideal conditions.

For work, I don’t sync my mail anywhere. It’s just on my laptop. Not having it on my phone is a feature. I have a notmuch post-new hook that does some initial tagging of mail, and as such I have this in my ~/.notmuch-config:

[new]
tags=new;

My post-new hook looks like this:

#!/bin/bash

# immediately archive all messages from "me"
notmuch tag -new -- tag:new and from:stewart@linux.vnet.ibm.com

# tag all message from lists
notmuch tag +devicetree +list -- tag:new and to:devicetree@vger.kernel.org
notmuch tag +inbox +unread -new -- tag:new and tag:devicetree and to:stewart@linux.vnet.ibm.com
notmuch tag +listinbox +unread +list -new -- tag:new and tag:devicetree and not to:stewart@linux.vnet.ibm.com

notmuch tag +linuxppc +list -- tag:new and to:linuxppc-dev@lists.ozlabs.org
notmuch tag +linuxppc +list -- tag:new and cc:linuxppc-dev@lists.ozlabs.org
notmuch tag +inbox +unread -new -- tag:new and tag:linuxppc
notmuch tag +openbmc +list -- tag:new and to:openbmc@lists.ozlabs.org
notmuch tag +inbox +unread -new -- tag:new and tag:openbmc

notmuch tag +lkml +list -- tag:new and to:linux-kernel@vger.kernel.org
notmuch tag +inbox +unread -new -- tag:new and tag:lkml and to:stewart@linux.vnet.ibm.com
notmuch tag +listinbox +unread -new -- tag:new and tag:lkml and not tag:linuxppc and not to:stewart@linux.vnet.ibm.com

notmuch tag +qemuppc +list -- tag:new and to:qemu-ppc@nongnu.org
notmuch tag +inbox +unread -new -- tag:new and tag:qemuppc and to:stewart@linux.vnet.ibm.com
notmuch tag +listinbox +unread -new -- tag:new and tag:qemuppc and not to:stewart@linux.vnet.ibm.com

notmuch tag +qemu +list -- tag:new and to:qemu-devel@nongnu.org
notmuch tag +inbox +unread -new -- tag:new and tag:qemu and to:stewart@linux.vnet.ibm.com
notmuch tag +listinbox +unread -new -- tag:new and tag:qemu and not to:stewart@linux.vnet.ibm.com

notmuch tag +buildroot +list -- tag:new and to:buildroot@buildroot.org
notmuch tag +buildroot +list -- tag:new and to:buildroot@busybox.net
notmuch tag +buildroot +list -- tag:newa nd to:buildroot@uclibc.org
notmuch tag +inbox +unread -new -- tag:new and tag:buildroot and to:stewart@linux.vnet.ibm.com
notmuch tag +listinbox +unread -new -- tag:new and tag:buildroot and not to:stewart@linux.vnet.ibm.com

notmuch tag +ibmbugzilla -- tag:new and from:bugzilla@us.ibm.com

# finally, retag all "new" messages "inbox" and "unread"
notmuch tag +inbox +unread -new -- tag:new

This leaves me with both an inbox and a listinbox. I do not look at the overwhelming majority of mail that hits the listinbox – It’s mostly for following up on individual things. If I started to need to care more about specific topics, I’d probably add something in there for them so I could easily find them.

My notmuch emacs setup has a bunch of saved searches, so my notmuch-hello screen looks something like this:

This gets me a bit of a state-of-the-world-of-email-to-look-at view for the day. I’ll often have meetings first thing in the morning that may reference email I haven’t looked at yet, and this generally lets me quickly find mail related to the problems of the day and start being productive.

How I do email (at home)

I thought I might write something up on how I’ve been doing email both at home and at work. I very much on purpose keep the two completely separate, and have slightly different use cases for both of them.

For work, I do not want mail on my phone. For personal mail, it turns out I do want this on my phone, which is currently an Android phone. Since my work and personal email is very separate, the volume of mail is really, really different. Personal mail is maybe a couple of dozen a day at most. Work is… orders of magnitude more.

Considering I generally prefer free software to non-free software, K9 Mail is the way I go on my phone. I have it set up to point at the IMAP and SMTP servers of my mail provider (FastMail). I also have a google account, and the gmail app works fine for the few bits of mail that go there instead of my regular account.

For my mail accounts, I do an INBOX ZERO like approach (in reality, I’m pretty much nowhere near zero, but today I learned I’m a lot closer than many colleagues). This means I read / respond / do / ignore mail and then move it to an ARCHIVE folder. K9 and Gmail both have the ability to do this easily, so it works well.

Additionally though, I don’t want to care about limits on storage (i.e. expire mail from the server after X days), nor do I want to rely on “the cloud” to be the only copy of things. I also don’t want to have to upload any of past mail I may be keeping around. I also generally prefer to use notmuch as a mail client on a computer.

For those not familiar with notmuch, it does tags on mail in Maildir, is extremely fast and can actually cope with a quantity of mail. It also has this “archive”/INBOX ZERO workflow which I like.

In order to get mail from FastMail and Gmail onto a machine, I use offlineimap. An important thing to do is to set “status_backend = sqlite” for each Account. It turns out I first hacked on sqlite for offlineimap status a bit over ten years ago – time flies. For each Account I also set presynchook = ~/Maildir/maildir-notmuch-presync (below) and a postsynchook = notmuch new. The presynchook is run before we sync, and its job is to move files around based on the tags in notmuch and the postsynchook lets notmuch catch any new mail that’s been fetched.

My maildir-notmuch-presync hook script is:

#!/bin/bash
notmuch search --output=files not tag:inbox and folder:fastmail/INBOX|xargs -I'{}' mv '{}' "$HOME/Maildir/INBOX/fastmail/Archive/cur/"

notmuch search --output=files folder:fastmail/INBOX and tag:spam |xargs -I'{}' mv '{}' "$HOME/Maildir/INBOX/fastmail/Spam/cur/"
ARCHIVE_DIR=$HOME/Maildir/INBOX/`date +"%Y%m"`/cur/
mkdir -p $ARCHIVE_DIR
notmuch search --output=files folder:fastmail/Archive and date:..90d and not tag:flagged | xargs -I'{}' mv '{}' "$ARCHIVE_DIR"

# Gmail
notmuch search --output=files not tag:inbox and folder:gmail/INBOX|grep 'INBOX/gmail/INBOX/' | xargs -I'{}' rm '{}'
notmuch search --output=files folder:gmail/INBOX and tag:spam |xargs -I'{}' mv '{}' "$HOME/Maildir/INBOX/gmail/[Gmail].Spam/cur/"

So This keeps 90 days of mail on the fastmail server, and archives older mail off into month based archive dirs. This is simply to keep directory sizes not too large, you could put everything in one directory… but at some point that gets a bit silly.

I don’t think this is all the most optimal setup I could have, but it does let me read and answer mail on my phone and desktop (as well as use a web client if I want to). There is a bit of needless copying of messages by offlineimap under certain circumstances, but I don’t get enough personal mail for it to be a problem.

pwnm-sync: Synchronizing Patchwork and Notmuch

One of the core bits of infrastructure I use as a maintainer is Patchwork (I wrote about making it faster recently). Patchwork tracks patches sent to a mailing list, allowing me as a maintainer to track the state of them (New|Under Review|Changes Requested|Accepted etc), combine them into patch bundles, look at specific series, test results etc.

One of the core bits of software I use is my email client, notmuch. Most other mail clients are laughably slow and clunky, or just plain annoying for absorbing a torrent of mail and being able to deal with it or just plain ignore it but have it searchable locally.

You may think your mail client is better than notmuch, but you’re wrong.

A key feature of notmuch is tagging email. It doesn’t do the traditional “folders” but instead does tags (if you’ve used gmail, you’d be somewhat familiar).

One of my key work flows as a maintainer is looking at what patches are outstanding for a project, and then reviewing them. This should also be a core part of any contributor to a project too. You may think that a tag:unread and to:project-list@foo query would be enough, but that doesn’t correspond with what’s in patchwork.

So, I decided to make a tool that would add tags to messages in notmuch corresponding with the state of the patch in patchwork. This way, I could easily search for “patches marked as New in patchwork” (or Under Review or whatever) and see what I should be reviewing and looking at merging or commenting on.

Logically, this wouldn’t be that hard, just use the (new) Patchwork REST API to get the state of everything and issue the appropriate notmuch commands.

But just going one way isn’t that interesting, I wanted to be able to change the tags in notmuch and have them sync back up to Patchwork. So, I made that part of the tool too.

Introducing pwnm-sync: a tool to sync patchwork and notmuch.

notmuch-hello tag counts for pwnm-sync tagged
patches in patchwork
notmuch-hello tag counts for pwnm-sync tagged
patches in patchwork

With this tool I can easily see the patchwork state of any patch that I have in my notmuch database. For projects that I’m a maintainer on (i.e. can change the state of patches), If I update the patches of that email and run pwnm-sync again, it’ll update the state in patchwork.

I’ve been using this for a few weeks myself and it’s made my maintainer workflow significantly nicer.

It may also be useful to people who want to find what patches need some review.

The sync time is mostly dependent on how fast your patchwork instance is for API requests. Unfortunately, we need to make some improvements on the Patchwork side of things here, but a full sync of the above takes about 4 minutes for me. You can also add a –epoch option (with a date/time) to say “only fetch things from patchwork since that date” which makes things a lot quicker for incremental syncs. For me, I typically run it with an epoch of a couple of months ago, and that takes ~20-30 seconds to run. In this case, if you’ve locally updated a old patch, it will still sync that change up to patchwork.

Implementation Details

It’s a python3 script using the notmuch python bindings, the requests-futures module for asynchronous HTTP requests (so we can have the patchwork server assemble the next page of results while we process the previous one), and a local sqlite3 database to store state in so we can work out what changed locally / server side.

Getting it

Head to https://github.com/stewart-ibm/pwnm-sync or just:

git clone https://github.com/stewart-ibm/pwnm-sync.git

Optimizing database access in Django: A patchwork story

tl;dr: I made Patchwork a lot faster by looking at what database queries were being generated and optimizing them either by making Django produce better queries or by adding better indexes.

Introduction to Patchwork

One of the key bits of infrastructure a bunch of maintainers of Open Source Software use is a tool called Patchwork. We use it for a bunch of OpenPOWER firmware development, several Linux subsystems use it as well as freedesktop.org.

The purpose of Patchwork is to supplement the patches-to-a-mailing-list development work flow. It allows a maintainer to see all the patches that have been posted on the list, How many Acked-by/Reviewed-by/Tested-by replies they have, delegate responsibility for the patch to a co-maintainer, track (and change) the state of the patch (e.g. to “Under Review”, “Changes Requested”, or “Accepted”), and create bundles of patches to help in review and testing.

Since patchwork is an open source project itself, there’s several instances of it out there in common use. One of the main instances is https://patchwork.ozlabs.org/ which is (funnily enough) used by a bunch of people connected to OzLabs for projects that are somewhat connected to OzLabs. e.g. the linuxppc-dev project and the skiboot and petitboot projects. There’s also a kernel.org instance, which is used by some kernel subsystems.

Recent versions of Patchwork have added some pretty cool features such as the ability to integrate with CI systems such as Snowpatch which helps maintainers see if patches submitted are likely to break things.

Unfortunately, there’s also been some complaints that recent version of patchwork have gotten slower than previous ones. This may well be the case, or it could just be that the volume of patches is much higher and there’s load on the database. Anyway, after asking a few questions about what the size and scope was of the patchwork database on ozlabs.org, I went “hrm… this sounds like it shouldn’t really be a problem… perhaps I should look into this”.

Attacking the problem…

Every so often it is revealed that I know a little bit about databases.

Getting a development environment up for Patchwork is amazingly easy thanks to Docker and the great work of the Patchwork maintainers. The only thing you need to load in is an example dataset. I started by importing mail from a few mailing lists I’m subscribed to, which was Good Enough(TM) for an initial look.

Due to how Django forces us to design a database schema though, the suggested method of getting a sample data set will not mirror what occurs in a production system with multiple lists. It’s for this reason that I ended up using a copy of a live dataset for much of my work rather than constructing an artificial one.

Patchwork supports both a MySQL and PostgreSQL database backend. Since the one on ozlabs.org is backed by PostgreSQL, I ended up loading a large dataset into PostgreSQL for most of my work, although I also did some testing with MySQL.

The current patchwork.ozlabs.org instance has a database of around 13GB in side, with about a million patches. You may think this is big, my database brain goes “no, this is actually quite small and everything should be a lot faster than it is even on quite limited hardware”

The problem with ORMs

It turns out that Patchwork is written in Django, an ORM (Object-Relational Mapping) framework in Python – and thus something that pretty effectively obfuscates application code from the SQL being run.

There is one thing that Django misses that could be a pretty big general performance boost to many applications: it doesn’t support composite primary keys. For some databases (e.g. MySQL’s InnoDB engine) the PRIMARY KEY is a clustered index – that is, the physical layout of the rows on disk reflect primary key order. You can use this feature to your advantage and have much higher cache hits of your database pages.

Unfortunately though, we cannot do that with Django, so we lose a bunch of possible performance because of it (especially for queries that are going to have to bring in data from disk). In fact, we’re forced to use an ID field that’ll scatter our rows all over the place rather than do something efficient. You can somewhat get back some of the performance by creating covering indexes, but this costs in terms of index maintenance and disk space.

It should be noted that PostgreSQL doesn’t have a similar concept, although there is a (locking) CLUSTER statement that can (as an offline operation for the table) re-arrange existing rows to be in index order. In my testing, this can give a bit of a boost to performance of some of the Patchwork queries.

With MySQL, you’d look at a bunch of statistics on what pages are being brought in and paged out of the InnoDB buffer pool. With PostgreSQL it’s a bit more complex as it relies heavily on the OS page cache.

My main experience is with MySQL like environment, so I’ve had to re-learn a bunch of PostgreSQL things in this work which was kind of fun. It may be “because of my upbringing” but it seems as if there’s a lot more resources and documentation out in the wild about optimizing MySQL environments than PostgreSQL ones, especially when it comes to documentation around a bunch of things inside the database server. A lot of credit should go to the MySQL Documentation team – I wish the PostgreSQL documentation was up to the same standard.

Another issue is that fetching BLOBs is generally an expensive operation that you want to avoid unless you’re going to use them. Thus, fetching the whole “object” at once isn’t always optimal. The Django query generation appears to be somewhat buggy when it comes to “hey, don’t fetch these columns, I don’t need them”, so you do have to watch what query is produced not just what query you expect to be produced. For example, [01/11] Improve patch listing performance (~3x).

Another issue with Django is how you go from your Python code to an actual SQL query, especially when the produced SQL query is needlessly complex or inefficient. I’ve seen Django always produce an ORDER BY for one table, even when not needed, I’ve also seen it always join tables even when you’re getting no columns from one of them and there’s no way you’re asking for it. In fact, I had to revert to raw SQL for one of my performance improvements as I just couldn’t beat it into submission: [10/11] Be sensible computing project patch counts.

An ORM can be great for getting apps out quickly, or programming in a familiar way. But like many things, an understanding of what is going on underneath is key for extracting maximum performance.

Also, if you ever hear something like “ORM $x doesn’t scale” then maybe that person just hasn’t looked at how to use the ORM better. The same goes for if they say “database $y doesn’t scale”- especially if it’s a long existing relational database such as MySQL or PostgreSQL.

Speeding up listing current patches for a project

17 SQL queries in 4477msMore than 4 seconds in the database
does not make page load time great.

Fortunately though, the Django development environment lets you really easily dive into what queries are being generated and (at least roughly) where they’re being generated from. There’s a sidebar in your browser that shows how many SQL queries were needed to generate the page and how long they took. The key to making your application go faster is to run fewer queries in less time.

I was incredibly impressed with how easy it was to see what queries were run, where they were run from, and the EXPLAIN output for them.

By clicking on that SQL button on the right side of your browser, you get this wonderful chart of what queries were executed, when, and how long they took. From this, it is incredibly obvious which query is the most problematic: the one that took more than four seconds!

In the dim dark days of web development, you’d have to turn on a Slow Query Log on the database server and then grep through your source code or some other miserable activity. I am so glad I didn’t have to do that.

More than four seconds for a single database query does not make for a nice UX.

This particular query was a real hairy one, the EXPLAIN output from PostgreSQL (and MySQL) was certainly long and involved and would most certainly not feature in the first half of an “Introduction to query optimization” day long workshop. If you haven’t brushed up on various bits of documentation on understanding EXPLAIN, you should! The MySQL EXPLAIN FORMAT=JSON is especially fantastic for getting deep details as to what’s going on with query execution.

The big performance gain here was to have the database be able to execute the query in a much more efficient way by creating two covering indexes for part of the query. To work out what indexes to create, one has to look at the EXPLAIN output and work out why the database is choosing to do either a sequential scan of a large table, or use an index that doesn’t exclude that many rows. In this case, I tweaked the code to slightly change the query that was generated as well as adding a covering index. What we ended up with is something that is dramatically faster.

The main query is ~350x faster than before

You’ll notice that it appears that the first query there takes a lot more time but it doesn’t, it just takes a lot more time relative to the main query.

In fact, this particular page is one that people have mentioned at being really, really slow to load. With the main query now about 350 times faster than it was originally, it shouldn’t be a problem anymore.

A future improvement would be to cache the COUNT() for the common case, as it’s pretty easily computed when new patches come in or states change.

The patches that help this particular page have been submitted upstream here:

Making viewing a patch with comments faster

Now that we can list patches faster, can we make other pages that Patchwork has quicker?

One such page is viewing a patch (or cover letter) that has a lot of comments on it. One feature of Patchwork is that it will display all the email replies to a patch or cover letter in the Web UI. But… this seemed slow

On one of the most commented patches I could find, we ended up executing one hundred and seventy seven SQL queries to view it! If we dove into it, a bunch of the queries looked really really similar…

I’ve got 99 queries where I only need 1.

The problem here is that the Patchwork UI is wanting to find out the name of each person who submitted a comment, and is doing that by querying the ID from a table. What it should be doing instead is a SQL JOIN on the original query and just fetching all that information in one go: make the database server do the work, it’s really good at it.

My patch [02/11] 4x performance improvement for viewing patch with many comments   does just that by using the select_related() method correctly, as well as being explicit about what information we want to retrieve.

We’re now only a few milliseconds to grab all the comments

With that patch, we’re down to a constant number of queries and around a 3x-7x faster time executing them depending if we have a warm cache or not.

The one time I had to use raw SQL

When viewing a project page (such as https://patchwork.ozlabs.org/project/qemu-devel/ ) it displays the number of patches (archived and not archived) for the project. By looking at what SQL queries are executed to collect these numbers, you’ll notice two things. First, here are the queries:

COUNT() queries can be expensive

First thing you’ll notice is that they took a loooooong time to execute (more than a second each). The second thing, if you look closer, is that they contain a join which is completely unneeded.

I spent a good long while trying to make Django behave, and I just could not. I believe it’s due to the model having some inheritance in it. Writing the query by hand ended up being the best solution, and it gave a significant performance improvement:

Unfortunately, only 4x faster.

Arguably, a better way would be to precompute the count for the archived/non-archived patches and just display them. I (or someone else who knows more about Django) may want to look at that for a future improvement.

Conclusion and final thoughts

There’s a few more places where there could be some optimizations, but currently I cannot get any single page to take more than between 40-400ms in the database when running on my laptop – and that’s Good Enough(TM) for now.

The next steps are getting these patches through a round or two of review, and then getting them into a Patchwork release and deployed out on patchwork.ozlabs.org and see if people can find any new ways to make things slow.

If you’re interested, the full patchset with cover letter is here: [00/11] Performance for ALL THE THINGS!

The diffstat is interesting, as most of the added code is auto-generated by Django for database migrations (adding of indexes).

 .../migrations/0027_add_comment_date_index.py | 23 +++++++++++++++++
 .../0028_add_list_covering_index.py           | 19 ++++++++++++++
 .../0029_add_submission_covering_index.py     | 19 ++++++++++++++
 patchwork/models.py                           | 21 ++++++++++++++--
 patchwork/templates/patchwork/submission.html | 16 ++++++------
 patchwork/views/__init__.py                   |  8 +++++-
 patchwork/views/cover.py                      |  5 ++++
 patchwork/views/patch.py                      |  7 ++++++
 patchwork/views/project.py                    | 25 ++++++++++++++++---
 9 files changed, 128 insertions(+), 15 deletions(-)
 create mode 100644 patchwork/migrations/0027_add_comment_date_index.py
 create mode 100644 patchwork/migrations/0028_add_list_covering_index.py
 create mode 100644 patchwork/migrations/0029_add_submission_covering_index.py

I think the lesson is that making dramatic improvements to performance of your Django based app does not mean you have to write a lot of code or revert to raw SQL or abandon your ORM. In fact, use it properly and you can get a looong way. It’s just that to use it properly, you’re going to have to understand the layer below the ORM, and not just treat the database as a magic black box.

ccache and op-build

You may have heard of ccache (Compiler Cache) which saves you heaps of real world time when rebuilding a source tree that is rather similar to one you’ve recently built before. It’s really useful in buildroot based projects where you’re building similar trees, or have done a minor bump of some components.

In trying to find a commit which introduced a bug in op-build (OpenPOWER firmware), I noticed that hostboot wasn’t being built using ccache and we were always doing a full build. So, I started digging into it.

It turns out that a bunch of the perl scripts for parsing the Machine Readable Workbook XML in hostboot did a bunch of things like foreach $key (%hash) – which means that the code iterates over the items in hash order rather than an order that would produce predictable output such as “attribute name” or something. So… much messing with that later, I had hostboot generating the same output for the same input on every build.

Next step was to work out why I was still getting a lot of CCACHE misses. It turns out the default ccache size is 5GB. A full hostboot build uses around 7.1GB of that.

So, if building op-build with CCACHE, be sure to set both BR2_CCACHE=y in your config as well as something like BR2_CCACHE_INITIAL_SETUP="--max-size 20G"

Hopefully my patches hit hostboot and op-build soon.

Switching to iPhone: Part 1

I have used Android phones since the first one: the G1. I’m one of the (relatively) few people who has used Android 1.0. I’ve had numerous Android phones since then, mostly the Google flagship.

I have fond memories of the Nexus One and Galaxy Nexus, as well as a bunch of time running Cyanogen (often daily builds, because YOLO) to get more privacy preserving features (or a more recent Android). I had a Sony Z1 Compact for a while which was great bang for buck except for the fact the screen broke whenever you looked at it sideways. Great kudos to the Sony team for being so friendly to custom firmware loads.

I buy my hardware from physical stores. Why? Well, it means that the NSA and others get to spend extra effort to insert hardware modifications (backdoors), as well as the benefit of having a place to go to/set the ACCC on to get my rights under Australian Consumer Law.

My phone before last was a Nexus 5X. There were a lot of good things about this phone; the promise of fast charging via USB-C was one, as was the ever improving performance of the hardware and Android itself. Well… it just got progressively slower, and slower, and slower – as if it was designed to get near unusable by the time of the next Google phone announcement.

Inevitably, my 5X succumbed to the manufacturing defect that resulted in a boot loop. It would start booting, and then spontaneously reboot, in a loop, forever. The remedy? Replace it under warranty! That would take weeks, which isn’t a suitable timeframe in this day and age to be without a phone, so I mulled over buying a Google Pixel or my first ever iPhone (my iPhone owning friends assured me that if such a thing happens with an iPhone that Apple would have swapped it on the spot). Not wanting to give up a lot of the personal freedom that comes with the Android world, I spent the $100 more to get the Pixel, acutely aware that having a phone was now a near $1000/year habit.

The Google Pixel was a fantastic phone (except the price, they should have matched the iPhone price). The camera was the first phone camera I actually went “wow, I’m impressed” over. The eye-watering $279 to replace a cracked screen, the still eye-watering cost of USB-C cables, and the seat to process the HDR photos were all forgiven. It was a good phone. Until, that is, less than a year in, the battery was completely shot. It would power off when less than 40% and couldn’t last the trip from Melbourne airport to Melbourne city.

So, with a flagship phone well within the “reasonable quality” time that consumer law would dictate, I contacted Google after going through all the standard troubleshooting. Google agreed this was not normal and that the phone was defective. I was told that they would mail me a replacement, I could transfer my stuff over and then mail in the broken one. FANTASTIC!! This was soooo much better than the experience with the 5X.

Except that it wasn’t. A week later, I rang back to ask what was going on as I hadn’t received the replacement; it turns out Google had lied to me, I’d have to mail the phone to them and then another ten business days later I’d have a replacement. Errr…. no, I’ve been here before.

I rang the retailer, JB Hi-Fi; they said it would take them at least three weeks, which I told them was not acceptable nor a “reasonable timeframe” as dictated by consumer law.

So, with a bunch of travel imminent, I bought a big external USB-C battery and kept it constantly connected as without it the battery percentage went down faster than the minutes ticked over. I could sort it out once I was back from travel.

So, I’m back. In fact, I drove back from a weekend away and finally bit the bullet – I went to pick up a phone who’s manufacturer has a reputation of supporting their hardware.

I picked up an iPhone.

I figured I should write up how, why, my reasons, and experiences in switching phone platforms. I think my next post will be “Why iPhone and not a different Android”.

CVE-2019-6260: Gaining control of BMC from the host processor

This is details for CVE-2019-6260 – which has been nicknamed “pantsdown” due to the nature of feeling that we feel that we’ve “caught chunks of the industry with their…” and combined with the fact that naming things is hard, so if you pick a bad name somebody would have to come up with a better one before we publish.

I expect OpenBMC to have a statement shortly.

The ASPEED ast2400 and ast2500 Baseboard Management Controller (BMC) hardware and firmware implement Advanced High-performance Bus (AHB) bridges, which allow arbitrary read and write access to the BMC’s physical address space from the host, or from the network if the BMC console uart is attached to a serial concentrator (this is atypical for most systems).

Common configuration of the ASPEED BMC SoC’s hardware features leaves it open to “remote” unauthenticated compromise from the host and from the BMC console. This stems from AHB bridges on the LPC and PCIe buses, another on the BMC console UART (hardware password protected), and the ability of the X-DMA engine to address all of the BMC’s M-Bus (memory bus).

This affects multiple BMC firmware stacks, including OpenBMC, AMI’s BMC, and SuperMicro. It is independent of host processor architecture, and has been observed on systems with x86_64 processors IBM POWER processors (there is no reason to suggest that other architectures wouldn’t be affected, these are just the ones we’ve been able to get access to)

The LPC, PCIe and UART AHB bridges are all explicitly features of Aspeed’s designs: They exist to recover the BMC during firmware development or to allow the host to drive the BMC hardware if the BMC has no firmware of its own. See section 1.9 of the AST2500 Software Programming Guide.

The typical consequence of external, unauthenticated, arbitrary AHB access is that the BMC fails to ensure all three of confidentiality, integrity and availability for its data and services. For instance it is possible to:

  1. Reflash or dump the firmware of a running BMC from the host
  2. Perform arbitrary reads and writes to BMC RAM
  3. Configure an in-band BMC console from the host
  4. “Brick” the BMC by disabling the CPU clock until the next AC power cycle

Using 1 we can obviously implant any malicious code we like, with the impact of BMC downtime while the flashing and reboot take place. This may take the form of minor, malicious modifications to the officially provisioned BMC image, as we can extract, modify, then repackage the image to be re-flashed on the BMC. As the BMC potentially has no secure boot facility it is likely difficult to detect such actions.

Abusing 3 may require valid login credentials, but combining 1 and 2 we can simply change the locks on the BMC by replacing all instances of the root shadow password hash in RAM with a chosen password hash – one instance of the hash is in the page cache, and from that point forward any login process will authenticate with the chosen password.

We obtain the current root password hash by using 1 to dump the current flash content, then using https://github.com/ReFirmLabs/binwalk to extract the rootfs, then simply loop-mount the rootfs to access /etc/shadow. At least one BMC stack doesn’t require this, and instead offers “Press enter for console”.

IBM has internally developed a proof-of-concept application that we intend to open-source, likely as part of the OpenBMC project, that demonstrates how to use the interfaces and probes for their availability. The intent is that it be added to platform firmware test
suites as a platform security test case. The application requires root user privilege on the host system for the LPC and PCIe bridges, or normal user privilege on a remote system to exploit the debug UART interface. Access from userspace demonstrates the vulnerability of systems in bare-metal cloud hosting lease arrangements where the BMC
is likely in a separate security domain to the host.

OpenBMC Versions affected: Up to at least 2.6, all supported Aspeed-based platforms

It only affects systems using the ASPEED ast2400, ast2500 SoCs. There has not been any investigation into other hardware.

The specific issues are listed below, along with some judgement calls on their risk.

iLPC2AHB bridge Pt I

State: Enabled at cold start
Description: A SuperIO device is exposed that provides access to the BMC’s address-space
Impact: Arbitrary reads and writes to the BMC address-space
Risk: High – known vulnerability and explicitly used as a feature in some platform designs
Mitigation: Can be disabled by configuring a bit in the BMC’s LPC controller, however see Pt II.

iLPC2AHB bridge Pt II

State: Enabled at cold start
Description: The bit disabling the iLPC2AHB bridge only removes write access – reads are still possible.
Impact: Arbitrary reads of the BMC address-space
Risk: High – we expect the capability and mitigation are not well known, and the mitigation has side-effects
Mitigation: Disable SuperIO decoding on the LPC bus (0x2E/0x4E decode). Decoding is controlled via hardware strapping and can be turned off at runtime, however disabling SuperIO decoding also removes the host’s ability to configure SUARTs, System wakeups, GPIOs and the BMC/Host mailbox

PCIe VGA P2A bridge

State: Enabled at cold start
Description: The VGA graphics device provides a host-controllable window mapping onto the BMC address-space
Impact: Arbitrary reads and writes to the BMC address-space
Risk: Medium – the capability is known to some platform integrators and may be disabled in some firmware stacks
Mitigation: Can be disabled or filter writes to coarse-grained regions of the AHB by configuring bits in the System Control Unit

DMA from/to arbitrary BMC memory via X-DMA

State: Enabled at cold start
Description: X-DMA available from VGA and BMC PCI devices
Impact: Misconfiguration can expose the entirety of the BMC’s RAM to the host
AST2400 Risk: High – SDK u-boot does not constrain X-DMA to VGA reserved memory
AST2500 Risk: Low – SDK u-boot restricts X-DMA to VGA reserved memory
Mitigation: X-DMA accesses are configured to remap into VGA reserved memory in u-boot

UART-based SoC Debug interface

State: Enabled at cold start
Description: Pasting a magic password over the configured UART exposes a hardware-provided debug shell. The capability is only exposed on one of UART1 or UART5, and interactions are only possible via the physical IO port (cannot be accessed from the host)
Impact: Misconfiguration can expose the BMC’s address-space to the network if the BMC console is made available via a serial concentrator.
Risk: Low
Mitigation: Can be disabled by configuring a bit in the System Control Unit

LPC2AHB bridge

State: Disabled at cold start
Description: Maps LPC Firmware cycles onto the BMC’s address-space
Impact: Misconfiguration can expose vulnerable parts of the BMC’s address-space to the host
Risk: Low – requires reasonable effort to configure and enable.
Mitigation: Don’t enable the feature if not required.
Note: As a counter-point, this feature is used legitimately on OpenPOWER systems to expose the boot flash device content to the host

PCIe BMC P2A bridge

State: Disabled at cold start
Description: PCI-to-BMC address-space bridge allowing memory and IO accesses
Impact: Enabling the device provides limited access to BMC address-space
Risk: Low – requires some effort to enable, constrained to specific parts of the BMC address space
Mitigation: Don’t enable the feature if not required.

Watchdog setup

State: Required system function, always available
Description: Misconfiguring the watchdog to use “System Reset” mode for BMC reboot will re-open all the “enabled at cold start” backdoors until the firmware reconfigures the hardware otherwise. Rebooting the BMC is generally possible from the host via IPMI “mc reset” command, and this may provide a window of opportunity for BMC compromise.
Impact: May allow arbitrary access to BMC address space via any of the above mechanisms
Risk: Low – “System Reset” mode is unlikely to be used for reboot due to obvious side-effects
Mitigation: Ensure BMC reboots always use “SOC Reset” mode

The CVSS score for these vulnerabilities is: https://nvd.nist.gov/vuln-metrics/cvss/v3-calculator?vector=3DAV:A/AC:L/PR:=N/UI:N/S:U/C:H/I:H/A:H/E:F/RL:U/RC:C/CR:H/IR:H/AR:M/MAV:L/MAC:L/MPR:N/MUI:N=/MS:U/MC:H/MI:H/MA:H

There is some debate on if this is a local or remote vulnerability, and it depends on if you consider the connection between the BMC and the host processor as a network or not.

The fix is platform dependent as it can involve patching both the BMC firmware and the host firmware.

For example, we have mitigated these vulnerabilities for OpenPOWER systems, both on the host and BMC side. OpenBMC has a u-boot patch that disables the features:

https://gerrit.openbmc-project.xyz/#/c/openbmc/meta-phosphor/+/13290/

Which platforms can opt into in the following way:

https://gerrit.openbmc-project.xyz/#/c/openbmc/meta-ibm/+/17146/

The process is opt-in for OpenBMC platforms because platform maintainers have the knowledge of if their platform uses affected hardware features. This is important when disabling the iLPC2AHB bridge as it can be a bit of a finicky process.

See also https://gerrit.openbmc-project.xyz/c/openbmc/docs/+/11164 for a WIP OpenBMC Security Architecture document which should eventually contain all these details.

For OpenPOWER systems, the host firmware patches are contained in op-build v2.0.11 and enabled for certain platforms. Again, this is not by default for all platforms as there is BMC work required as well as per-platform changes.

Credit for finding these problems: Andrew Jeffery, Benjamin
Herrenschmidt, Jeremy Kerr, Russell Currey, Stewart Smith. There have been many more people who have helped with this issue, and they too deserve thanks.

Tracing flash reads (and writes) during boot

On OpenPOWER POWER9 systems, we typically talk to the flash chips that hold firmware for the host (i.e. the POWER9) processor through a daemon running on the BMC (aka service processor) rather than directly.

We have host firmware map “windows” on the LPC bus to parts of the flash chip. This flash chip can in fact be a virtual one, constructed dynamically from files on the BMC.

Since we’re mapping windows into this flash address space, we have some knowledge as to what IO the host is doing to/from the pnor. We can use this to output data in the blktrace format and feed into existing tools used to analyze IO patterns.

So, with a bit of learning of the data format and learning how to drive the various tools, I was ready to patch the BMC daemon (mboxbridge) to get some data out.

An initial bit of data is a graph of the windows into PNOR opened up during an normal boot (see below).

PNOR windows created over the course of a normal boot.

This shows us that over the course of the boot, we open a bunch of windows, and switch them around a fair bit early on. This makes sense as early in boot we do not yet have DRAM working and page in firmware on-demand into L3 cache.

Later in boot, you can see the loading of larger chunks of firmware into memory. It’s also possible to see that this seems to take longer than it should – and indeed, we have a bug there.

Next, by modifying the code again, I introduced recording of when we used a window that the BMC had already cached. While the host will only see one window at a time, the BMC can keep around the ones it prepared earlier in order to avoid IO to the actual flash chips (which are SPI flash, so aren’t incredibly fast).

Here we can see that we’re likely not doing the most efficient things during boot, and there’s probably room for some optimization.

Normal boot but including re-used Windows rather than just created ones

Finally, in order to get finer grained information, I reduced the window size from one megabyte down to 4096 bytes. This will impose a heavy speed penalty as it’ll mean we will have to create a lot more windows to do the same amount of IO, but it means that since we’re using the page size of hostboot, we’ll see each individual page in/out operation that it does during boot.

So, from the next graph, we can see that there’s several “hot” areas of the image, and on the whole it’s not too many pages. This gives us a hint that a bit of effort to reduce binary image size a little bit could greatly reduce the amount of IO we have to do.

4096 byte (i.e. page) size window, capturing the bits of flash we need to read in several times due to being low on memory when we’re L3 cache constrained.

The iowatcher tool also can construct a video of the boot and what “blocks” are being read.

Video of what blocks are read from flash during booting

So, what do we get from this adventure? Well, we get a good list of things to look into in order to improve boot performance, and we get to back these up with data rather than guesswork. Since this also works on unmodified host firmware, we’re measuring what we really boot rather than changing it in order to measure it.

What you need to reproduce this:

Switching to iPhone Part 2: Seriously?

In which I ask of Apple, “Seriously?”.

That was pretty much my reaction with Apple sticking to Lightning connectors rather than going with the USB-C standard. Having USB-C around the place for my last two (Android) phones was fantastic. I could charge a phone, external battery, a (future) laptop, all off the same wall wart and with the same cable. It is with some hilarity that I read that the new iPad Pro has USB-C rather than Lightning.

But Apple’s dongle fetish reigns supreme, and so I get a multitude of damn dongles all for a wonderfully inflated price with an Australia Tax whacked on top.

The most egregious one is the Lightning-to-3.5mm dongle. In the office, I have a good set of headphones. The idea is to block out the sound of an open plan office so I can actually get some concentrating done. With tiny dedicated MP3 players and my previous phones, these sounded great. The Apple dongle? It sounds terrible. Absolutely terrible. The Lighting-to-3.5mm adapter might be okay for small earbuds but it is nearly completely intolerable for any decent set of headphones. I’m now in the market for a Bluetooth headphone amplifier. Another bunch of money to throw at another damn dongle.

Luckily, there seems to be a really good Bluetooth headphone amplifier on Amazon. The same Amazon that no longer ships to Australia. Well, there’s an Australian seller, for six times the price.

Urgh.