Planet Linux Australia
Celebrating Australians & Kiwis in the Linux and Free/Open-Source community...

October 17, 2017

Checking Your Passwords Against the Have I Been Pwned List

Two months ago, Troy Hunt, the security professional behind Have I been pwned?, released an incredibly comprehensive password list in the hope that it would allow web developers to steer their users away from passwords that have been compromised in past breaches.

While the list released by HIBP is hashed, the plaintext passwords are out there and one should assume that password crackers have access to them. So if you use a password on that list, you can be fairly confident that it's very easy to guess or crack your password.

I wanted to check my active passwords against that list to check whether or not any of them are compromised and should be changed immediately. This meant that I needed to download the list and do these lookups locally since it's not a good idea to send your current passwords to this third-party service.

I put my tool up on Launchpad / PyPI and you are more than welcome to give it a go. Install Postgres and Psycopg2 and then follow the README instructions to setup your database.

October 13, 2017

This Week in HASS – term 4, week 2

This week our youngest students are looking at transport in the past, slightly older students consider places that are special to people around the world and our oldest students are considering reasons why people might leave their homes to become migrants. Foundation/Prep/Kindy to Year 3 Students in standalone Foundation/Prep/Kindy classes (Unit F.4), as well as […]

October 09, 2017

LUV Main November 2017 Meeting: TBA

Nov 8 2017 18:30
Nov 8 2017 20:30
Nov 8 2017 18:30
Nov 8 2017 20:30
Location: 
Mail Exchange Hotel, 688 Bourke St, Melbourne VIC 3000

PLEASE NOTE NEW LOCATION AND DATE DUE TO MELBOURNE CUP DAY

Wednesday, November 8, 2017
6:30 PM to 8:30 PM
Mail Exchange Hotel
688 Bourke St, Melbourne VIC 3000

Speakers:

  • TBA

 

Mail Exchange Hotel, 688 Bourke St, Melbourne VIC 3000

Food and drinks will be available on premises.

Linux Users of Victoria is a subcommittee of Linux Australia.

November 8, 2017 - 18:30

October 07, 2017

New Lithium Battery Pack for my EV

Eight years ago I installed a pack of 36 Lithium cells in my EV. After about 50,000km and several near-death battery pack experiences (over discharge) the range decreased beneath a useful level so I have just purchased a new pack.

Same sort of cells, CALB 100AH, 3.2V per cell (80km range). The pack was about AUD$6,000 delivered and took an afternoon to install. I’ve adjusted my Zivan NG3 to cut out at an average of 3.6 v/cell (129.6V), and still have the BMS system that will drop out the charger if any one cell exceeds 4.1V.

The original pack was rated at 10 years (3000 cycles) and given the abuse we subjected it to I’m quite pleased it lasted 8 years. I don’t have a fail-safe battery management system like a modern factory EV so we occasionally drove the car when dead flat. While I could normally pick this problem quickly from the instrumentation my teenage children tended to just blissfully drive on. Oh well, this is an experimental hobby, and mistakes will be made. The Wright brothers broke a few wings……

I just took the car with it’s new battery pack for a 25km test drive and all seems well. The battery voltage is about 118V at rest, and 114V when cruising at 60 km/hr. It’s not dropping beneath 110V during acceleration, much better than the old pack which would sag beneath 100V. I guess the internal resistance of the new cells is much lower.

I plan to keep driving my little home-brew EV until I can by a commercial EV with a > 200km range here in Australia for about $30k, which I estimate will happen around 2020.

It’s nice to have my little EV back on the road.

October 06, 2017

This Week in HASS – term 4, week 1

The last term of the school year – traditionally far too short and crowded with many events, both at and outside of school. OpenSTEM’s® Understanding Our World® program for HASS + Science ensures that not only are the students kept engaged with interesting material, but that teachers can relax, knowing that all curriculum-relevant material is […]

October 05, 2017

MS Gong ride

I have returned to cycling a couple weeks ago and I am taking part in the MS Sydney to the Gong Ride - The Ride to Fight Multiple Sclerosis.

Though it would be a huge fun and a great challenge to ride over 80km along the Sydney coast, this is a fundraising event and entry fee only covers event staging costs. Every dollar you DONATE will go directly to ensuring the thousands of Australians with multiple sclerosis are able to receive the support and care they need to live well.

Please DONATE now to support my ride and change the lives of Australians living with multiple sclerosis.

Make a Donation!

Thank you for your support.

PS: Please visit fund raising pages of my friends Natasha and Eric who have inspired me to return to cycling and take this ride!

October 04, 2017

DevOps Days Auckland 2017 – Wednesday Session 3

Sanjeev Sharma – When DevOps met SRE: From Apollo 13 to Google SRE

  • Author of Two DevOps Bookks
  • Apollo 13
    • Who were the real heroes? The guys back at missing control. The Astronaunts just had to keep breathing and not die
  • Best Practice for Incident management
    • Prioritize
    • Prepare
    • Trust
    • Introspec
    • Consider Alternatives
    • Practice
    • Change it around
  • Big Hurdles to adoption of DevOps in Enterprise
    • Literature is Only looking at one delivery platform at a time
    • Big enterprise have hundreds of platforms with completely different technologies, maturity levels, speeds. All interdependent
    • He Divides
      • Industrialised Core – Value High, Risk Low, MTBF
      • Agile/Innovation Edge – Value Low, Risk High, Rapid change and delivery, MTTR
      • Need normal distribution curve of platforms across this range
      • Need to be able to maintain products at both ends in one IT organisation
  • 6 capabilities needed in IT Organisation
    • Planning and architecture.
      • Your Delivery pipeline will be as fast as the slowest delivery pipeline it is dependent on
    • APIs
      • Modernizing to Microservices based architecture: Refactoring code and data and defining the APIs
    • Application Deployment Automation and Environment Orchestration
      • Devs are paid code, not maintain deployment and config scripts
      • Ops must provide env that requires devs to do zero setup scripts
    • Test Service and Environment Virtualisation
      • If you are doing 2week sprints, but it takes 3-weeks to get a test server, how long are your sprints
    • Release Management
      • No good if 99% of software works but last 1% is vital for the business function
    • Operational Readiness for SRE
      • Shift between MTBF to MTTR
      • MTTR  = Mean time to detect + Mean time to Triage + Mean time to restore
      • + Mean time to pass blame
    • Antifragile Systems
      • Things that neither are fragile or robust, but rather thrive on chaos
      • Cattle not pets
      • Servers may go red, but services are always green
    • DevOps: “Everybody is responsible for delivery to production”
    • SRE: “(Everybody) is responsible for delivering Continuous Business Value”

Share

DevOps Days Auckland 2017 – Wednesday Session 2

Marcus Bristol (Pushpay) – Moving fast without crashing

  • Low tolerance for errors in production due to being in finance
  • Deploy twice per day
  • Just Culture – Balance safety and accountability
    • What rule?
    • Who did it?
    • How bad was the breach?
    • Who gets to decide?
  • Example of Retributive Culture
    • KPIs reflect incidents.
    • If more than 10% deploys bad then affect bonus
    • Reduced number of deploys
  • Restorative Culture
  • Blameless post-mortem
    • Can give detailed account of what happened without fear or retribution
    • Happens after every incident or near-incident
    • Written Down in Wiki Page
    • So everybody has the chance to have a say
    • Summary, Timeline, impact assessment, discussion, Mitigations
    • Mitigations become highest-priority work items
  • Our Process
    • Feature Flags
    • Science
    • Lots of small PRs
    • Code Review
    • Testers paired to devs so bugs can be fixed as soon as found
    • Automated tested
    • Pollination (reviews of code between teams)
    • Bots
      • Posts to Slack when feature flag has been changed
      • Nags about feature flags that seems to be hanging around in QA
      • Nags about Flags that have been good in prod for 30+ days
      • Every merge
      • PRs awaiting reviews for long time (days)
      • Missing postmortun migrations
      • Status of builds in build farm
      • When deploy has been made
      • Health of API
      • Answer queries on team member list
      • Create ship train of PRs into a build and user can tell bot to deploy to each environment

Share

October 03, 2017

DevOps Days Auckland 2017 – Wednesday Session 1

Michael Coté – Not actually a DevOps Talk

Digital Transformation

  • Goal: deliver value, weekly reliably, with small patches
  • Management must be the first to fail and transform
  • Standardize on a platform: special snow flakes are slow, expensive and error prone (see his slide, good list of stuff that should be standardize)
  • Ramping up: “Pilot low-risk apps, and ramp-up”
  • Pair programming/working
    • Half the advantage is people speed less time on reddit “research”
  • Don’t go to meetings
  • Automate compliance, have what you do automatic get logged and create compliance docs rather than building manually.
  • Crafting Your Cloud-Native Strategy

Sajeewa Dayaratne – DevOps in an Embedded World

  • Challenges on Embedded
    • Hardware – resource constrinaed
    • Debugging – OS bugs, Hardware Bugs, UFO Bugs – Oscilloscopes and JTAG connectors are your friend.
    • Environment – Thermal, Moisture, Power consumption
    • Deploy to product – Multi-month cycle, hard of impossible to send updates to ships at sea.
  • Principles of Devops , equally apply to embedded
    • High Frequency
    • Reduce overheads
    • Improve defect resolution
    • Automate
    • Reduce response times
  • Navico
    • Small Sonar, Navigation for medium boats, Displays for sail (eg Americas cup). Navigation displays for large ships
    • Dev around world, factory in Mexico
  • Codebase
    • 5 million lines of code
    • 61 Hardware Products supported – Increasing steadily, very long lifetimes for hardware
    • Complex network of products – lots of products on boat all connected, different versions of software and hardware on the same boat
  • Architecture
    • Old codebase
    • Backward compatible with old hardware
    • Needs to support new hardware
    • Desire new features on all products
  • What does this mean
    • Defects were found too late
    • Very high cost of bugs found late
    • Software stabilization taking longer
    • Manual test couldn’t keep up
    • Cost increasing , including opportunity cost
  • Does CI/CD provide answer?
    • But will it work here?
    • Case Study from HP. Large-Scale Agile Development by Gary Gruver
  • Our Plan
    • Improve tolls and archetecture
    • Build Speeds
    • Automated testing
    • Code quality control
  • Previous VCS
    • Proprietary tool with limit support and upgrades
    • Limited integration
    • Lack of CI support
    • No code review capacity
  • Move to git
    • Code reviews
    • Integrated CI
    • Supported by tools
  • Archetecture
    • Had a configurable codebase already
    • Fairly common hardware platform (only 9 variations)
    • Had runtime feature flags
    • But
      • Cyclic dependancies – 1.5 years to clean these up
      • Singletons – cut down
      • Promote unit testability – worked on
      • Many branches – long lived – mega merges
  • Went to a single Branch model, feature flags, smaller batch sizes, testing focused on single branch
  • Improve build speed
    • Start 8 hours to build Linux platform, 2 hours for each app, 14+ hours to build and package a release
    • Options
      • Increase speed
      • Parallel Builds
    • What did
      • ccache.clcache
      • IncrediBuild
      • distcc
    • 4-5hs down to 1h
  • Test automation
    • Existing was mock-ups of the hardware to not typical
    • Started with micro-test
      • Unit testing (simulator)
      • Unit testing (real hardware)
    • Build Tools
      • Software tools (n2k simulator, remote control)
      • Hardware tools ( Mimic real-world data, re purpose existing stuff)
    • UI Test Automation
      • Build or Buy
      • Functional testing vs API testing
      • HW Test tools
      • Took 6 hours to do full test on hardware.
  • PipeLine
    • Commit -> pull request
    • Automated Build / Unit Tests
    • Daily QA Build
  • Next?
    • Configuration as code
    • Code Quality tools
    • Simulate more hardware
    • Increase analytics and reporting
    • Fully simulated test env for dev (so the devs don’t need the hardware)
    • Scale – From internal infrastructure to the cloud
    • Grow the team
  • Lessons Learnt
    • Culture!
    • Collect Data
    • Get Executive Buy in
    • Change your tolls and processes if needed
    • Test automation is the key
      • Invest in HW
      • Silulate
      • Virtualise
    • Focus on good software design for Everything

Share

Ikea wireless charger in CNC mahogany case

I notice that Ikea sell their wireless chargers without a shell for insertion into desks. The "desk" I chose is a curve cut profile in mahogany that just happens to have the same fit as an LG G3/4/5 type phone. The design changed along the way to a more upright one which then required a catch to stop the phone sliding off.


This was done in Fusion360 which allows bringing in STL files of things like phones and cutting those out of another body. It took a while to work out the ball end toolpath but I finally worked out how to get something that worked reasonably well. The chomps in the side allow fingers to securely lift the phone off the charger.

It will be interesting to play with sliced objects in wood. Layering 3D cuts to build up objects that are 10cm (or about 4 layers) tall.

DevOps Days Auckland 2017 – Tuesday Session 3

Mirror, mirror, on the wall: testing Conway’s Law in open source communities – Lindsay Holmwood

  • The map between the technical organisation and the technical structure.
  • Easy to find who owns something, don’t have to keep two maps in your head
  • Needs flexibility of the organisation structure in order to support flexibility in a technical design
  • Conway’s “Law” really just adage
  • Complexity frequently takes the form of hierarchy
  • Organisations that mirror perform badly in rapidly changing and innovative enviroments

Metrics that Matter – Alison Polton-Simon (Thoughtworks)

  • Metrics Mania – Lots of focus on it everywhere ( fitbits, google analytics, etc)
  • How to help teams improve CD process
  • Define CD
    • Software consistently in a deployable state
    • Get fast, automated feedback
    • Do push-button deployments
  • Identifying metrics that mattered
    • Talked to people
    • Contextual observation
    • Rapid prototyping
    • Pilot offering
  • 4 big metrics
    • Deploy ready builds
    • Cycle time
    • Mean time between failures
    • Mean time to recover
  • Number of Deploy-ready builds
    • How many builds are ready for production?
    • Routine commits
    • Testing you can trust
    • Product + Development collaboration
  • Cycle Time
    • Time it takes to go from a commit to a deploy
    • Efficient testing (test subset first, faster testing)
    • Appropriate parallelization (lots of build agents)
    • Optimise build resources
  • Case Study
    • Monolithic Codebase
    • Hand-rolled build system
    • Unreliable environments ( tests and builds fail at random )
    • Validating a Pull Request can take 8 hours
    • Coupled code: isolated teams
    • Wide range of maturity in testing (some no test, some 95% coverage)
    • No understanding of the build system
    • Releases routinely delay (10 months!) or done “under the radar”
  • Focus in case study
    • Reducing cycle time, increasing reliability
    • Extracted services from monolith
    • Pipelines configured as code
    • Build infrastructure provisioned as docker and ansible
    • Results:
      • Cycle time for one team 4-5h -> 1:23
      • Deploy ready builds 1 per 3-8 weeks -> weekly
  • Mean time between failures
    • Quick feedback early on
    • Robust validation
    • Strong local builds
    • Should not be done by reducing number of releases
  • Mean time to recover
    • How long back to green?
    • Monitoring of production
    • Automated rollback process
    • Informative logging
  • Case Study 2
    • 1.27 million lines of code
    • High cyclomatic complexity
    • Tightly coupled
    • Long-running but frequently failing testing
    • Isolated teams
    • Pipeline run duration 10h -> 15m
    • MTTR Never -> 50 hours
    • Cycle time 18d -> 10d
    • Created a dashboard for the metrics
  • Meaningless Metrics
    • The company will build whatever the CEO decides to measure
    • Lines of code produced
    • Number of Bugs resolved. – real life duplicates Dilbert
    • Developers Hours / Story Points
    • Problems
      • Lack of team buy-in
      • Easy to agme
      • Unintended consiquences
      • Measuring inputs, not impacts
  • Make your own metrics
    • Map your path to production
    • Highlights pain points
    • collaborate
    • Experiment

 

Share

DevOps Days Auckland 2017 – Tuesday Session 2

Using Bots to Scale incident Management – Anthony Angell (Xero)

  • Who we are
    • Single Team
    • Just a platform Operations team
  • SRE team is formed
    • Ops teams plus performance Engineering team
  • Incident Management
    • In Bad old days – 600 people on a single chat channel
    • Created Framework
    • what do incidents look like, post mortems, best practices,
    • How to make incident management easy for others?
  • ChatOps (Based on Hubot)
    • Automated tour guide
    • Multiple integrations – anything with Rest API
    • Reducing time to restore
    • Flexability
  • Release register – API hook to when changes are made
  • Issue report form
    • Summary
    • URL
    • User-ids
    • how many users & location
    • when started
    • anyone working on it already
    • Anything else to add.
  • Chat Bot for incident
    • Populates for an pushes to production channel, creates pagerduty alert
    • Creates new slack channel for incident
    • Can automatically update status page from chat and page senior managers
    • Can Create “status updates” which record things (eg “restarted server”), or “yammer updates” which get pushed to social media team
    • Creates a task list automaticly for the incident
    • Page people from within chat
    • At the end: Gives time incident lasted, archives channel
    • Post Mortum
  • More integrations
    • Report card
    • Change tracking
    • Incident / Alert portal
  • High Availability – dockerisation
  • Caching
    • Pageduty
    • AWS
    • Datadog

 

Share

October 02, 2017

DevOps Days Auckland 2017 – Tuesday Session 1

DevSecOps – Anthony Rees

“When Anthrax and Public Enemy came together, It was like Developers and Operations coming together”

  • Everybody is trying to get things out fast, sometimes we forget about security
  • Structural efficiency and optimised flow
  • Compliance putting roadblock in flow of pipeline
    • Even worse scanning in production after deployment
  • Compliance guys using Excel, Security using Shell-scripts, Develops and Operations using Code
  • Chef security compliance language – InSpec
    • Insert Sales stuff here
  • ispec.io
  • Lots of pre-written configs available

Immutable SQL Server Clusters – John Bowker (from Xero)

  • Problem
    • Pet Based infrastructure
    • Not in cloud, weeks to deploy new server
    • Hard to update base infrastructure code
  • 110 Prod Servers (2 regions).
  • 1.9PB of Disk
  • Octopus Deploy: SQL Schemas, Also server configs
  • Half of team in NZ, Half in Denver
    • Data Engineers, Infrastructure Engineers, Team Lead, Product Owner
  • Where we were – The Burning Platform
    • Changed mid-Migration from dedicated instances to dedicated Hosts in AWS
    • Big saving on software licensing
  • Advantages
    • Already had Clustered HA
    • Existing automation
    • 6 day team, 15 hours/day due to multiple locations of team
  • Migration had to have no downtime
    • Went with node swaps in cluster
  • Split team. Half doing migration, half creating code/system for the node swaps
  • We learnt
    • Dedicated hosts are cheap
    • Dedicated host automation not so good for Windows
    • Discovery service not so good.
    • Syncing data took up to 24h due to large dataset
    • Powershell debugging is hard (moving away from powershell a bit, but powershell has lots of SQL server stuff built in)
    • AWS services can timeout, allow for this.
  • Things we Built
    • Lots Step Templates in Octopus Deploy
    • Metadata Store for SQL servers – Dynamite (Python, Labda, Flask, DynamoDB) – Hope to Open source
    • Lots of PowerShell Modules
  • Node Swaps going forward
    • Working towards making this completely automated
    • New AMI -> Node swap onto that
    • Avoid upgrade in place or running on old version

Share

Linux Security Summit 2017 Roundup

The 2017 Linux Security Summit (LSS) was held last month in Los Angeles over the 14th and 15th of September.  It was co-located with Open Source Summit North America (OSSNA) and the Linux Plumbers Conference (LPC).

LSS 2017 sign at conference

LSS 2017

Once again we were fortunate to have general logistics managed by the Linux Foundation, allowing the program committee to focus on organizing technical content.  We had a record number of submissions this year and accepted approximately one third of them.  Attendance was very strong, with ~160 attendees — another record for the event.

LSS 2017 Attendees

LSS 2017 Attendees

On the day prior to LSS, attendees were able to access a day of LPC, which featured two tracks with a security focus:

Many thanks to the LPC organizers for arranging the schedule this way and allowing LSS folk to attend the day!

Realtime notes were made of these microconfs via etherpad:

I was particularly interested in the topic of better integrating LSM with containers, as there is an increasingly common requirement for nesting of security policies, where each container may run its own apparently independent security policy, and also a potentially independent security model.  I proposed the approach of introducing a security namespace, where all security interfaces within the kernel are namespaced, including LSM.  It would potentially solve the container use-cases, and also the full LSM stacking case championed by Casey Schaufler (which would allow entirely arbitrary stacking of security modules).

This would be a very challenging project, to say the least, and one which is further complicated by containers not being a first class citizen of the kernel.   This leads to security policy boundaries clashing with semantic functional boundaries e.g. what does it mean from a security policy POV when you have namespaced filesystems but not networking?

Discussion turned to the idea that it is up to the vendor/user to configure containers in a way which makes sense for them, and similarly, they would also need to ensure that they configure security policy in a manner appropriate to that configuration.  I would say this means that semantic responsibility is pushed to the user with the kernel largely remaining a set of composable mechanisms, in relation to containers and security policy.  This provides a great deal of flexibility, but requires those building systems to take a great deal of care in their design.

There are still many issues to resolve, both upstream and at the distro/user level, and I expect this to be an active area of Linux security development for some time.  There were some excellent followup discussions in this area, including an approach which constrains the problem space. (Stay tuned)!

A highlight of the TPMs session was an update on the TPM 2.0 software stack, by Philip Tricca and Jarkko Sakkinen.  The slides may be downloaded here.  We should see a vastly improved experience over TPM 1.x with v2.0 hardware capabilities, and the new software stack.  I suppose the next challenge will be TPMs in the post-quantum era?

There were further technical discussions on TPMs and container security during subsequent days at LSS.  Bringing the two conference groups together here made for a very productive event overall.

TPMs microconf at LPC with Philip Tricca presenting on the 2.0 software stack.

This year, due to the overlap with LPC, we unfortunately did not have any LWN coverage.  There are, however, excellent writeups available from attendees:

There were many awesome talks.

The CII Best Practices Badge presentation by David Wheeler was an unexpected highlight for me.  CII refers to the Linux Foundation’s Core Infrastructure Initiative , a preemptive security effort for Open Source.  The Best Practices Badge Program is a secure development maturity model designed to allow open source projects to improve their security in an evolving and measurable manner.  There’s been very impressive engagement with the project from across open source, and I believe this is a critically important effort for security.

CII Bade Project adoption (from David Wheeler’s slides).

During Dan Cashman’s talk on SELinux policy modularization in Android O,  an interesting data point came up:

We of course expect to see application vulnerability mitigations arising from Mandatory Access Control (MAC) policies (SELinux, Smack, and AppArmor), but if you look closely this refers to kernel vulnerabilities.   So what is happening here?  It turns out that a side effect of MAC policies, particularly those implemented in tightly-defined environments such as Android, is a reduction in kernel attack surface.  It is generally more difficult to reach such kernel vulnerabilities when you have MAC security policies.  This is a side-effect of MAC, not a primary design goal, but nevertheless appears to be very effective in practice!

Another highlight for me was the update on the Kernel Self Protection Project lead by Kees, which is now approaching its 2nd anniversary, and continues the important work of hardening the mainline Linux kernel itself against attack.  I would like to also acknowledge the essential and original research performed in this area by grsecurity/PaX, from which this mainline work draws.

From a new development point of view, I’m thrilled to see the progress being made by Mickaël Salaün, on Landlock LSM, which provides unprivileged sandboxing via seccomp and LSM.  This is a novel approach which will allow applications to define and propagate their own sandbox policies.  Similar concepts are available in other OSs such as OSX (seatbelt) and BSD (pledge).  The great thing about Landlock is its consolidation of two existing Linux kernel security interfaces: LSM and Seccomp.  This ensures re-use of existing mechanisms, and aids usability by utilizing already familiar concepts for Linux users.

Overall I found it to be an incredibly productive event, with many new and interesting ideas arising and lots of great collaboration in the hallway, lunch, and dinner tracks.

Slides from LSS may be found linked to the schedule abstracts.

We did not have a video sponsor for the event this year, and we’ll work on that again for next year’s summit.  We have discussed holding LSS again next year in conjunction with OSSNA, which is expected to be in Vancouver in August.

We are also investigating a European LSS in addition to the main summit for 2018 and beyond, as a way to help engage more widely with Linux security folk.  Stay tuned for official announcements on these!

Thanks once again to the awesome event staff at LF, especially Jillian Hall, who ensured everything ran smoothly.  Thanks also to the program committee who review, discuss, and vote on every proposal, ensuring that we have the best content for the event, and who work on technical planning for many months prior to the event.  And of course thanks to the presenters and attendees, without whom there would literally and figuratively be no event :)

See you in 2018!

 

Stone Axes and Aboriginal Stories from Victoria

In the most recent edition of Australian Archaeology, the journal of the Australian Archaeological Association, there is a paper examining the exchange of stone axes in Victoria and correlating these patterns of exchange with Aboriginal stories in the 19th century. This paper is particularly timely with the passing of legislation in the Victorian Parliament on […]

September 28, 2017

Process Monitoring

Since forking the Mon project to etbemon [1] I’ve been spending a lot of time working on the monitor scripts. Actually monitoring something is usually quite easy, deciding what to monitor tends to be the hard part. The process monitoring script ps.monitor is the one I’m about to redesign.

Here are some of my ideas for monitoring processes. Please comment if you have any suggestions for how do do things better.

For people who don’t use mon, the monitor scripts return 0 if everything is OK and 1 if there’s a problem along with using stdout to display an error message. While I’m not aware of anyone hooking mon scripts into a different monitoring system that’s going to be easy to do. One thing I plan to work on in the future is interoperability between mon and other systems such as Nagios.

Basic Monitoring

ps.monitor tor:1-1 master:1-2 auditd:1-1 cron:1-5 rsyslogd:1-1 dbus-daemon:1- sshd:1- watchdog:1-2

I’m currently planning some sort of rewrite of the process monitoring script. The current functionality is to have a list of process names on the command line with minimum and maximum numbers for the instances of the process in question. The above is a sample of the configuration of the monitor. There are some limitations to this, the “master” process in this instance refers to the main process of Postfix, but other daemons use the same process name (it’s one of those names that’s wrong because it’s so obvious). One obvious solution to this is to give the option of specifying the full path so that /usr/lib/postfix/sbin/master can be differentiated from all the other programs named master.

The next issue is processes that may run on behalf of multiple users. With sshd there is a single process to accept new connections running as root and a process running under the UID of each logged in user. So the number of sshd processes running as root will be one greater than the number of root login sessions. This means that if a sysadmin logs in directly as root via ssh (which is controversial and not the topic of this post – merely something that people do which I have to support) and the master process then crashes (or the sysadmin stops it either accidentally or deliberately) there won’t be an alert about the missing process. Of course the correct thing to do is to have a monitor talk to port 22 and look for the string “SSH-2.0-OpenSSH_”. Sometimes there are multiple instances of a daemon running under different UIDs that need to be monitored separately. So obviously we need the ability to monitor processes by UID.

In many cases process monitoring can be replaced by monitoring of service ports. So if something is listening on port 25 then it probably means that the Postfix “master” process is running regardless of what other “master” processes there are. But for my use I find it handy to have multiple monitors, if I get a Jabber message about being unable to send mail to a server immediately followed by a Jabber message from that server saying that “master” isn’t running I don’t need to fully wake up to know where the problem is.

SE Linux

One feature that I want is monitoring SE Linux contexts of processes in the same way as monitoring UIDs. While I’m not interested in writing tests for other security systems I would be happy to include code that other people write. So whatever I do I want to make it flexible enough to work with multiple security systems.

Transient Processes

Most daemons have a second process of the same name running during the startup process. This means if you monitor for exactly 1 instance of a process you may get an alert about 2 processes running when “logrotate” or something similar restarts the daemon. Also you may get an alert about 0 instances if the check happens to run at exactly the wrong time during the restart. My current way of dealing with this on my servers is to not alert until the second failure event with the “alertafter 2” directive. The “failure_interval” directive allows specifying the time between checks when the monitor is in a failed state, setting that to a low value means that waiting for a second failure result doesn’t delay the notification much.

To deal with this I’ve been thinking of making the ps.monitor script automatically check again after a specified delay. I think that solving the problem with a single parameter to the monitor script is better than using 2 configuration directives to mon to work around it.

CPU Use

Mon currently has a loadavg.monitor script that to check the load average. But that won’t catch the case of a single process using too much CPU time but not enough to raise the system load average. Also it won’t catch the case of a CPU hungry process going quiet (EG when the SETI at Home server goes down) while another process goes into an infinite loop. One way of addressing this would be to have the ps.monitor script have yet another configuration option to monitor CPU use, but this might get confusing. Another option would be to have a separate script that alerts on any process that uses more than a specified percentage of CPU time over it’s lifetime or over the last few seconds unless it’s in a whitelist of processes and users who are exempt from such checks. Probably every regular user would be exempt from such checks because you never know when they will run a file compression program. Also there is a short list of daemons that are excluded (like BOINC) and system processes (like gzip which is run from several cron jobs).

Monitoring for Exclusion

A common programming mistake is to call setuid() before setgid() which means that the program doesn’t have permission to call setgid(). If return codes aren’t checked (and people who make such rookie mistakes tend not to check return codes) then the process keeps elevated permissions. Checking for processes running as GID 0 but not UID 0 would be handy. As an aside a quick examination of a Debian/Testing workstation didn’t show any obvious way that a process with GID 0 could gain elevated privileges, but that could change with one chmod 770 command.

On a SE Linux system there should be only one process running with the domain init_t. Currently that doesn’t happen in Stretch systems running daemons such as mysqld and tor due to policy not matching the recent functionality of systemd as requested by daemon service files. Such issues will keep occurring so we need automated tests for them.

Automated tests for configuration errors that might impact system security is a bigger issue, I’ll probably write a separate blog post about it.

I think I found a bug in python's unittest.mock library

Mocking is a pretty common thing to do in unit tests covering OpenStack Nova code. Over the years we've used various mock libraries to do that, with the flavor de jour being unittest.mock. I must say that I strongly prefer unittest.mock to the old mox code we used to write, but I think I just accidentally found a fairly big bug.

The problem is that python mocks are magical. Its an object where you can call any method name, and the mock will happily pretend it has that method, and return None. You can then later ask what "methods" were called on the mock.

However, you use the same mock object later to make assertions about what was called. Herein is the problem -- the mock object doesn't know if you're the code under test, or the code that's making assertions. So, if you fat finger the assertion in your test code, the assertion will just quietly map to a non-existent method which returns None, and your code will pass.

Here's an example:

    #!/usr/bin/python3
    
    from unittest import mock
    
    
    class foo(object):
        def dummy(a, b):
            return a + b
    
    
    @mock.patch.object(foo, 'dummy')
    def call_dummy(mock_dummy):
        f = foo()
        f.dummy(1, 2)
    
        print('Asserting a call should work if the call was made')
        mock_dummy.assert_has_calls([mock.call(1, 2)])
        print('Assertion for expected call passed')
    
        print()
        print('Asserting a call should raise an exception if the call wasn\'t made')
        mock_worked = False
        try:
            mock_dummy.assert_has_calls([mock.call(3, 4)])
        except AssertionError as e:
            mock_worked = True
            print('Expected failure, %s' % e)
    
        if not mock_worked:
            print('*** Assertion should have failed ***')
    
        print()
        print('Asserting a call where the assertion has a typo should fail, but '
              'doesn\'t')
        mock_worked = False
        try:
            mock_dummy.typo_assert_has_calls([mock.call(3, 4)])
        except AssertionError as e:
            mock_worked = True
            print('Expected failure, %s' % e)
            print()
    
        if not mock_worked:
            print('*** Assertion should have failed ***')
            print(mock_dummy.mock_calls)
            print()
    
    
    if __name__ == '__main__':
        call_dummy()
    


If I run that code, I get this:

    $ python3 mock_assert_errors.py 
    Asserting a call should work if the call was made
    Assertion for expected call passed
    
    Asserting a call should raise an exception if the call wasn't made
    Expected failure, Calls not found.
    Expected: [call(3, 4)]
    Actual: [call(1, 2)]
    
    Asserting a call where the assertion has a typo should fail, but doesn't
    *** Assertion should have failed ***
    [call(1, 2), call.typo_assert_has_calls([call(3, 4)])]
    


So, we should have been told that typo_assert_has_calls isn't a thing, but we didn't notice because it silently failed. I discovered this when I noticed an assertion with a (smaller than this) typo in its call in a code review yesterday.

I don't really have a solution to this right now (I'm home sick and not thinking straight), but it would be interesting to see what other people think.

Tags for this post: python unittest.mock mock testing
Related posts: Implementing SCP with paramiko; Packet capture in python; A pythonic example of recording metrics about ephemeral scripts with prometheus; mbot: new hotness in Google Talk bots; Starfish Prime; Calculating a SSH host key with paramiko

Comment

September 24, 2017

What Makes Humans Different From Most Other Mammals?

Well, there are several things that makes us different from other mammals – although perhaps fewer than one might think. We are not unique in using tools, in fact we discover more animals that use tools all the time – even fish! We pride ourselves on being a “moral animal”, however fairness, reciprocity, empathy and […]

Drupal Puppies

Over the years Drupal distributions, or distros as they're more affectionately known, have evolved a lot. We started off passing around database dumps. Eventually we moved onto using installations profiles and features to share par-baked sites.

There are some signs that distros aren't working for people using them. Agencies often hack a distro to meet client requirements. This happens because it is often difficult to cleanly extend a distro. A content type might need extra fields or the logic in an alter hook may not be desired. This makes it difficult to maintain sites built on distros. Other times maintainers abandon their distributions. This leaves site owners with an unexpected maintenance burden.

We should recognise how people are using distros and try to cater to them better. My observations suggest there are 2 types of Drupal distributions; starter kits and targeted products.

Targeted products are easier to deal with. Increasingly monetising targeted distro products is done through a SaaS offering. The revenue can funds the ongoing development of the product. This can help ensure the project remains sustainable. There are signs that this is a viable way of building Drupal 8 based products. We should be encouraging companies to embrace a strategy built around open SaaS. Open Social is a great example of this approach. Releasing the distros demonstrates a commitment to the business model. Often the secret sauce isn't in the code, it is the team and services built around the product.

Many Drupal 7 based distros struggled to articulate their use case. It was difficult to know if they were a product, a demo or a community project that you extend. Open Atrium and Commerce Kickstart are examples of distros with an identity crisis. We need to reconceptualise most distros as "starter kits" or as I like to call them "puppies".

Why puppies? Once you take a puppy home it becomes your responsibility. Starter kits should be the same. You should never assume that a starter kit will offer an upgrade path from one release to the next. When you install a starter kit you are responsible for updating the modules yourself. You need to keep track of security releases. If your puppy leaves a mess on the carpet, no one else will clean it up.

Sites build on top of a starter kit should diverge from the original version. This shouldn't only be an expectation, it should be encouraged. Installing a starter kit is the starting point of building a unique fork.

Project pages should clearly state that users are buying a puppy. Prospective puppy owners should know if they're about to take home a little lap dog or one that will grow to the size of a pony that needs daily exercise. Puppy breeders (developers) should not feel compelled to do anything once releasing the puppy. That said, most users would like some documentation.

I know of several agencies and large organisations that are making use of starter kits. Let's support people who are adopting this approach. As a community we should acknowledge that distros aren't working. We should start working out how best to manage the transition to puppies.

September 23, 2017

On Equal Rights

This is probably old news now, but I only saw it this morning, so here we go:

In case that embedded tweet doesn’t show up properly, that’s an editorial in the NT News which says:

Voting papers have started to drop through Territory mailboxes for the marriage equality postal vote and I wanted to share with you a list of why I’ll be voting yes.

1. I’m not an arsehole.

This resulted in predictable comments along the lines of “oh, so if I don’t share your views, I’m an arsehole?”

I suppose it’s unlikely that anyone who actually needs to read and understand what I’m about to say will do so, but just in case, I’ll lay this out as simply as I can:

  • A personal belief that marriage is a thing that can only happen between a man and a woman does not make you an arsehole (it might make you on the wrong side of history, or a lot of other things, but it does not necessarily make you an arsehole).
  • Voting “no” to marriage equality is what makes you an arsehole.

The survey says “Should the law be changed to allow same-sex couples to marry?” What this actually means is, “Should same-sex couples have the same rights under law as everyone else?”

If you believe everyone should have the same rights under law, you need to vote yes regardless of what you, personally, believe the word “marriage” actually means – this is to make sure things like “next of kin” work the way the people involved in a relationship want them to.

If you believe that there are minorities that should not have the same rights under law as everyone else, then I’m sorry, but you’re an arsehole.

(Personally I think the Marriage Act should be ditched entirely in favour of a Civil Unions Act – that way the word “marriage” could go back to simply meaning whatever it means to the individuals being married, and to their god(s) if they have any – but this should in no way detract from the above. Also, this vote shouldn’t have happened in the first place; our elected representatives should have done their bloody jobs and fixed the legislation already.)

Converting Mbox to Maildir

MBox is the original and ancient format for storing mail on Unix systems, it consists of a single file per user under /var/spool/mail that has messages concatenated. Obviously performance is very poor when deleting messages from a large mail store as the entire file has to be rewritten. Maildir was invented for Qmail by Dan Bernstein and has a single message per file giving fast deletes among other performance benefits. An ongoing issue over the last 20 years has been converting Mbox systems to Maildir. The various ways of getting IMAP to work with Mbox only made this more complex.

The Dovecot Wiki has a good page about converting Mbox to Maildir [1]. If you want to keep the same message UIDs and the same path separation characters then it will be a complex task. But if you just want to copy a small number of Mbox accounts to an existing server then it’s a bit simpler.

Dovecot has a mb2md.pl script to convert folders [2].

cd /var/spool/mail
mkdir -p /mailstore/example.com
for U in * ; do
  ~/mb2md.pl -s $(pwd)/$U -d /mailstore/example.com/$U
done

To convert the inboxes shell code like the above is needed. If the users don’t have IMAP folders (EG they are just POP users or use local Unix MUAs) then that’s all you need to do.

cd /home
for DIR in */mail ; do
  U=$(echo $DIR| cut -f1 -d/)
  cd /home/$DIR
  for FOLDER in * ; do
    ~/mb2md.pl -s $(pwd)/$FOLDER -d /mailstore/example.com/$U/.$FOLDER
  done
  cp .subscriptions /mailstore/example.com/$U/ subscriptions
done

Some shell code like the above will convert the IMAP folders to Maildir format. The end result is that the users will have to download all the mail again as their MUA will think that every message had been deleted and replaced. But as all servers with significant amounts of mail or important mail were probably converted to Maildir a decade ago this shouldn’t be a problem.

LUV Main October 2017 Meeting: The Tor software and network

Oct 3 2017 18:30
Oct 3 2017 20:30
Oct 3 2017 18:30
Oct 3 2017 20:30
Location: 
Mail Exchange Hotel, 688 Bourke St, Melbourne VIC 3000

PLEASE NOTE NEW LOCATION

Tuesday, October 3, 2017
6:30 PM to 8:30 PM
Mail Exchange Hotel
688 Bourke St, Melbourne VIC 3000

Speakers:

  • Russell Coker, Tor

Tor is free software and an open network that helps you defend against traffic analysis, a form of network surveillance that threatens personal freedom and privacy, confidential business activities and relationships, and state security.

Russell Coker has done lots of Linux development over the years, mostly involved with Debian.

Mail Exchange Hotel, 688 Bourke St, Melbourne VIC 3000

Food and drinks will be available on premises.

Linux Users of Victoria is a subcommittee of Linux Australia.

October 3, 2017 - 18:30

LUV October 2017 Workshop: Hands-on with Tor

Oct 21 2017 12:30
Oct 21 2017 16:30
Oct 21 2017 12:30
Oct 21 2017 16:30
Location: 
Infoxchange, 33 Elizabeth St. Richmond

Hands-on with Tor

Following on from Russell Coker's well-attended Tor presentation at the October main meeting, he will cover torbrowser-launcher, torchat, ssh (as an example of a traditionally non-tor app that can run with it), and how to write a basic torchat in shell.

The meeting will be held at Infoxchange, 33 Elizabeth St. Richmond 3121 (enter via the garage on Jonas St.) Late arrivals, please call (0421) 775 358 for access to the venue.

LUV would like to acknowledge Infoxchange for the venue.

Linux Users of Victoria is a subcommittee of Linux Australia.

October 21, 2017 - 12:30

read more

September 22, 2017

Stupid Solutions to Stupid Problems: Hardcoding Your SSH Key in the Kernel

The "problem"

I'm currently working on firmware and kernel support for OpenCAPI on POWER9.

I've recently been allocated a machine in the lab for development purposes. We use an internal IBM tool running on a secondary machine that triggers hardware initialisation procedures, then loads a specified skiboot firmware image, a kernel image, and a root file system directly into RAM. This allows us to get skiboot and Linux running without requiring the usual hostboot initialisation and gives us a lot of options for easier tinkering, so it's super-useful for our developers working on bringup.

When I got access to my machine, I figured out the necessary scripts, developed a workflow, and started fixing my code... so far, so good.

One day, I was trying to debug something and get logs off the machine using ssh and scp, when I got frustrated with having to repeatedly type in our ultra-secret, ultra-secure root password, abc123. So, I ran ssh-copy-id to copy over my public key, and all was good.

Until I rebooted the machine, when strangely, my key stopped working. It took me longer than it should have to realise that this is an obvious consequence of running entirely from an initrd that's reloaded every boot...

The "solution"

I mentioned something about this to Jono, my housemate/partner-in-stupid-ideas, one evening a few weeks ago. We decided that clearly, the best way to solve this problem was to hardcode my SSH public key in the kernel.

This would definitely be the easiest and most sensible way to solve the problem, as opposed to, say, just keeping my own copy of the root filesystem image. Or asking Mikey, whose desk is three metres away from mine, whether he could use his write access to add my key to the image. Or just writing a wrapper around sshpass...

One Tuesday afternoon, I was feeling bored...

The approach

The SSH daemon looks for authorised public keys in ~/.ssh/authorized_keys, so we need to have a read of /root/.ssh/authorized_keys return a specified hard-coded string.

I did a bit of investigation. My first thought was to put some kind of hook inside whatever filesystem driver was being used for the root. After some digging, I found out that the filesystem type rootfs, as seen in mount, is actually backed by the tmpfs filesystem. I took a look around the tmpfs code for a while, but didn't see any way to hook in a fake file without a lot of effort - the tmpfs code wasn't exactly designed with this in mind.

I thought about it some more - what would be the easiest way to create a file such that it just returns a string?

Then I remembered sysfs, the filesystem normally mounted at /sys, which is used by various kernel subsystems to expose configuration and debugging information to userspace in the form of files. The sysfs API allows you to define a file and specify callbacks to handle reads and writes to the file.

That got me thinking - could I create a file in /sys, and then use a bind mount to have that file appear where I need it in /root/.ssh/authorized_keys? This approach seemed fairly straightforward, so I decided to give it a try.

First up, creating a pseudo-file. It had been a while since the last time I'd used the sysfs API...

sysfs

The sysfs pseudo file system was first introduced in Linux 2.6, and is generally used for exposing system and device information.

Per the sysfs documentation, sysfs is tied in very closely with the kobject infrastructure. sysfs exposes kobjects as directories, containing "attributes" represented as files. The kobject infrastructure provides a way to define kobjects representing entities (e.g. devices) and ksets which define collections of kobjects (e.g. devices of a particular type).

Using kobjects you can do lots of fancy things such as sending events to userspace when devices are hotplugged - but that's all out of the scope of this post. It turns out there's some fairly straightforward wrapper functions if all you want to do is create a kobject just to have a simple directory in sysfs.

#include <linux/kobject.h>

static int __init ssh_key_init(void)
{
        struct kobject *ssh_kobj;
        ssh_kobj = kobject_create_and_add("ssh", NULL);
        if (!ssh_kobj) {
                pr_err("SSH: kobject creation failed!\n");
                return -ENOMEM;
        }
}
late_initcall(ssh_key_init);

This creates and adds a kobject called ssh. And just like that, we've got a directory in /sys/ssh/!

The next thing we have to do is define a sysfs attribute for our authorized_keys file. sysfs provides a framework for subsystems to define their own custom types of attributes with their own metadata - but for our purposes, we'll use the generic bin_attribute attribute type.

#include <linux/sysfs.h>

const char key[] = "PUBLIC KEY HERE...";

static ssize_t show_key(struct file *file, struct kobject *kobj,
                        struct bin_attribute *bin_attr, char *to,
                        loff_t pos, size_t count)
{
        return memory_read_from_buffer(to, count, &pos, key, bin_attr->size);
}

static const struct bin_attribute authorized_keys_attr = {
        .attr = { .name = "authorized_keys", .mode = 0444 },
        .read = show_key,
        .size = sizeof(key)
};

We provide a simple callback, show_key(), that copies the key string into the file's buffer, and we put it in a bin_attribute with the appropriate name, size and permissions.

To actually add the attribute, we put the following in ssh_key_init():

int rc;
rc = sysfs_create_bin_file(ssh_kobj, &authorized_keys_attr);
if (rc) {
        pr_err("SSH: sysfs creation failed, rc %d\n", rc);
        return rc;
}

Woo, we've now got /sys/ssh/authorized_keys! Time to move on to the bind mount.

Mounting

Now that we've got a directory with the key file in it, it's time to figure out the bind mount.

Because I had no idea how any of the file system code works, I started off by running strace on mount --bind ~/tmp1 ~/tmp2 just to see how the userspace mount tool uses the mount syscall to request the bind mount.

execve("/bin/mount", ["mount", "--bind", "/home/ajd/tmp1", "/home/ajd/tmp2"], [/* 18 vars */]) = 0

...

mount("/home/ajd/tmp1", "/home/ajd/tmp2", 0x18b78bf00, MS_MGC_VAL|MS_BIND, NULL) = 0

The first and second arguments are the source and target paths respectively. The third argument, looking at the signature of the mount syscall, is a pointer to a string with the file system type. Because this is a bind mount, the type is irrelevant (upon further digging, it turns out that this particular pointer is to the string "none").

The fourth argument is where we specify the flags bitfield. MS_MGC_VAL is a magic value that was required before Linux 2.4 and can now be safely ignored. MS_BIND, as you can probably guess, signals that we want a bind mount.

(The final argument is used to pass file system specific data - as you can see it's ignored here.)

Now, how is the syscall actually handled on the kernel side? The answer is found in fs/namespace.c.

SYSCALL_DEFINE5(mount, char __user *, dev_name, char __user *, dir_name,
                char __user *, type, unsigned long, flags, void __user *, data)
{
        int ret;

        /* ... copy parameters from userspace memory ... */

        ret = do_mount(kernel_dev, dir_name, kernel_type, flags, options);

        /* ... cleanup ... */
}

So in order to achieve the same thing from within the kernel, we just call do_mount() with exactly the same parameters as the syscall uses:

rc = do_mount("/sys/ssh", "/root/.ssh", "sysfs", MS_BIND, NULL);
if (rc) {
        pr_err("SSH: bind mount failed, rc %d\n", rc);
        return rc;
}

...and we're done, right? Not so fast:

SSH: bind mount failed, rc -2

-2 is ENOENT - no such file or directory. For some reason, we can't find /sys/ssh... of course, that would be because even though we've created the sysfs entry, we haven't actually mounted sysfs on /sys.

rc = do_mount("sysfs", "/sys", "sysfs",
              MS_NOSUID | MS_NOEXEC | MS_NODEV, NULL);

At this point, my key worked!

Note that this requires that your root file system has an empty directory created at /sys to be the mount point. Additionally, in a typical Linux distribution environment (as opposed to my hardware bringup environment), your initial root file system will contain an init script that mounts your real root file system somewhere and calls pivot_root() to switch to the new root file system. At that point, the bind mount won't be visible from children processes using the new root - I think this could be worked around but would require some effort.

Kconfig

The final piece of the puzzle is building our new code into the kernel image.

To allow us to switch this important functionality on and off, I added a config option to fs/Kconfig:

config SSH_KEY
        bool "Andrew's dumb SSH key hack"
        default y
        help
          Hardcode an SSH key for /root/.ssh/authorized_keys.

          This is a stupid idea. If unsure, say N.

This will show up in make menuconfig under the File systems menu.

And in fs/Makefile:

obj-$(CONFIG_SSH_KEY)           += ssh_key.o

If CONFIG_SSH_KEY is set to y, obj-$(CONFIG_SSH_KEY) evaluates to obj-y and thus ssh-key.o gets compiled. Conversely, obj-n is completely ignored by the build system.

I thought I was all done... then Andrew suggested I make the contents of the key configurable, and I had to oblige. Conveniently, Kconfig options can also be strings:

config SSH_KEY_VALUE
        string "Value for SSH key"
        depends on SSH_KEY
        help
          Enter in the content for /root/.ssh/authorized_keys.

Including the string in the C file is as simple as:

const char key[] = CONFIG_SSH_KEY_VALUE;

And there we have it, a nicely configurable albeit highly limited kernel SSH backdoor!

Conclusion

I've put the full code up on GitHub for perusal. Please don't use it, I will be extremely disappointed in you if you do.

Thanks to Jono for giving me stupid ideas, and the rest of OzLabs for being very angry when they saw the disgusting things I was doing.

Comments and further stupid suggestions welcome!

NCSI - Nice Network You've Got There

A neat piece of kernel code dropped into my lap recently, and as a way of processing having to inject an entire network stack into by brain in less-than-ideal time I thought we'd have a look at it here: NCSI!

NCSI - Not the TV Show

NCSI stands for Network Controller Sideband Interface, and put most simply it is a way for a management controller (eg. a BMC like those found on our OpenPOWER machines) to share a single physical network interface with a host machine. Instead of two distinct network interfaces you plug in a single cable and both the host and the BMC have network connectivity.

NCSI-capable network controllers achieve this by filtering network traffic as it arrives and determining if it is host- or BMC-bound. To know how to do this the BMC needs to tell the network controller what to look out for, and from a Linux driver perspective this the focus of the NCSI protocol.

NCSI Overview

Hi My Name Is 70:e2:84:14:24:a1

The major components of what NCSI helps facilitate are:

  • Network Controllers, known as 'Packages' in this context. There may be multiple separate packages which contain one or more Channels.
  • Channels, most easily thought of as the individual physical network interfaces. If a package is the network card, channels are the individual network jacks. (Somewhere a pedant's head is spinning in circles).
  • Management Controllers, or our BMC, with their own network interfaces. Hypothetically there can be multiple management controllers in a single NCSI system, but I've not come across such a setup yet.

NCSI is the medium and protocol via which these components communicate.

NCSI Packages

The interface between Management Controller and one or more Packages carries both general network traffic to/from the Management Controller as well as NCSI traffic between the Management Controller and the Packages & Channels. Management traffic is differentiated from regular traffic via the inclusion of a special NCSI tag inserted in the Ethernet frame header. These management commands are used to discover and configure the state of the NCSI packages and channels.

If a BMC's network interface is configured to use NCSI, as soon as the interface is brought up NCSI gets to work finding and configuring a usable channel. The NCSI driver at first glance is an intimidating combination of state machines and packet handlers, but with enough coffee it can be represented like this:

NCSI State Diagram

Without getting into the nitty gritty details the overall process for configuring a channel enough to get packets flowing is fairly straightforward:

  • Find available packages.
  • Find each package's available channels.
  • (At least in the Linux driver) select a channel with link.
  • Put this channel into the Initial Config State. The Initial Config State is where all the useful configuration occurs. Here we find out what the selected channel is capable of and its current configuration, and set it up to recognise the traffic we're interested in. The first and most basic way of doing this is configuring the channel to filter traffic based on our MAC address.
  • Enable the channel and let the packets flow.

At this point NCSI takes a back seat to normal network traffic, transmitting a "Get Link Status" packet at regular intervals to monitor the channel.

AEN Packets

Changes can occur from the package side too; the NCSI package communicates these back to the BMC with Asynchronous Event Notification (AEN) packets. As the name suggests these can occur at any time and the driver needs to catch and handle these. There are different types but they essentially boil down to changes in link state, telling the BMC the channel needs to be reconfigured, or to select a different channel. These are only transmitted once and no effort is made to recover lost AEN packets - another good reason for the NCSI driver to periodically monitor the channel.

Filtering

Each channel can be configured to filter traffic based on MAC address, broadcast traffic, multicast traffic, and VLAN tagging. Associated with each of these filters is a filter table which can hold a finite number of entries. In the case of the VLAN filter each channel could match against 15 different VLAN IDs for example, but in practice the physical device will likely support less. Indeed the popular BCM5718 controller supports only two!

This is where I dived into NCSI. The driver had a lot of the pieces for configuring VLAN filters but none of it was actually hooked up in the configure state, and didn't have a way of actually knowing which VLAN IDs were meant to be configured on the interface. The bulk of that work appears in this commit where we take advantage of some useful network stack callbacks to get the VLAN configuration and set them during the configuration state. Getting to the configuration state at some arbitrary time and then managing to assign multiple IDs was the trickiest bit, and is something I'll be looking at simplifying in the future.


NCSI! A neat way to give physically separate users access to a single network controller, and if it works right you won't notice it at all. I'll surely be spending more time here (fleshing out the driver's features, better error handling, and making the state machine a touch more readable to start, and I haven't even mentioned HWA), so watch this space!

September 17, 2017

Those Dirty Peasants!

It is fairly well known that many Europeans in the 17th, 18th and early 19th centuries did not follow the same routines of hygiene as we do today. There are anecdotal and historical accounts of people being dirty, smelly and generally unhealthy. This was particularly true of the poorer sections of society. The epithet “those […]

September 16, 2017

Trying Drupal

While preparing for my DrupalCamp Belgium keynote presentation I looked at how easy it is to get started with various CMS platforms. For my talk I used Contentful, a hosted content as a service CMS platform and contrasted that to the "Try Drupal" experience. Below is the walk through of both.

Let's start with Contentful. I start off by visiting their website.

Contentful homepage

In the top right corner is a blue button encouraging me to "try for free". I hit the link and I'm presented with a sign up form. I can even use Google or GitHub for authentication if I want.

Contentful signup form

While my example site is being installed I am presented with an overview of what I can do once it is finished. It takes around 30 seconds for the site to be installed.

Contentful installer wait

My site is installed and I'm given some guidance about what to do next. There is even an onboarding tour in the bottom right corner that is waving at me.

Contentful dashboard

Overall this took around a minute and required very little thought. I never once found myself thinking come on hurry up.

Now let's see what it is like to try Drupal. I land on d.o. I see a big prominent "Try Drupal" button, so I click that.

Drupal homepage

I am presented with 3 options. I am not sure why I'm being presented options to "Build on Drupal 8 for Free" or to "Get Started Risk-Free", I just want to try Drupal, so I go with Pantheon.

Try Drupal providers

Like with Contentful I'm asked to create an account. Again I have the option of using Google for the sign up or completing a form. This form has more fields than contentful.

Pantheon signup page

I've created my account and I am expecting to be dropped into a demo Drupal site. Instead I am presented with a dashboard. The most prominent call to action is importing a site. I decide to create a new site.

Pantheon dashboard

I have to now think of a name for my site. This is already feeling like a lot of work just to try Drupal. If I was a busy manager I would have probably given up by this point.

Pantheon create site form

When I submit the form I must surely be going to see a Drupal site. No, sorry. I am given the choice of installing WordPress, yes WordPress, Drupal 8 or Drupal 7. Despite being very confused I go with Drupal 8.

Pantheon choose application page

Now my site is deploying. While this happens there is a bunch of items that update above the progress bar. They're all a bit nerdy, but at least I know something is happening. Why is my only option to visit my dashboard again? I want to try Drupal.

Pantheon site installer page

I land on the dashboard. Now I'm really confused. This all looks pretty geeky. I want to try Drupal not deal with code, connection modes and the like. If I stick around I might eventually click "Visit Development site", which doesn't really feel like trying Drupal.

Pantheon site dashboard

Now I'm asked to select a language. OK so Drupal supports multiple languages, that nice. Let's select English so I can finally get to try Drupal.

Drupal installer, language selection

Next I need to chose an installation profile. What is an installation profile? Which one is best for me?

Drupal installer, choose installation profile

Now I need to create an account. About 10 minutes I already created an account. Why do I need to create another one? I also named my site earlier in the process.

Drupal installer, configuration form part 1
Drupal installer, configuration form part 2

Finally I am dropped into a Drupal 8 site. There is nothing to guide me on what to do next.

Drupal site homepage

I am left with a sense that setting up Contentful is super easy and Drupal is a lot of work. For most people wanting to try Drupal they would have abandoned someway through the process. I would love to see the conversion stats for the try Drupal service. It must miniscule.

It is worth noting that Pantheon has the best user experience of the 3 companies. The process with 1&1 just dumps me at a hosting sign up page. How does that let me try Drupal?

Acquia drops onto a page where you select your role, then you're presented with some marketing stuff and a form to request a demo. That is unless you're running an ad blocker, then when you select your role you get an Ajax error.

The Try Drupal program generates revenue for the Drupal Association. This money helps fund development of the project. I'm well aware that the DA needs money. At the same time I wonder if it is worth it. For many people this is the first experience they have using Drupal.

The previous attempt to have simplytest.me added to the try Drupal page ultimately failed due to the financial implications. While this is disappointing I don't think simplytest.me is necessarily the answer either.

There needs to be some minimum standards for the Try Drupal page. One of the key item is the number of clicks to get from d.o to a working demo site. Without this the "Try Drupal" page will drive people away from the project, which isn't the intention.

If you're at DrupalCon Vienna and want to discuss this and other ways to improve the marketing of Drupal, please attend the marketing sprints.

AttachmentSize
try-contentful-1.png342.82 KB
try-contentful-2.png214.5 KB
try-contentful-3.png583.02 KB
try-contentful-5.png826.13 KB
try-drupal-1.png1.19 MB
try-drupal-2.png455.11 KB
try-drupal-3.png330.45 KB
try-drupal-4.png239.5 KB
try-drupal-5.png203.46 KB
try-drupal-6.png332.93 KB
try-drupal-7.png196.75 KB
try-drupal-8.png333.46 KB
try-drupal-9.png1.74 MB
try-drupal-10.png1.77 MB
try-drupal-11.png1.12 MB
try-drupal-12.png1.1 MB
try-drupal-13.png216.49 KB

September 14, 2017

New Dates for Human Relative + ‘Explorer Classroom’ Resources

During September, National Geographic is featuring the excavations of Homo naledi at Rising Star Cave in South Africa in their Explorer Classroom, in tune with new discoveries and the publishing of dates for this enigmatic little hominid. A Teacher’s Guide and Resources are available and classes can log in to see live updates from the […]

September 10, 2017

Guess the Artefact #3

This week’s Guess the Artefact challenge centres around an artefact used by generations of school children. There are some adults who may even have used these themselves when they were at school. It is interesting to see if modern students can recognise this object and work out how it was used. The picture below comes […]

September 09, 2017

Observing Reliability

Last year I wrote about how great my latest Thinkpad is [1] in response to a discussion about whether a Thinkpad is still the “Rolls Royce” of laptops.

It was a few months after writing that post that I realised that I omitted an important point. After I had that laptop for about a year the DVD drive broke and made annoying clicking sounds all the time in addition to not working. I removed the DVD drive and the result was that the laptop was lighter and used less power without missing any feature that I desired. As I had installed Debian on that laptop by copying the hard drive from my previous laptop I had never used the DVD drive for any purpose. After a while I got used to my laptop being like that and the gaping hole in the side of the laptop where the DVD drive used to be didn’t even register to me. I would prefer it if Lenovo sold Thinkpads in the T series without DVD drives, but it seems that only the laptops with tiny screens are designed to lack DVD drives.

For my use of laptops this doesn’t change the conclusion of my previous post. Now the T420 has been in service for almost 4 years which makes the cost of ownership about $75 per year. $1.50 per week as a tax deductible business expense is very cheap for such a nice laptop. About a year ago I installed a SSD in that laptop, it cost me about $250 from memory and made it significantly faster while also reducing heat problems. The depreciation on the SSD about doubles the cost of ownership of the laptop, but it’s still cheaper than a mobile phone and thus not in the category of things that are expected to last for a long time – while also giving longer service than phones usually do.

One thing that’s interesting to consider is the fact that I forgot about the broken DVD drive when writing about this. I guess every review has an unspoken caveat of “this works well for me but might suck badly for your use case”. But I wonder how many other things that are noteworthy I’m forgetting to put in reviews because they just don’t impact my use. I don’t think that I am unusual in this regard, so reading multiple reviews is the sensible thing to do.

TLS Authentication on Freenode and OFTC

In order to easily authenticate with IRC networks such as OFTC and Freenode, it is possible to use client TLS certificates (also known as SSL certificates). In fact, it turns out that it's very easy to setup both on irssi and on znc.

Generate your TLS certificate

On a machine with good entropy, run the following command to create a keypair that will last for 10 years:

openssl req -nodes -newkey rsa:2048 -keyout user.pem -x509 -days 3650 -out user.pem -subj "/CN=<your nick>"

Then extract your key fingerprint using this command:

openssl x509 -sha1 -noout -fingerprint -in user.pem | sed -e 's/^.*=//;s/://g'

Share your fingerprints with NickServ

On each IRC network, do this:

/msg NickServ IDENTIFY Password1!
/msg NickServ CERT ADD <your fingerprint>

in order to add your fingerprint to the access control list.

Configure ZNC

To configure znc, start by putting the key in the right place:

cp user.pem ~/.znc/users/<your nick>/networks/oftc/moddata/cert/

and then enable the built-in cert plugin for each network in ~/.znc/configs/znc.conf:

<Network oftc>
    ...
            LoadModule = cert
    ...
</Network>
    <Network freenode>
    ...
            LoadModule = cert
    ...
</Network>

Configure irssi

For irssi, do the same thing but put the cert in ~/.irssi/user.pem and then change the OFTC entry in ~/.irssi/config to look like this:

{
  address = "irc.oftc.net";
  chatnet = "OFTC";
  port = "6697";
  use_tls = "yes";
  tls_cert = "~/.irssi/user.pem";
  tls_verify = "yes";
  autoconnect = "yes";
}

and the Freenode one to look like this:

{
  address = "chat.freenode.net";
  chatnet = "Freenode";
  port = "7000";
  use_tls = "yes";
  tls_cert = "~/.irssi/user.pem";
  tls_verify = "yes";
  autoconnect = "yes";
}

That's it. That's all you need to replace password authentication with a much stronger alternative.

September 07, 2017

Linux Plumbers Conference Sessions for Linux Security Summit Attendees

Folks attending the 2017 Linux Security Summit (LSS) next week may be also interested in attending the TPMs and Containers sessions at Linux Plumbers Conference (LPC) on the Wednesday.

The LPC TPMs microconf will be held in the morning and lead by Matthew Garret, while the containers microconf will be run by Stéphane Graber in the afternoon.  Several security topics will be discussed in the containers session, including namespacing and stacking of LSM, and namespacing of IMA.

Attendance on the Wednesday for LPC is at no extra cost for registered attendees of LSS.  Many thanks to the LPC organizers for arranging this!

There will be followup BOF sessions on LSM stacking and namespacing at LSS on Thursday, per the schedule.

This should be a very productive week for Linux security development: see you there!

September 06, 2017

Software Freedom Day 2017 and LUV Annual General Meeting

Sep 16 2017 12:00
Sep 16 2017 18:00
Sep 16 2017 12:00
Sep 16 2017 18:00
Location: 
Electron Workshop, 31 Arden St. North Melbourne

 

Software Freedom Day 2017

It's that time of the year where we celebrate our freedoms in technology and raise a ruckus about all the freedoms that have been eroded away. The time of the year we look at how we might keep our increasingly digital lives under our our own control and prevent prying eyes from seeing things they shouldn't. You guessed it: It's Software Freedom Day!

LUV would like to acknowledge Electron Workshop for the venue.

Linux Users of Victoria Inc., is an incorporated association, registration number A0040056C.

September 16, 2017 - 12:00

read more

September 05, 2017

Space station trio returns to Earth: NASA’s Peggy Whitson racks up 665-day record | GeekWire

https://www.geekwire.com/2017/space-station-trio-returns-earth-nasas-peggy-whitson-sets-665-day-record/ NASA astronaut Peggy Whitson and two other spacefliers capped off a record-setting orbital mission with their return from the International Space Station.

September 04, 2017

Understanding BlueStore, Ceph’s New Storage Backend

On June 1, 2017 I presented Understanding BlueStore, Ceph’s New Storage Backend at OpenStack Australia Day Melbourne. As the video is up (and Luminous is out!), I thought I’d take the opportunity to share it, and write up the questions I was asked at the end.

First, here’s the video:

The bit at the start where the audio cut out was me asking “Who’s familiar with Ceph?” At this point, most of the 70-odd people in the room put their hands up. I continued with “OK, so for the two people who aren’t…” then went into the introduction.

After the talk we had a Q&A session, which I’ve paraphrased and generally cleaned up here.

With BlueStore, can you still easily look at the objects like you can through the filesystem when you’re using FileStore?

There’s not a regular filesystem anymore, so you can’t just browse through it. However you can use `ceph-objectstore-tool` to “mount” an offline OSD’s data via FUSE and poke around that way. Some more information about this can be found in Sage Weil’s recent blog post: New in Luminous: BlueStore.

Do you have real life experience with BlueStore for how IOPS performance scales?

We (SUSE) haven’t released performance numbers yet, so I will instead refer you to Sage Weil’s slides from Vault 2017, and Allan Samuel’s slides from SCALE 15x, which together include a variety of performance graphs for different IO patterns and sizes. Also, you can expect to see more about this on the Ceph blog in the coming weeks.

What kind of stress testing has been done for corruption in BlueStore?

It’s well understood by everybody that it’s sort of important to stress test these things and that people really do care if their data goes away. Ceph has a huge battery of integration tests, various of which are run on a regular basis in the upstream labs against Ceph’s master and stable branches, others of which are run less frequently as needed. The various downstreams all also run independent testing and QA.

Wouldn’t it have made sense to try to enhance existing POSIX filesystems such as XFS, to make them do what Ceph needs?

Long answer: POSIX filesystems still need to provide POSIX semantics. Changing the way things work (or adding extensions to do what Ceph needs) in, say, XFS, assuming it’s possible at all, would be a big, long, scary, probably painful project.

Short answer: it’s really a different use case; better to build a storage engine that fits the use case, than shoehorn in one that doesn’t.

Best answer: go read New in Luminous: BlueStore ;-)

September 03, 2017

This Week in HASS – term 3, week 9

OpenSTEM’s ® Understanding Our World® Units are designed to cover 9 weeks of the term, because we understand that life happens. Sports carnivals, excursions and other special events are also necessary parts of the school year and even if the calendar runs according to plan, having a little bit of breathing space at the end of […]

September 01, 2017

memcmp() for POWER8 - part II

This entry is a followup to part I which you should absolutely read here before continuing on.

Where we left off

We concluded that while a vectorised memcmp() is a win, there are some cases where it won't quite perform.

The overhead of enabling ALTIVEC

In the kernel we explicitly don't touch ALTIVEC unless we need to, this means that in the general case we can leave the userspace registers in place and not have do anything to service a syscall for a process.

This means that if we do want to use ALTIVEC in the kernel, there is some setup that must be done. Notably, we must enable the facility (a potentially time consuming move to MSR), save off the registers (if userspace we using them) and an inevitable restore later on.

If all this needs to be done for a memcmp() in the order of tens of bytes then it really wasn't worth it.

There are two reasons that memcmp() might go for a small number of bytes, firstly and trivially detectable is simply that parameter n is small. The other is harder to detect, if the memcmp() is going to fail (return non zero) early then it also wasn't worth enabling ALTIVEC.

Detecting early failures

Right at the start of memcmp(), before enabling ALTIVEC, the first 64 bytes are checked using general purpose registers. Why the first 64 bytes, well why not? In a strange twist of fate 64 bytes happens to be the amount of bytes in four ALTIVEC registers (128 bits per register, so 16 bytes multiplied by 4) and by utter coincidence that happens to be the stride of the ALTIVEC compare loop.

What does this all look like

Well unlike part I the results appear slightly less consistent across three runs of measurement but there are some very key differences with part I. The trends do appear to be the same across all three runs, just less pronounced - why this is is unclear.

The difference between run two and run three clipped at deltas of 1000ns is interesting: Sample 2: Deltas below 1000ns

vs

Sample 3: Deltas below 1000ns

The results are similar except for a spike in the amount of deltas in the unpatched kernel at around 600ns. This is not present in the first sample (deltas1) of data. There are a number of reasons why this spike could have appeared here, it is possible that the kernel or hardware did something under the hood, prefetch could have brought deltas for a memcmp() that would otherwise have yielded a greater delta into the 600ns range.

What these two graphs do both demonstrate quite clearly is that optimisations down at the sub 100ns end have resulted in more sub 100ns deltas for the patched kernel, a significant win over the original data. Zooming out and looking at a graph which includes deltas up to 5000ns shows that the sub 100ns delta optimisations haven't noticeably slowed the performance of long duration memcmp(), Samply 2: Deltas below 5000ns.

Conclusion

The small amount of extra development effort has yielded tangible results in reducing the low end memcmp() times. This second round of data collection and performance analysis only confirms the that for any significant amount of comparison, a vectorised loop is significantly quicker.

The results obtained here show no downside to adopting this approach for all power8 and onwards chips as this new version of the patch solves the performance regression for small compares.

August 30, 2017

An Overview of SSH

SSH (Secure Shell) is secure means, mainly used on Linux and other UNIX-like systems, to access remote systems, designed as a replacement for a variety of insecure protocols (e.g., telnet, rlogin etc). This presentation will cover the core useage of the protocol, its development (SSH-1 and SSH-2), the architecture (client-server, public key authentication), installation and implementation, and some handy elaborations and enhancements, and real and imagined vulnerabilities.
Plenty of examples will be provided throughout along with the opportunity to test the protocol.

read more

August 29, 2017

LUV Main September 2017 Meeting: Cygwin and Virtualbox

Sep 5 2017 18:30
Sep 5 2017 20:30
Sep 5 2017 18:30
Sep 5 2017 20:30
Location: 
The Dan O'Connell Hotel, 225 Canning Street, Carlton VIC 3053

Tuesday, September 5, 2017

6:30 PM to 8:30 PM
The Dan O'Connell Hotel
225 Canning Street, Carlton VIC 3053

Speakers:

  • Duncan Roe, Cygwin
  • Steve Roylance, Virtualbox

Cygwin is a large collection of GNU and Open Source tools which provide functionality similar to a Linux distribution on Windows.  It allows easy porting of many Unix programs without the need for extensive changes to the source code.

The Dan O'Connell Hotel, 225 Canning Street, Carlton VIC 3053

Food and drinks will be available on premises.

Before and/or after each meeting those who are interested are welcome to join other members for dinner.

Linux Users of Victoria Inc., is an incorporated association, registration number A0040056C.

September 5, 2017 - 18:30

read more

August 28, 2017

Creative kid’s piano + vocal composition

An inspirational song from an Australian youngster.  He explains his background at the start.

August 27, 2017

This Week in HASS – term 3, week 8

This week our younger students are putting together a Class Museum, while older students are completing their Scientific Report. Foundation/Prep/Kindy to Year 3 Students in Foundation/Prep/Kindy (Units F.3 and F-1.3), as well as those in Years 1 (Unit 1.3). 2 (Unit 2.3) and 3 (Unit 3.3) are all putting together Class Museums of items of […]

August 24, 2017

Mental Health Resources for New Dads

Right now, one in seven new fathers experiences high levels of psychological distress and as many as one in ten experience depression or anxiety. Often distressed fathers remain unidentified and unsupported due to both a reluctance to seek help for themselves and low levels of community understanding that the transition to parenthood is a difficult period for fathers, as well as mothers.

The project is hoping to both increase understanding of stress and distress in new fathers and encourage new fathers to take action to manage their mental health.

This work is being informed by research commissioned by beyondblue into men’s experiences of psychological distress in the perinatal period.

Informed by the findings of the Healthy Dads research, three projects are underway to provide men with the knowledge, tools and support to stay resilient during the transition to fatherhood.

https://www.medicalert.org.au/news-and-resources/becoming-a-healthy-dad

August 22, 2017

The Attention Economy

In May 2017, James Williams, a former Google employee and doctoral candidate researching design ethics at Oxford University, won the inaugural Nine Dots Prize.

James argues that digital technologies privilege our impulses over our intentions, and are gradually diminishing our ability to engage with the issues we most care about.

Possibly a neat followup on our earlier post on “busy-ness“.

August 20, 2017

This Week in HASS – term 3, week 7

This week students are starting the final sections of their research projects and Scientific Reports. Our younger students are also preparing to set up a Class Museum. Foundation/Prep/Kindy to Year 3 Our youngest students (Unit F.3) also complete a Scientific Report. By becoming familiar with the overall layout and skills associated with the scientific process […]

CNC Z Axis with 150mm or more of travel

Many of the hobby priced CNC machines have limited Z Axis movement. This coupled with limited clearance on the gantry force a limited number of options for work fixtures. For example, it is very unlikely that there will be clearance for a vice on the cutting bed of a cheap machine.

I started tinkering around with a Z Axis assembly which offers around 150mm of travel. The assembly also uses bearing blocks that should help overcome the tensions that drilling and cutting can offer.


The assembly is designed to be as thin as possible. The spindle mount is a little wider which allows easy bolting onto the spindle mount plate which attaches to these bearings and drive nut. The width of the assembly is important because it will limit the travel in the Y axis if it can interact with the gantry in any way.

Construction is mainly done in 1/4 and 1/2 inch 6061 alloy. The black bracket at the bottom is steel. This seemed like a reasonable choice since that bracket was going to be key to holding the weight and attachment to the gantry.

The Z axis shown above needs to be combined with a gantry height extension when attaching to a hobby CNC to be really effective. Using a longer travel Z axis like this would allow higher gantries which combined allow for easier fixturing and also pave the way for a 4/5th axis to fit under the cutter.

August 13, 2017

This Week in HASS – term 3, week 6

This week all our students are hard at work examining the objects they are using for their research projects. For the younger students these are objects that will be used to generate a Class Museum. For the older students, the objects of study relate to their chosen topic in Australian History. Foundation / Prep / […]

August 10, 2017

Tools for talking

I gave a talk a couple of years ago called Tools for Talking.

I'm preparing a new talk, which, in some ways, is a sequel to this one. As part of that prep, I thought it might be useful to write some short summaries of each of the tools outlined here, with links to resources on them.

  • Powerful Non Defensive Communication
  • Non Violent Communication
  • Active Listening
  • Appreciative Inquiry
  • Transactional Analysis
  • The Drama Triangle vs
  • The Empowerment Dynamic
  • The 7 Cs

So I might try to make a start on that over the next week or so.

 

In the meantime, here's the slides:

And here's the video of the presentation at DrupalCon Barcelona

Larger format CNC

Having access to a wood cutting CNC machine that can do a full sheet of plywood at once has led me to an initial project for a large sconce stand. The sconce is 210mm square at the base and the DAR ash I used was 140mm across. This lead to the four edge grain glue ups in the middle of the stand.


The design was created in Fusion 360 by just seeing what might look good. Unfortunately the sketch export as DXF presented some issues on the import side. This was part of why a littler project like this was a good first choice rather than a more complex whole sheet of ply.

To get around the DXF issue the tip was to select a face of a body and create a sketch from that face. Then export the created sketch as DXF which seemed to work much better. I don't know what I had in the original sketch that I created the body from that the DXF export/import didn't like. Maybe the dimensions, maybe the guide lines, hard to know without a bisect. The CNC was using the EnRoute software, so I had to work out how to bounce things from Fusion over to EnRoute and then get some help to reCAM things on that side and setup tabs et al.

One tip for others would be to use the DAR timber to form a glue up before arriving at a facility with a larger cut surface. Fewer pieces means less tabs/bridges and easier reCAM. A preformed blue panel would also have let me used more advanced designs such as n and u slots to connect two pieces instead of edge grains to connect four.

Overall it was a fun build and the owner of the sconce will love having it slightly off the table top so it can more easily be seen.

pristine-tar and git-buildpackage Work-arounds

I recently ran into problems trying to package the latest version of my planetfilter tool.

This is how I was able to temporarily work-around bugs in my tools and still produce a package that can be built reproducibly from source and that contains a verifiable upstream signature.

pristine-tar being is unable to reproduce a tarball

After importing the latest upstream tarball using gbp import-orig, I tried to build the package but ran into this pristine-tar error:

$ gbp buildpackage
gbp:error: Pristine-tar couldn't checkout "planetfilter_0.7.4.orig.tar.gz": xdelta3: target window checksum mismatch: XD3_INVALID_INPUT
xdelta3: normally this indicates that the source file is incorrect
xdelta3: please verify the source file with sha1sum or equivalent
xdelta3 decode failed! at /usr/share/perl5/Pristine/Tar/DeltaTools.pm line 56.
pristine-tar: command failed: pristine-gz --no-verbose --no-debug --no-keep gengz /tmp/user/1000/pristine-tar.mgnaMjnwlk/wrapper /tmp/user/1000/pristine-tar.EV5aXIPWfn/planetfilter_0.7.4.orig.tar.gz.tmp
pristine-tar: failed to generate tarball

So I decided to throw away what I had, re-import the tarball and try again. This time, I got a different pristine-tar error:

$ gbp buildpackage
gbp:error: Pristine-tar couldn't checkout "planetfilter_0.7.4.orig.tar.gz": xdelta3: target window checksum mismatch: XD3_INVALID_INPUT
xdelta3: normally this indicates that the source file is incorrect
xdelta3: please verify the source file with sha1sum or equivalent
xdelta3 decode failed! at /usr/share/perl5/Pristine/Tar/DeltaTools.pm line 56.
pristine-tar: command failed: pristine-gz --no-verbose --no-debug --no-keep gengz /tmp/user/1000/pristine-tar.mgnaMjnwlk/wrapper /tmp/user/1000/pristine-tar.EV5aXIPWfn/planetfilter_0.7.4.orig.tar.gz.tmp
pristine-tar: failed to generate tarball

I filed bug 871938 for this.

As a work-around, I simply symlinked the upstream tarball I already had and then built the package using the tarball directly instead of the upstream git branch:

ln -s ~/deve/remote/planetfilter/dist/planetfilter-0.7.4.tar.gz ../planetfilter_0.7.4.orig.tar.gz
gbp buildpackage --git-tarball-dir=..

Given that only the upstream and master branches are signed, the .delta file on the pristine-tar branch could be fixed at any time in the future by committing a new .delta file once pristine-tar gets fixed. This therefore seems like a reasonable work-around.

git-buildpackage doesn't import the upstream tarball signature

The second problem I ran into was a missing upstream signature after building the package with git-buildpackage:

$ lintian -i planetfilter_0.7.4-1_amd64.changes
E: planetfilter changes: orig-tarball-missing-upstream-signature planetfilter_0.7.4.orig.tar.gz
N: 
N:    The packaging includes an upstream signing key but the corresponding
N:    .asc signature for one or more source tarballs are not included in your
N:    .changes file.
N:    
N:    Severity: important, Certainty: certain
N:    
N:    Check: changes-file, Type: changes
N: 

This problem (and the lintian error I suspect) is fairly new and hasn't been solved yet.

So until gbp import-orig gets proper support for upstream signatures, my work-around was to copy the upstream signature in the export-dir output directory (which I set in ~/.gbp.conf) so that it can be picked up by the final stages of gbp buildpackage:

ln -s ~/deve/remote/planetfilter/dist/planetfilter-0.7.4.tar.gz.asc ../build-area/planetfilter_0.7.4.orig.tar.gz.asc

If there's a better way to do this, please feel free to leave a comment (authentication not required)!

August 07, 2017

NBN Fixed Wireless – Four Years On

It’s getting close to the fourth anniversary of our NBN fixed wireless connection. Over that time, speaking as someone who works from home, it’s been generally quite good. 22-24 Mbps down and 4-4.5 Mbps up is very nice. That said, there have been a few problems along the way, and more recently evenings have become significantly irritating.

There were some initial teething problems, and at least three or four occasions where someone was performing “upgrades” during business hours over the course of several consecutive days. These upgrade periods wouldn’t have affected people who are away at work or school or whatever during the day, as by the time they got home, the connection would have been back up. But for me, I had to either tether my mobile phone to my laptop, or go down to a cafe or friend’s place to get connectivity.

There’s also the icing problem, which occurs a couple of times a year when snow falls below 200-300 metres for a few days. No internet, and also no mobile phone.

These are all relatively isolated incidents though. What’s been happening more recently is our connection speed in the evenings has gone to hell. I don’t tend to do streaming video, and my syncing several GB of software mirrors happens automatically in the wee hours while I’m asleep, so my subjective impression for some time has just been that “things were kinda slower during the evenings” (web browsing, pushing/pulling from already cloned git repos, etc.). I vented about this on Twitter in mid-June but didn’t take any further action at the time.

Several weeks later, on the evening of July 28, I needed to update and rebuild a Ceph package for openSUSE and SLES. The specifics aren’t terribly relevant to this post, but the process (which is reasonably automated) involves running something like `git clone git@github.com:SUSE/ceph.git && cd ceph && git submodule update --init --recursive`, which in turn downloads a few GB of data. I’ve done this several times in the past, and it usually takes an hour, or maybe a bit more. So you start it up, then go make a meal, come back and you’re done.

Not so on that Friday evening. It took six hours.

I ran a couple of speed tests:

I looked at my smokeping graphs:

smokeping-2017-07-28

That’s awfully close to 20% packet loss in the evenings. It happens every night:

smokeping-last-10-days

And it’s been happening for a long time:

smokeping-last-400-days

Right now, as I’m writing this, the last three hours show an average of 15.57% packet loss:

smokeping-last-three-hours

So I’ve finally opened a support ticket with iiNet. We’ll see what they say. It seems unlikely that this is a problem with my equipment, as my neighbour on the same wireless tower has also had noticeable speed problems for at least the last couple of months. I’m guessing it’s either not enough backhaul, or the local NBN wireless tower is underprovisioned (or oversubscribed). I’m leaning towards the latter, as in recent times the signal strength indicators on the NTD flick between two amber and three green lights in the evenings, whereas during the day it’s three green lights all the time.

memcmp() for POWER8

Userspace

When writing C programs in userspace there is libc which does so much of the heavy lifting. One important thing libc provides is portability in performing syscalls, that is, you don't need to know the architectural details of performing a syscall on each architecture your program might be compiled for. Another important feature that libc provides for the average userspace programmer is highly optimised routines to do things that are usually performance critical. It would be extremely inefficient for each userspace programmer if they had to implement even the naive version of these functions let alone optimised versions. Let us take memcmp() for example, I could trivially implement this in C like:

int memcmp(uint8_t *p1, uint8_t *p2, int n)
{
    int i;

    for (i = 0; i < n; i++) {
        if (p1[i] < p2[i])
            return -1;
        if (p1[i] > p2[i])
            return 1;
    }

    return 0;
}

However, while it is incredibly portable it is simply not going to perform, which is why the nice people who write libc have highly optimised ones in assembly for each architecture.

Kernel

When writing code for the Linux kernel, there isn't the luxury of a fully featured libc since it expects (and needs) to be in userspace, therefore we need to implement the features we need ourselves. Linux doesn't need all the features but something like memcmp() is definitely a requirement.

There have been some recent optimisations in glibc from which the kernel could benefit too! The question to be asked is, does the glibc optimised power8_memcmp() actually go faster or is it all smoke and mirrors?

Benchmarking memcmp()

With things like memcmp() it is actually quite easy to choose datasets which can make any implementation look good. For example; the new power8_memcmp() makes use of the vector unit of the power8 processor, in order to do so in the kernel there must be a small amount of setup code so that the rest of the kernel knows that the vector unit has been used and it correctly saves and restores the userspace vector registers. This means that power8_memcmp() has a slightly larger overhead than the current one, so for small compares or compares which are different early on then the newer 'faster' power8_memcmp() might actually not perform as well. For any kind of large compare however, using the vector unit should outperform a CPU register load and compare loop. It is for this reason that I wanted to avoid using micro benchmarks and use a 'real world' test as much as possible.

The biggest user of memcmp() in the kernel, at least on POWER is Kernel Samepage Merging (KSM). KSM provides code to inspect all the pages of a running system to determine if they're identical and deduplicate them if possible. This kind of feature allows for memory overcommit when used in a KVM host environment as guest kernels are likely to have a lot of similar, readonly pages which can be merged with no overhead afterwards. In order to determine if the pages are the same KSM must do a lot of page sized memcmp().

Performance

Performing a lot of page sized memcmp() is the one flaw with this test, the sizes of the memcmp() don't vary, hopefully the data will be 'random' enough that we can still observe differences in the two approaches.

My approach for testing involved getting the delta of ktime_get() across calls to memcmp() in memcmp_pages() (mm/ksm.c). This actually generated massive amounts of data, so, for consistency the following analysis is performed on the first 400MB of deltas collected.

The host was compiled with powernv_defconfig and run out of a ramdisk. For consistency the host was rebooted between each run so as to not have any previous tests affect the next. The host was rebooted a total of six times, the first three with my 'patched' power8_memcmp() kernel was booted the second three times with just my data collection patch applied, the 'vanilla' kernel. Both kernels are based off 4.13-rc3.

Each boot the following script was run and the resulting deltas file saved somewhere before reboot. The command line argument was always 15.

#!/bin/sh

ppc64_cpu --smt=off

#Host actually boots with ksm off but be sure
echo 0 > /sys/kernel/mm/ksm/run

#Scan a lot of pages
echo 999999 > /sys/kernel/mm/ksm/pages_to_scan

echo "Starting QEMUs"
i=0
while [ "$i" -lt "$1" ] ; do
    qemu-system-ppc64 -smp 1 -m 1G -nographic -vga none \
        -machine pseries,accel=kvm,kvm-type=HV \
        -kernel guest.kernel  -initrd guest.initrd \
        -monitor pty -serial pty &
    i=$(expr $i + 1);
done

echo "Letting all the VMs boot"
sleep 30

echo "Turning KSM om"
echo 1 > /sys/kernel/mm/ksm/run

echo "Letting KSM do its thing"
sleep 2m

echo 0 > /sys/kernel/mm/ksm/run

dd if=/sys/kernel/debug/ksm/memcmp_deltas of=deltas bs=4096 count=100

The guest kernel was a pseries_le_defconfig 4.13-rc3 with the same ramdisk the host used. It booted to the login prompt and was left to idle.

Analysis

A variety of histograms were then generated in an attempt to see how the behaviour of memcmp() changed between the two implementations. It should be noted here that the y axis in the following graphs is a log scale as there were a lot of small deltas. The first observation is that the vanilla kernel had more smaller deltas, this is made particularly evident by the 'tally' points which are a running total of all deltas with less than the tally value.

Sample 1 - Deltas below 200ns Graph 1 depicting the vanilla kernel having a greater amount of small (sub 20ns) deltas than the patched kernel. The green points rise faster (left to right) and higher than the yellow points.

Still looking at the tallies, graph 1 also shows that the tally of deltas is very close by the 100ns mark, which means that the overhead of power8_memcmp() is not too great.

The problem with looking at only deltas under 200ns is that the performance results we want, that is, the difference between the algorithms is being masked by things like cache effects. To avoid this problem is may be wise to look at longer running (larger delta) memcmp() calls.

The following graph plots all deltas below 5000ns - still relatively short calls to memcmp() but an interesting trend emerges: Sample 1 - Deltas below 5000ns Graph 2 shows that above 500ns the blue (patched kernel) points appear to have all shifted left with respect to the purple (vanilla kernel) points. This shows that for any memcmp() which will take more than 500ns to get a result it is favourable to use power8_memcmp() and it is only detrimental to use power8_memcmp() if the time will be under 50ns (a conservative estimate).

It is worth noting that graph 1 and graph 2 are generated by combining the first run of data collected from the vanilla and patched kernels. All the deltas for both runs are can be viewed separately here for vanilla and here for patched. Finally, the results from the other four runs look very much identical and provide me with a fair amount of confidence that these results make sense.

Conclusions

It is important to separate possible KSM optimisations with generic memcmp() optimisations, for example, perhaps KSM shouldn't be calling memcmp() if it suspects the first byte will differ. On the other hand, things that power8_memcmp() could do (which it currently doesn't) is check the length parameter and perhaps avoid the overhead of enabling kernel vector if the compare is less than some small amount of bytes.

It does seem like at least for the 'average case' glibcs power8_memcmp() is an improvement over what we have now.

Future work

A second round of data collection and plotting of delta vs position of first byte to differ should confirm these results, this would mean a more invasive patch to KSM.

August 06, 2017

This Week in HASS – term 3, week 5

This week students in all year levels are working on their research project for the term. Our youngest students are looking at items and pictures from the past, while our older students are collecting source material for their project on Australian history. Foundation/Prep/Kindy to Year 3 The focus of this term is an investigation into […]

Time Synchronization with NTP and systemd

I recently ran into problems with generating TOTP 2-factor codes on my laptop. The fact that some of the codes would work and some wouldn't suggested a problem with time keeping on my laptop.

This was surprising since I've been running NTP for a many years and have therefore never had to think about time synchronization. After realizing that ntpd had stopped working on my machine for some reason, I found that systemd provides an easier way to keep time synchronized.

The new systemd time synchronization daemon

On a machine running systemd, there is no need to run the full-fledged ntpd daemon anymore. The built-in systemd-timesyncd can do the basic time synchronization job just fine.

However, I noticed that the daemon wasn't actually running:

$ systemctl status systemd-timesyncd.service 
● systemd-timesyncd.service - Network Time Synchronization
   Loaded: loaded (/lib/systemd/system/systemd-timesyncd.service; enabled; vendor preset: enabled)
  Drop-In: /lib/systemd/system/systemd-timesyncd.service.d
           └─disable-with-time-daemon.conf
   Active: inactive (dead)
Condition: start condition failed at Thu 2017-08-03 21:48:13 PDT; 1 day 20h ago
     Docs: man:systemd-timesyncd.service(8)

referring instead to a mysterious "failed condition". Attempting to restart the service did provide more details though:

$ systemctl restart systemd-timesyncd.service 
$ systemctl status systemd-timesyncd.service 
● systemd-timesyncd.service - Network Time Synchronization
   Loaded: loaded (/lib/systemd/system/systemd-timesyncd.service; enabled; vendor preset: enabled)
  Drop-In: /lib/systemd/system/systemd-timesyncd.service.d
           └─disable-with-time-daemon.conf
   Active: inactive (dead)
Condition: start condition failed at Sat 2017-08-05 18:19:12 PDT; 1s ago
           └─ ConditionFileIsExecutable=!/usr/sbin/ntpd was not met
     Docs: man:systemd-timesyncd.service(8)

The above check for the presence of /usr/sbin/ntpd points to a conflict between ntpd and systemd-timesyncd. The solution of course is to remove the former before enabling the latter:

apt purge ntp

Enabling time synchronization with NTP

Once the ntp package has been removed, it is time to enable NTP support in timesyncd.

Start by choosing the NTP server pool nearest you and put it in /etc/systemd/timesyncd.conf. For example, mine reads like this:

[Time]
NTP=ca.pool.ntp.org

before restarting the daemon:

systemctl restart systemd-timesyncd.service 

That may not be enough on your machine though. To check whether or not the time has been synchronized with NTP servers, run the following:

$ timedatectl status
...
 Network time on: yes
NTP synchronized: no
 RTC in local TZ: no

If NTP is not enabled, then you can enable it by running this command:

timedatectl set-ntp true

Once that's done, everything should be in place and time should be kept correctly:

$ timedatectl status
...
 Network time on: yes
NTP synchronized: yes
 RTC in local TZ: no

August 01, 2017

QEMU for ARM Processes

I’m currently doing some embedded work on ARM systems. Having a virtual ARM environment is of course helpful. For the i586 class embedded systems that I run it’s very easy to setup a virtual environment, I just have a chroot run from systemd-nspawn with the --personality=x86 option. I run it on my laptop for my own development and on a server my client owns so that they can deal with the “hit by a bus” scenario. I also occasionally run KVM virtual machines to test the boot image of i586 embedded systems (they use GRUB etc and are just like any other 32bit Intel system).

ARM systems have a different boot setup, there is a uBoot loader that is fairly tightly coupled with the kernel. ARM systems also tend to have more unusual hardware choices. While the i586 embedded systems I support turned out to work well with standard Debian kernels (even though the reference OS for the hardware has a custom kernel) the ARM systems need a special kernel. I spent a reasonable amount of time playing with QEMU and was unable to make it boot from a uBoot ARM image. The Google searches I performed didn’t turn up anything that helped me. If anyone has good references for getting QEMU to work for an ARM system image on an AMD64 platform then please let me know in the comments. While I am currently surviving without that facility it would be a handy thing to have if it was relatively easy to do (my client isn’t going to pay me to spend a week working on this and I’m not inclined to devote that much of my hobby time to it).

QEMU for Process Emulation

I’ve given up on emulating an entire system and now I’m using a chroot environment with systemd-nspawn.

The package qemu-user-static has staticly linked programs for emulating various CPUs on a per-process basis. You can run this as “/usr/bin/qemu-arm-static ./staticly-linked-arm-program“. The Debian package qemu-user-static uses the binfmt_misc support in the kernel to automatically run /usr/bin/qemu-arm-static when an ARM binary is executed. So if you have copied the image of an ARM system to /chroot/arm you can run the following commands like the following to enter the chroot:

cp /usr/bin/qemu-arm-static /chroot/arm/usr/bin/qemu-arm-static
chroot /chroot/arm bin/bash

Then you can create a full virtual environment with “/usr/bin/systemd-nspawn -D /chroot/arm” if you have systemd-container installed.

Selecting the CPU Type

There is a huge range of ARM CPUs with different capabilities. How this compares to the range of x86 and AMD64 CPUs depends on how you are counting (the i5 system I’m using now has 76 CPU capability flags). The default CPU type for qemu-arm-static is armv7l and I need to emulate a system with a armv5tejl. Setting the environment variable QEMU_CPU=pxa250 gives me armv5tel emulation.

The ARM Architecture Wikipedia page [2] says that in armv5tejl the T stands for Thumb instructions (which I don’t think Debian uses), the E stands for DSP enhancements (which probably isn’t relevant for me as I’m only doing integer maths), the J stands for supporting special Java instructions (which I definitely don’t need) and I’m still trying to work out what L means (comments appreciated).

So it seems clear that the armv5tel emulation provided by QEMU_CPU=pxa250 will do everything I need for building and testing ARM embedded software. The issue is how to enable it. For a user shell I can just put export QEMU_CPU=pxa250 in .login or something, but I want to emulate an entire system (cron jobs, ssh logins, etc).

I’ve filed Debian bug #870329 requesting a configuration file for this [1]. If I put such a configuration file in the chroot everything would work as desired.

To get things working in the meantime I wrote the below wrapper for /usr/bin/qemu-arm-static that calls /usr/bin/qemu-arm-static.orig (the renamed version of the original program). It’s ugly (I would use a config file if I needed to support more than one type of CPU) but it works.

#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>

int main(int argc, char **argv)
{
  if(setenv("QEMU_CPU", "pxa250", 1))
  {
    printf("Can't set $QEMU_CPU\n");
    return 1;
  }
  execv("/usr/bin/qemu-arm-static.orig", argv);
  printf("Can't execute \"%s\" because of qemu failure\n", argv[0]);
  return 1;
}

NBN FTTN

Unfortunate for us our home only got FTTN NBN connection. but like others I thought I would share the speed improvement results from cleaning up wiring inside your own home. we have 2 phone sockets 1 in the bedroom and one in the kitchen. by removing the cable from the kitchen to the bedroom, we managed to increase our maximum line rate from 14.2Mbps upload and 35.21Mbps download to 20Mbps upload and 47 Mbps download.

Bedroom Phone Line connected.
Line Statistics Post Wiring clean up

we’ve also put a speed change request from the 12/5 plan to the 50/20 plan so next month we should be enjoying a bit more of an NBN.

To think that with FTTH you could of had up to 4 100/40 connections. and you wouldn’t of had to pay someone to rewire your phone sockets.

Update:

speed change has gone through

NBN ModemModem statistics on 50/20 speed

July 31, 2017

Running a Tor Relay

I previously wrote about running my SE Linux Play Machine over Tor [1] which involved configuring ssh to use Tor.

Since then I have installed a Tor hidden service for ssh on many systems I run for clients. The reason is that it is fairly common for them to allow a server to get a new IP address by DHCP or accidentally set their firewall to deny inbound connections. Without some sort of VPN this results in difficult phone calls talking non-technical people through the process of setting up a tunnel or discovering an IP address. While I can run my own VPN for them I don’t want their infrastructure tied to mine and they don’t want to pay for a 3rd party VPN service. Tor provides a free VPN service and works really well for this purpose.

As I believe in giving back to the community I decided to run my own Tor relay. I have no plans to ever run a Tor Exit Node because that involves more legal problems than I am willing or able to deal with. A good overview of how Tor works is the EFF page about it [2]. The main point of a “Middle Relay” (or just “Relay”) is that it only sends and receives encrypted data from other systems. As the Relay software (and the sysadmin if they choose to examine traffic) only sees encrypted data without any knowledge of the source or final destination the legal risk is negligible.

Running a Tor relay is quite easy to do. The Tor project has a document on running relays [3], which basically involves changing 4 lines in the torrc file and restarting Tor.

If you are running on Debian you should install the package tor-geoipdb to allow Tor to determine where connections come from (and to not whinge in the log files).

ORPort [IPV6ADDR]:9001

If you want to use IPv6 then you need a line like the above with IPV6ADDR replaced by the address you want to use. Currently Tor only supports IPv6 for connections between Tor servers and only for the data transfer not the directory services.

Data Transfer

I currently have 2 systems running as Tor relays, both of them are well connected in a European DC and they are each transferring about 10GB of data per day which isn’t a lot by server standards. I don’t know if there is a sufficient number of relays around the world that the share of the load is small or if there is some geographic dispersion algorithm which determined that there are too many relays in operation in that region.

July 30, 2017

This Week in HASS – term 3, week 4

This week younger students start investigating how we can find out about the past. This investigation will be conducted over the next 3 weeks and will culminate in a Scientific Report. Older students are considering different sources of historical information and how they will use these sources in their research. Foundation/Prep/Kindy to Year 3 Students […]

July 29, 2017

QSO Today Podcast

Eric, 4Z1UG, has kindly interviewed me for his fine QSO Today Podcast.

Apache Mesos on Debian

I decided to try packaging Mesos for Debian/Stretch. I had a spare system with a i7-930 CPU, 48G of RAM, and SSDs to use for building. The i7-930 isn’t really fast by today’s standards, but 48G of RAM and SSD storage mean that overall it’s a decent build system – faster than most systems I run (for myself and for clients) and probably faster than most systems used by Debian Developers for build purposes.

There’s a github issue about the lack of an upstream package for Debian/Stretch [1]. That upstream issue could probably be worked around by adding Jessie sources to the APT sources.list file, but a package for Stretch is what is needed anyway.

Here is the documentation on building for Debian [2]. The list of packages it gives as build dependencies is incomplete, it also needs zlib1g-dev libapr1-dev libcurl4-nss-dev openjdk-8-jdk maven libsasl2-dev libsvn-dev. So BUILDING this software requires Java + Maven, Ruby, and Python along with autoconf, libtool, and all the usual Unix build tools. It also requires the FPM (Fucking Package Management) tool, I take the choice of name as an indication of the professionalism of the author.

Building the software on my i7 system took 79 minutes which includes 76 minutes of CPU time (I didn’t use the -j option to make). At the end of the build it turned out that I had mistakenly failed to install the Fucking Package Management “gem” and it aborted. At this stage I gave up on Mesos, the pain involved exceeds my interest in trying it out.

How to do it Better

One of the aims of Free Software is that bugs are more likely to get solved if many people look at them. There aren’t many people who will devote 76 minutes of CPU time on a moderately fast system to investigate a single bug. To deal with this software should be prepared as components. An example of this is the SE Linux project which has 13 source modules in the latest release [3]. Of those 13 only 5 are really required. So anyone who wants to start on SE Linux from source (without considering a distribution like Debian or Fedora that has it packaged) can build the 5 most important ones. Also anyone who has an issue with SE Linux on their system can find the one source package that is relevant and study it with a short compile time. As an aside I’ve been working on SE Linux since long before it was split into so many separate source packages and know the code well, but I still find the separation convenient – I rarely need to work on more than a small subset of the code at one time.

The requirement of Java, Ruby, and Python to build Mesos could be partly due to language interfaces to call Mesos interfaces from Ruby and Python. Ohe solution to that is to have the C libraries and header files to call Mesos and have separate packages that depend on those libraries and headers to provide the bindings for other languages. Another solution is to have autoconf detect that some languages aren’t installed and just not try to compile bindings for them (this is one of the purposes of autoconf).

The use of a tool like Fucking Package Management means that you don’t get help from experts in the various distributions in making better packages. When there is a FOSS project with a debian subdirectory that makes barely functional packages then you will be likely to have an experienced Debian Developer offer a patch to improve it (I’ve offered patches for such things on many occasions). When there is a FOSS project that uses a tool that is never used by Debian developers (or developers of Fedora and other distributions) then the only patches you will get will be from inexperienced people.

A software build process should not download anything from the Internet. The source archive should contain everything that is needed and there should be dependencies for external software. Any downloads from the Internet need to be protected from MITM attacks which means that a responsible software developer has to read through the build system and make sure that appropriate PGP signature checks etc are performed. It could be that the files that the Mesos build downloaded from the Apache site had appropriate PGP checks performed – but it would take me extra time and effort to verify this and I can’t distribute software without being sure of this. Also reproducible builds are one of the latest things we aim for in the Debian project, this means we can’t just download files from web sites because the next build might get a different version.

Finally the fpm (Fucking Package Management) tool is a Ruby Gem that has to be installed with the “gem install” command. Any time you specify a gem install command you should include the -v option to ensure that everyone is using the same version of that gem, otherwise there is no guarantee that people who follow your documentation will get the same results. Also a quick Google search didn’t indicate whether gem install checks PGP keys or verifies data integrity in other ways. If I’m going to compile software for other people to use I’m concerned about getting unexpected results with such things. A Google search indicates that Ruby people were worried about such things in 2013 but doesn’t indicate whether they solved the problem properly.

July 28, 2017

RegTech – a primer for the uninitiated

Whilst working at AUSTRAC I wrote a brief about RegTech which was quite helpful. I was given permission to blog the generically useful parts of it for general consumption :) Thanks Leanne!

Overview – This brief is the most important thing you will read in planning transformation! Government can’t regulate in the way we have traditionally done. Traditional approaches are too small, too slow and too ineffective. We need to explore new ways to regulate and achieve the goal of a stronger financial sector resistance to abuse that leverages data, automation, machine learning, technology and collaboration. We are here to help!

The key here is to put technology at the heart of the business strategy, rather than as simply an implementation mechanism. By embracing technology thinking, which means getting geeks into the strategy and policy rooms, we can build the foundation of a modern, responsive, agile, proactive and interactive regulator that can properly scale.

The automation of compliance with RegTech has the potential to overcome individual foibles and human error in a way that provides the quantum leap in culture and compliance that our regulators, customers, policy makers and the community are increasingly demanding… The Holy Grail is when we start to actually write regulation and legislation in code. Imagine the productivity gains and compliance savings of instantaneous certified compliance… We are now in one of the most exciting phases in the development of FinTech since the inception of e-banking.Treasurer Morrison, FinTech Australia Summit, Nov 2016

On the back of the FinTech boom, there is a growth in companies focused on “RegTech” solutions and services to merge technology and regulation/compliance needs for a more 21st century approach to the problem space. It is seen as a logical next step to the FinTech boom, given the high costs and complexity of regulation in the financial sector, but the implications for the broader regulatory sector are significant. The term only started being widely used in 2015. Other governments have started exploring this space, with the UK Government investing significantly.

Core themes of RegTech can be summarised as: data; automation; security; disruption; and enabling collaboration. There is also an overall drive towards everything being closer to real-time, with new data or information informing models, responses and risk in an ongoing self-adjusting fashion.

  • Data driven regulation – better monitoring, better use of available big and small data holdings to inform modelling and analysis (rather than always asking a human to give new information), assessment on the fly, shared data and modelling, trends and forecasting, data analytics for forward looking projections rather than just retrospective analysis, data driven risk and adaptive modelling, programmatic delivery of regulations (regulation as a platform).
  • Automation – reporting, compliance, risk modelling of transactions to determine what should be reported as “suspicious”, system to system registration and escalation, use of machine learning and AI, a more blended approach to work combining humans and machines.
  • Security – biometrics, customer checks, new approaches to KYC, digital identification and assurance, sharing of identity information for greater validation and integrity checking.
  • Disruptive technologies – blockchain, cloud, machine learning, APIs, cryptography, augmented reality and crypto-currencies just to start!
  • Enabling collaboration – for-profit regulation activities, regulation/compliance services and products built on the back of government rules/systems/data, access to distributed ledgers, distributed risk models and shared data/systems, broader private sector innovation on the back of regulator open data and systems.

Some useful references for the more curious:

July 27, 2017

LUV Main August 2017 Meeting

Aug 1 2017 18:30
Aug 1 2017 20:30
Aug 1 2017 18:30
Aug 1 2017 20:30
Location: 
The Dan O'Connell Hotel, 225 Canning Street, Carlton VIC 3053

Tuesday, August 1, 2017

6:30 PM to 8:30 PM
The Dan O'Connell Hotel
225 Canning Street, Carlton VIC 3053

Speakers:

  • Tony Cree, CEO Aboriginal Literacy Foundation (to be confirmed)
  • Russell Coker, QEMU and ARM on AMD64

Russell Coker will demonstrate how to use QEMU to run software for ARM CPUs on an x86 family CPU.

The Dan O'Connell Hotel, 225 Canning Street, Carlton VIC 3053

Food and drinks will be available on premises.

Before and/or after each meeting those who are interested are welcome to join other members for dinner.

Linux Users of Victoria Inc., is an incorporated association, registration number A0040056C.

August 1, 2017 - 18:30

LUV Beginners August Meeting: Secure Shell (SSH)

Aug 26 2017 12:30
Aug 26 2017 16:30
Aug 26 2017 12:30
Aug 26 2017 16:30
Location: 
Infoxchange, 33 Elizabeth St. Richmond

Secure Shell (SSH)

The meeting will be held at Infoxchange, 33 Elizabeth St. Richmond 3121 (enter via the garage on Jonas St.) Late arrivals, please call (0421) 775 358 for access to the venue.

LUV would like to acknowledge Infoxchange for the venue.

Linux Users of Victoria Inc., is an incorporated association, registration number A0040056C.

August 26, 2017 - 12:30

read more

July 25, 2017

Forking Mon and DKIM with Mailing Lists

I have forked the “Mon” network/server monitoring system. Here is a link to the new project page [1]. There hasn’t been an upstream release since 2010 and I think we need more frequent releases than that. I plan to merge as many useful monitoring scripts as possible and support them well. All Perl scripts will use strict and use other best practices.

The first release of etbe-mon is essentially the same as the last release of the mon package in Debian. This is because I started work on the Debian package (almost all the systems I want to monitor run Debian) and as I had been accepted as a co-maintainer of the Debian package I put all my patches into Debian.

It’s probably not a common practice for someone to fork upstream of a package soon after becoming a comaintainer of the Debian package. But I believe that this is in the best interests of the users. I presume that there are other collections of patches out there and I hope to merge them so that everyone can get the benefits of features and bug fixes that have been separate due to a lack of upstream releases.

Last time I checked mon wasn’t in Fedora. I believe that mon has some unique features for simple monitoring that would be of benefit to Fedora users and would like to work with anyone who wants to maintain the package for Fedora. I am also interested in working with any other distributions of Linux and with non-Linux systems.

While setting up the mailing list for etbemon I wrote an article about DKIM and mailing lists (primarily Mailman) [2]. This explains how to setup Mailman for correct operation with DKIM and also why that seems to be the only viable option.

July 23, 2017

This Week in HASS – term 3, week 3

This week our youngest students are playing games from different places around the world, in the past. Slightly older students are completing the Timeline Activity. Students in Years 4, 5 and 6 are starting to sink their teeth into their research project for the term, using the Scientific Process. Foundation/Prep/Kindy to Year 3 This week […]

test post

test posting from wordpress.com

01 – [Jul-24 13:35 API] Volley error on https://public-api.wordpress.com/rest/v1.1/sites/4046490/posts/366/?context=edit&locale=en_AU – exception: null
02 – [Jul-24 13:35 API] StackTrace: com.android.volley.ServerError
at com.android.volley.toolbox.BasicNetwork.performRequest(BasicNetwork.java:179)
at com.android.volley.NetworkDispatcher.run(NetworkDispatcher.java:114)

03 – [Jul-24 13:35 API] Dispatching action: PostAction-PUSHED_POST
04 – [Jul-24 13:35 POSTS] Post upload failed. GENERIC_ERROR: The Jetpack site is inaccessible or returned an error: transport error – HTTP status code was not 200 (403) [-32300]
05 – [Jul-24 13:35 POSTS] updateNotificationError: Error while uploading the post: The Jetpack site is inaccessible or returned an error: transport error – HTTP status code was not 200 (403) [-32300]
06 – [Jul-24 13:35 EDITOR] Focus out callback received

July 20, 2017

New Dates for Earliest Archaeological Site in Aus!

This morning news was released of a date of 65,000 years for archaeological material at the site of Madjedbebe rock shelter in the Jabiluka mineral lease area, surrounded by Kakadu National Park. The site is on the land of the Mirarr people, who have partnered with archaeologists from the University of Queensland for this investigation. […]

July 17, 2017

XDP on Power

This post is a bit of a break from the standard IBM fare of this blog, as I now work for Canonical. But I have a soft spot for Power from my time at IBM - and Canonical officially supports 64-bit, little-endian Power - so when I get a spare moment I try to make sure that cool, officially-supported technologies work on Power before we end up with a customer emergency! So, without further ado, this is the story of XDP on Power.

XDP

eXpress Data Path (XDP) is a cool Linux technology to allow really fast processing of network packets.

Normally in Linux, a packet is received by the network card, an SKB (socket buffer) is allocated, and the packet is passed up through the networking stack.

This introduces an inescapable latency penalty: we have to allocate some memory and copy stuff around. XDP allows some network cards and drivers to process packets early - even before the allocation of the SKB. This is much faster, and so has applications in DDOS mitigation and other high-speed networking use-cases. The IOVisor project has much more information if you want to learn more.

eBPF

XDP processing is done by an eBPF program. eBPF - the extended Berkeley Packet Filter - is an in-kernel virtual machine with a limited set of instructions. The kernel can statically validate eBPF programs to ensure that they terminate and are memory safe. From this it follows that the programs cannot be Turing-complete: they do not have backward branches, so they cannot do fancy things like loops. Nonetheless, they're surprisingly powerful for packet processing and tracing. eBPF programs are translated into efficient machine code using in-kernel JIT compilers on many platforms, and interpreted on platforms that do not have a JIT. (Yes, there are multiple JIT implementations in the kernel. I find this a terrifying thought.)

Rather than requiring people to write raw eBPF programs, you can write them in a somewhat-restricted subset of C, and use Clang's eBPF target to translate them. This is super handy, as it gives you access to the kernel headers - which define a number of useful data structures like headers for various network protocols.

Trying it

There are a few really interesting project that are already up and running that allow you to explore XDP without learning the innards of both eBPF and the kernel networking stack. I explored the samples in the bcc compiler collection and also the samples from the netoptimizer/prototype-kernel repository.

The easiest way to get started with these is with a virtual machine, as recent virtio network drivers support XDP. If you are using Ubuntu, you can use the uvt-kvm tooling to trivially set up a VM running Ubuntu Zesty on your local machine.

Once your VM is installed, you need to shut it down and edit the virsh XML.

You need 2 vCPUs (or more) and a virtio+vhost network card. You also need to edit the 'interface' section and add the following snippet (with thanks to the xdp-newbies list):

<driver name='vhost' queues='4'>
    <host tso4='off' tso6='off' ecn='off' ufo='off'/>
    <guest tso4='off' tso6='off' ecn='off' ufo='off'/>
</driver>

(If you have more than 2 vCPUs, set the queues parameter to 2x the number of vCPUs.)

Then, install a modern clang (we've had issues with 3.8 - I recommend v4+), and the usual build tools.

I recommend testing with the prototype-kernel tools - the DDOS prevention tool is a good demo. Then - on x86 - you just follow their instructions. I'm not going to repeat that here.

POWERful XDP

What happens when you try this on Power? Regular readers of my posts will know to expect some minor hitches.

XDP does not disappoint.

Firstly, the prototype-kernel repository hard codes x86 as the architecture for kernel headers. You need to change it for powerpc.

Then, once you get the stuff compiled, and try to run it on a current-at-time-of-writing Zesty kernel, you'll hit a massive debug splat ending in:

32: (61) r1 = *(u32 *)(r8 +12)
misaligned packet access off 0+18+12 size 4
load_bpf_file: Permission denied

It turns out this is because in Ubuntu's Zesty kernel, CONFIG_HAS_EFFICIENT_UNALIGNED_ACCESS is not set on ppc64el. Because of that, the eBPF verifier will check that all loads are aligned - and this load (part of checking some packet header) is not, and so the verifier rejects the program. Unaligned access is not enabled because the Zesty kernel is being compiled for CPU_POWER7 instead of CPU_POWER8, and we don't have efficient unaligned access on POWER7.

As it turns out, IBM never released any officially supported Power7 LE systems - LE was only ever supported on Power8. So, I filed a bug and sent a patch to build Zesty kernels for POWER8 instead, and that has been accepted and will be part of the next stable update due real soon now.

Sure enough, if you install a kernel with that config change, you can verify the XDP program and load it into the kernel!

If you have real powerpc hardware, that's enough to use XDP on Power! Thanks to Michael Ellerman, maintainer extraordinaire, for verifying this for me.

If - like me - you don't have ready access to Power hardware, you're stuffed. You can't use qemu in TCG mode: to use XDP with a VM, you need multi-queue support, which only exists in the vhost driver, which is only available for KVM guests. Maybe IBM should release a developer workstation. (Hint, hint!)

Overall, I was pleasantly surprised by how easy things were for people with real ppc hardware - it's encouraging to see something not require kernel changes!

eBPF and XDP are definitely growing technologies - as Brendan Gregg notes, now is a good time to learn them! (And those on Power have no excuse either!)

July 16, 2017

This Week in HASS – term 3, week 2

This week older students start their research projects for the term, whilst younger students are doing the Timeline Activity. Our youngest students are thinking about the places where people live and can join together with older students as buddies to Build A Humpy together. Foundation/Prep/Kindy to Year 3 Students in stand-alone Foundation/Prep/Kindy classes (Unit F.3), […]

July 13, 2017

Bitcoin: ASICBoost – Plausible or not?

So the first question: is ASICBoost use plausible in the real world?

There are plenty of claims that it’s not:

  • “Much conspiracy around today. I don’t believe SegWit non-activation has anything to do with AsicBoost!” – Timo Hanke, one of the patent applicants, on twitter
  • “there’s absolutely nothing but baseless accusations flying around” – Emin Gun Sirer’s take, linked from the Bitmain statement
  • “no company would ever produce a chip that would have a switch in to hide that it’s actually an ASICboost chip.” – Sam Cole formerly of KNCMiner which went bankrupt due to being unable to compete with Bitmain in 2016
  • “I believe their claim about not activating ASICBoost. It is very small money for them.” – Guy Corem of SpoonDoolies, who independently discovered ASICBoost
  • “No one is even using Asicboost.” – Roger Ver (/u/memorydealers) on reddit

A lot of these claims don’t actually match reality though: ASICBoost is implemented in Bitmain miners sold to the public, and since it defaults to off, a switch to hide it is obviously easily possible since it’s disabled by default, contradicting Sam Cole’s take. There’s plenty of circumstantial evidence of ASICBoost-related transaction structuring in blocks, contradicting the basis on which Emin Gun Sirer’s dismisses the claims. The 15%-30% improvement claims that Guy Corem and Sam Cole cite are certainly large enough to be worth looking into — and  Bitmain confirms to have done on testnet. Even Guy Corem’s claim that they only amount to $2,000,000 in savings per year rather than $100,000,000 seems like a reason to expect it to be in use, rather than so little that you wouldn’t bother.

If ASICBoost weren’t in use on mainnet it would probably be relatively straightforward to prove that: Bitmain could publish the benchmarks results they got when testing on testnet, and why that proved not to be worth doing on mainnet, and provide instructions for their customers on how to reproduce their results, for instance. Or Bitmain and others could support efforts to block ASICBoost from being used on mainnet, to ensure no one else uses it, for the greater good of the network — if, as they claim, they’re already not using it, this would come at no cost to them.

To me, much of the rhetoric that’s being passed around seems to be a much better match for what you would expect if ASICBoost were in use, than if it was not. In detail:

  • If ASICBoost were in use, and no one had any reason to hide it being used, then people would admit to using it, and would do so by using bits in the block version.
  • If ASICBoost were in use, but people had strong reasons to hide that fact, then people would claim not to be using it for a variety of reasons, but those explanations would not stand up to more than casual analysis.
  • If ASICBoost were not in use, and it was fairly easy to see there is no benefit to it, then people would be happy to share their reasoning for not using it in detail, and this reasoning would be able to be confirmed independently.
  • If ASICBoost were not in use, but the reasons why it is not useful require significant research efforts, then keeping the detailed reasoning private may act as a competitive advantage.

The first scenario can be easily verified, and does not match reality. Likewise the third scenario does not (at least in my opinion) match reality; as noted above, many of the explanations presented are superficial at best, contradict each other, or simply fall apart on even a cursory analysis. Unfortunately that rules out assuming good faith — either people are lying about using ASICBoost, or just dissembling about why they’re not using it. Working out which of those is most likely requires coming to our own conclusion on whether ASICBoost makes sense.

I think Jimmy Song had some good posts on that topic. His first, on Bitmain’s ASICBoost claims finds some plausible examples of ASICBoost testing on testnet, however this was corrected in the comments as having been performed by Timo Hanke, rather than Bitmain. Having a look at other blocks’ version fields on testnet seems to indicate that there hasn’t been much other fiddling of version fields, so presumably whatever testing of ASICBoost was done by Bitmain, fiddling with the version field was not used; but that in turn implies that Bitmain must have been testing covert ASICBoost on testnet, assuming their claim to have tested it on testnet is true in the first place (they could quite reasonably have used a private testnet instead). Two later posts, on profitability and ASICBoost and Bitmain’s profitability in particular, go into more detail, mostly supporting Guy Corem’s analysis mentioned above. Perhaps interestingly, Jimmy Song also made a proposal to the bitcoin-dev shortly after Greg’s original post revealing ASICBoost and prior to these posts; that proposal would have endorsed use of ASICBoost on mainnet, making it cheaper and compatible with segwit, but would also have made use of ASICBoost readily apparent to both other miners and patent holders.

It seems to me there are three different ways to look at the maths here, and because this is an economics question, each of them give a different result:

  • Greg’s maths splits miners into two groups each with 50% of hashpower. One group, which is unable to use ASICBoost is assumed to be operating at almost zero profit, so their costs to mine bitcoins are only barely below the revenue they get from selling the bitcoin they mine. Using this assumption, the costs of running mining equipment are calculated by taking the number of bitcoin mined per year (365*24*6*12.5=657k), multiplying that by the price at the time ($1100), and halving the costs because each group only mines half the chain. This gives a cost of mining for the non-ASICBoost group of $361M per year. The other group, which uses ASICBoost, then gains a 30% advantage in costs, so only pays 70%, or $252M, a comparative saving of approximately $100M per annum. This saving is directly proportional to hashrate and ASICBoost advantage, so using Guy Corem’s figures of 13.2% hashrate and 15% advantage, this reduces from $95M to $66M, saving about $29M per annum.
  • Guy Corem’s maths estimates Bitmain’s figures directly: looking at the AntPool hashpower share, he estimates 500PH/s in hashpower (or 13.2%); he uses the specs of the AntMiner S9 to determine power usage (0.1 J/GH); he looks at electricity prices in China and estimates $0.03 per kWh; and he estimates the ASICBoost advantage to be 15%. This gives a total cost of 500M GH/s * 0.1 J/GH / 1000 W/kW * $0.03 per kWh * 24 * 365 which is $13.14 M per annum, so a 15% saving is just under $2M per annum. If you assume that the hashpower was 50% and ASICBoost gave a 30% advantage instead, this equates to about 1900 PH/s, and gives a benefit of just under $15M per annum. In order to get the $100M figure to match Greg’s result, you would also need to increase electricity costs by a factor of six, from 3c per kWH to 20c per kWH.
  • The approach I prefer is to compare what your hashpower would be keeping costs constant and work out the difference in revenue: for example, if you’re spending $13M per annum in electricity, what is your profit with ASICBoost versus without (assuming that the difficulty retargets appropriately, but no one else changes their mining behaviour). Following this line of thought, if you have 500PH/s with ASICBoost giving you a 30% boost, then without ASICBoost, you have 384 PH/s (500/1.3). If that was 13.2% of hashpower, then the remaining 86.8% of hashpower is 3288 PH/s, so when you stop using ASICBoost and a retarget occurs, total hashpower is now 3672 PH/s (384+3288), and your percentage is now 10.5%. Because mining revenue is simply proportional to hashpower, this amounts to a loss of 2.7% of the total bitcoin reward, or just under $20M per year. If you match Greg’s assumptions (50% hashpower, 30% benefit) that leads to an estimate of $47M per annum; if you match Guy Corem’s assumptions (13.2% hashpower, 15% benefit) it leads to an estimate of just under $11M per annum.

So like I said, that’s three different answers in each of two scenarios: Guy’s low end assumption of 13.2% hashpower and a 15% advantage to ASICBoost gives figures of $29M/$2M/$11M; while Greg’s high end assumptions of 50% hashpower and 30% advantage give figures of $100M/$15M/$47M. The differences in assumptions there is obviously pretty important.

I don’t find the assumptions behind Greg’s maths realistic: in essence, it assumes that mining be so competitive that it is barely profitable even in the short term. However, if that were the case, then nobody would be able to invest in new mining hardware, because they would not recoup their investment. In addition, even if at some point mining were not profitable, increases in the price of bitcoin would change that, and the price of bitcoin has been increasing over recent months. Beyond that, it also assumes electricity prices do not vary between miners — if only the marginal miner is not profitable, it may be that some miners have lower costs and therefore are profitable; and indeed this is likely the case, because electricity prices vary over time due to both seasonal and economic factors. The method Greg uses does is useful for establishing an upper limit, however: the only way ASICBoost could offer more savings than Greg’s estimate would be if every block mined produced less revenue than it cost in electricity, and miners were making a loss on every block. (This doesn’t mean $100M is an upper limit however — that estimate was current in April, but the price of bitcoin has more than doubled since then, so the current upper bound via Greg’s maths would be about $236M per year)

A downside to Guy’s method from the point of view of outside analysis is that it requires more information: you need to know the efficiency of the miners being used and the cost of electricity, and any error in those estimates will be reflected in your final figure. In particular, the cost of electricity needs to be a “whole lifecycle” cost — if it costs 3c/kWh to supply electricity, but you also need to spend an additional 5c/kWh in cooling in order to keep your data-centre operating, then you need to use a figure of 8c/kWh to get useful results. This likely provides a good lower bound estimate however: using ASICBoost will save you energy, and if you forget to account for cooling or some other important factor, then your estimate will be too low; but that will still serve as a loose lower bound. This estimate also changes over time however; while it doesn’t depend on price, it does depend on deployed hashpower — since total hashrate has risen from around 3700 PH/s in April to around 6200 PH/s today, if Bitmain’s hashrate has risen proportionally, it has gone from 500 PH/s to 837 PH/s, and an ASICBoost advantage of 15% means power cost savings have gone from $2M to $3.3M per year; or if Bitmain has instead maintained control of 50% of hashrate at 30% advantage, the savings have gone from $15M to $25M per year.

The key difference between my method and both Greg’s and Guy’s is that they implicitly assume that consuming more electricity is viable, and costs simply increase proportionally; whereas my method assumes that this is not viable, and instead that sufficient mining hardware has been deployed that power consumption is already constrained by some other factor. This might be due to reaching the limit of what the power company can supply, or the rating of the wiring in the data centre, or it might be due to the cooling capacity, or fire risk, or some other factor. For an operation spanning multiple data centres this may be the case for some locations but not others — older data centres may be maxed out, while newer data centres are still being populated and may have excess capacity, for example. If setting up new data centres is not too difficult, it might also be true in the short term, but not true in the longer term — that is having each miner use more power due to disabling ASICBoost might require shutting some miners down initially, but they may be able to be shifted to other sites over the course of a few weeks or month, and restarted there, though this would require taking into account additional hosting costs beyond electricity and cooling. As such, I think this is a fairly reasonable way to produce an plausible estimate, and it’s the one I’ll be using. Note that it depends on the bitcoin price, so the estimates this method produces have also risen since April, going from $11M to $24M per annum (13.2% hash, 15% advantage) or from $47M to $103M (50% hash, 30% advantage).

The way ASICBoost works is by allowing you to save a few steps: normally when trying to generate a proof of work, you have to do essentially six steps:

  1. A = Expand( Chunk1 )
  2. B = Compress( A, 0 )
  3. C = Expand( Chunk2 )
  4. D = Compress( C, B )
  5. E = Expand( D )
  6. F = Compress( E )

The expected process is to do steps (1,2) once, then do steps (3,4,5,6) about four billion (or more) times, until you get a useful answer. You do this process in parallel across many different chips. ASICBoost changes this process by observing that step (3) is independent of steps (1,2) — so by finding a variety of Chunk1s — call them Chunk1-A, Chunk1-B, Chunk1-C and Chunk1-D that are each compatible with a common Chunk2. In that case, you do steps (1,2) four times for each different Chunk1, then do step (3) four billion (or more) times, and do steps (4,5,6) 16 billion (or more) times, to get four times the work, while saving 12 billion (or more) iterations of step (3). Depending on the number of Chunk1’s you set yourself up to find, and the relative weight of the Expand versus Compress steps, this comes to (n-1)/n / 2 / (1+c/e), where n is the number of different Chunk1’s you have. If you take the weight of Expand and Compress steps as about equal, it simplifies to 25%*(n-1)/n, and with n=4, this is 18.75%. As such, an ASICBoost advantage of about 20% seems reasonably practical to me. At 50% hash and 20% advantage, my estimates for ASICBoost’s value are $33M in April, and $72M today.

So as to the question of whether you’d use ASICBoost, I think the answer is a clear yes: the lower end estimate has risen from $2M to $3.3M per year, and since Bitmain have acknowledged that AntMiner’s support ASICBoost in hardware already, the only additional cost is finding collisions which may not be completely trivial, but is not difficult and is easily automated.

If the benefit is only in this range, however, this does not provide a plausible explanation for opposing segwit: having the Bitcoin industry come to a consensus about how to move forward would likely increase the bitcoin price substantially, definitely increasing Bitmain’s mining revenue — even a 2% increase in price would cover their additional costs. However, as above, I believe this is only a lower bound, and a more reasonable estimate is on the order of $11M-$47M as of April or $24M-$103M as of today. This is a much more serious range, and would require an 11%-25% increase in price to not be an outright loss; and a far more attractive proposition would be to find a compromise position that both allows the industry to move forward (increasing the price) and allows ASICBoost to remain operational (maintaining the cost savings / revenue boost).

 

It’s possible to take a different approach to analysing the cost-effectiveness of mining given how much you need to pay in electricity costs. If you have access to a lot of power at a flat rate, can deal with other hosting issues, can expand (or reduce) your mining infrastructure substantially, and have some degree of influence in how much hashpower other miners can deploy, then you can derive a formula for what proportion of hashpower is most profitable for you to control.

In particular, if your costs are determined by an electricity (and cooling, etc) price, E, in dollars per kWh and performance, r, in Joules per gigahash, then given your hashrate, h in terahash/second, your power usage in watts is (h*1e3*r), and you run this for 600 seconds on average between each block (h*r*6e5 Ws), which you divide by 3.6M to convert to kWh (h*r/6), then multiply by your electricity cost to get a dollar figure (h*r*E/6). Your revenue depends on the hashrate of the everyone else, which we’ll call g, and on average you receive (p*R*h/(h+g)) every 600 seconds where p is the price of Bitcoin in dollars and R is the reward (subsidy and fees) you receive from a block. Your profit is just the difference, namely h*(p*R/(h+g) – r*E/6). Assuming you’re able to manufacture and deploy hashrate relatively easily, at least in comparison to everyone else, you can optimise your profit by varying h while the other variables (bitcoin price p, block reward R, miner performance r, electricity cost E, and external hashpower g) remain constant (ie, set the derivative of that formula with respect to h to zero and simplify) which gives a result of 6gpR/Er = (g+h)^2.

This is solvable for h (square root both sides and subtract g), but if we assume Bitmain is clever and well funded enough to have already essentially optimised their profits, we can get a better sense of what this means. Since g+h is just the total bitcoin hashrate, if we call that t, and divide both sides, we get 6gpR/Ert = t, or g/t = (Ert)/(6pR), which tells us what proportion of hashrate the rest of the network can have (g/t) if Bitmain has optimised its profits, or, alternative we can work out h/t = 1-g/t = 1-(Ert)/(6pR) which tells us what proportion of hashrate Bitmain will have if it has optimised its profits.  Plugging in E=$0.03 per kWH, r=0.1 J/GH, t=6e6 TH/s, p=$2400/BTC, R=12.5 BTC gives a figure of 0.9 – so given the current state of the network, and Guy Corem’s cost estimate, Bitmain would optimise its day to day profits by controlling 90% of mining hashrate. I’m not convinced $0.03 is an entirely reasonable figure, though — my inclination is to suspect something like $0.08 per kWh is more reasonable; but even so, that only reduces Bitmain’s optimal control to around 73%.

Because of that incentive structure, if Bitmain’s current hashrate is lower than that amount, then lowering manufacturing costs for own-use miners by 15% (per Sam Cole’s estimates) and lowering ongoing costs by 15%-30% by using ASICBoost could have a compounding effect by making it easier to quickly expand. (It’s not clear to me that manufacturing a line of ASICBoost-only miners to reduce manufacturing costs by 15% necessarily makes sense. For one thing, this would come at a cost of not being able to mine with them while they are state of the art, then sell them on to customers once a more efficient model has been developed, which seems like it might be a good way to manage inventory. For another, it vastly increases the impact of ASICBoost not being available: rather than simply increasing electricity costs by 15%-30%, it would mean reducing output to 10%-25% of what it was, likely rendering the hardware immediately obsolete)

Using the same formula, it’s possible to work out a ratio of bitcoin price (p) to hashrate (t) that makes it suboptimal for a manufacturer to control a hashrate majority (at least just due to normal mining income): h/t < 0.5, 1-Ert/6pR < 0.5, so t > 3pR/Er. Plugging in p=2400, R=12.5, e=0.08, r=0.1, this gives a total hash rate of 11.25M TH/s, almost double the current hash rate. This hashrate target would obviously increase as the bitcoin price increases, halve if the block reward halves (if a fall in the inflation subsidy is not compensated by a corresponding increase in fee income eg), increase if the efficiency of mining hardware increases, and decrease if the cost of electricity increases. For a simpler formula, assuming the best hosting price is $0.08 per kWh, and while the Antminer S9’s efficiency at 0.1 J/GH is state of the art, and the block reward is 12.5 BTC, the global hashrate in TH/s should be at least around 5000 times the price (ie 3R/Er = 4787.5, near enough to 5000).

Note that this target also sets a limit on the range at which mining can be profitable: if it’s just barely better to allow other people to control >50% of miners when your cost of electricity is E, then for someone else whose cost of electricity is 2*E or more, optimal profit is when other people control 100% of hashrate, that is, you don’t mine at all. Thus if the best large scale hosting globally costs $0.08/kWh, then either mining is not profitable anywhere that hosting costs $0.16/kWh or more, or there’s strong centralisation pressure for a mining hardware manufacturer with access to the cheapest electrictiy to control more than 50% of hashrate. Likewise, if Bitmain really can do hosting at $0.03/kWh, then either they’re incentivised to try to control over 50% of hashpower, or mining is unprofitable at $0.06/kWh and above.

If Bitmain (or any mining ASIC manufacturer) is supplying the majority of new hashrate, they actually have a fairly straightforward way of achieving that goal: if they dedicate 50-70% of each batch of ASICs built for their own use, and sell the rest, with the retail price of the sold miners sufficient to cover the manufacturing cost of the entire batch, then cashflow will mostly take care of itself. At $1200 retail price and $500 manufacturing costs (per Jimmy Song’s numbers), that strategy would imply targeting control of up to about 58% of total hashpower. The above formula would imply that’s the profit-maximising target at the current total hashrate and price if your average hosting cost is about $0.13 per kWh. (Those figures obviously rely heavily on the accuracy of the estimated manufacturing costs of mining hardware; at $400 per unit and $1200 retail, that would be 67% of hashpower, and about $0.09 per kWh)

Strategies like the above are also why this analysis doesn’t apply to miners who buy their hardware rather from a vendor, rather than building their own: because every time they increase their own hash rate (h), the external hashrate (g) also increases as a direct result, it is not valid to assume that g is constant when optimising h, so the partial derivative and optimisation is in turn invalid, and the final result is not applicable.

 

Bitmain’s mining pool, AntPool, obviously doesn’t directly account for 58% or more of total hashpower; though currently they’re the pool with the most hashpower at about 20%. As I understand it, Bitmain is also known to control at least BTC.com and ConnectBTC which add another 7.6%. The other “Emergent Consensus” supporting pools (Bitcoin.com, BTC.top, ViaBTC) account for about 22% of hashpower, however, which brings the total to just under 50%, roughly the right ballpark — and an additional 8% or 9% could easily be pointed at other public pools like slush or f2pool. Whether the “emergent consensus” pools are aligned due to common ownership and contractual obligations or simply similar interests is debatable, though. ViaBTC is funded by Bitmain, and Canoe was built and sold by Bitmain, which means strong contractual ties might exist, however  Jihan Wu, Bitmain’s co-founder, has disclaimed equity ties to BTC.top. Bitcoin.com is owned by Roger Ver, but I haven’t come across anything implying a business relationship between Bitmain and Bitcoin.com beyond supplier and customer. However John McAffee’s apparently forthcoming MGT mining pool is both partnered with Bitmain and advised by Roger Ver, so the existence of tighter ties may be plausible.

It seems likely to me that Bitmain is actually behaving more altruistically than is economically rational according to the analysis above: while it seems likely to me that Bitcoin.com, BTC.top, ViaBTC and Canoe have strong ties to Bitmain and that Bitmain likely has a high level of influence — whether due to contracts, business relationships or simply due to the loyalty and friendship — this nevertheless implies less control over the hashpower than direct ownership and management, and likely less profit. This could be due to a number of factors: perhaps Bitmain really is already sufficiently profitable from mining that they’re focusing on building their business in other ways; perhaps they feel the risks of centralised mining power are too high (and would ultimately be a risk to their long term profits) and are doing their best to ensure that mining power is decentralised while still trying to maximise their return to their investors; perhaps the rate of expansion implied by this analysis requires more investment than they can cover from their cashflow, and additional hashpower is funded by new investors who are simply assigned ownership of a new mining pool, which may helps Bitmain’s investors assure themselves they aren’t being duped by a pyramid scheme and gives more of an appearance of decentralisation.

It seems to me therefore there could be a variety of ways in which Bitmain may have influence over a majority of hashpower:

  • Direct ownership and control, that is being obscured in order to avoid an economic backlash that might result from people realising over 50% of hashpower is controlled by one group
  • Contractual control despite independent ownership, such that customers of Bitmain are committed to follow Bitmain’s lead when signalling blocks in order to maintain access to their existing hardware, or to be able to purchase additional hardware (an account on reddit appearing to belong to the GBMiners pool has suggested this is the case)
  • Contractual control due to offering essential ongoing services, eg support for physical hosting, or some form of mining pool services — maintaining the infrastructure for covert ASICBoost may be technically complex enough that Bitmain’s customers cannot maintain it themselves, but that Bitmain could relatively easily supply as an ongoing service to their top customers.
  • Contractual influence via leasing arrangements rather than sale of hardware — if hardware is leased to customers, or financing is provided, Bitmain could retain some control of the hardware until the leasing or financing term is complete, despite not having ownership
  • Coordinated investment resulting in cartel-like behaviour — even if there is no contractual relationship where Bitmain controls some of its customers in some manner, it may be that forming a cartel of a few top miners allows those miners to increase profits; in that case rather than a single firm having control of over 50% of hashrate, a single cartel does. While this is technically different, it does not seem likely to be an improvement in practice. If such a cartel exists, its members will not have any reason to compete against each other until it has maximised its profits, with control of more than 70% of the hashrate.

 

So, conclusions:

  • ASICBoost is worth using if you are able to. Bitmain is able to.
  • Nothing I’ve seen suggest Bitmain is economically clueless; so since ASICBoost is worth doing, and Bitmain is able to use it on mainnet, Bitmain are using it on mainnet.
  • Independently of ASICBoost, Bitmain’s most profitable course of action seems to be to control somewhere in the range of 50%-80% of the global hashrate at current prices and overall level of mining.
  • The distribution of hashrate between mining pools aligned with Bitmain in various ways makes it plausible, though not certain, that this may already be the case in some form.
  • If all this hashrate is benefiting from ASICBoost, then my estimate is that the value of ASICBoost is currently about $72M per annum
  • Avoiding dominant mining manufacturers tending towards supermajority control of hashrate requires either a high global hashrate or a relatively low price — the hashrate in TH/s should be about 5000 times the price in dollars.
  • The current price is about $2400 USD/BTC, so the corresponding hashrate to prevent centralisation at that price point is 12M TH/s. Conversely, the current hashrate is about 6M TH/s, so the maximum price that doesn’t cause untenable centralisation pressure is $1200 USD/BTC.

CFP for Percona Live Europe Dublin 2017 closes July 17 2017!

I’ve always enjoyed the Percona Live Europe events, because I consider them to be a lot more intimate than the event in Santa Clara. It started in London, had a smashing success last year in Amsterdam (conference sold out), and by design the travelling conference is now in Dublin from September 25-27 2017.

So what are you waiting for when it comes to submitting to Percona Live Europe Dublin 2017? Call for presentations close on July 17 2017, the conference has a pretty diverse topic structure (MySQL [and its diverse ecosystem including MariaDB Server naturally], MongoDB and other open source databases including PostgreSQL, time series stores, and more).

And I think we also have a pretty diverse conference committee in terms of expertise. You can also register now. Early bird registration ends August 8 2017.

I look forward to seeing you in Dublin, so we can share a pint of Guinness. Sláinte.

July 12, 2017

Toggling Between Pulseaudio Outputs when Docking a Laptop

In addition to selecting the right monitor after docking my ThinkPad, I wanted to set the correct sound output since I have headphones connected to my Ultra Dock. This can be done fairly easily using Pulseaudio.

Switching to a different pulseaudio output

To find the device name and the output name I need to provide to pacmd, I ran pacmd list-sinks:

2 sink(s) available.
...
  * index: 1
    name: <alsa_output.pci-0000_00_1b.0.analog-stereo>
    driver: <module-alsa-card.c>
...
    ports:
        analog-output: Analog Output (priority 9900, latency offset 0 usec, available: unknown)
            properties:

        analog-output-speaker: Speakers (priority 10000, latency offset 0 usec, available: unknown)
            properties:
                device.icon_name = "audio-speakers"

From there, I extracted the soundcard name (alsa_output.pci-0000_00_1b.0.analog-stereo) and the names of the two output ports (analog-output and analog-output-speaker).

To switch between the headphones and the speakers, I can therefore run the following commands:

pacmd set-sink-port alsa_output.pci-0000_00_1b.0.analog-stereo analog-output
pacmd set-sink-port alsa_output.pci-0000_00_1b.0.analog-stereo analog-output-speaker

Listening for headphone events

Then I looked for the ACPI event triggered when my headphones are detected by the laptop after docking.

After looking at the output of acpi_listen, I found jack/headphone HEADPHONE plug.

Combining this with the above pulseaudio names, I put the following in /etc/acpi/events/thinkpad-dock-headphones:

event=jack/headphone HEADPHONE plug
action=su francois -c "pacmd set-sink-port alsa_output.pci-0000_00_1b.0.analog-stereo analog-output"

to automatically switch to the headphones when I dock my laptop.

Finding out whether or not the laptop is docked

While it is possible to hook into the docking and undocking ACPI events and run scripts, there doesn't seem to be an easy way from a shell script to tell whether or not the laptop is docked.

In the end, I settled on detecting the presence of USB devices.

I ran lsusb twice (once docked and once undocked) and then compared the output:

lsusb  > docked 
lsusb  > undocked 
colordiff -u docked undocked 

This gave me a number of differences since I have a bunch of peripherals attached to the dock:

--- docked  2017-07-07 19:10:51.875405241 -0700
+++ undocked    2017-07-07 19:11:00.511336071 -0700
@@ -1,15 +1,6 @@
 Bus 001 Device 002: ID 8087:8000 Intel Corp. 
 Bus 001 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub
-Bus 003 Device 081: ID 0424:5534 Standard Microsystems Corp. Hub
-Bus 003 Device 080: ID 17ef:1010 Lenovo 
 Bus 003 Device 001: ID 1d6b:0003 Linux Foundation 3.0 root hub
-Bus 002 Device 041: ID xxxx:xxxx ...
-Bus 002 Device 040: ID xxxx:xxxx ...
-Bus 002 Device 039: ID xxxx:xxxx ...
-Bus 002 Device 038: ID 17ef:100f Lenovo 
-Bus 002 Device 037: ID xxxx:xxxx ...
-Bus 002 Device 042: ID 0424:2134 Standard Microsystems Corp. Hub
-Bus 002 Device 036: ID 17ef:1010 Lenovo 
 Bus 002 Device 002: ID xxxx:xxxx ...
 Bus 002 Device 004: ID xxxx:xxxx ...
 Bus 002 Device 001: ID 1d6b:0002 Linux Foundation 2.0 root hub

I picked 17ef:1010 as it appeared to be some internal bus on the Ultra Dock (none of my USB devices were connected to Bus 003) and then ended up with the following port toggling script:

#!/bin/bash

if /usr/bin/lsusb | grep 17ef:1010 > /dev/null ; then
    # docked
    pacmd set-sink-port alsa_output.pci-0000_00_1b.0.analog-stereo analog-output
else
    # undocked
    pacmd set-sink-port alsa_output.pci-0000_00_1b.0.analog-stereo analog-output-speaker
fi

July 11, 2017

Linux Security Summit 2017 Schedule Published

The schedule for the 2017 Linux Security Summit (LSS) is now published.

LSS will be held on September 14th and 15th in Los Angeles, CA, co-located with the new Open Source Summit (which includes LinuxCon, ContainerCon, and CloudCon).

The cost of LSS for attendees is $100 USD. Register here.

Highlights from the schedule include the following refereed presentations:

There’s also be the usual Linux kernel security subsystem updates, and BoF sessions (with LSM namespacing and LSM stacking sessions already planned).

See the schedule for full details of the program, and follow the twitter feed for the event.

This year, we’ll also be co-located with the Linux Plumbers Conference, which will include a containers microconference with several security development topics, and likely also a TPMs microconference.

A good critical mass of Linux security folk should be present across all of these events!

Thanks to the LSS program committee for carefully reviewing all of the submissions, and to the event staff at Linux Foundation for expertly planning the logistics of the event.

See you in Los Angeles!

July 10, 2017

This Week in HASS – term 3, week 1

Today marks the start of a new term in Queensland, although most states and territories have at least another week of holidays, if not more. It’s always hard to get back into the swing of things in the 3rd term, with winter cold and the usual round of flus and sniffles. OpenSTEM’s 3rd term units […]

Bitcoin: ASICBoost and segwit2x – Background

I’ve been trying to make heads or tails of what the heck is going on in Bitcoin for a while now. I’m not sure I’ve actually made that much progress, but I’ve at least got some thoughts that seem coherent now.

First, this post is background for people playing along at home who aren’t familiar with the issues or jargon: Bitcoin is a currency based on an electronic ledger that essentially tracks how much Bitcoin exists, and how someone can be authorised to transfer it to someone else; that ledger is currently about 100GB in size, growing at a rate of about a gigabyte a week. The ledger is updated by miners, who compete by doing otherwise pointless work running cryptographic hashes (and in so doing obtain a “proof of work”), and in return receive a reward (denominated in bitcoin) made up from fees by people transacting and an inflation subsidy. Different miners are competing in an essentially zero-sum game, because fees and inflation are essentially a fixed amount that is (roughly) divided up amongst miners according to how much work they do — so while you get more reward for doing more work, it comes at a cost of other miners receiving less reward.

Because the ledger only grows by (about) a gigabyte each week (or a megabyte per block, which is roughly every ten minutes), there is a limit on how many transactions can be included each week (ie, supply is limited), which both increases fees and limits adoption — so for quite a while now, people in the bitcoin ecosystem with a focus on growth have wanted to work out ways to increase the transaction rate. Initial proposals in mid 2015 suggested allowing miners to regularly determine the limit with no official upper bound (nominally “BIP100“, though never actually formally submitted as a proposal), or to increase by a factor of eight within six months, then double every two years after that, until reaching almost 200 times the current size by 2036 (BIP101), or to increase at a rate of about 17% per annum (suggested on the mailing list, but never formally proposed BIP103). These proposals had two major risks: locking in a lot of growth that may turn out to be unnecessary or actively harmful, and requiring what is called a “hard fork”, which would render the existing bitcoin software unable to track the ledger after the change took affect with the possible outcome that two ledgers would coexist and would in turn cause a range of problems. To reduce the former risk, a minimal compromise proposal was made to “kick the can down the road” and just double the ledger growth rate, then figure out a more permanent solution down the road (BIP102) (or to double it three times — to 2MB, 4MB then 8MB — over four years, per Adam Back). A few months later, some of the devs figured out a way to more or less achieve this that also doesn’t require a hard fork, and comes with a host of other benefits, and proposed an update called “segregated witness” at the December 2015 Scaling Bitcoin conference.

And shortly after that things went completely off the rails, and have stayed that way since. Ultimately there seem to be two camps: one group is happy to deploy segregated witness, and is eager to make further improvements to Bitcoin based on that (this is my take on events); while the other group does not, perhaps due to some combination of being opposed to the segregated witness changes directly, wanting a more direct upgrade immediately, being afraid deploying segregated witness will block other changes, or wanting to take control of the bitcoin codebase/roadmap from the current developers (take this with a grain of salt: these aren’t opinions I share or even find particularly reasonable, so I can’t do them justice when describing them; cf ViaBTC’s post to get that side of the argument made directly, eg)

Most recently, and presumably on the basis that the opposed group are mostly worried that deploying segregated witness will prevent or significantly delay a more direct increase in capacity, a bitcoin venture capitalist, Barry Silbert, organised an agreement amongst a number of companies including many miners, to both activate segregated witness within the next month, and to do a hard fork capacity increase by the end of the year. This is the “segwit2x” project; named because it takes segregated witness, (“segwit”) and then additionally doubles its capacity increase (“2x”). This agreement is not supported by any of the existing dev team, and is being developed by Jeff Garzik (who was behind BIP100 and BIP102 mentioned above) in a forked codebase renamed “btc1“, so if successful, this may also satisfy members of the opposed group motivated by a desire to take control of the bitcoin codebase and roadmap, despite that not being an explicit part of the agreement itself.

To me, the arguments presented for opposing segwit don’t really seem plausible. As far as future development goes, a roadmap was put out in December 2015 and endorsed by many developers that explicitly included a hard fork for increased capacity (“moderate block size increase proposals (such as 2/4/8 …)”), among many other things, so the risk of no further progress happening seems contrary to the facts to me. The core bitcoin devs are extremely capable in my estimation, so replacing them seems a bad idea from the start, but even more than that, they take a notably hands off approach to dictating where Bitcoin will go in future — so, to my mind, it seems like a more sensible thing to try would be working with them to advance the bitcoin ecosystem in whatever direction you want, rather than to try to replace them outright. In that context, it seems particularly notable to me that in the eighteen months between the segregated witness proposal and the segwit2x agreement, there hasn’t been any serious attempt to propose a hard fork capacity increase that meets the core dev’s quality standards; for instance there has never been any code for BIP100, and of the various hard forking codebases that have arisen by advocates of the hard fork approach — Bitcoin XT, Bitcoin Classic, Bitcoin Unlimited, btc1, and Bitcoin ABC — none have been developed in a way that’s suitable for the changes to be reviewed and merged into core via a pull request in the normal fashion. Further, since one of the main criticisms of a hard fork is that deployment costs are higher when it is done in a short time frame (weeks or a few months versus a year or longer), that lack of engagement over the past 18 months followed by a desperate rush now seems particularly poor to me.

A different explanation for the opposition to segwit became public in April, however. ASICBoost is a patent-pending optimisation to the way Bitcoin miners do the work that entitles them to extend the ledger (for which they receive the rewards described earlier), and while there are a few ways of making use of ASICBoost, perhaps the most effective way turns out to be incompatible with segwit. There are three main alternatives to the covert, segwit-incompatible approach, all of which have serious downsides. The first, overt ASICBoost via modifying the block version reveals that you’re using ASICBoost, which would either (a) encourage other miners to also use the optimisation reducing your profits, (b) give the patent holder cause to charge you royalties or cause other problems (assuming the patent is eventually granted and deemed valid), or (c) encourage the bitcoin community at large to change the ecosystem rules so that the optimisation no longer works. The second, mining empty blocks via ASICBoost means you don’t gain any fee income, reducing your revenue and hence profit. And the third, rolling the extranonce to find a collision rather than combining partial transaction trees increases the preparation work by a factor of ten or so, which is probably enough to outweigh the savings from the optimisation in the first place.

If ASICBoost were being used by a significant number of miners, and segregated witness prevents its continued use in practice, then we suddenly have a very plausible explanation for much of the apparent madness: the loss of the optimisation could significantly increase some miners’ costs or reduce their revenue, reducing profit either way (a high end estimate of $100,000,000 per year was given in the original explanation), which would justify significant investment in blocking that change. Further, an honest explanation of the problem would not be feasible, because this would be just as bad as doing the optimisation overtly — it would increase competition, alert the potential patent owners, and might cause the optimisation to be deliberately disabled — all of which would also negatively affect profits. As a result, there would be substantial opposition to segwit, but the reasons presented in public for this opposition would be false, and it would not be surprising if the people presenting these reasons only give half-hearted effort into providing evidence — their purpose is simply to prevent or at least delay segwit, rather than to actually inform or build a new consensus. To this line of thinking the emphasis on lack of communication from core devs or the desire for a hard fork block size increase aren’t the actual goal, so the lack of effort being put into resolving them over the past 18 months from the people complaining about them is no longer surprising.

With that background, I think there are two important questions remaining:

  1. Is it plausible that preventing ASICBoost would actually cost people millions in profit, or is that just an intriguing hypothetical that doesn’t turn out to have much to do with reality?
  2. If preserving ASICBoost is a plausible motivation, what will happen with segwit2x, given that by enabling segregated witness, it does nothing to preserve ASICBoost?

Well, stay tuned…

July 08, 2017

One Million Jobs for Spartan

Whilst it is a loose metric, our little cluster, "Spartan", at the University of Melbourne ran its 1 millionth job today after almost exactly a year since launch.

The researcher in question is doing their PhD in biochemistry. The project is a childhood asthma study:

"The nasopharynx is a source of microbes associated with acute respiratory illness. Respiratory infection and/ or the asymptomatic colonisation with certain microbes during childhood predispose individuals to the development of asthma.

Using data generated from 16S rRNA sequencing and metagenomic sequencing of nasopharyn samples, we aim to identify which specific microbes and interactions are important in the development of asthma."

Moments like this is why I do HPC.

Congratulations to the rest of the team and to the user community.

read more

July 07, 2017

Using the Nitrokey HSM with GPG in macOS

Getting yourself set up in macOS to sign keys using a Nitrokey HSM with gpg is non-trivial. Allegedly (at least some) Nitrokeys are supported by scdaemon (GnuPG’s stand-in abstraction for cryptographic tokens) but it seems that the version of scdaemon in brew doesn’t have support.

However there is gnupg-pkcs11-scd which is a replacement for scdaemon which uses PKCS #11. Unfortunately it’s a bit of a hassle to set up.

There’s a bunch of things you’ll want to install from brew: opensc, gnupg, gnupg-pkcs11-scd, pinentry-mac, openssl and engine_pkcs11.

brew install opensc gnupg gnupg-pkcs11-scd pinentry-mac \
    openssl engine-pkcs11

gnupg-pkcs11-scd won’t create keys, so if you’ve not made one already, you need to generate yourself a keypair. Which you can do with pkcs11-tool:

pkcs11-tool --module /usr/local/lib/opensc-pkcs11.so -l \
    --keypairgen --key-type rsa:2048 \
    --id 10 --label 'Danielle Madeley'

The --id can be any hexadecimal id you want. It’s up to you to avoid collisions.

Then you’ll need to generate and sign a self-signed X.509 certificate for this keypair (you’ll need both the PEM form and the DER form):

/usr/local/opt/openssl/bin/openssl << EOF
engine -t dynamic \
    -pre SO_PATH:/usr/local/lib/engines/engine_pkcs11.so \
    -pre ID:pkcs11 \
    -pre LIST_ADD:1 \
    -pre LOAD \
    -pre MODULE_PATH:/usr/local/lib/opensc-pkcs11.so
req -engine pkcs11 -new -key 0:10 -keyform engine \
    -out cert.pem -text -x509 -days 3640 -subj '/CN=Danielle Madeley/'
x509 -in cert.pem -out cert.der -outform der
EOF

The flag -key 0:10 identifies the token and key id (see above when you created the key) you’re using. If you want to refer to a different token or key id, you can change these.

And import it back into your HSM:

pkcs11-tool --module /usr/local/lib/opensc-pkcs11.so -l \
    --write-object cert.der --type cert \
    --id 10 --label 'Danielle Madeley'

You can then configure gnupg-agent to use gnupg-pkcs11-scd. Edit the file ~/.gnupg/gpg-agent.conf:

scdaemon-program /usr/local/bin/gnupg-pkcs11-scd
pinentry-program /usr/local/bin/pinentry-mac

And the file ~./gnupg/gnupg-pkcs11-scd.conf:

providers nitrokey
provider-nitrokey-library /usr/local/lib/opensc-pkcs11.so

gnupg-pkcs11-scd is pretty nifty in that it will throw up a (pin entry) dialog if your token is not available, and is capable of supporting multiple tokens and providers.

Reload gpg-agent:

gpg-agent --server gpg-connect-agent << EOF
RELOADAGENT
EOF

Check your new agent is working:

gpg --card-status

Get your key handle (grip), which is the 40-character hex string after the phrase KEY-FRIEDNLY (sic):

gpg-agent --server gpg-connect-agent << EOF
SCD LEARN
EOF

Import this key into gpg as an ‘Existing key’, giving the key grip above:

gpg --expert --full-generate-key

You can now use this key as normal, create sub-keys, etc:

gpg -K
/Users/danni/.gnupg/pubring.kbx
-------------------------------
sec> rsa2048 2017-07-07 [SCE]
 1172FC7B4B5755750C65F9A544B80C280F80807C
 Card serial no. = 4B43 53233131
uid [ultimate] Danielle Madeley <danielle@madeley.id.au>

echo -n "Hello World" | gpg --armor --clearsign --textmode

Side note: the curses-based pinentry doesn’t deal with piping content into stdin, which is why you want pinentry-mac.

Terminal console showing a gpg signing command. Over the top is a dialog box prompting the user to insert her Nitrokey token

You can also import your certificate into gpgsm:

gpgsm --import < ca-certificate
gpgsm --learn-card

And that’s it, now you can sign your git tags with your super-secret private key, or whatever it is you do. Remember that you can’t exfiltrate the secret keys from your HSM in the clear, so if you need a backup you can create a DKEK backup (see the SmartcardHSM docs), or make sure you’ve generated that revocation certificate, or just decided disaster recovery is for dweebs.

July 06, 2017

Broadband Speeds, 2 Years Later

Two years ago, considering the blocksize debate, I made two attempts to measure average bandwidth growth, first using Akamai serving numbers (which gave an answer of 17% per year), and then using fixed-line broadband data from OFCOM UK, which gave an answer of 30% per annum.

We have two years more of data since then, so let’s take another look.

OFCOM (UK) Fixed Broadband Data

First, the OFCOM data:

  • Average download speed in November 2008 was 3.6Mbit
  • Average download speed in November 2014 was 22.8Mbit
  • Average download speed in November 2016 was 36.2Mbit
  • Average upload speed in November 2008 to April 2009 was 0.43Mbit/s
  • Average upload speed in November 2014 was 2.9Mbit
  • Average upload speed in November 2016 was 4.3Mbit

So in the last two years, we’ve seen 26% increase in download speed, and 22% increase in upload, bringing us down from 36/37% to 33% over the 8 years. The divergence of download and upload improvements is concerning (I previously assumed they were the same, but we have to design for the lesser of the two for a peer-to-peer system).

The idea that upload speed may be topping out is reflected in the Nov-2016 report, which notes only an 8% upload increase in services advertised as “30Mbit” or above.

Akamai’s State Of The Internet Reports

Now let’s look at Akamai’s Q1 2016 report and Q1-2017 report.

  • Annual global average speed in Q1 2015 – Q1 2016: 23%
  • Annual global average speed in Q1 2016 – Q1 2017: 15%

This gives an estimate of 19% per annum in the last two years. Reassuringly, the US and UK (both fairly high-bandwidth countries, considered in my previous post to be a good estimate for the future of other countries) have increased by 26% and 19% in the last two years, indicating there’s no immediate ceiling to bandwidth.

You can play with the numbers for different geographies on the Akamai site.

Conclusion: 19% Is A Conservative Estimate

17% growth now seems a little pessimistic: in the last 9 years the US Akamai numbers suggest the US has increased by 19% per annum, the UK by almost 21%.  The gloss seems to be coming off the UK fixed-broadband numbers, but they’re still 22% upload increase for the last two years.  Even Australia and the Philippines have managed almost 21%.

July 04, 2017

python-pkcs11 with the Nitrokey HSM

So my Nitrokey HSM arrived and it works great, thanks to the Nitrokey peeps for sending me one.

Because the OpenSC PKCS #11 module is a little more lightweight than some of the other vendors, which often implement mechanisms that are not actually supported by the hardware (e.g. the Opencryptoki TPM module), I wrote up some documentation on how to use the device, focusing on how to extract the public keys for using outside of PKCS #11, as the Nitrokey doesn’t implement any of the public key functions.

Nitrokey with python-pkcs11

This also encouraged me to add a whole bunch more of the import/extraction functions for the diverse key formats, including getting very frustrated at the lack of documentation for little things like how OpenSSL stores EC public keys (the answer is as SubjectPublicKeyInfo from X.509), although I think there might be some operating system specific glitches with encoding some DER structures. I think I need to move from pyasn1 to asn1crypto.

July 02, 2017

The Science of Cats

Ah, the comfortable cat! Most people agree that cats are experts at being comfortable and getting the best out of life, with the assistance of their human friends – but how did this come about? Geneticists and historians are continuing to study how cats and people came to live together and how cats came to […]

June 26, 2017

Codec 2 Wideband

I’m spending a month or so improving the speech quality of a couple of Codec 2 modes. I have two aims:

  1. Make the 700 bit/s codec sound better, to improve speech quality on low SNR HF channels (beneath 0dB).
  2. Develop a higher quality mode in the 2000 to 3000 bit/s range, that can be used on HF channels with modest SNRs (around 10dB)

I ran some numbers on the new OFDM modem and LDPC codes, and turns out we can get 3000 bit/s of codec data through a 2000 Hz channel at down to 7dB SNR.

Now 3000 bit/s is broadband for me – I’ve spent years being very frugal with my bits while I play in low SNR HF land. However it’s still a bit low for Opus which kicks in at 6000 bit/s. I can’t squeeze 6000 bit/s through a 2000 Hz RF channel without higher order QAM constellations which means SNRs approaching 20dB.

So – what can I do with 3000 bit/s and Codec 2? I decided to try wideband(-ish) audio – the sort of audio bandwidth you get from Skype or AM broadcast radio. So I spent a few weeks modifying Codec 2 to work at 16 kHz sample rate, and Jean Marc gave me a few tips on using DCTs to code the bits.

It’s early days but here are a few samples:

Description Sample
1 Original Speech Listen
2 Codec 2 Model, orignal amplitudes and phases Listen
3 Synthetic phase, one bit voicing, original amplitudes Listen
4 Synthetic phase, one bit voicing, amplitudes at 1800 bit/s Listen
5 Simulated analog SSB, 300-2600Hz BPF, 10dB SNR Listen

Couple of interesting points:

  • Sample (2) is as good as Codec 2 can do, its the unquantised model parameters (harmonic phases and amplitudes). It’s all down hill from here as we quantise or toss away parameters.
  • In (3) I’m using a one bit voicing model, this is very vocoder and shouldn’t work this well. MBE/MELP all say you need mixed excitation. Exploring that conundrum would be a good Masters degree topic.
  • In (3) I can hear the pitch estimator making a few mistakes, e.g. around “sheet” on the female.
  • The extra 4kHz of audio bandwidth doesn’t take many more bits to encode, as the ear has a log frequency response. It’s maybe 20% more bits than 4kHz audio.
  • You can hear some words like “well” are muddy and indistinct in the 1800 bit/s sample (4). This usually means the formants (spectral) peaks are not well defined, so we might be tossing away a little too much information.
  • The clipping on the SSB sample (5) around the words “depth” and “hours” is an artifact of the PathSim AGC. But dat noise. It gets really fatiguing after a while.

Wideband audio is a big paradigm shift for Push To Talk (PTT) radio. You can’t do this with analog radio: 2000 Hz of RF bandwidth, 8000 Hz of audio bandwidth. I’m not aware of any wideband PTT radio systems – they all work at best 4000 Hz audio bandwidth. DVSI has a wideband codec, but at a much higher bit rate (8000 bits/s).

Current wideband codecs shoot for artifact-free speech (and indeed general audio signals like music). Codec 2 wideband will still have noticeable artifacts, and probably won’t like music. Big question is will end users prefer this over SSB, or say analog FM – at the same SNR? What will 8kHz audio sound like on your HT?

We shall see. I need to spend some time cleaning up the algorithms, chasing down a few bugs, and getting it all into C, but I plan to be testing over the air later this year.

Let me know if you want to help.

Play Along

Unquantised Codec 2 with 16 kHz sample rate:

$ ./c2sim ~/Desktop/c2_hd/speech_orig_16k.wav --Fs 16000 -o - | play -t raw -r 16000 -s -2 -

With “Phase 0” synthetic phase and 1 bit voicing:

$ ./c2sim ~/Desktop/c2_hd/speech_orig_16k.wav --Fs 16000 --phase0 --postfilter -o - | play -t raw -r 16000 -s -2 -

Links

FreeDV 2017 Road Map – this work is part of the “Codec 2 Quality” work package.

Codec 2 page – has an explanation of the way Codec 2 models speech with harmonic amplitudes and phases.

Guess the Artefact! – #2

Today’s Guess the Artefact! covers one of a set of artefacts which are often found confusing to recognise. We often get questions about these artefacts, from students and teachers alike, so here’s a chance to test your skills of observation. Remember – all heritage and archaeological material is covered by State or Federal legislation and should never be removed from its context. If possible, photograph the find in its context and then report it to your local museum or State Heritage body (the Dept of Environment and Heritage Protection in Qld; the Office of Environment and Heritage in NSW; the Dept of Environment, Planning and Sustainable Development in ACT; Heritage Victoria; the Dept of Environment, Water and Natural Resources in South Australia; the State Heritage Office in WA and the Heritage Council – Dept of Tourism and Culture in NT).

This artefact is made of stone. It measures about 12 x 8 x 3 cm. It fits easily and comfortably into an adult’s hand. The surface of the stone is mostly smooth and rounded, it looks a little like a river cobble. However, one side – the right-hand side in the photo above – is shaped so that 2 smooth sides meet in a straight, sharpish edge. Such formations do not occur on naturally rounded stones, which tells us that this was shaped by people and not just rounded in a river. The smoothed edges meeting in a sharp edge tell us that this is ground-stone technology. Ground stone technology is a technique used by people to create smooth, sharp edges on stones. People grind the stone against other rocks, occasionally using sand and water to facilitate the process, usually in a single direction. This forms a smooth surface which ends in a sharp edge.

Neolithic Axe

Ground stone technology is usually associated with the Neolithic period in Europe and Asia. In the northern hemisphere, this technology was primarily used by people who were learning to domesticate plants and animals. These early farmers learned to grind grains, such as wheat and barley, between two stones to make flour – thus breaking down the structure of the plant and making it easier to digest. Our modern mortar and pestle is a descendant of this process. Early farmers would have noticed that these actions produced smooth and sharp edges on the stones. These observations would have led them to apply this technique to other tools which they used and thus develop the ground-stone technology. Here (picture on right) we can see an Egyptian ground stone axe from the Neolithic period. The toolmaker has chosen an attractive red and white stone to make this axe-head.

In Japan this technology is much older than elsewhere in the northern hemisphere, and ground-stone axes have been found dating to 30,000 years ago during the Japanese Palaeolithic period. Until recently these were thought to be the oldest examples of ground-stone technology in the world. However, in 2016, Australian archaeologists Peter Hiscock, Sue O’Connor, Jane Balme and Tim Maloney reported in an article in the journal Australian Archaeology, the finding of a tiny flake of stone (just over 1 cm long and 1/2 cm wide) from a ground stone axe in layers dated to 44,000 to 49,000 years ago at the site of Carpenter’s Gap in the Kimberley region of north-west Australia. This tiny flake of stone – easily missed by anyone not paying close attention – is an excellent example of the extreme importance of ‘archaeological context’. Archaeological material that remains in its original context (known as in situ) can be dated accurately and associated with other material from the same layers, thus allowing us to understand more about the material. Anything removed from the context usually can not be dated and only very limited information can be learnt.

The find from the Kimberley makes Australia the oldest place in the world to have ground-stone technology. The tiny chip of stone, broken off a larger ground-stone artefact, probably an axe, was made by the ancestors of Aboriginal people in the millennia after they arrived on this continent. These early Australians did not practise agriculture, but they did eat various grains, which they leaned to grind between stones to make flour. It is possible that whilst processing these grains they learned to grind stone tools as well. Our artefact, shown above, is undated. It was found, totally removed from its original context, stored under an old house in Brisbane. The artefact is useful as a teaching aid, allowing students to touch and hold a ground-stone axe made by Aboriginal people in Australia’s past. However, since it was removed from its original context at some point, we do not know how old it is, or even where it came from exactly.

Our artefact is a stone tool. Specifically, it is a ground stone axe, made using technology that dates back almost 50,000 years in Australia! These axes were usually made by rubbing a hard stone cobble against rocks by the side of a creek. Water from the creek was used as a lubricant, and often sand was added as an extra abrasive. The making of ground-stone axes often left long grooves in these rocks. These are called ‘grinding grooves’ and can still be found near some creeks in the landscape today, such as in Kuringai Chase National Park in Sydney. The ground-stone axes were usually hafted using sticks and lashings of plant fibre, to produce a tool that could be used for cutting vegetation or other uses. Other stone tools look different to the one shown above, especially those made by flaking stone; however, smooth stones should always be carefully examined in case they are also ground-stone artefacts and not just simple stones!

LUV Beginners July Meeting: Teaching programming using video games

Jul 15 2017 12:30
Jul 15 2017 16:30
Jul 15 2017 12:30
Jul 15 2017 16:30
Location: 
Infoxchange, 33 Elizabeth St. Richmond

Andrew Pam will be demonstrating a range of video games that run natively on Linux and explicitly include programming skills as part of the game including SpaceChem, InfiniFactory, TIS-100, Shenzen I/O, Else Heart.Break(), Hack 'n' Slash and Human Resource Machine.  He will seek feedback on the suitability of these games for teaching programming skills to non-programmers and the possibility of group play in a classroom or workshop setting.

The meeting will be held at Infoxchange, 33 Elizabeth St. Richmond 3121 (enter via the garage on Jonas St.) Late arrivals, please call (0421) 775 358 for access to the venue.

LUV would like to acknowledge Infoxchange for the venue.

Linux Users of Victoria Inc., is an incorporated association, registration number A0040056C.

July 15, 2017 - 12:30

read more

LUV Main July 2017 Meeting

Jul 4 2017 18:30
Jul 4 2017 20:30
Jul 4 2017 18:30
Jul 4 2017 20:30
Location: 
The Dan O'Connell Hotel, 225 Canning Street, Carlton VIC 3053

PLEASE NOTE NEW LOCATION

Tuesday, July 4, 2017
6:30 PM to 8:30 PM
The Dan O'Connell Hotel
225 Canning Street, Carlton VIC 3053

Speakers:

• To be announced

Come have a drink with us and talk about Linux.  If you have something cool to show, please bring it along!

The Dan O'Connell Hotel, 225 Canning Street, Carlton VIC 3053

Food and drinks will be available on premises.

Before and/or after each meeting those who are interested are welcome to join other members for dinner.

Linux Users of Victoria Inc., is an incorporated association, registration number A0040056C.

July 4, 2017 - 18:30

June 24, 2017

Duolingo Plus is Extremely Broken

After using Duolingo for over a year and accumulating almost 100,000 points I thought it would do the right thing and pay for the Plus service. It was exactly the right time as I would be travelling overseas and the ability to do lessons offline and have them sync later seemed ideal.

For the first few days it seemed to be operating fine; I had downloaded the German tree and was working my way through it. Then I downloaded the French tree, and several problems started to emerge.

read more

June 22, 2017

Hire me!

tl;dr: I’ve recently moved to the San Francisco Bay Area, received my US Work Authorization, so now I’m looking for somewhere  to work. I have a résumé and an e-mail address!

I’ve worked a lot in Free and Open Source Software communities over the last five years, both in Australia and overseas. While much of my focus has been on the Python community, I’ve also worked more broadly in the Open Source world. I’ve been doing this community work entirely as a volunteer, most of the time working in full-time software engineering jobs which haven’t related to my work in the Open Source world.

It’s pretty clear that I want to move into a job where I can use the skills I’ve been volunteering for the last few years, and put them to good use both for my company, and for the communities I serve.

What I’m interested in doing fits best into a developer advocacy or community management sort of role. Working full-time on helping people in tech be better at what they do would be just wonderful. That said, my background is in code, and working in software engineering with a like-minded company would also be pretty exciting (better still if I get to write a lot of Python).

  • Something with a strong developer relations element. I enjoy working with other developers, and I love having the opportunity to get them excited about things that I’m excited about. As a conference organiser, I’m very aware of the line between terrible marketing shilling, and genuine advocacy by and for developers: I want to help whoever I work for end up on the right side of that line.
  • Either in San Francisco, North of San Francisco, or Remote-Friendly. I live in Petaluma, a lovely town about 50 minutes north of San Francisco, with my wonderful partner, Josh. We’re pretty happy up here, but I’m happy to regularly commute as far as San Francisco. I’ll consider opportunities in other cities, but they’d need to primarily be remote.
  • Relevant to Open Source. The Open Source world is where my experience is, it’s where I know people, and it’s the world where I can be most credible. This doesn’t mean I need to be working on open source itself, but I’d love to be able to show up at OSCON or linux.conf.au and be excited to have my company’s name on my badge.

Why would I be good at this? I’ve been working on building and interacting with communities of developers, especially in the Free and Open Source Software world, for the last five years.

You can find a complete list of what I’ve done in my résumé, but here’s a selection of what I think’s notable:

  • Co-organised two editions of PyCon Australia, and led the linux.conf.au 2017 team. I’ve led PyCon AU, from inception, to bidding, to the successful execution for two years in a row. As the public face of PyCon AU, I made sure that the conference had the right people interested in speaking, and that we had many from Australian Python community interested in attending. I took what I learned at PyCon AU and applied it to run linux.conf.au 2017, where our CFP attracted its largest ever response (beating the previous record by more than 30%).
  • Developed Registrasion, an open source conference ticket system. I designed and developed a ticket sales system that allowed for automation of the most significant time sinks that linux.conf.au and PyCon Australia registration staff had experienced in previous years. Registrasion was Open Sourced, and several other conferences are considering adopting it.
  • Given talks at countless open source and developer events, both in Australia, and overseas. I’ve presented at OSCON, PyCons in five countries, and myriad other conferences. I’ve presented on a whole lot of technical topics, and I’ve recently started talking more about the community-level projects I’ve been involved with.
  • Designed, ran, and grew PyCon Australia’s outreach and inclusion programmes. Each year, PyCon Australia has offered upwards of $10,000 (around 10% of conference budget) in grants to people who otherwise wouldn’t be able to attend the conference: this is not just speakers, but people whose presence would improve the conference just by being there. I’ve led a team to assess applications for these grants, and lead our outreach efforts to make sure we find the right people to receive these grants.
  • Served as a council member for Linux Australia. Linux Australia is the peak body for Open Source communities in Australia, as well as underwriting the region’s more popular Open Source and Developer conferences. In particular, I led a project to design governance policies to help make sure the conferences we underwrite are properly budgeted and planned.

So, if you know of anything going at the moment, I’d love to hear about it. I’m reachable by e-mail (mail@chrisjrn.com) but you can also find me on Twitter (@chrisjrn), or if you really need to, LinkedIn.

June 18, 2017

HASS Additional Activities

OK, so you’ve got the core work covered for the term and now you have all those reports to write and admin to catch up on. Well, the OpenSTEM™ Understanding Our World® HASS plus Science material has heaps of activities which help students to practise core curricular skills and can keep students occupied. Here are some ideas:

 Aunt Madge’s Suitcase Activity

Aunt Madge

Aunt Madge is a perennial favourite with students of all ages. In this activity, students use clues to follow Aunt Madge around the world trying to return her forgotten suitcase. There’s a wide range of locations to choose from on every continent – both natural and constructed places. This activity can be tailored for group work, or the whole class, and by adjusting the number of locations to be found, the teacher can adjust to the available time, anywhere from 10-15 minutes to a whole lesson. Younger students enjoy matching the pictures of locations and trying to find the countries on the map. Older students can find out further information about the locations on the information sheets. Teachers can even choose a theme for the locations (such as “Ancient History” or “Aboriginal Places”) and see if students can guess what it is.

 Ancient Sailing Ships Activity

Sailing Ships (History + Science)Science

Students in Years 3 to 6 have undertaken the Ancient Sailing Ships activity this term, however, there is a vast scope for additional aspects to this activity. Have students compared the performance of square-rigged versus lateen sails? How about varying the number of masts? Have students raced the vessels against each other? (a water trough and a fan is all that’s needed for some exciting races) Teachers can encourage the students to examine the effects of other changes to ship design, such as adding a keel or any other innovations students can come up with, which can be tested. Perhaps classes or grades can even race their ships against each other.

Trade and Barter Activity

Students in years 5 and 6 in particular enjoy the Trade and Barter activity, which teaches them the basics of Economics without them even realising it! This activity covers so many different aspects of the curriculum, that it is always a good one to revisit, even though it was not in this term’s units. Students enjoy the challenge and will find the activity different each time. It is a particularly good choice for a large chunk of time, or for smaller groups; perhaps a more experienced group can coach other students. The section of the activity which has students developing their own system of writing is one that lends itself to extension and can even be spun off as a separate activity.

Games from the Past

Kids Playing TagKids Playing Tag

Students of all ages enjoy many of the games listed in the resource Games From The Past. Several of these games are best done whilst running around outside, so if that is an option, then choose from the Aboriginal, Chinese or Zulu games. Many of these games can be played by large groups. Older students might like to try recreating some of the rules for some of the games of Ancient Egypt or the Aztecs. If this resource wasn’t part of the resources for your particular unit, it can be downloaded from the OpenSTEM™ site directly.

 

Class Discussions

The b) and c) sections of the Teacher Handbooks contain suggestions for topics of discussion – such as Women Explorers or global citizenship, or ideas for drawings that the students can do. These can also be undertaken as additional activities. Teachers could divide students into groups to research and explore particular aspects of these topics, or stage debates, allowing students to practise persuasive writing skills as well.

OpenSTEM A0 world map: Country Outlines and Ice Age CoastlineAdding events to a timeline, or the class calendar, also good ways to practise core skills.

The OpenSTEM™ Our World map is used as the perfect complement to many of the Understanding Our World® units. This map comes blank and country names are added to the map during activities. The end of term is also a good chance for students to continue adding country names to the map. These can be cut out of the resource World Countries, which supplies the names in a suitable font size. Students can use the resource World Maps to match the country names to their locations.

We hope you find these suggestions useful!

Enjoy the winter holidays – not too long now to a nice, cosy break!

June 11, 2017

This Week in HASS – term 2, week 9

The OpenSTEM™ Understanding Our World® units have only 9 weeks per term, so this is the last week! Our youngest students are looking at some Aboriginal Places; slightly older older students are thinking about what their school and local area were like when their parents and grandparents were children; and students in years 3 to 6 are completing their presentations and anything else that might be outstanding from the term.

Foundation/Prep/Kindy

Students in the stand-alone Foundation/Prep/Kindy class (Unit F.2) examine Aboriginal Places this week. Students examine which places are special to Aboriginal people, and how these places should be cared for by Aboriginal people and the broader community. Several of the Australian places in the Aunt Madge’s Suitcase Activity can be used to support this discussion in the classroom. Students in an integrated Foundation/Prep/Kindy and Year 1 class (Unit F.6), as well as Year 1 (Unit 1.2), 2 (Unit 2.2) and 3 (Unit 3.2) students consider life in the times of their parents and grandparents, with specific reference to their school, or the local area studied during this unit. Teachers may wish to invite older members of the community (including interested parents and/or grandparents) in to the class to describe their memories of the area in former years. Were any of them past students of the school? This is a great opportunity for students to come up with their own questions about life in past times.

Years 3 to 6

Aunt Madge

Students in Year 3 (Unit 3.6), 4 (Unit 4.2), 5 (Unit 5.2) and 6 (Unit 6.2) are finishing off their presentations and any outstanding work this week. Sometimes the middle of term can be very rushed and so it’s always good to have some breathing space at the end to catch up on anything that might have been squeezed out before. For those classes where everyone is up-to-date and looking for extra activities, the Aunt Madge’s Suitcase Activity is always popular with students and can be used to support their learning. Teachers may wish to select a range of destinations appropriate to the work covered during the term and encourage students to think about how those destinations relate to the material covered in class. Destinations may be selected by continent or theme – e.g. natural places or historical sites. A further advantage of Aunt Madge is that the activity can be tailored to fit the available time – from 5 or 10 minutes for a single destination, to 45 minutes or more for a full selection; and played in groups, or as a whole class, allowing some students to undertake the activity while other students may be catching up on other work. Students may also wish to revisit aspects of the Ancient Sailing Ships Activity and expand on their investigations.

Although this is the last week of this term’s units, we will have some more suggestions for extra activities next week – particularly those that keep the students busy while teachers attend to marking or compiling of reports.

Mysterious 400 Bad Request in Django debug mode

While upgrading Libravatar to a more recent version of Django, I ran into a mysterious 400 error.

In debug mode, my site was working fine, but with DEBUG = False, I would only a page containing this error:

Bad Request (400)

with no extra details in the web server logs.

Turning on extra error logging

To see the full error message, I configured logging to a file by adding this to settings.py:

LOGGING = { 'version': 1, 'disable_existing_loggers': False, 'handlers': { 'file': { 'level': 'DEBUG', 'class': 'logging.FileHandler', 'filename': '/tmp/debug.log', }, }, 'loggers': { 'django': { 'handlers': ['file'], 'level': 'DEBUG', 'propagate': True, }, }, }

Then I got the following error message:

Invalid HTTP_HOST header: 'www.example.com'. You may need to add u'www.example.com' to ALLOWED_HOSTS.

Temporary hack

Sure enough, putting this in settings.py would make it work outside of debug mode:

ALLOWED_HOSTS = ['*']

which means that there's a mismatch between the HTTP_HOST from Apache and the one that Django expects.

Root cause

The underlying problem was that the Libravatar config file was missing the square brackets around the ALLOWED_HOSTS setting.

I had this:

ALLOWED_HOSTS = 'www.example.com'

instead of:

ALLOWED_HOSTS = ['www.example.com']

June 09, 2017

LUV Beginners June Meeting: Debian 9 release party!

Jun 17 2017 12:30
Jun 17 2017 16:30
Jun 17 2017 12:30
Jun 17 2017 16:30
Location: 
Infoxchange, 33 Elizabeth St. Richmond

Debian Linux version 9 (codename "Stretch") is scheduled for release on 17 June 2017.  Join us in celebrating the release and assisting anyone who would like to install or upgrade to the new version!


There will also be the usual casual hands-on workshop, Linux installation, configuration and assistance and advice. Bring your laptop if you need help with a particular issue. This will now occur BEFORE the talks from 12:30 to 14:00. The talks will commence at 14:00 (2pm) so there is time for people to have lunch nearby.

The meeting will be held at Infoxchange, 33 Elizabeth St. Richmond 3121 (enter via the garage on Jonas St.) Late arrivals, please call (0421) 775 358 for access to the venue.

LUV would like to acknowledge Infoxchange for the venue.

Linux Users of Victoria Inc., is an incorporated association, registration number A0040056C.

June 17, 2017 - 12:30

June 08, 2017

Heredocs with Gaussian and Slurm

Gaussian is a well-known computational chemistry package, and sometimes subject to debate over its license (e.g., the terms state researchers who develop competing software packages are not permitted to use the software, compare performance etc). Whilst I have some strong opinions about such a license, this will be elaborated at another time. The purpose here is to illustrate the use of heredocs with Slurm.

read more

June 06, 2017

Applied PKCS#11

The most involved thing I’ve had to learn this year is how to actually use PKCS #11 to talk to crypto hardware. It’s actually not that clear. Most of the examples are buried in random bits of C from vendors like Oracle or IBM; and the spec itself is pretty dense. Especially when it comes to understanding how you actually use it, and what all the bits and pieces do.

In honour of our Prime Minister saying he should have NOBUS access into our cryptography, which is why we should all start using hardware encryption modules (did you know you can use your TPM) and thus in order to save the next girl 6 months of poking around on a piece of hardware she doesn’t really *get*, I started a document: Applied PKCS#11.

The later sections refer to the API exposed by python-pkcs11, but the first part is generally relevant. Hopefully it makes sense, I’m super keen to get feedback if I’ve made any huge logical leaps etc.

June 05, 2017

LUV Main June 2017 Meeting

Jun 6 2017 18:30
Jun 6 2017 20:30
Jun 6 2017 18:30
Jun 6 2017 20:30
Location: 
The Dan O'Connell Hotel, 225 Canning Street, Carlton VIC 3053

PLEASE NOTE NEW LOCATION

Tuesday, June 6, 2017
6:30 PM to 8:30 PM
The Dan O'Connell Hotel
225 Canning Street, Carlton VIC 3053

Speakers:

• To be announced

Come have a drink with us and talk about Linux.  If you have something cool to show, please bring it along!

The Dan O'Connell Hotel, 225 Canning Street, Carlton VIC 3053

Food and drinks will be available on premises.

Before and/or after each meeting those who are interested are welcome to join other members for dinner.

Linux Users of Victoria Inc., is an incorporated association, registration number A0040056C.

June 6, 2017 - 18:30

June 04, 2017

This Week in HASS – term 2, week 8

This week we are starting into the last stretch of the term. Students are well into their final sections of work. Our youngest students are thinking about how we care for places, slightly older students are displaying their posters and older students are giving their presentations.

Foundation/Prep/Kindy to Year 3

Our youngest students doing the stand-alone Foundation/Prep/Kindy unit (F.2) are thinking about how we look after different places this week. Students in integrated Foundation/Prep/Kindy and Year 1 classes, doing Unit F.6, are displaying their posters on an issue in their local environment. These posters were prepared in proceeding weeks and can now be displayed either at school or in a local library or hall. The teacher may choose to invite parents to view the posters as well. Students in Years 1 (Unit 1.2), 2 (Unit 2.2) and 3 (Unit 3.2) also have posters to display on a range of issues, either at the school, in a local place, such as a park, or even a local heritage place. Discussions around points of view and the intended audience of the posters can help students to gain a more in-depth understanding and critique their own work.

Years 3 to 6

Students in Years 3 (Unit 3.6), 4 (Unit 4.2), 5 (Unit 5.2) and 6 (Unit 6.2) are in the second of 3 weeks set aside for their presentations. The presentations cover a significant body of work and thus a 3 weeks of lessons are set aside for the presentations, as well as for finishing any other sections of work not yet completed. Year 3 students are considering extreme climate areas of Australia and other parts of the world, such as the Sahara Desert, Arctic and Antarctica and Mount Everest, by studying explorers such as Edmund Hillary and Tenzing Norgay, Robert Scott and Pawel Strzelecki. Year 4 students are studying explorers and the environments and animals of Africa and South America, such as Francisco Pizarro, the Giant Vampire Bat, Vasco Da Gama and the Cape Lion. Year 5 students are studying explorers, environments and animals of North America, such as Henry Hudson, Hernando de Soto and the Great Auk. Year 6 students are studying explorers, environments and indigenous peoples of Asia, such as Vitus Bering, Zheng He, Marco Polo, the Mongols and the Rus.

Speaking in June 2017

I will be at several events in June 2017:

  • db tech showcase 2017 – 16-17 June 2017 – Tokyo, Japan. I’m giving a talk about best practices around MySQL High Availability.
  • O’Reilly Velocity 2017 – 19-22 June 2017 – San Jose, California, USA. I’m giving a tutorial about best practices around MySQL High Availability. Use code CC20 for a 20% discount.

I look forward to meeting with you at either of these events, to discuss all things MySQL (High Availability, security, cloud, etc.), and how Percona can help you.

As I write this, I’m in Budva, Montenegro, for the Percona engineering meeting.

Six is the magic number

I have talked about controlling robot arms with 4 or 5 motors and the maths involved in turning a desired x,y,z target into servo angles. Things get a little too interesting with 6 motors as you end up with a great deal of solutions to a positioning problem and need to work out a 'best' choice.


So I finally got MoveIt! to work to control a six motor arm using ROS. I now also know that using MoveIt on lower order arms isn't going to give you much love. Six is the magic number (plus claw motor) to get things working and patience is your best friend in getting the configuration and software setup going.

This was great as MoveIt was the last corner of the ROS stack that I hadn't managed to get to work for me. The great part is that the knowledge I gained playing with MoveIt will work on larger more accurate and expensive robot arms.

June 03, 2017

The Why and How of HPC-Cloud Hybrids with OpenStack

High performance computing and cloud computing have traditionally been seen as separate solutions to separate problems, dealing with issues of performance and flexibility respectively. In a diverse research environment however, both sets of compute requirements can occur. In addition to the administrative benefits in combining both requirements into a single unified system, opportunities are provided for incremental expansion.

read more

June 01, 2017

Update on python-pkcs11

I spent a bit of time fleshing out the support matrix for python-pkcs11 and getting things that aren’t SoftHSM into CI for integration testing (there’s still no one-command rollout for BuildBot connected to GitHub, but I got there in the end).

The nice folks at Nitrokey are also sending me some devices to widen the compatibility matrix. Also happy to make it work with CloudHSM if someone at Amazon wants to hook me up!

I also put together API docs that hopefully help to explain how to actually use the thing and added support for RFC3279 to pyasn1_modules (so you can encode your elliptic curve parameters).

Next goal is to open up my Django HSM integrations to add encrypted database fields, encrypted file storage and various other offloads onto the HSM. Also look at supporting certificate objects for all that wonderful stuff.

May 29, 2017

Fedora 25 + Lenovo X1 Carbon 4th Gen + OneLink+ Dock

As of May 29th 2017, if you want to do something crazy like use *both* ports of the OneLink+ dock to use monitors that aren’t 640×480 (but aren’t 4k), you’re going to need a 4.11 kernel, as everything else (for example 4.10.17, which is the latest in Fedora 25 at time of writing) will end you in a world of horrible, horrible pain.

To install, run this:

sudo dnf install \
https://kojipkgs.fedoraproject.org//packages/kernel/4.11.3/200.fc25/x86_64/kernel-4.11.3-200.fc25.x86_64.rpm \
https://kojipkgs.fedoraproject.org//packages/kernel/4.11.3/200.fc25/x86_64/kernel-core-4.11.3-200.fc25.x86_64.rpm \
https://kojipkgs.fedoraproject.org//packages/kernel/4.11.3/200.fc25/x86_64/kernel-cross-headers-4.11.3-200.fc25.x86_64.rpm \
https://kojipkgs.fedoraproject.org//packages/kernel/4.11.3/200.fc25/x86_64/kernel-devel-4.11.3-200.fc25.x86_64.rpm \
https://kojipkgs.fedoraproject.org//packages/kernel/4.11.3/200.fc25/x86_64/kernel-modules-4.11.3-200.fc25.x86_64.rpm \
https://kojipkgs.fedoraproject.org//packages/kernel/4.11.3/200.fc25/x86_64/kernel-tools-4.11.3-200.fc25.x86_64.rpm \
https://kojipkgs.fedoraproject.org//packages/kernel/4.11.3/200.fc25/x86_64/kernel-tools-libs-4.11.3-200.fc25.x86_64.rpm \
https://kojipkgs.fedoraproject.org//packages/kernel/4.11.3/200.fc25/x86_64/perf-4.11.3-200.fc25.x86_64.rpm

This grabs a kernel that’s sitting in testing and isn’t yet in the main repositories. However, I can now see things on monitors, rather than 0 to 1 monitor (most often 0). You can also dock/undock and everything doesn’t crash in a pile of fail.

I remember a time when you could fairly reliably buy Intel hardware and have it “just work” with the latest distros. It’s unfortunate that this is no longer the case, and it’s more of a case of “wait six months and you’ll still have problems”.

Urgh.

(at least Wayland and X were bug for bug compatible?)

Configuring docker to use rexray and Ceph for persistent storage

For various reasons I wanted to play with docker containers backed by persistent Ceph storage. rexray seemed like the way to do that, so here are my notes on getting that working...

First off, I needed to install rexray:

    root@labosa:~/rexray# curl -sSL https://dl.bintray.com/emccode/rexray/install | sh
    Selecting previously unselected package rexray.
    (Reading database ... 177547 files and directories currently installed.)
    Preparing to unpack rexray_0.9.0-1_amd64.deb ...
    Unpacking rexray (0.9.0-1) ...
    Setting up rexray (0.9.0-1) ...
    
    rexray has been installed to /usr/bin/rexray
    
    REX-Ray
    -------
    Binary: /usr/bin/rexray
    Flavor: client+agent+controller
    SemVer: 0.9.0
    OsArch: Linux-x86_64
    Branch: v0.9.0
    Commit: 2a7458dd90a79c673463e14094377baf9fc8695e
    Formed: Thu, 04 May 2017 07:38:11 AEST
    
    libStorage
    ----------
    SemVer: 0.6.0
    OsArch: Linux-x86_64
    Branch: v0.9.0
    Commit: fa055d6da595602715bdfd5541b4aa6d4dcbcbd9
    Formed: Thu, 04 May 2017 07:36:11 AEST
    


Which is of course horrid. What that script seems to have done is install a deb'd version of rexray based on an alien'd package:

    root@labosa:~/rexray# dpkg -s rexray
    Package: rexray
    Status: install ok installed
    Priority: extra
    Section: alien
    Installed-Size: 36140
    Maintainer: Travis CI User <travis@testing-gce-7fbf00fc-f7cd-4e37-a584-810c64fdeeb1>
    Architecture: amd64
    Version: 0.9.0-1
    Depends: libc6 (>= 2.3.2)
    Description: Tool for managing remote & local storage.
     A guest based storage introspection tool that
     allows local visibility and management from cloud
     and storage platforms.
     .
     (Converted from a rpm package by alien version 8.86.)
    


If I was building anything more than a test environment I think I'd want to do a better job of installing rexray than this, so you've been warned.

Next to configure rexray to use Ceph. The configuration details are cunningly hidden in the libstorage docs, and aren't mentioned at all in the rexray docs, so you probably want to take a look at the libstorage docs on ceph. First off, we need to install the ceph tools, and copy the ceph authentication information from the the ceph we installed using openstack-ansible earlier.

    root@labosa:/etc# apt-get install ceph-common
    root@labosa:/etc# scp -rp 172.29.239.114:/etc/ceph .
    The authenticity of host '172.29.239.114 (172.29.239.114)' can't be established.
    ECDSA key fingerprint is SHA256:SA6U2fuXyVbsVJIoCEHL+qlQ3xEIda/MDOnHOZbgtnE.
    Are you sure you want to continue connecting (yes/no)? yes
    Warning: Permanently added '172.29.239.114' (ECDSA) to the list of known hosts.
    rbdmap                       100%   92     0.1KB/s   00:00    
    ceph.conf                    100%  681     0.7KB/s   00:00    
    ceph.client.admin.keyring    100%   63     0.1KB/s   00:00    
    ceph.client.glance.keyring   100%   64     0.1KB/s   00:00    
    ceph.client.cinder.keyring   100%   64     0.1KB/s   00:00    
    ceph.client.cinder-backup.keyring   71     0.1KB/s   00:00  
    root@labosa:/etc# modprobe rbd
    


You also need to configure rexray. My first attempt looked like this:

    root@labosa:/var/log# cat /etc/rexray/config.yml
    libstorage:
      service: ceph
    


And the rexray output sure made it look like it worked...

    root@labosa:/etc# rexray service start
    ● rexray.service - rexray
       Loaded: loaded (/etc/systemd/system/rexray.service; enabled; vendor preset: enabled)
       Active: active (running) since Mon 2017-05-29 10:14:07 AEST; 33ms ago
     Main PID: 477423 (rexray)
        Tasks: 5
       Memory: 1.5M
          CPU: 9ms
       CGroup: /system.slice/rexray.service
               └─477423 /usr/bin/rexray start -f
    
    May 29 10:14:07 labosa systemd[1]: Started rexray.
    


Which looked good, but /var/log/syslog said:

    May 29 10:14:08 labosa rexray[477423]: REX-Ray
    May 29 10:14:08 labosa rexray[477423]: -------
    May 29 10:14:08 labosa rexray[477423]: Binary: /usr/bin/rexray
    May 29 10:14:08 labosa rexray[477423]: Flavor: client+agent+controller
    May 29 10:14:08 labosa rexray[477423]: SemVer: 0.9.0
    May 29 10:14:08 labosa rexray[477423]: OsArch: Linux-x86_64
    May 29 10:14:08 labosa rexray[477423]: Branch: v0.9.0
    May 29 10:14:08 labosa rexray[477423]: Commit: 2a7458dd90a79c673463e14094377baf9fc8695e
    May 29 10:14:08 labosa rexray[477423]: Formed: Thu, 04 May 2017 07:38:11 AEST
    May 29 10:14:08 labosa rexray[477423]: libStorage
    May 29 10:14:08 labosa rexray[477423]: ----------
    May 29 10:14:08 labosa rexray[477423]: SemVer: 0.6.0
    May 29 10:14:08 labosa rexray[477423]: OsArch: Linux-x86_64
    May 29 10:14:08 labosa rexray[477423]: Branch: v0.9.0
    May 29 10:14:08 labosa rexray[477423]: Commit: fa055d6da595602715bdfd5541b4aa6d4dcbcbd9
    May 29 10:14:08 labosa rexray[477423]: Formed: Thu, 04 May 2017 07:36:11 AEST
    May 29 10:14:08 labosa rexray[477423]: time="2017-05-29T10:14:08+10:00" level=error
    msg="error starting libStorage server" error.driver=ceph time=1496016848215
    May 29 10:14:08 labosa rexray[477423]: time="2017-05-29T10:14:08+10:00" level=error
    msg="default module(s) failed to initialize" error.driver=ceph time=1496016848216
    May 29 10:14:08 labosa rexray[477423]: time="2017-05-29T10:14:08+10:00" level=error
    msg="daemon failed to initialize" error.driver=ceph time=1496016848216
    May 29 10:14:08 labosa rexray[477423]: time="2017-05-29T10:14:08+10:00" level=error
    msg="error starting rex-ray" error.driver=ceph time=1496016848216
    


That's because the service is called rbd it seems. So, the config file ended up looking like this:

    root@labosa:/var/log# cat /etc/rexray/config.yml
    libstorage:
      service: rbd
    
    rbd:
      defaultPool: rbd
    


Now to install docker:

    root@labosa:/var/log# sudo apt-get update
    root@labosa:/var/log# sudo apt-get install linux-image-extra-$(uname -r) \
        linux-image-extra-virtual
    root@labosa:/var/log# sudo apt-get install apt-transport-https \
        ca-certificates curl software-properties-common
    root@labosa:/var/log# curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo apt-key add -
    root@labosa:/var/log# sudo add-apt-repository \
        "deb [arch=amd64] https://download.docker.com/linux/ubuntu \
        $(lsb_release -cs) \
        stable"
    root@labosa:/var/log# sudo apt-get update
    root@labosa:/var/log# sudo apt-get install docker-ce
    


Now let's make a rexray volume.

    root@labosa:/var/log# rexray volume ls
    ID  Name  Status  Size
    root@labosa:/var/log# docker volume create --driver=rexray --name=mysql \
        --opt=size=1
    A size of 1 here means 1gb
    mysql
    root@labosa:/var/log# rexray volume ls
    ID         Name   Status     Size
    rbd.mysql  mysql  available  1
    


Let's start the container.

    root@labosa:/var/log# docker run --name some-mysql --volume-driver=rexray \
        -v mysql:/var/lib/mysql -e MYSQL_ROOT_PASSWORD=my-secret-pw -d mysql
    Unable to find image 'mysql:latest' locally
    latest: Pulling from library/mysql
    10a267c67f42: Pull complete 
    c2dcc7bb2a88: Pull complete 
    17e7a0445698: Pull complete 
    9a61839a176f: Pull complete 
    a1033d2f1825: Pull complete 
    0d6792140dcc: Pull complete 
    cd3adf03d6e6: Pull complete 
    d79d216fd92b: Pull complete 
    b3c25bdeb4f4: Pull complete 
    02556e8f331f: Pull complete 
    4bed508a9e77: Pull complete 
    Digest: sha256:2f4b1900c0ee53f344564db8d85733bd8d70b0a78cd00e6d92dc107224fc84a5
    Status: Downloaded newer image for mysql:latest
    ccc251e6322dac504e978f4b95b3787517500de61eb251017cc0b7fd878c190b
    


And now to prove that persistence works and that there's nothing up my sleeve...
    root@labosa:/var/log# docker run -it --link some-mysql:mysql --rm mysql \
        sh -c 'exec mysql -h"$MYSQL_PORT_3306_TCP_ADDR" \
        -P"$MYSQL_PORT_3306_TCP_PORT" -uroot -p"$MYSQL_ENV_MYSQL_ROOT_PASSWORD"'
    mysql: [Warning] Using a password on the command line interface can be insecure.
    Welcome to the MySQL monitor.  Commands end with ; or \g.
    Your MySQL connection id is 3
    Server version: 5.7.18 MySQL Community Server (GPL)
    
    Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved.
    
    Oracle is a registered trademark of Oracle Corporation and/or its
    affiliates. Other names may be trademarks of their respective
    owners.
    
    Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
    
    mysql> show databases;
    +--------------------+
    | Database           |
    +--------------------+
    | information_schema |
    | mysql              |
    | performance_schema |
    | sys                |
    +--------------------+
    4 rows in set (0.00 sec)
    
    mysql> create database demo;
    Query OK, 1 row affected (0.03 sec)
    
    mysql> use demo;
    Database changed
    mysql> create table foo(val char(5));
    Query OK, 0 rows affected (0.14 sec)
    
    mysql> insert into foo(val) values ('a'), ('b'), ('c');
    Query OK, 3 rows affected (0.08 sec)
    Records: 3  Duplicates: 0  Warnings: 0
    
    mysql> select * from foo;
    +------+
    | val  |
    +------+
    | a    |
    | b    |
    | c    |
    +------+
    3 rows in set (0.00 sec)
    


Now let's re-create the container and prove the data remains.

    root@labosa:/var/log# docker stop some-mysql
    some-mysql
    root@labosa:/var/log# docker rm some-mysql
    some-mysql
    root@labosa:/var/log# docker run --name some-mysql --volume-driver=rexray \
        -v mysql:/var/lib/mysql -e MYSQL_ROOT_PASSWORD=my-secret-pw -d mysql
    99a7ccae1ad1865eb1bcc8c757251903dd2f1ac7d3ce4e365b5cdf94f539fe05
    
    root@labosa:/var/log# docker run -it --link some-mysql:mysql --rm mysql \
        sh -c 'exec mysql -h"$MYSQL_PORT_3306_TCP_ADDR" -\
        P"$MYSQL_PORT_3306_TCP_PORT" -uroot -p"$MYSQL_ENV_MYSQL_ROOT_PASSWORD"'
    mysql: [Warning] Using a password on the command line interface can be insecure.
    Welcome to the MySQL monitor.  Commands end with ; or \g.
    Your MySQL connection id is 3
    Server version: 5.7.18 MySQL Community Server (GPL)
    
    Copyright (c) 2000, 2017, Oracle and/or its affiliates. All rights reserved.
    
    Oracle is a registered trademark of Oracle Corporation and/or its
    affiliates. Other names may be trademarks of their respective
    owners.
    
    Type 'help;' or '\h' for help. Type '\c' to clear the current input statement.
    
    mysql> use demo;
    Reading table information for completion of table and column names
    You can turn off this feature to get a quicker startup with -A
    
    Database changed
    mysql> select * from foo;
    +------+
    | val  |
    +------+
    | a    |
    | b    |
    | c    |
    +------+
    3 rows in set (0.00 sec)
    
So there you go.

Tags for this post: docker ceph rbd rexray
Related posts: So you want to setup a Ceph dev environment using OSA; Juno nova mid-cycle meetup summary: containers

Comment

LilacSat-1 Codec 2 in Space!

On May 25th LilacSat-1 was launched from the ISS. The exiting news is that it contains an analog FM to Codec 2 repeater. I’ve been in touch with Wei Mingchuan, BG2BHC during the development phase, and it’s wonderful to see the satellite in orbit. He reports that some Hams have had preliminary contacts.

The LilacSat-1 team have developed their own waveform, that uses a convolutional code running over BPSK at 9600 bit/s. Wei reports a MDS of about -127 dBm on a USRP B210 SDR which is quite respectable and much better than analog FM. GNU radio modules are available to support reception. I think it’s great that Wei and team have used open source (including Codec 2) to develop their own novel systems, in this case a hybrid FM/digital system with custom FEC and modulation.

Now I need to get organised with some local hams and find out how to work this satellite myself!

Part 2 – Making a LilacSat-1 Contact

On Saturday 3 June 2017 Mark VK5QI, Andy VK5AKH and I just made our first LilacSat-1 contact at 12:36 local time on a lovely sunny winter day here in Adelaide! Mark did a fine job setting up a receive station in his car, and Andy put together the video below showing both ends of the conversation:

The VHF tx and UHF rx stations were only 20m apart but the path to LilacSat-1 was about 400km each way. Plenty of signal as you can see from the error free scatter diagram.

I’m fairly sure there is something wrong with the audio (perhaps levels into the codec), as the decoded Codec 2 1300 bit/s signal is quite distorted. I can also hear similar distortion on other LilicSat-1 contacts I have listened too.

Let me show you what I mean. Here is a sample of my voice from LilacSat-1, and another sample of my voice that I encoded locally using the Codec 2 c2enc/c2dec command line tools.

There is a clue in this QSO – one end of the contact is much clearer than the other:

I’ll take a closer look at the Codec 2 bit stream from the satellite over the next few days to see if I can spot any issues.

Well done to LilacSat-1 team – quite a thrill for me to send my own voice through my own codec into space and back!

Part 3 – Level Analysis

Sunday morning 4 June after a cup of coffee! I added a little bit of code to codec2.c:codec2_decode_1300() to dump the energy quantister levels:

    e_index = unpack_natural_or_gray(bits, &nbit, E_BITS, c2->gray);
    e[3] = decode_energy(e_index, E_BITS);
    fprintf(stderr, "%d %f\n", e_index, e[3]);

The energy of the current frame is encoded as a 5 bit binary number. It’s effectively the “AF gain” or “volume” of the current 40ms frame of speech. We unpack the bits and use a look up table to get the actual energy.

We can then run the Codec 2 command line decoder with the LilacSat-1 Codec 2 data Mark captured yesterday to extract a file of energies:

./c2dec 1300 ~/Desktop/LilacSat-1/lilacsat_dgr.c2 - 2>lilacsat1_energy.txt | play -t raw -r 8000 -s -2 - trim 30 6

The lilacsat1_energy.txt file contains the energy quantiser index and decoded energy in a table (matrix) that I can load into Octave and plot. I also ran the same text on the reference cq_freedv file used in Part 2 above:

So the top plot is the input speech “cq freedv ….”, and the middle plot the resulting energy quantiser index values. The energy bounces about with the level of the input speech. Now the bottom plot is from the LilacSat-1 sample. It is “red lined” – hard up against the upper limits of the quantiser. This could explain the audio distortion we are hearing.

Wei emailed me overnight and other Hams (e.g. Bob N6RFM) have discovered that reducing the Mic gain on the uplink FM radios indeed improves the audio quality. Wei is looking into in-flight adjustments of the gain between the FM rx and Codec 2 tx on LilacSat-1.

Note to self – I should look into quantiser ranges to make Codec 2 robust to people driving it with different levels.

Part 4 – Some Improvements

Sunday morning 4 June 11:36am pass: Mark set up his VHF tx in my car, and we played the cq_freedv canned wave file using a laptop and signalink so we could easily vary the tx drive:

Fortunately I have plenty of power available in my Electric Vehicle – we just tapped across 13.2V worth of Lithium cells in the rear pack:

We achieved better results, but not quite as good as using the source file directly without a journey through the VHF FM uplink:

LilacSat-1 3 June high mic gain

LilacSat-1 4 June low mic gain

encoded locally (no VHF FM uplink)

There is still quite a lot of noise on the decoded audio, probably from the VHF uplink. Codec 2 performs poorly in the presence of high levels of background noise. As we are under-deviating, the SNR of the FM uplink will be reduced, further increasing noise. However Wei has just emailed me that his team is reducing the “AF gain” between the VHF rx and Codec 2 on LilacSat-1 so we should hear some improvements on the next few passes.

Note to self #2 – add some noise reduction inside of Codec 2 to make it more robust to different input signal conditions.

Links

The LilacSat-1 page has links to GNU Radio modules that can be used to receive signals from the satellite.

Mark, VK5QI, describes he car’s exotic antennas system and how it was used on todays LilacSat-1 contact.

LilacSat-1 HowTo, Mark and I have documented the set up procedure for LilacSat-1, and written some scripts to help automate the process.