Sunday, 2 October 2016

Ergonomics of the Internet of Things - 1

BLUF; The Internet of Things (IoT) will have lots of small, low-powered devices with complex functionality. This is a challenging context for good ergonomics. By and large, the IoT community won't even try; they'll take the same approach to usability as they have to security. Inability to power usable file formatting is a perhaps obscure example, but a good one. Message to the engineers: Just don't do it.

My Sony Walkman mp3 player has given good service over the years, but is not quite functioning correctly any more, and my music collection has grown well beyond its capacity. So the prospect of a small cheap mp3 player that takes a large capacity MicroSD card was too tempting. Bought.

Unusable; the tracks would not play in the right order when copied over from my PC.

MP3 file tagging is quite hard to get right, and it matters, especially for classical music (the bulk of my files).
What follows is  a bit of background and a summary of what I had to do to fix it. It is the result of a good bit of digging around and trial and error. Even if decent instructions came with the device, it is too big a demand to make of the user that just wants to listen to music. The engineers who thought that they had made an acceptable compromise in the interests of a low-power device were wrong.

In FAT32, filenames are written in the File Allocation Table with a creation date/time and the mp3 player reads the FAT and shows the files in the order they were written to the disc. It makes it more difficult when you want to view or play files in alphabetical or numerical order. Windows applications can sort the files on name and replay them in sequence but small devices such as mp3 players are more limited because of their low power constraints apparently.

I used Drivesort after trying some other applications to sort the files on the MicroSD card into alphabetical and numerical order. I see other people have had problems with Drivesort, but it is free and it worked for me. I used a mixture of Long Name Sort and Short Name Sort. I had to do it folder by folder, which was pretty tedious. There is a subdirectory function but I couldn't get it to work.
My MicroSD card came in exFAT format, so I had to format it to FA32 before I could use Drivesort. Windows wouldn't do it, so I used guiformat  (free but Paypal donations welcome).

After the event for me, I hope, but this looks a useful resource on file sorting for mp3 players

Thursday, 22 September 2016

What autonomous ships are really about

It is easy for technical people (and others) to take technical ideas at face value. These days, this is often a serious mistake. In his brilliant monograph 'The Epic Struggle for the Internet of Things', reviewed here, Bruce Sterling puts us right. Google spent billions buying Nest, not for its thermostat technology, but to stake a claim in home automation.A technical device can be used to support a narrative for the future. So it is with autonomous ships.
First a small paleofuture exercise. Go back five or six years. What was 'the future of shipping' then? Perhaps, how big container ships might get, what alternative fuels might we be using. Look at the ships in this piece by Wärtsilä  from 2010 about shipping in 2030. Those ships have bridges and people. No mention of autonomous ships by anybody, probably.
Now, to the present. Go to any shipping event and ask about 'the future of shipping'. Autonomous ships will be mentioned within the first three sentences, and Rolls-Royce will be named. Rolls-Royce has put itself at the centre of the dominant narrative for the whole industry. This positioning is worth a fortune, and RR has done it at almost zero cost. A contribution to an EU research project or two, some press releases, some snazzy graphics from Ålesund - pennies. The autonomous ship has been the device used to stake that claim. Please don't mistake it for a technical exercise.

Friday, 17 June 2016

Pre-empting Hindsight Bias in Automated Warfare

"His Majesty made you a major because he believed you would know when not to obey his orders." Prince Frederick Karl (cited by Von Moltke)

The killer robot community is debating concepts such as Meaningful Human Control (MHC) and 'appropriate human judgment' with a view to their operationalisation in practical use. For the purpose of this post, the various terms are bundled into the abbreviation MHC.

After things have gone wrong, the challenge for incident analysis is to avoid 'hindsight bias'. To learn from an incident, it is necessary to find out why it made sense at the time. "to reconstruct the evolving mindset", to quote Sidney Dekker. There is a long history of the wrong people getting the blame for an incident - usually some poor soul at the 'sharp end' (Woods).

In a world of highly automated systems, the distinction between 'human' and 'machine' becomes blurred. In most systems, there are a number of human stakeholders to consider, and a through-life perspective is frequently useful.

In a combat situation, 'control' is an aspiration rather than a continuing reality, and losers will have lost 'control' before the battle - e.g. after the opponent has got inside their OODA loop. What is a realistic baseline for MHC in combat? We have to be able to determine this without hindsight bias.
How would an investigator determine the presence or absence of MHC in the reconstruction of an incident? It would be virtue signalling of the lowest order to wait until after an incident and then decide how to determine the presence or absence of MHC.

One aspect of such determination is to de-couple the decision making from outcomes. The classic paper on this topic is '“Either a medal or a corporal”: The effects of success and failure on the evaluation of decision making and decision makers' by Raanan Lipshitz
There is, of course, a sizeable literature on decision quality e.g. Keren and de Bruin.

The game of 'consequences' developed here has been to provide food for thought, and an aid to discussion on what an investigator would need to know to make a determination of MHC.  It comprises short sections of dialogue. The allocation of function to human or machine, and the outcomes, are open to chance variation.
The information required to determine MHC might help in system specification, including the specifics of a 'human window'. It is not always the case that automation provides such a window - especially in the case of Machine Learning. So, how do we determine MHC in a combat situation? Try some of the exercises and see how much you would need to know. If the exercises here don't help make a determination - what would?

Please let me know in comments below, or on Twitter @BrianSJ3

As an aside, there are proven approaches to take in system development that can provide assurance of decision quality. This is not entirely a new challenge to the world of Human-System Integration. "What assurances are there that weapon systems developed can be operated and maintained by the people who must use them?"
[Guidelines for Assessing Whether Human Factors Were Considered in the Weapon Systems Acquisition Process FPCD-82-5, US GAO, 1981]

Sunday, 3 January 2016

Human aspects of automation - The 'Jokers'

I propose four 'Jokers' to be considered in the design and operation of automated / autonomous systems. These are not 'risks' as normally managed, though there may be ways of managing them for people who have taken the red pill. The Jokers are:
  • Affect dilemma: Users WILL attribute a personality to your system and act on it, which may or may not match the behaviour of the system.
  • Risk compensation: Users WILL use systems installed for safety purposes to achieve commercial gain.
  • Automation bias: Users WILL trust the system when they shouldn't.
  • Moral buffering: Remoteness brings moral and ethical distance. Users WILL become morally disengaged.
The Jokers need to be addressed during design and operation. There are no simple means of 'mitigating' or 'treating' them.   To a large extent, engineers have got away with minor informal treatment of the (unrecognised) Jokers. This won't be possible with Robotics and Autonomous Systems.

Affect dilemma

Whether you intend it or not, your computer will be assigned a personality by its users e.g. Tamagotchi effect.  This doesn't just apply to social robots; nuisance alarms and other such 'technical' features will be used by the users in assigning a personality to the computer, and this will drive their interaction with it. This seems to be an area well short of having 'best practice' and may just need lots of monitoring, with corrective action where possible. Giving the interface personality human values sounds a good start.

Risk compensation

Wikipedia has a good entry on risk compensation. Despite being a well-accepted phenomenon, I have yet to encounter its explicit treatment in design, operation, or regulation. I should be delighted to hear of its appearance in a single safety case. 'Shared Space' stands out as a cultural oddity.
Risk compensation triggered by regulation is termed the Peltzman Effect.
[Note: Wilde's risk homeostasis is not being discussed here.]

Automation bias

"The automation's fine when it works" Margareta Lützhöft. Problems can arise when it doesn't. The reliability of modern automation means that it makes sense for the user to rely on it without checking. A summary from a paper by Missy Cumming:
"Known as automation bias, humans have a tendency to disregard or not search for contradictory information in light of a computer-generated solution that is accepted as correct (Mosier & Skitka, 1996; Parasuraman & Riley, 1997).  Automation bias is particularly problematic when intelligent decision support is needed in large problem spaces with time pressure like what is needed in command and control domains such as emergency path planning and resource allocation (Cummings, 2004). Moreover, automated decision aids designed to reduce human error can actually cause new errors in the operation of a system.  In an experiment in which subjects were required to both monitor low fidelity gauges and participate in a tracking task, 39 out of 40 subjects committed errors of commission, i.e. these subjects almost always followed incorrect automated directives or recommendations, despite the fact that contraindications existed and verification was possible (Skitka et al., 1999). "
Kathleen Mosier has shown that automation bias is surprisingly resistant to extra users or training, and that automation can lead to new, different types of error. AFAIK, automation bias is not addressed in Human Reliability Analysis, or explicitly addressed in design or operation. It is recognised as a concern in reports by the CAA and Eurocontrol.
The blame-the-human language of over-reliance is unwelcome but unsurprising. It begs the question of what would be optimal reliance. “The reason that current research does not unequivocally support the presence of complacency is that none of the research known has rigorously defined optimal behaviour in supervisory monitoring” (Moray & Inagaki, 2000)
Measures of trust, including trustworthiness, trustedness, trust miscalibration may need to be part of the answer. The Yagoda trust scale is of potential use in this context.
It could reasonably be argued that automation bias is a consequence of the affect dilemma. My grounds for having two separate Jokers is that, even when not independent, they are  separate concerns from a design or operational point of view.

Moral buffering

Dumping your boyfriend by text message. Letting people go by email. "Distant punishment" in modern warfare. Moral buffering. The moral buffer is described by Missy Cummings.
"The concept of moral buffering is related to but not the same as Bandura's (2002) idea of moral disengagement in which people disengage in moral self-censure in order to engage in reprehensible conduct. A moral buffer adds an additional layer of ambiguity and possible diminishment of accountability and responsibility through an artifact or process, such as a computer interface or automated recommendations. Moral buffers can be the conduits for moral disengagement, which is precisely the reason for the need to examine ethical issues in interface design."
People can exploit moral buffering to generate the 'Agency Problem' as set out by Nassim Nicholas Taleb:
"Solution to the AGENCY PROBLEM: Never get on a plane unless the person piloting it is also on board.
Generalization: no-one should be allowed to declare war, make a prediction, express an opinion, publish an academic paper, manage a firm, treat a patient, etc. without having something to lose (or win) from the outcome
Taleb links the agency problem to 'skin in the game'.
A classic demonstration of moral buffering is the 'Button Defense' in 'How To Murder Your Wife' - "Edna will never know".

The Jokers are due to appear in a paper in the Safety Critical Systems Club Newsletter, which will give them a proper citation. To be added when published this month.
There is some overlap between the Jokers and BS8611. To be the subject of a future post.

Wednesday, 30 December 2015

Providing assurance of machine decision making

All Models Are Wrong But Some Are Useful” -George Box

The aim of Human-Machine Teams (HMT) is to make rapid decisions under changing situations characterised by uncertainty. The aim of much modern automation is to enable machines to make such decisions for use by people or other machines. The process of converting incomplete, uncertain, conflicting, context-sensitive data to an outcome or decision needs to be effective, efficient, and to provide some freedom from risk. It also may need to reflect human values, legislation, social justice etc. How can the designer or operator of such an automated system provide assurance of the quality of decision making (potentially to customers, users, regulators, society at large)? 'Transparency' is part of the answer, but the practical meaning of transparency has still to be worked out.

The philosopher Jurgen Habermas has proposed that action can be considered from a number of viewpoints. To simplify the description given in McCarthy (1984), purposive-rational action comprises instrumental action and strategic action. Strategic action is part-technical, part-social and refers to the decision-making procedure, and is a the decision theory level e.g. the choice between maximin, maximax criteria etc., and needs supplementing by values and maxims. It may be that Value Sensitive Design forms a useful supplement to Human-Centred Design to address values.

The Myth of Rationality

"Like the magician who consults a chicken’s entrails, many organizational decision makers insist that the facts and figures be examined before a policy decision is made, even though the statistics provide unreliable guides as to what is likely to happen in the future. And, as with the magician, they or their magic are not discredited when events prove them wrong. (…) It is for this reason that anthropologists often refer to rationality as the myth of modern society, for, like primitive myth, it provides us with a comprehensive frame of reference, or structure of belief, through which we can negotiate day-to-day experience and make it intelligible."
Gareth Morgan

The Myth of Rationality is discussed e.g.  here.  The limits of rationality (or perhaps its irrelevance) in military situations  should be obvious. If you need a refresher, then try Star Trek 'The Galileo Seven'. The myth of the rational manager is discussed here. This is not to say that vigilant decision making is a bad thing - quite the opposite. As Lee Frank points out, rationality is not the same as being able to rationalise.

The need for explanation / transparency

The need for transparency and/or observability is discussed in a previous post here. There is an interaction between meeting this need and the approach to decision making. AFAIK the types of Machine Learning (ML) currently popular with the majors cannot produce a rationalisation/explanation for decisions/outcomes, which would seem a serious shortcoming for applications such as healthcare. If I am a customer, how can I gain assurance that a system will give the various users the explanations they need?

Approach to decision making

It is the mark of an educated man to look for precision in each class of things just so far as the nature of the subject admits; it is evidently equally foolish to accept probable reasoning from a mathematician and to demand from a rhetorician scientific proofs.” Aristotle
At some point, someone has to decide how the machine is to convert data to outcomes (what might have been called an Inference Engine at one point). There is a wide range of choices; the numeric/symbolic split, algorithms, heuristics, statistics, ML, neural nets, rule induction. In some cases, the form of decision making is inherent in the tool used e.g. constraint-based planning tool, forward-chaining production system, truth maintenance system etc. There are choices to be made in search (depth vs. breadth) and in types of logic or reasoning to be used. There were attempts before the AI winter to match problem type to implementation but IMHO they didn't finish the job, and worked-up methodologies such as CommonKADS would be a hard sell now. So, what guidance is available to system designers, and what forms of assurance can be offered to customers at design time? Genuine question.

Sunday, 20 December 2015

Human Machine Teaming - Data Quality Management

"A mathematician is a man who is willing to assume anything except responsibility." (Theodore von Karman)

"Rapid, effective decision making under conditions of uncertainty whilst retaining Meaningful Human Control (MHC)" is the sort of mantra associated with Human Machine Teaming (HMT). A purely mathematical approach to risk and uncertainty is unlikely to match the needs of real world operation, as Wall St. has discovered.

So, during the design of a system where the data are potentially incomplete, uncertain, contradictory etc. how does the designer offer assurance that data quality is being addressed in an appropriate manner? Or are we doomed to crafted systems on the basis of "trust me"?

Not all forms of uncertainty should be treated in the same way; this applies to data fusion, say, and most other tasks. It is my impression that the literature on data quality and information quality is not being used widely in the AI, ML, HMT community just now - I'd be delighted to be corrected on that.

 ISO/IEC 25012 “Software Engineering – Software Product Quality Requirements and Evaluation (SQuaRE) – “Data Quality Model”, 2008 categorises quality attributes into fifteen characteristics from two different perspectives: inherent and system dependent ones. This framework may or may not be appropriate to all applications of HMT but it makes the point that there is more than just "uncertainty". Richard Y Wang has proposed that "incorporating quality information explicitly in the development of information systems can be surprisingly useful"  in the context of military image recognition.

HMT takes place in the context of Organisational Information Processing. The good news is that this is quite well-developed for flows within an organisation (less so for dealing with an opposing organisation). The bad news is that Weick is hard work. The key term is equivocality, and I suggest that the HMT community use it as an umbrella term, embracing 'uncertainty' and other such parameters. Media richness theory helps.

"A man's gotta know his limitations" (Clint Eastwood). "So does a robot" (BSJ)

A key driver for data quality management is whether a system (or agent etc.) assumes an open world or a closed one. Closed world processing has to know the fine details e.g. how a Google self-driving car interacts with a cyclist on a fixed-wheel bicycle.  By contrast, GeckoSystems  takes an open world approach to 'sense and avoid' and doesn't have to know these fine details. It would seem that closed world processing needs explicit treatment of data quality to avoid brittleness.

Time flies like an arrow, fruit flies like a banana.

At some point, the parameters acquire meaning, or semantic values. "We won’t be surfing with search engines any more. We’ll be trawling with engines of meaning." (Bruce Sterling). The parameters may be classified on the basis of a folksonomy, or the results of knowledge elicitation. So far as I can see, the Semantic Revolution has a way to run before achieving dependable performance. Roger Schank has been fairly blunt about the present state of the art. Semantic parameters are likely to have contextual sensitivity, which may be hard to characterise.

If a system is to support human decision making, then it may need to provide information well beyond that required analytically for the derivation of a mathematical solution. Accordingly, the system may need to manage data about the quality of processing.   For robotic state estimation, the user may need more than a point best estimate. Confidence estimates may need to be expressed in operational terms, rather than mathematical ones. Indeed, the HMT may need to reason about uncertainty as much as under uncertainty.

This post is scrappy and home-brewed. Suggestions for improvement are welcome. If I am anywhere near right, then the state of art needs advancing quite swiftly. As a customer I wouldn't know how to gain assurance that the management of data quality would support safe and effective operation, and as a project manager, I wouldn't know how to offer such assurance.

Update: This is nice on unknown unknowns and approaches to uncertainty in data:

Friday, 11 December 2015

Human-Machine Teaming - meeting the Centaur challenge

At the centre of the US DoD Third Offset is Human-Machine Teaming (HMT), with five building blocks:
  1. Machine Learning
  2. Autonomy / AI
  3. Human-Machine Collaboration
  4. Assisted human operations
  5. Autonomous weapons.
The analogy with Centaur Chess is a powerful one, and potentially offers the best use for both people (H) and machines (M). However, this approach is not easy to implement.This post is a quick look at some issues of design and development for HMT. Other aspects of HMT will be addressed in subsequent posts (hopefully).

1. Human-Centred Automation

The problems of SAGE, one of the first automated systems, were well-documented in 1963. Most automated systems built now still have the same problems. "H" managed to get the UK MoD to use the phrase "So-called unmanned systems" to reflect their reality. There are people working on autonomous systems who really believe there will be no human involvement or oversight. These people will, of course, build systems that don't work. In summary, the state of the art is not good - an engineering led technical focus leads to "human error".
The principles of human-centred automation were set out by Billings in 1991:
  • To command effectively, the human operator must be involved.
  • To be involved, the human operator must be informed.
  • The human operator must be able to monitor automated systems.
  • Automation systems must be predictable.
  • The automated system must also be able to monitor the human operator.
  • Each of the elements of the system must have knowledge of the other’s intent. 
We know a great deal about the human aspects of automation. The problem is getting this knowledge applied.
There is a considerable literature on technical aspects of HMT, including work on the Pilot's Associate / Electronic Crewmember. The challenge is with getting this expertise used.

2. Human-System Integration process

Human-System Integration (HSI) is more talked-about than done. For HMT, HSI has to be pretty central to design, development, and operation. This will require enlightened engineers, programmers, risk managers etc. There are standards etc. for HSI (e.g. the Incremental Commitment Model), though these do not address HMT-specific matters.

The state of Cognitive Systems Engineering (CSE) is lamentable. I can take some share of the blame here, having dropped my topics of interest in the AI winter (the day job got in the way). Nearly all of it is academic as opposed to practical. Some of the more visible approaches have very little cognition,minimal systems thinking and no connection with engineering. Gary Klein's work is probably the best place to find practical resources (starting with Decision Centred Design).
MANPRINT; the integration of people and machines may go very deep and require closer coupling of Human Factors Engineering and Human Resources (selection, training, career structures etc) than has been the case to date. Not easy at scale.
Simulation-based design is probably the way to achieve iteration through to wargaming to support operation. Obviously there are issues of fidelity (realism) here, but they should be manageable.

3. Capability, ownership, responsibilities

The industrial capability to deliver HMT is limited, and the small pool of expertise is divided by the AI winter. Caveat emptor will be vital, and specalist capability evaluation tools for HMT don't exist (though HSI capability evaluation tools could be expanded to do the job). 'Saw one, did one, taught one' won't work here unless you want to fail.
The data (big or otherwise), algorithms, heuristics, rules, concepts, folksonomies etc. are core to military operations (and may be sensitive). It would be best if they were owned and managed by a responsible military organisation, rather than a contractor. In a sense, they could be considered an expansion of doctrine.

4. Test, acceptance

If HMT is to work as a team, then it may well be that the M becomes personally tailored to the individual H. This goes against military procurement in general, and raises questions about how to conduct T&E and acceptance. If the M evolves in harmony with the H, then this raises further difficulties. Not insuperable, but certainly challenging. Probably simpler in the context of the extended military responsibility proposed above.

5.State of art

We are seeing the return of hype in AI. Sadly, it seems little was learned from the problems of the previous phase, exacerbated by somewhat impractical hype on ethics.
It is still as much craft as engineering to build responsible systems; there is a real shortage of good design guidance. HMT has been the province of the lab, and has not been translated into anything resembling mainstream system acquisition. Much to do.