Her is being heralded as a wonderful vision of humanistic, frictionless design. It's worth picking apart how its vision of design could work in practice, so we know what we're signing up for. But before we go any further, let me warn you - below there will be spoilers.
Naturally, we can't take Her as a literal description of how the technology would work. It's a film, and it has to tell a story, glossing over certain things in the process. But valuable lessons can be drawn without being tied to a specific implementation of a technology. For instance, Minority Report is a useful text for us to think about how touch interfaces might work, even if using a touch interface on a mounted screen is a bit uncomfortable in real life.
There are a couple of major examples of user interface used in Her. The first example is shown at the main character's office. Theodore works at an internet company that prepares hand-written letters for clients to present as their own.
The workers use the voice control of their computers to compose and edit their letters. In practice, this would mean a fairly loud workplace (imagine everyone in your office talking to their computers at conversational volume). You could ask everyone to whisper, but that introduces hardware problems (you'd need more sensitive microphones) and potentially voice-strain (whispering all the time is difficult).
That said, voice-control and voice-capture software does already exist, so we can look at how people use it to get a sense of how well would work in practice.
Software such as Dragon Dictate allows you to talk to your computer and have your speech rendered as text. It's not perfect - you have to speak punctuation aloud and it takes a while for the software to calibrate your accent and vernacular - but it's definitely useful for situations where you have to capture documentation. Importantly, it can be very useful for people who are unable to operate a keyboard efficiently.
However, it's not particularly common for writers to use it. This is because for many writers, composing a text is very different from speaking a text. We tend to have different written styles from our conversational styles, and composing work as we speak it aloud is quite difficult. Many authors and writers edit as they go, which is quite difficult using a spoken interface. The software has to be able to differentiate commands from text, and has to be able to navigate through the written text to find the edit point. Using a keyboard and mouse to highlight and move text is a lot faster than speaking aloud a new sentence or trying to direct the software to find the right segment of text.
Similarly, voice-command software such as Siri already exists on smartphones. This software - while quite complex in terms of its ability to parse statements for intent - is still restricted to conducting relatively simple tasks. Its value lies in being able to determine intent from parsing text and referring to prior actions and preferences.
It's here that we start to approach the kind of intelligent software agent that Samantha is presented as being. She is able to undertake complex tasks based on contextual information, conversational input, and prior action and preferences.
Samantha the OS is presented as 'intuitive' and able to learn from the owner's interactions. While the initial setup is an amusing little riff on the 'tell me the good things you remember about your mother' scene from Blade Runner, the reality is that learning the user's preferences in detail would require a large input of information - in an excruciating, privacy violating fashion.
This is, oddly enough, one of the few things in the film that's spelled out explicitly. Samantha reads all of Theodore's email and learns his embarrassing secrets, pressuring him to go on a date before he is emotionally ready. This sense of inappropriateness feels much worse when presented in the friendly voice of Siri or Samantha. It's akin to the 'uncanny valley' effect - the closer our technology gets to being human, the more we notice mistakes and disjunctures.
While there is a suggestion that Samantha’s intelligence is what allows her to understand Theodore, the reality is that ‘intuition’ is just a way of saying ‘socially fluent’ – and ultimately we learn our social fluency through our experience of other people. Even if we achieve the creation of ‘hard AI’ - and we do so in a way that these intelligences remain interested in us – we’ll need to allow for them to learn all about us to be ‘intuitive.’
This is really the key to understanding what this frictionless, humanistic design vision really requires. Samantha understands what Theodore wants - she is intuitive - because she knows all his secrets. While her friendly persona offsets the feeling of violation, this does not change the underlying dynamic. In fact, cutesy imagery, relaxing colours and friendly language are precisely the kind of tools designers use to encourage you to accept this collection of personal information. When you fill out your details on Facebook, when you allow Amazon to provide personalised suggested purchases, you are interacting with software in this same way.
None of this is suggest that this vision of computer-interaction is intrinsically wrong. But design is ultimately about negotiating acceptable tradeoffs, and it’s worth understanding what we’re giving up.
Barry Saunders is a user experience architect and software designer.
Share

