Send the link below via email or IMCopy
Present to your audienceStart remote presentation
- Invited audience members will follow you as you navigate and present
- People invited to a presentation do not need a Prezi account
- This link expires 10 minutes after you close the presentation
- A maximum of 30 users can follow your presentation
- Learn more about this feature in our knowledge base article
CS 4249 - Phenomena and Theories of HCI
Transcript of CS 4249 - Phenomena and Theories of HCI
Selective Attention Tasks that we find easy require less attention and we employ automatic processing
Tasks that are complex or novel require controlled processing
Practice allows us to convert controlled processing to automatic ones CS 4249 - Phenomena and Theories of Human Computer Interaction Kan Min-Yen Cognitive Models Human Processing and Abilities How and why do affective models matter in HCI Emotion and Affective Models Understanding how people make sense of their physical and virtual environments Navigation 1 CS 4249: Phenomena and Theories of HCI Foundations: phenomena and theories HCI design requires understanding both computers and humans Human Factors Phenomena
How do people …?
What observations can we make about people? Theories
Our explanations for the phenomena
Surprisingly, less agreement than we might expect
Ties into psychology, cognitive sciences, ergonomics Leads us to:
Guidelines, frameworks and heuristics for useability Student presentations have and will present the basic component theories in HCI This week and Week 10 will bring these theories together so that we’ll better understand ourselves Outline Human Information Processing
Memory and Attention
Human Abilities Cognitive Models Navigation and Wayfinding
Crowdsourcing Social, Emotional and Affective Factors Social Interaction Human communication
Challenges in Computer Supported Collaborative Work Memory
Attention Models of Memory Perception Visual Auditory
Haptics Goals of Theories 4 3 5 Descriptive
Predictive 6 9 Models of Attention 1. Not a simple information store
Generally agreed that it has a short- and long-term division 2. Not a passive repository. Active processes changes its structure. 3. Not unaffected by the type of material to be stored. 1. Sensory Stores:
2. Short Term / Working Memory:
3. Long Term Memory:
Permastore Memory Stores and Processes Accessibility: can we retrieve the stored memory
Availability: is the memory actually stored in our brain? Forgetting Working Memory Central Executive Articulatory Loop Visuo Spatial Sketchpad Assigns processing to the two slave modules Inner Ear (~2 secs) and
Inner Voice (Repeating to yourself to remember) Inner Eye - Remembering an image that you've just seen
Stroop (1935) Effect Models of working memory tells us why certain abilities are impaired: we don't process both verbal and visual information in parallel very well. Our understanding of multitasking and cognitive load support these findings. Why are these models important? Long Term Store The synthesis of proteins allows neurons to form the pathways that encode our LT memory.
This process is called encoding or storage
Retrieval is then the process of recalling memory.
The active process of elaboration can change the effectiveness of retrieval. Rehearsal keeps things
in short term memory
Decay or Displacement
out of the WM Miller's 7 +/- 2 is perhaps the most famous fact about WM. And it is often misapplied by HCI designers. Why?
Also there's been some evidence that in today's generation the number is closer to 4 +/- 1 (Cowan, 2002). Think about why. *declarative memory is stored in the temporal lobe Studies place LT memory in different places depending on whether it is declarative* or nondeclarative (implicit). Joshua Foer: Feats of memory anyone can do Working Memory Model revision tutorial "Cells that fire together, wire together" Hebbian Theory suggest we forget when we don't use our memories. Basis for some (recent) A.I. theories of learning Interference Theory suggests that other acts of remembering or learning can supplant retrieval cues for the original memory. Retrieval Failure Theory suggests we can't retrieve things when the wrong retrieval cue is employed (cf elaboration). As HCI practitioners, it's likely that we'll want to use one channel as the primary channel and use another channel to reinforce it.
Design to promote recognition rather than recall.
Provide users with different methods for encoding information and procedures. Divided Attention (Multitasking) Divided Attention As capacity allocation Automatic vs. Controlled Processing Depends on the nature of the tasks and how much attention each demands 10 15 17 21 27 28 Enduring Dispositions Momentary Attentions Arousal (Capacity) Allocation Policy Tangent: Alan Kay and Tim Gallwey Alan Kay on Learning and Computer Science "The best way to predict
the future is to invent it." For HCI and problem solving:
Use the appropriate channel. Alan Kay also cites Tim Gallwey's "The Inner Game of Tennis" - you can learn tennis in an afternoon if you don't try to hard. Alan Kay: Doing with Images Makes Symbols Pt 2 (1987) Inner Game of Tennis (Tim Gallwey method) Visual Perception We construct meaning and recognition from sensory store.
Normally sighted people perceive stable, 3D color world. Thus we have constancy in these same dimensions. Violating these we get surprising illusions. 35 36 41 Müller-Lyer Illusion Necker Cube Red car by day Red car by night Visual Perception As humans, we are particularly sensitive to motion and human likeliness. 42 Images from Wikipedia Image from Wikipedia *works on all mammals The phenomenon of Mori's Uncanny Valley can be seen as one manifestation of this. Due to Alan Kay ` Motor Skills Motor Ability In studies of physiology and psychology there are actually two homunculi: one for describing sensing and another to describe our motor control. 54 Kahneman's (1973)
capacity allocation model Depth Perception 45 ` Perceiving* depth is based on a number of recognized cues.
Primary cues - are largely based on optic image received by eyes and muscle control.
Secondary cues - are based on only a single image (i.e., one eye). Light and Shade
Linear Perspective (e.g., Müller-Lyer arrows)
Motion Parallax, Texture Gradient
Relative Size Gestalt Perception 46 ` Color Perception 47 ` Auditory Perception 48 ` Were a group of psychologist working to look at perception of wholes from parts* (cf. holistic cognitive models)
Closure *Applies to other things aside from vision From visionweb.com Rods are more numerous, and highly sensitive but cannot see color
Cones are wavelength specific, and detail-oriented but need lots of lights
Visual attention orients the cone-rich portion of the retina Loudness is measured in (deci)bels
Bel scale is logarithmic: 40 dB is 10 times louder than 30 dB
We hear better at lower pitches (frequencies)
Normal conversation goes from 100 Hz - 4 KHz
But humans can hear up to 22 KHz Auditory Perception 49 ` Nyquist-Shannon law tells us that we need to sample at twice the frequency to store a sample perfectly.
The sampling rate for CDs (44.1 KHz) reflects this. The one and same Claude
Shannon of information theory Haptic Perception and Kinaesthetics 50 ` Haptics refers to touch, whereas kinaesthetics refers to our sense of our body's position.
Force feedback and vibration are newer methods of interaction
We are much more sensitive to sensation in certain body parts l: sensory homunculus r: Penfield's motor homunculus Listen for Joshua's description of distributed cognition.
Listen to why he thinks Miller's 7+/-2 is changing. Crowdsourcing In our class, we've explored Fitts' Laws and the Keystroke Level Model as examples of predictive laws for motor ability. Fitts' Law, revisited 55 Selective Attention
Levels of Processing
Keystroke Level Model GOMS (Goals, Operators, Methods and Selection)
Distributed and External Cognition
Seven Stages of Action
Plans and Scripts Pleasure Models
Emotional Design Models Collective Effort Model
Conversational Analysis Information Scent
Berrypicking We've already seen what Fitts' "law" is: a model to predict how long it will take for a pointing action to take place.
It is a ratio between the distance covered (D) and the width of the target (W). The larger the ratio, the longer it takes. Fitts' Law, revisited 56 Since distance is a large factor in Fitts' law, it has been a justification in developing pie menus It is a one-dimensional measure that makes doesn't account for the "height" or "depth" of a target (in 2D or 3D space). Think about when is it appropriate to use Fitts' law and what modifications you have to make for other scenarios. Fitts' Law Keystroke Level Model GOMS
Hick Hyman Law Keystroke Level Model 57 Fitts' Law predicts time to complete a pointing operation. But it is just one of many possible (human) operations in a computer system. KLM puts these together by defining a set of different operators, where each operator is given a time estimate to complete. We're now done with our lecture on human abilities.
For you to think about:
- How do the models we discussed apply to mobile phone applications?
- How can we apply the varying types of perception we have to best support HCI (cf Guiard's Model of Bimanual Skill) To find out a task completion time, one decomposes a task to its set of keystroke-level operators, and sum up the time for all of the operations. Later, other models add context to these notions. Models for how we think and decide.
Started with an empirical motivation to measure individuals. Human Processor Model & GOMS The Human Processor Model and GOMS were both developed by Card, Moran and Newell. The Human Processor Model models individuals as a set of three processes: perceptual, motor and cognitive. Sound familiar? Both are predictive models of an single person, conceptualizing a task as a series of processes whose time can be added up sequentially. Goals, Operators, Methods and Selectors GOMS can be thought of as an elaboration on the human processor model. A task to be accomplished has a Goal and Methods (a series of steps to accomplish them. Methods can be broken down hierarchically into a series of Operators that carry out a task. Selectors are the decision logic (if-then-else rules) that specify what to do after observing an outcome.
If this sounds like a programming language, that's because it almost is one. Simulations of task completion time are done in computer-facilitated versions of GOMS. Norman's Gulfs of Execution and Evaluation Levels of Processing Situated Action Scripts Activity Theory These "gulfs" arise often because there's a mismatch in the mental models of the user and of the designer.
A good way to provide a mental model is to draw on the user's experience using a metaphor: "Think of swiping as opening a sliding window" Gulf of Execution - The user does not know how to accomplish her task
Gulf of Evaluation - The user does not know how to check on the effects of her actions Quick Q: Can you find all three red terms in Bill Verplank's sketch? Instead of looking at the structure of memory, Levels of Processing proposed that memory is simply the by-product of analysis done by us to interpret the world.
The more complex the analysis, the stronger and more durable the memory trace (cf long term memory, and the Foer video on elaboration). Why is this important? If we want a person to remember something, LoP asserts that we need them to think more deeply (semantically, again elaborate) about the subject.
Your turn: In some ways, this contradicts the HCI principle of lowering the cognitive load of the user. What do you think? Hick-Hyman Law
The key part of Hick's law is that it is logarithmic, that is, it's not linear. Like Fitts' law, psychologists wanted to create predictive models for dealing with choice (selectors in GOMS) (cf Shannon's work on Information Theory) Tangent: Often, choices have a different amount of entropy. Common choices have less information (entropy) and are thus faster to make. To make a choice that's rare takes more time.
E.g., the response time for avoiding sudden obstacles on the road (a falling tree) is larger than avoiding more common moving obstacles (errant pedestrians). Scripts also have metadata that describe when they are applicable.
For us, this is how scripts relate to situated action. Using metaphor, we can overcome the gulfs of execution and evaluation by invoking scripts and the situated knowledge of other experiences.
Plans, as defined by Schank, are the series of actions needed to satisfy goals. I think of this as equivalent to GOMS's goals and methods. To me, GOMS offers a more compelling (operational) description of this. Originated in the precursors to Artificial Intelligence.
A key factor is that missing descriptions are supplied by recall of other related experiences. (and Plans*) Distributed and External Cognition Comparing
Contextual Models "Situated" in the contexts in which an activity occurs in. These include the organizational, social and technological contexts in which an activity occurs.
HCI is a collaboration between the users and the system designers (cf Norman's Gulfs) This notion of immediacy is an opposing viewpoint to the regularity of scripts and plans. In certain situations, our actions are opportunistic, flexible, reactive and individualized. A key point is that situated action states that much real human activity happens in the now, the immediate and personal context. An activity is the unit of analysis and is diagrammed to show the actors (subject), outcome (object), artifacts (tools).
In AT, the activity is the context. In subsequent revisions, the activity schema evolved to encompass rules (working practices; scripts), community and division of labor. Thinking about the parts of the schema can help identify mismatches (contradictions) when a part of the activity changes. The unit of analysis here is the system.
In a DC analysis, we look for the structure of how both people and artifacts interact.
There can be many analyses for the different people and artifacts involved.
It is different from Activity Theory, which looks at these and other factors holistically as an activity. The 3 models have different foci, but all are valuable for specific analyses.
Persistent structures: Activity theory and distributed cognition are similar, but distributed cognition puts humans on parity with artifacts; AT keeps the actor as the focus. Situated action emphasizes temporary or improvisatory artifacts.
People vs. Artifacts AT is human-centric. Distributed cognition again puts both as equal parts of a whole.
From one limited perspective, Situated Action describes people as reactive to their immediate environment. http://csalt.lancs.ac.uk/alt/lucy/ Summary Cognitive Models started as descriptive and explanatory models to help HCI understand how people make decisions.
1st generation: linear, isolated problem solving.
2nd generation: interaction in the environment, w/ others and artifacts, to ID problem areas within a system. Affective Computing Goals of Affective Computing
1. Recognizing Emotion
2. Synthesizing Emotion
3. Causing Emotional Responses Psychological Theories of Emotion Ekman, Friesen and Ellsworth (1972) and Plutchik (1980): six to eight basic universal emotions. Affective Computing Detecting and Recognizing Emotions Overt input: facial expression, voice intonation, gesture, movement, posture and pupilary dialation
Intrusive input: respiration, skin conductance, pulse, blood pressure, perspiration and temperature.
Current A.I. technologies for emotion recognition are getting better (70-90% for different emotions), but using intrusive detection.
Tangent: Speech recognition technology and problems with hyperarticulation. Persuasive Design http://library.thinkquest.org/25500/index2.htm Alternative models (Russell and Fernandez-Dols, 1997) distinguish only degrees on two dimensions: arousal and pleasure. Generally agreed that there are 3 components:
1. The subjective experience of emotion
2. Accompanying physiological changes
3. Change in higher level behavior Emotions affect the state of mind and type of thinking done Fear and anxiety cause a survival instinct, manifesting in higher effort but concentrated on a particular effort (depth-first processing) Happiness allows more tolerant behavior and allows people to accept and generate more alternatives (breadth-first processing.
We can design to promulgate certain types of behavior by looking at the connection between the visceral and behavioral elements. We can use our understanding of emotion and pleasure to cause changes in behavior: starting, stopping or continuing a behavior by tapping on these elements. Outline: Models of Emotions
Application to Persuasion and Gamification Pleasure Models Towards Gamification Gamification: The use (not extension) of design (not technology or practices) elements (not whole) characteristic of games (not play) in non-game contexts. Summary Affective computing: employing an understanding of emotions to design systems better. Affective theory gives designers more ways to analyze a system to look for potential conflicts. Important longitudinal effects: emotion, moods and pleasure are changing and individualized. Large changes on the horizon; plenty of opportunity for more development. cf
gamification. Jane McGonigal: The game that can give you 10 extra years of life Suits' The Grasshopper argues that in a utopia where material needs are fulfilled, then game playing is left as the ultimate form of good. Examine the connections between the pleasure model and the challenges that Jane asked her audience to play. "Usability" is equated with psycho-pleasure - designing around human error and mitigating negative emotion
Keep this model in mind as we continue to think about what HCI really is about. Fogg's Behavior Model (2009) suggests three components to such persuasive design:
3. Triggers Study of gamification is not new: this subject used to be called funology or serious gaming.
We'll return to this notion in crowdsourcing at the very end of this course. Human Communication Group Formation and Norms How do groups form? a) Forming, b) Storming, c) Norming,
d) Performing and e) Decay Spatial and Temporal Matrix Discourse and Conversational Analysis
Attend to not only content but also prosody (pauses, intonation, rhythm).
Also includes non-verbal communication (NVC):
Body language, posture Computer Supported Collaborative Work (CSCW) Individuals are already quite different in their workflow and values, this is compounded in group environments.
Grudin's (1994) challenges, informed by Ackerman: Tradeoffs between awareness and privacy 3 principles of social translucence:
1. Visibility: two-way, both participant and observers. Can generate privacy issues. Opacity.
2. Awareness: Situational awareness of the system and actors' context (situated action)
3. Accountability: Knowledge
connecting actions to identity
(social loafing) Summary Designing for navigation Account for the different activities that people undertake in a space.
A focus of other disciplines that we can piggyback from:
urban planning, interior design.
Paths - "Pave the cowpath"
Signage - Show landmarks, and paths by use
Landmarks - In guiding others, people overwhelmingly use these in their directions, whether in a map or path context.
Quick Question 1: There are also big differences between web and physical navigation. What's one of the largest differences?
Quick Question 2: What's the connection between landmarks and social networking systems? Summary Navigation Information Foraging and Berrypicking Group norms and cultural differences (cf Hofstede):
- Individualism/Collectivism "Conformity": Asch
- Power distance (Hofstede)
- Uncertainty avoidance
- Long/Short Term Orientation
- Masculinity/Femininity Asch Conformity Experiment Space Time Different Same Different Same F2F meetings
Meeting support tools Stickies
Project Management / Version Control Email
Shared Information Spaces
Threaded discussions Tele-conference
Instant messaging Understanding social phenomena is still very
much work in progress. We are informed by
studies of culture, but global Web 2.0 sites
are still covering new ground.
Social bookmarking, information trails, profile management, likes, +1 are also examples. 1. Disparity between work and benefit
2. Critical mass (under- and over-use)
3. Social and motivational factors (cf social loafing)
4. Exception as normal (cf availability)
5. Independent use
6. Evaluation (i.e., longitudinal evaluation) For tasks that are not completely well-formed.
Rely on multiple sources, use strategies that work in other contexts.
Connections to distributed cognition and CSCW. Exploration Wayfinding (Map / Browse / "Hot")
Context, and relation of objects to other objects, holistic understanding (Path / Search / "Cold")
Goal-driven use for a particular purpose or destination To think about: We're certainly not at the logical endpoint towards "likes" and other "+1" like systems. In terms of awareness and privacy, what more can be done? Object Identification
Landmarks What are the related methodologies in (web) information seeking tasks? View of the World from 9th Avenue
The New Yorker, Saul Steinberg Useful analogies from physical wayfinding
Different models of behavior from searching / browsing
Not only past use but ease of use and information scent Teleportation via search causes disorientation
Search engines always available
Better support for web orienteering important, also a province of information retrieval. Eric Whitacre: A virtual choir 2,000 voices strong Asking the general public to do work. Related to human-based computation. Last Words "Any sufficiently advanced technology is indistinguishable from magic." - Arthur C Clarke Technology will continue to shrink, to the point it becomes invisible and ubiquitous.
Human Computer Interaction =>
Human Human Interaction Origins: Captcha + Distributed Proofreading http://en.wikipedia.org/wiki/The_Turk Completely Automated Public Turing test to tell Computers and Humans Apart
Distributed Proofreading: Scholars wanted to proofread the scanned and recognized output of classic, out of copyright literature for the betterment of humanity. Alan Turing! Luis von Ahn: Re-Captcha Leverage this to tell segregate people from robots.
Leverage this to have people help computers. Games with a Purpose (GWAP) Get people to play games to help do something useful (i.e., subbing entertainment for payment).
cf Jane McGonigal's earlier talk about games. now synthesizing new proteins! The Long Tail Chris Anderson's description of the effect of the web.
How Amazon, eBay, Netflix function and profit from web scale. Quick Question: What is the ramification for universal usability? 26 32 33 34 Images from Wikipedia From walodesign.de 61 From thestrategyguysite.com From sapdesignguild.org 62 66 68 70 71 75 78 80 81 83 85 86 87 88 89 90 91 From http://www.slideshare.net/dings/persuasive-web-design-how-to-separate-users-from-their-bad-behaviours From A Behavior Model for Persuasive Design - BJ Fogg 92 93 95 97 99 100 From TheraminTree's Conformity YouTube video 104 105 106 From ibm.com 107 108 109 110 112 113 114 116 117 118 From twistedphysics.typepad.com 119 120 From recaptcha.net 121 From nature.com From worldwithoutoil.org 122 126