Article
Professional Website Usability
Part 3: Preparing for a Web Usability Study
Preparing for your Website Usability study involves two steps:
- Developing your evaluation method or tool
- Finalizing logistics
Let's walk through them both in detail.
1. Developing your Evaluation Method or Tool
Developing your evaluation method or tool
Regardless of how you plan to administer your usability study, you'll want to establish an approach for recording the raw data generated by your participants. This may take the form of using recording equipment for later review and analysis or designing a hard-copy or online form that participants or observers will complete. Because resources were available, my experience has generally included one observer for each participant with the observer making the notations on the evaluation tool. This frees up the participant so they can focus on the task at hand. If I didn't have access to these observers, I certainly would use a camera or other recording device to capture the participant's information.
Developing the evaluation tool is an important step in the process and you should include time in your plan to draft the tool, have it reviewed by your team or staff, approved by management and walked through one or two times by an objective third party (it should be timed when the walkthrough takes place with the tool modified as needed). The evaluation tool itself can be a Word document or take the form of a web form for automated input. Regardless of how you produce the tool, it should have several basic components.
- A number assigned to each participant (for anonymity)
- The observer's name (in case you need to follow-up with them later)
The five parts below make up the contents of the evaluation tools I've created, but you can certainly develop your own.
Specific to Website studies, I've gotten all the information I needed using these basic components — it is a simple approach and it works. The main limitation to the length of the evaluation criteria is time allotted for the study. If you have a 30-minute study, you want an evaluation tool that will take up that much time, and still leave time for those who are more diligent or not as well paced as the others (and, there are always a few stragglers!).
With the time constraint, you'd want to be selective in your activities and questions and really focus on and prioritize the activities you need to test the most. I've managed 5-minute, 10-minute, 30 minute and 1-hour studies. If the study is more than 1 hour, you might want to rethink or re-prioritize your questions and activities; asking a participant to give more than an hour of their time sitting at a computer and working through what they might perceive as a difficult assignment may not be very appealing.
If the study takes too long to complete or is perceived to be boring or dragging on, it may impact participation for this or future studies. But, if your incentive is really good, they may be willing to stay, but I would bet that the quality of the participant's output drops significantly after an hour. Another alternative would be to break a two-hour session in two by providing a meal in the middle, then resuming for the remaining hour.
These are the five parts I've used for Website Usability studies. The testing criteria are the questions, statements and activities that take place under each of the five parts:
Part 1 - General Survey
Includes a few general questions about web use (How often do you use the web? Have you ever visited our website? When you did, what were you primarily looking for?); you could certainly add other questions here if you needed more specific information from your participants about how they want to interact with your company (How would you like to communicate with us? From our company, do you prefer instructor-led or online training? Do you wish to subscribe to our newsletter?) This section includes the questions you need to better understand your user base and should be no more than 5-7 open-ended questions.
Scoring the General Survey
Responses will be text, so read through each carefully to learn more about your participant, how they use the web, how they use your website and how they want to interact with you. The information from these results can build on your understanding of the demographics associated with web use relative to your industry, user or specific websites. These results can be used to improve business process, create new communications strategies or even identify new market niches.
Part 2 - Treasure Hunt
In the Treasure Hunt, the participant is asked to take a deeper dive into the site to find, retrieve or download specific pieces of information or perform a function or transaction. For example, you may have the following statement, "From the home page, find the benefits change form and modify your status from single to married." You could make these activities increasingly more difficult. For example, select several items on the website that you feel are particularly buried; you'll want to test these especially if they are more than 3 clicks away from the home page.
In addition, make the direction more complex, for example, "From the home page, which path would you follow to find and download a leave slip; without returning to the home page, describe three features of the telephone directory." If the site includes interactivity, add a few exercises to test the usability of these features and have the participant perform a mock interaction. If you want to perform the Treasure Hunt, place it as the first section of your study so the participants don't have a chance to become more familiar with your site through other activities—you want first-impression material here!
Scoring the Treasure Hunt
(I'm not a statistician or rocket scientist, but I made up this intuition scoring system and it has worked well for me.) These exercises produce text responses, so it is difficult to use a scoring methodology. While not using a number to score, use three categories and ask the participant to circle the one that was most relevant to their experience: a) I completed this task b) I'm not sure if I completed this task c) I did not complete this task or d) I gave up. Then, when you return to the raw results, you can count up how many were a, b, c or d and determine participant ease or difficulty based on these findings. For example, let's say you have 10 participants and you are able to convert the a-d choices into a numbering scheme (counting up all you're a's, b's, c's and d's).
Total up each to produce how many participants fell into each category. By specifying ranges, you can then determine the level of intuitiveness (0-3=low intuitive, 4-7=medium intuitive and 8-10=high intuitive). If you have 10 participants and all 10 were a) I completed this task, you can surmise that the activity was high intuitive (they found what they were looking for and you did a great job!). On the other hand, if 8 of the 10 were d) I gave up, you have a low intuitive score and need to do more work to place that item more intuitively for the user. When you prioritize which changes to make, focus on those low to medium intuitive scores first.
Part 3 - Anticipation/Intuition
This section may include a few exercises to confirm that your assumptions about labeling and navigation is meaningful to the user (e.g., when the click the link, they will get exactly what they expected to get). This section can be done in a two-column table with "Label/Link" in the first column header (and your labels below—one on each row) and "Expectation" in the second column heading. Then the observer will note what the participant expected to find in the second column. Users expect to find what they need easily and quickly on your site, so check to be sure that you're referring to these items correctly, that the names and identifiers make sense and are intuitive to the user.
Scoring Anticipation/Intuition
This test with the table will net text responses, so you'll want to read through the comments in the second column to see where labeling or navigation queues didn't work well for the participants. If the majority noted a similar issue, I would tend to agree that more work is needed on that particular item. If you can't score it, go with what the majority said, I always say! On the other hand, if you restructure the question to give simple statements like, "The screen layout works well for me," you could use a 1-5 scale (1=strongly disagree, 2=disagree, 3=agree somewhat, 4=agree or 5=strongly agree) and have the participant circle the statement that most closely matches their opinion about the statement.
I also like to give room for the participant to tell me why the item didn't work well for them (so any scores under 3, I request more information on). Then, you could total up all the individual scores and multiply by the number of participants to produce the average. With 10 participants, if 5 participants strongly agree (5x5=25), 3 participants agree somewhat (3x3=9) and 2 participants strongly disagree (2x1=2), add up the totals (25+9+2=36) and divide by 10 (your number of participants). 36 divided by 10 = 3.6. From this, we can conclude that 3.6 against the 1-5 scoring would mean that work needs to be done, but it's in the middle-to-high range. Closer to the 1 would indicate dissatisfaction and that more work is required; closer to the 5 would mean that the participant was more satisfied and less work is required. You have the detail behind the scoring to determine which areas need work. You get the idea!
Part 4 - Terms and Language Use
Within the evaluation tool, ask if the writing was clear and easy to understand and any terms were encountered that the participant didn't understand. Web writers and developers often use language, terms and acronyms specific to their company or industry, and it's important to check to see if you've used any on your website that will be misunderstood.
Scoring Terms and Language
There is no system for scoring language use, but you'll want to review the feedback to identify where any issues were uncovered and make changes to the site accordingly. You may find that several trends develop where several participants noted the same feedback. It would be a good idea to focus on these items first and the remainder after these high priority items are resolved.
Part 5 - Look & Feel
Close out your evaluation tool by soliciting feedback on the color scheme used, readability of the text and font sizes, consistent use of a metaphor, theme or template through graphics from page to page at both high levels and deeper levels of the site. You could use statements like, "The text and lettering is legible and readable." Other items to test might be: "The colors were appropriate for the content and messages being presented," "Graphics and pictures were not distracting," or "The theme for the site was a good representation for the content."
Scoring Look & Feel
You can use the 1-5 scale (1=strongly disagree, 2=disagree, 3=agree somewhat, 4=agree or 5=strongly agree) and have the participant circle the statement that most closely matches their opinion about the statement. Provide room for more information about items that were scored 3 or lower and focus on these items to prioritize which changes to make.
Within each of the five parts, make room on the evaluation tool for the observer to make their notations or attach additional sheets. This is the document that will be used to tally all the results, so give your observers the room they need to express their observations clearly.
Now to the second part in preparing for your Website Usability Test...