Blog / Ux research

Your guide to the 4 types of usability testing

By Amanda Stockwell| min read|Updated Jun 24, 2024

A mobile app's user interface and an incentive.

Usability testing is one of the most commonly used user experience research methods. It’s a powerful tool to evaluate a product or service’s ease of use. 

While every usability test consists of observing users interact with designs, you can vary the context and framing to serve slightly different goals.

In this article, we’re going to focus on the following types of usability tests:

  • The traditional usability assessment

  • Comparative usability testing

  • Explorative usability testing

  • Validation testing/benchmarking

What is usability testing?

Regardless of methodology, all usability tests are a way to assess the functionality of a website, app, or other digital product or service by observing real users as they complete tasks within the UI. 

All usability tests can take place in-person or remote. 

  • In-person is considered the gold standard, because UX researchers have the opportunity to observe body language and any other nuanced dynamics at play when observing someone face-to-face. However, it’s not always feasible to hold usability tests in person, particularly if your userbase spans multiple countries. 

  • Remote testing is the more common route to take. Remote research tools have vastly improved over the years, and with modern tooling, remote testing is a valuable method for collecting high-quality data from users no matter where in the world they are. Remote testing is also more efficient and less costly. 

Usability test are often live-moderated. 

That means UX researchers are present in time with the participant, even if not physically in the same space. Moderated usability testing allows research participants to provide real-time feedback. It also gives moderators the opportunity to dig into user responses and behaviors. 

In unmoderated usability tests, UX researchers predefine the tasks and directions and participants complete on their own time. 

Unmoderated studies are best suited for straightforward tasks on higher-fidelity or live products, but can be very effective for gathering a large sample. 

Usability testing best practices

To ensure your usability tests yield valuable insights and actionable results, always be sure to:

  • Define the objectives: Clearly outline what you aim to learn from the test. Specific, measurable goals will focus your efforts and help you gather relevant data.

  • Conduct pilot testing: A trial run of your usability test will surface any issues with your test design or materials. This helps refine your approach before involving actual participants.

  • Prioritize user comfort: Create a welcoming environment for participants, whether in-person or remote. Explain the process thoroughly and reassure them that you're testing the product, not their abilities.

  • Prioritize findings with the data: After collecting data, focus on the most impactful issues. Prioritize the most common problems that impact key user flows.

  • Encourage feedback: Ask participants to verbalize their thoughts as they complete tasks. This provides valuable context for their actions and helps uncover underlying UX issues.

Now, let’s dive into the four categories of usability tests. 

Traditional usability assessment

A traditional usability test is aimed at assessing the ease of use of an interface or product. 

This could be a low-fidelity sketch or non-interactive design, a prototype, or a live product. You’ll often hear this referred to as the testing stimulus or artifact.

A session consists of a single participant being asked to perform a set of prescribed tasks with the stimulus. A moderator observes their behavior, listens to their feedback, and asks follow-up questions to understand or uncover the reasoning behind an action. 

The tasks are generally meant to represent commonly performed actions with the design or areas where the team has made changes or hypothesized issues. 

Each test effort should include at least 5 participants of the same kind so you can uncover the highest number of issues without diminishing returns.   

Depending on your goals, you might be interested in recording things like the frequency with which participants successfully complete a task, the average time it takes participants to complete the task, the alternative places participants look for an item, or places where participants get stuck. 

You may focus more on determining the efficacy or the efficiency but again, the primary objective is to determine how well a design works for the prescribed tasks and context and uncover areas that might need improvement. 

Comparative usability testing

While a traditional usability test evaluates just one interface, you might want to compare variations of the same design or different design directions to accomplish the same goal. 

Very often, this is done during the earlier stages of product development when the team is still exploring potential options, or when there is a team dispute about direction.

The set up for comparative tests is similar to a traditional usability tests in that you determine specific tasks to have users complete and observe them doing so. 

The difference here is that you have them complete the same task on more than one design. 

Ideally, you should have 2-3 significantly different solutions. More than that can be difficult for participants to keep track of and accurately assess. 

When conducting comparative usability tests, make sure:

  • Every participant does the same tasks on all designs 

  •  Moderators rotate the order of designs to offset potential bias. 

Note that you aren’t simply looking to see which design participants prefer.

You may end up hearing preferences, but you should have specific success criteria related to measurable things like task completion, speed to completion, number of errors, perceived usability, or perceived satisfaction. 

If a person says they prefer the look of one design but isn’t able to complete the task successfully, that might be something to explore in follow up research. 

Also, if you need to explore performance of a variation of a live design, you may want to consider something like a/b testing rather than this sort of comparative test. 

You also may not end up with 1 design that “wins” by outperforming others across the board; in that case, you will want to look at qualitative feedback and may combine the most successful elements of the different designs to perform yet another round of testing. 

Explorative usability testing

We mentioned that usability tests are centered around participants performing preset tasks. 

But what if you want to know how users act in real life?

You might want to turn to methods like diary studies or contextual inquiries, but you can also modify usability test efforts to be more generative and help you learn about your users’ behavior in the context of a proposed design. 

In this case, usability testing can be like a variation of concept testing, where you’re more looking for more general reactions.

To perform this hybrid kind of exploration, you’ll still want to have defined research goals and target areas, but may not have quite as prescribed tasks or success criteria. 

Instead of assessing a particular path, you’ll be looking to observe how a user will accomplish a general goal.

For instance, in a traditional usability test of an ecommerce site, you might have one task designed to evaluate the search feature and another for browsing categories. 

In a more exploratory version of usability testing, you could ask users to find a product that fits their needs, observing their path and probing to understand their choices, then asking them to rate their perceived ease of use. 

This approach helps you evaluate the overall product, gaining insights into their real behaviors and decision-making, rather than focusing on individual components of the interface. 

Validation Testing/Benchmarking

Validation can be sort of a dirty word in relation to usability tests, because it implies that you already have a perfect solution and you’re performing a usability test to prove that you’re correct.

That said, there might be times when you want to establish a baseline of usability metrics and then compare performance at a larger scale than a single comparative usability study, hoping to show the difference in performance. 

We usually call this benchmarking. This is most often relevant when you have a live version of a design and want to see if a new proposed design will be worth implementing, you want to compare your solution to a competitor or industry standard, or you want to track your product’s progress over time. 

This is a summative evaluation, where the goal is to assess a completed design’s performance rather than inform the design process. 

When you conduct benchmarking, you set a number of metrics that are useful to define success in your context (you may want to use SUS or UMUX as a starting point for perceived usability.) 

Then, collect this information for your live product or the item you want to compare, and collect the same information for the new design. You can then compare the data to help you determine how the two experiences compare.


The best kind of usability test for you is really going to come back to your research goals, so be sure to carefully think through what you’re trying to learn and match the method to your objective. 

Regardless of the kind of usability test you run, remember that it’s standard practice to compensate participants for their time and that an incentive strategy is an important component of your planning.

The simplest way to send incentives to research participants just about anywhere in the world is with Tremendous. Sign up now and send your first incentive in minutes, or chat with our sales team.


What is the difference between user testing and usability testing?

User testing encompasses various methods of evaluating a product with users, including interviews and surveys. Usability testing specifically focuses on observing users as they interact with a product to complete predefined tasks, assessing its ease of use and functionality.

How do you implement usability testing?

To implement usability testing, start by defining clear objectives and creating a test plan. Recruit appropriate participants, prepare your testing materials, and conduct the tests either in-person or remotely. Analyze the results, prioritize findings, and use the insights to improve your product's design and functionality.

When should you do usability testing?

Ideally, usability testing should be conducted throughout the product or service’s lifecycle. Early-stage testing can validate concepts, while later tests with live products can help refine specific features. Regular usability testing can help you catch issues early and ensure continuous improvement.

How long should a usability test last?

A usability test session can range anywhere for 15 minutes to an hour. Unmoderated sessions are typically shorter than moderated sessions. Your usability test should allow enough time to complete tasks and gather feedback. The exact length may vary depending on the complexity of your product and the number of tasks assigned.

How to do quick usability testing?

For quick usability testing, focus on a few key tasks that are fairly simple to complete. You’ll also want to keep the session short so participants do not get fatigued or overwhelmed. If your test is unmoderated, ensure that all tasks are explained clearly.

Who conducts usability testing?

Usability testing is practiced across various sectors. Tech companies, software developers, financial institutions, and healthcare providers all rely on usability testing to improve the functionality of their digital products. Of course, usability testing also extends to physical products like cars and household appliances to enhance the user experience.   

How much should I budget for a usability test?

The budget for a usability test can vary widely depending on factors like the number of participants, the target audience, and testing method (in-person tends to cost more than remote). You may be able to tamp down costs by offering quality participant incentives — a $60 reward with 2,000 redemption options will be more appealing than a $60 gift card that can only be used with a limited number of retailers.      

Published June 24, 2024

Updated June 24, 2024

Share this article