VWO Glossary

Proximity Principle

Ketan Pande — Tue, 21 Jan 2025 06:44:50 +0000

What is the proximity principle?

The proximity principle suggests that elements placed close together are perceived as part of the same group, sharing similar traits and attributes. This concept is part of Gestalt theory, which aids in optimizing digital experiences, creating architectural designs, building marketing strategies, selling products, and more.

For example, UX designers often place icons for various third-party payment gateways next to each other on an eCommerce checkout page. This arrangement allows visitors to quickly recognize familiar options.

Due to the principle of proximity, users also intuitively assume that unfamiliar icons represent additional payment gateways they can explore. This enhances clarity and builds trust during the checkout process.

Image source: WooCommerce

Applications of the proximity principle

Here are some applications of the proximity principle:

1. Architectural design

The proximity principle is essential in designing physical structures, infrastructures, and buildings that are easy to navigate. Grouping related elements together ensures smooth flow and convenience for visitors.

For instance, in a shopping mall, organizing different floors based on store types, such as clothing, entertainment, and dining, makes it easier for visitors to navigate through the large space without confusion.

2. Product placement

Many retailers use the proximity principle to organize and display products strategically, encouraging customers to buy more. For example, a store may create a dedicated “sugar-free” section, grouping chocolates, desserts, bread, and other items. This setup subtly nudges health-conscious shoppers to explore and purchase multiple products from the same category.

Image source: Gateway Candyland – Instagram

3. UX design

The principle of proximity plays a key role in UX design. It helps structure menus, highlight sections, improve content readability, and make key functions easily identifiable.

A great example is a website’s header menu. Here, navigation links directing users to different pages are grouped, while whitespace distinctly separates the search bar. This visual distinction helps users instantly differentiate between navigation menus and the search function, enhancing usability and user experience.

Image source: Semrush

4. Marketing and branding

Marketers use the proximity principle to align products with the ideas and beliefs of their target audience. A great example is Red Bull. The brand consistently places its product alongside extreme sports and adventure activities. Over time, this strategic association has ingrained Red Bull as a symbol of energy, thrill, and adrenaline-fueled experiences.

Image source: Red Bull

Proximity principle and conversion rate optimization (CRO)

The proximity principle can be a powerful tool in CRO efforts. By grouping related fields in lead generation forms, users can fill them out more easily. Similarly, organizing blog posts by topic on the homepage or grouping payment options during checkout can enhance user experience. These changes can then be tested to measure their impact on conversion rates.

For example, Serpent Forge, an independent men’s jewelry brand, ran an A/B test using VWO Testing on their product page. In the original design, the PayPal badge was placed directly above the add-to-cart button. The variation, however, displayed multiple payment option badges in the same position, grouped together.

The results were impressive, as the variation increased add-to-cart clicks by 82%.

It demonstrates how a simple change based on a physiological principle, like the proximity principle, can create significant positive change.

Final thoughts

In today’s digital world, improving user experience is more important than ever. A proven way to do this is by applying psychological principles like the proximity principle. Concepts like these can improve user interactions and conversion rates.

A platform like VWO makes it easy to test such concepts on your website or app with its intuitive features. To use VWO, you can request a free demo today and witness the impact for yourself.

The post Proximity Principle appeared first on VWO Glossary.

Baseline Conversion

Ketan Pande — Tue, 05 Nov 2024 06:43:52 +0000

What is baseline conversion?

The baseline conversion represents the conversion rate of a website or page that is not the result of recent marketing efforts, promotions, or optimization campaigns. For instance, if an ed-tech website has a 2% course enrollment rate from organic channels, that 2% becomes its baseline conversion.

Why is monitoring baseline conversion important?

Monitoring baseline conversion is crucial for several reasons:

Benchmark for improvement: It provides a clear view of your current digital performance and acts as a standard to measure the success of future optimization campaigns.
Strategic planning and prioritization: Regularly checking baseline conversion rates helps you plan and prioritize upcoming conversion optimization efforts.
Precision in experimentation: Baseline conversion is key in A/B testing, as it helps determine the sample size needed.

How do you find out your baseline conversion?

Determining the baseline conversion is quite simple. Here’s how you can do it:

First, select the metric you want to analyze for a baseline conversion. For example, if you’re interested in the add-to-cart rate, use your web or data analytics tool to calculate the percentage of visitors who click the add-to-cart button out of all visitors who view that page. This percentage represents your baseline conversion.

Most analytics platforms allow you to track this rate over time, letting you observe monthly comparisons and trend patterns.

Another approach is to do A/A testing using an A/B testing tool. In this test, traffic is split evenly, with each segment showing identical webpage versions. While A/A testing primarily verifies the statistical reliability of the testing tool, it can also provide a baseline conversion since it measures conversions without any actual changes to the page.

Tips to improve the baseline conversion

Once you’ve calculated your baseline conversion, the next step is comparing it to industry benchmarks. If your rate falls short, here are some strategies to boost it:

a. Experimentation

Run A/B tests to see what works best. Split your traffic to show one group the original version and another group a modified experience.

A/B testing can be done on any platform—web, mobile, or app—and allows you to measure how changes impact conversions. If you see a significant uplift, you can confidently deploy the improved version for all users.

b. Personalization

Personalization is also another effective method. Tailoring your copy, web elements, widgets, and sections based on user segments—such as location, age, or buying habits—enables you to create a more contextual and relevant experience, which can lead to higher conversions.

c. Qualitative research

While quantitative data shows that conversion rates are low, qualitative research explains why it’s low. Tools like heatmaps, session recordings, form analytics, and on-page surveys reveal visitor pain points and frustrations. This insight helps generate targeted ideas for optimizing the user experience and boosting conversions.

By combining these strategies, you can take meaningful steps to improve your baseline conversion effectively.

Improve baseline conversion with VWO

VWO, the digital experience optimization platform, offers multiple tools to help you improve baseline conversion. Here are some powerful capabilities in VWO you can leverage:

a. VWO Insights

With VWO Insights, you can pinpoint areas causing low conversions using heatmaps, session recordings, on-page surveys, and form analytics. These insights provide a clear view of visitor frustrations and help identify solutions that drive better conversions.

b. VWO Testing

VWO Testing lets you run A/B tests across mobile, app, and web devices. This popular experimentation feature allows you to test and validate optimization ideas, enabling safe and effective improvements to your baseline conversion.

c. VWO Web Rollouts

VWO Web Rollouts allow for cosmetic changes that enhance baseline performance. An intuitive drag-and-drop editor makes it easy to deploy these tweaks.

d. VWO Personalize

VWO Personalize lets you tailor experiences for specific visitor segments. This feature allows you to use a drag-and-drop editor to deliver personalized nudges that encourage more visitors to convert.

e. VWO Feature Management

If you’re considering more substantial changes, like a new search algorithm. Then VWO Feature Management helps you safely test complex updates that could significantly uplift baseline conversions.

Ready to elevate your baseline conversion? Start a 30-day all-inclusive free trial today and explore VWO’s capabilities firsthand!

Conclusion

Improving your baseline conversion rate is essential for a successful digital marketing strategy. Start by calculating this baseline through your web analytics tool. Once identified, you can enhance it through strategies like A/B testing and personalization. Tools like VWO simplify this process, enabling you to manage all aspects of conversion optimization from a single dashboard.

The post Baseline Conversion appeared first on VWO Glossary.

North Star Metric (NSM)

Aastha Trehan — Mon, 28 Oct 2024 06:22:52 +0000

A north star metric (NSM) is a single, critical measurement that reflects the long-term value a company provides to its customers. It effectively represents your product’s success and captures the essence of the core value your product delivers.

By focusing on this singular metric, teams can align their strategies and actions, fostering a culture of accountability and driving collaboration across departments.

This alignment ensures that every team member, from marketing to product development, is working towards the same goal. This leads to enhanced customer value, sustained growth, and a competitive advantage in the marketplace.

Image source: LinkedIn

Importance of NSM

NSM provides a vital strategic focus for businesses and helps steer decisions at every level. Here are the key reasons why having a clear NSM is so critical to your business:

1. Focus

A defined NSM helps your teams prioritize their efforts and allocate resources where they matter the most. By concentrating on a single, critical measure, teams avoid distractions and focus on delivering the most impact.

For example, Netflix identifies its NSM as time spent watching content per user. With this focus, Netflix prioritizes initiatives like improving content recommendations and optimizing user experience, rather than getting sidetracked by other secondary metrics like app downloads.

Image source: Medium

2. Alignment

The NSM ensures that all departments and teams are working towards a unified goal. This shared objective fosters collaboration and prevents the isolating of efforts that can detract from overall company growth.

For example, a SaaS company might define its NSM as the number of active monthly users. All departments like product, marketing, and customer success should work toward increasing this metric. Product teams would focus on user-friendly features, while marketing teams will create campaigns to onboard more users, and customer success will ensure users remain engaged.

3. Simplicity

Businesses often drown in a sea of metrics, but an NSM simplifies your approach. By boiling down success to a single metric, it becomes easier for everyone, from executives to entry-level employees, to understand what’s driving the company’s growth.

4. Long-term vision

While many metrics focus on short-term performance, the north star metric encourages a more sustainable and forward-looking approach. It aligns the company’s efforts with long-term growth by continually improving the core value your product offers to customers.

For example, Duolingo’s NSM is the number of badges acquired by language learners, encouraging the company to focus on user retention and learning outcomes rather than just sign-ups.

Image source: Geeknack

A comparative overview: NSM vs KPIs vs OMTM

Businesses often have multiple metrics to track performance, but it’s important to distinguish between different types of measures, each serving unique roles:

Aspect	North Star Metric (NSM)	Key Performance Indicators (KPIs)	One Metric That Matters (OMTM)
Definition	The single most important metric that captures your product’s core value.	A set of quantifiable measures used to track specific business goals or performance.	A metric that the company focuses on for a specific period, often based on immediate priorities.
Scope	Broad and long-term, covering the entire business and its growth strategy.	Department-specific or project-specific, often focusing on specific areas of performance.	Narrow focus on a single, high-priority metric at a time, which may change based on shifting business needs.
Purpose	Aligns the whole organization towards delivering long-term value to customers.	Monitors various aspects of business performance, helping track progress towards goals.	Highlights the most important metric to focus on right now for tactical improvements.
Focus	Value delivered to customers that drives overall business growth.	Tracking and improving performance in specific areas such as sales, marketing, customer support, etc.	A temporary metric that may relate to a specific initiative, experiment, or stage of growth.
Organizational Impact	Aligns the entire organization around a single goal.	Helps teams and departments measure and manage their performance.	Focuses the entire company on solving one critical issue or seizing one key opportunity.
Examples	Slack: Number of daily messages exchanged between team members.	Slack: Daily Active Users (DAUs), Average Time Spent in Channels, Channel Engagement Rate, User Retention Rate	Slack: Weekly Active Users who send at least five messages a day.

For effective integration, NSM, KPIs, and OMTM must work in harmony. The NSM sets the overarching goal, giving the organization a clear direction.

KPIs then break this down into actionable, department-specific metrics that contribute to the NSM, ensuring every team’s performance is aligned with the long-term objective.

The OMTM provides immediate focus on short-term priorities, acting as a tactical lever to accelerate progress on key KPIs, which in turn, push the NSM forward.

Common NSMs across industries

Below are some common NSMs across different industries:

1. eCommerce: Gross Merchandise Value (GMV)

Why it works: Reflects the total value of goods sold and the platform’s ability to generate revenue from transactions.

2. SaaS: Monthly Recurring Revenue (MRR)

Why it works: Shows the consistency and growth of subscription-based revenue, a crucial indicator of business health.

3. Media: Daily Active Users (DAUs)

Why it works: Measures engagement, which drives ad revenue and content consumption.

For instance, Spotify’s north star metric is time spent listening per user, a critical indicator of engagement.

Image source: Grow with Ward

4. Fintech: Total Payment Volume (TPV)

Why it works: Indicates the overall volume of financial transactions handled, a direct measure of growth and customer trust.

5. Healthcare Tech: Number of appointments booked

Why it works: Tracks how well the platform connects users with healthcare providers, a core value proposition.

6. Education Tech (EdTech): Course completion rate

Why it works: Reflects the effectiveness of the platform in engaging users and delivering educational outcomes.

7. Travel: Bookings made or trips taken

Why it works: Captures the platform’s core activity of travel bookings by user, indicating customer value and business growth.

Image source: LinkedIn

How NSM guides A/B testing and Experimentation

The NSM plays a crucial role by providing direction and focus. By linking experimentation efforts to the north star metric, businesses can prioritize tests that have the most potential to impact this key performance indicator (KPI), driving meaningful growth and customer value.

1. Prioritizing high-impact experiments

When A/B tests are designed, there are often dozens or even hundreds of potential ideas to explore. By ensuring that experiments align with the NSM, businesses can focus on initiatives that directly contribute to their core value proposition.

For example, if a company’s NSM is monthly active users (MAUs), experiments that improve user engagement, reduce churn, or enhance the onboarding experience would take precedence over those that don’t have a direct impact on active user growth.

2. Maintaining focus across experimentation efforts

Experimentation can often become fragmented, with teams testing various elements of the product or user experience. When all experiments are aligned with the NSM, there’s a unified direction across departments.

This avoids scattered efforts and ensures that every test has a clear objective: to improve the metric that drives the core value of the business.

For example, if the NSM for a SaaS company is product usage depth, resources can be focused on running A/B tests related to feature improvements, user flows, or interface design to drive this metric.

3. Measuring the long-term impact of experiments

While some A/B tests deliver immediate results in terms of short-term metrics, the north star metric ensures that experiments are evaluated based on their long-term impact.

Instead of focusing solely on quick wins, businesses can use the NSM as a benchmark to measure whether experiments lead to sustainable growth and improvements in customer value over time.

Consider an eCommerce platform that might run an A/B test to improve checkout completion rates. While a test might boost short-term conversions, it’s important to also evaluate whether this change positively impacts the NSM, such as customer lifetime value (CLV), in the long run.

Explore A/B testing with a free 30-day trial!

Actionable next steps for your business

1. Assess your current metrics and their alignment with your core value proposition

Start by reviewing the metrics you’re currently tracking. Ensure they reflect the true value your product delivers to users and how closely they align with long-term business goals.

2. Engage stakeholders in discussions about potential north star metrics

Involve key stakeholders from all relevant departments – marketing, product, engineering, and customer success in a collaborative discussion about which metrics best capture your product’s value.

3. Choose a metric and implement tracking systems

Select the most relevant and impactful metric. Set up the necessary tracking systems and ensure your analytics tools can measure and report on this metric effectively.

4. Align teams and processes around your new NSM

Communicate the importance of the NSM across teams. Ensure everyone understands how their work contributes to improving the NSM and ultimately, the company’s growth.

5. Regularly review and refine your approach

Continuously monitor how well the NSM reflects the company’s goals and customer value. Be prepared to adjust or evolve the NSM if business needs or customer behavior changes.

6. Ensure continuous feedback loops for improvement

Create feedback mechanisms within teams to identify opportunities for improvement. Use these insights to make data-driven decisions that enhance the customer experience and drive the NSM forward.

Image source: LinkedIn

Ready to find your north star metric?

Now that you have a roadmap to choosing, implementing, and refining your north star metric, it’s time to take action.

Start by assessing your product’s value proposition, involve your stakeholders, and set your business on a course to drive meaningful growth through your NSM. Make the commitment to focus on what truly matters and transform your approach to long-term success.

The post North Star Metric (NSM) appeared first on VWO Glossary.

ROPE

Aastha Trehan — Thu, 29 Aug 2024 06:08:46 +0000

The Region of Practical Equivalence (ROPE)

The Region of Practical Equivalence refers to a statistical concept that defines an area within which any observed differences are practically considered unimportant. Said another way, it’s like a buffer zone around a baseline value inside which changes are considered too small to matter in a real-world context, even when they attain statistical significance.

In the context of experimentation, ROPE helps determine when a difference between the control and the variation is so minor that it can be considered practically equivalent, even if it is statistically significant. This approach optimizes the testing process by quickly implementing effective changes and stopping variations that barely outperform the baseline.

What is ROPE?

Understanding ROPE with an example

Imagine you own an eCommerce platform and your present conversion rate is 40%. You determine that for all practical purposes, any conversion rate within the range of 38% to 42% is considered equivalent to 40% for your business. (We have explained how to determine this in the following sections.)

This range—from 38% to 42%—is your ROPE.

Here’s how you do it:

Find the ends: State the lower and upper limits within which your ROPE lies. In this case, 38% and 42%.

Calculate the difference: Find the difference between these limits and the baseline. Here it is ±2%.

Normalize the Difference: Divide the difference by the baseline conversion rate (40%)

± 2 / (40%) = ± 5

So, a ±5% range around your baseline is your ROPE.

Significance of ROPE

ROPE helps save visitors from insignificant changes by closing them early. This approach helps save valuable visitors from being exposed to changes that aren’t likely to yield meaningful improvements.

As a tradeoff, you invest slightly more visitors on better variations so that you can deploy them with increased accuracy. Overall, since winning ideas are rarer and most ideas are insignificant, you save visitors significantly on average.

The wider the ROPE region, the more visitors you save. Larger ROPE means more accurate winners (in exchange for extra visitors) and early stopping of variations that do not have potential.

Minimize false positives

Random variations in your data may sometimes look like a trend of important changes. ROPE steps in to protect against these scenarios, making sure that actions will be taken only on meaningful improvements.

Suppose you’re running an ad campaign, and there is a spike in website traffic while the marketing campaign is running. If proper statistical bounds are not established, this spike may be mistaken for a successful campaign when the real reason could be other external factors such as a holiday or another viral post on social media. ROPE differentiates ‘real’ actionable improvement and random fluctuations.

Factors to consider while setting ROPE

Defaults

Reasonable default values to start with for a ROPE might be a conservative value, say, ±1%, especially if you are new to the idea. This will reduce your false positives and give you the benefits of early closing.

Later, as you start using ROPE, you can increase the ROPE value for faster closing of tests. However, doing so means you might miss out on detecting small but potentially valuable improvements within the ROPE region.

Essentially, the trade-off is between more rapid test closures and the risk of overlooking minor improvements.

Business context

Different businesses may have different thresholds for what they consider to be a meaningful change.

For example, some businesses may require very small improvements to be considered meaningful, while others may only consider larger improvements to be significant. Your understanding of what qualifies as a meaningful change in your particular context will determine the appropriate value for your ROPE.

Let’s take an example of an eCommerce retail company:

A company selling low-margin products (like groceries) might consider even a small percentage increase in conversion rate as meaningful. A 1% lift in conversion rate could translate to a significant increase in overall revenue due to high sales volume. The ROPE would be set accordingly, perhaps from -0.5% to +0.5%. Any improvement outside this range would be considered significant. In this case, small changes can be meaningful due to high transaction volume. Hence, ROPE is narrower to detect these smaller, yet significant, improvements.

But if the same store sells luxury goods, it may require a larger percentage increase in average order value to be considered meaningful. Given the higher profit margins, a smaller absolute increase in revenue might still represent a substantial improvement. A luxury goods retailer might require a 5% increase in average order value to be meaningful. Hence, ROPE would be wider, maybe from -2% to +2%, to accommodate the fact that smaller percentage changes have a lesser impact in absolute terms.

Iterative refinement

When conducting continuous testing, you may start with a much wider ROPE and then refine it based on the outcomes of the experiments. This adaptive mechanism will make your ROPE closer to the real business effect.

For example, for unoptimized and new webpages, you can aim for larger ROPE values since they should target bigger uplifts.

Optimized webpages with many visitors can benefit from small improvements as well and hence should keep a smaller ROPE.

High-traffic pages of your website should have lower ROPE values since smaller uplifts can be valuable for them. Low-traffic pages should have higher ROPE values so that early stopping can help save visitors and time.

ROPE in VWO

The good news is that ROPE has been integrated into VWO’s Statistical Engine. ROPE enables quicker decision-making since the stats engine can now recommend disabling a variation when it is unlikely to outperform the baseline. This means you will enjoy all the benefits discussed in this article and can rely on smarter and more accurate results for every test you run. Take a free trial with VWO now.

The post ROPE appeared first on VWO Glossary.

CUPED

Aastha Trehan — Thu, 08 Aug 2024 08:08:44 +0000

Controlled experiment using pre-experiment data (CUPED) is a variance reduction technique used in A/B testing.

Developed by Microsoft’s data science team in the early 2010s, CUPED was created to address the need for more efficient A/B testing on platforms like Bing and Microsoft Office. Since its inception, this technique has become regular within the A/B testing and optimization communities due to its ability to reduce variance.

How does CUPED work?

Let’s try to understand how CUPED works through an example. Suppose you run an online store and want to test a new checkout process. You set up an A/B test where half of your visitors see the new checkout process (Group B) and the other half see the current one (Group A). The goal is to determine if the new checkout process leads to more completed purchases.

Before starting the test, you already have extensive data about your visitors’ behavior. For instance, you know how many purchases each visitor made in the month prior to the test. Here’s where CUPED comes into play. For each visitor in both Group A and Group B, CUPED gathers data on their purchase behavior from the previous month. As the test runs, it counts the purchases each group makes during the test period. However, instead of just comparing the raw numbers, CUPED adjusts these figures based on an increase or decrease in the numbers compared to the last month in the control group and the variation group.

Without CUPED, if Group A (current checkout) averages 10 purchases and Group B (new checkout) averages 12 purchases after the test, you might conclude that the new checkout is slightly better. But with CUPED, you adjust these numbers using the pre-experiment data. Perhaps Group A’s visitors made an average of 4 purchases, and Group B’s visitors made an average of 2 purchases before the test. After adjusting for this pre-experiment data, you might find that Group B’s improvement is even more significant.

Thus, CUPED helps you make your A/B tests more accurate and reliable by factoring in what you already know about your visitors.

Benefits of CUPED

Here are the benefits of using CUPED to make your A/B tests more accurate and reliable:

CUPED leverages pre-experiment data to control for natural variations in your visitors’ behavior. This means that if there’s a genuine difference between your test groups, CUPED makes it easier to spot. For instance, if your new checkout process is indeed better, CUPED will help you see that improvement more clearly.
Reaching statistical significance requires a large number of visitors. However, with CUPED, you can achieve meaningful conclusions with fewer visitors because it reduces the “noise” from natural variations. This makes your tests more efficient and less resource-intensive.

Limitations of using CUPED

While CUPED offers significant benefits, it’s important to understand its limitations. Here are two key points to keep in mind:

CUPED relies on pre-experiment data to reduce variance and improve the accuracy of your test results. This means it can only be used with visitors who have been to your site before. If you have a lot of new visitors, CUPED won’t be effective because there’s no past data to leverage.

It is not effective for binary metrics, like conversion rates, because it relies on continuous data (such as the number of purchases) to adjust for pre-experiment differences. This makes it less suitable for scenarios where you’re measuring simple yes/no outcomes.

Conclusion

In conclusion, CUPED is a powerful technique that leverages pre-experiment data to enhance the accuracy and efficiency of A/B testing. It helps control variance and enables you to draw meaningful conclusions with fewer participants. However, keep in mind that CUPED is only effective with past visitors and not be suitable for binary metrics.

The post CUPED appeared first on VWO Glossary.

Variance

Aastha Trehan — Wed, 03 Jul 2024 08:58:59 +0000

What is variance?

Variance measures the spread of a dataset by quantifying how much a set of values differs from the mean. A higher variance indicates a more spread-out dataset, while a lower variance indicates values are closer to the mean.

Variance helps in understanding the consistency of the dataset and its spread, which helps in making better decisions.

Here are simple images that explain dataset variance:

The left graph shows a high-variance dataset with more erratic values. The right graph illustrates a low-variance data set where the values are more consistent.

How do we calculate variance?

To calculate the variance in a sample, you start by subtracting each value from the mean and then squaring the result. This process is repeated for all values in the data set. Next, you sum all these squared differences. Finally, you divide this sum by the number of values in the data set minus one. The square root of the result is the variance of the sample.

Where:

xi = Each value in the data set

x = Mean of all values in the data set

N = Number of values in the data set

The variance in a population is calculated slightly differently, the denominator changes from N – 1 to N.

Variance can be calculated in software like Google Sheets using various functions. Here’s a quick guide to the different variance functions:

VAR.P: Calculates the variance for an entire population, using only numerical data.
VAR.S: Calculates the variance for a sample, using only numerical data.
VARA: Calculates the variance for a sample, including numerical values, text strings (treated as 0), and logical values (TRUE as 1, FALSE as 0).
VARPA: Calculates the variance for an entire population, including numerical values, text strings (treated as 0), and logical values (TRUE as 1, FALSE as 0).

Use these functions depending on whether you are working with a sample or the entire population and whether your dataset includes mixed data types.

Variance in A/B testing

When conducting A/B testing, we compare the average of a metric (such as spending or conversion) between two distinct groups. We also use the standard error, which indicates how much the average conversion rate might vary if the experiment is repeated multiple times. It is calculated as the square root of the variance, so a higher variance results in a higher standard error. A higher standard error means more uncertainty in our estimate of the true conversion rate.

However, you can reduce the standard error using the following methods.

One of the simplest ways to reduce standard error in A/B testing is by increasing the sample size, as a larger sample size tends to produce a distribution that closely resembles a normal distribution. However, practical constraints often prevent us from increasing the sample size.
Splitting the sample size evenly between the control and variation groups (50-50%) in an A/B test can reduce the impact of variance in the dataset and help achieve statistical significance more quickly. An unequal sample size can increase the chances of variance in the sample size with a lower size.
Normalizing outliers is another effective method to reduce standard error. For instance, when conducting an A/B test on a segment filtered by cost per head, you can improve the accuracy of your results by excluding customer data with exceptionally high or low costs per head.
CUPED, or Controlled-experiment Using Pre-Existing Data, is another technique in A/B testing that uses data from before the experiment to account for natural variations in user behavior. This reduces the standard error in your results. By considering how users behave on your site beforehand (such as their usual spending habits), CUPED helps smooth out natural fluctuations in behavior, making it easier to see the true impact of your new layout.

Conclusion

Variance measures a dataset’s spread and influences statistical analysis’s accuracy. In A/B testing, reducing standard error by increasing sample size, equalizing sample distribution, and normalizing outliers can lead to more reliable and meaningful results. By grasping these concepts and applying appropriate techniques, we can enhance the accuracy and reliability of our data analysis and decision-making processes.

The post Variance appeared first on VWO Glossary.

False Positive Rate

Aastha Trehan — Wed, 27 Mar 2024 09:31:03 +0000

A false positive happens when a test or experiment wrongly shows that a variant is a winner or a loser when actually there is no impact on the target metric. It’s like getting a wrong answer on a test, making you think you’re right when you’re actually wrong. In testing or experiments, false positives can lead to mistaken conclusions and decisions.

Please note: False positives show up as Type-1 errors in A/B testing.

What is a false positive rate?

The false positive rate (FPR) is a critical metric that reveals how frequently a phenomenon is mistakenly identified as statistically significant when it’s not. This measure is vital as it indicates the reliability of a test or outcome. A lower false positive rate signifies higher accuracy and trustworthiness of the test.

Where:

FP represents the number of false positives.
TN represents the number of true negatives or the number of winners received among all the tests that did not have any improvement.

Example of false positive rates

Imagine a newly developed diagnostic test aimed at detecting a rare genetic disorder. To gauge its accuracy, 1000 seemingly healthy individuals from diverse demographics and geographical areas undergo the test. Upon analysis, it’s discovered that out of these 1000 individuals, the test incorrectly identifies 20 as having the genetic disorder. This results in a false positive rate of 2%. Despite being healthy, these individuals are wrongly flagged by the test. Such simulated assessments offer vital insights into the efficacy of medical tests, aiding healthcare professionals in assessing their real-world reliability and effectiveness.

Why is evaluating the false positive rate important?

The accuracy of the statistical model is heavily reliant on the false positive rate, making it imperative to maintain a careful balance.

In medical diagnostics, a high false positive rate can erroneously categorize healthy individuals as having a disease.

Within finance, false positives manifest in fraud detection systems and credit scoring models. Elevated false positive rates can result in legitimate transactions being flagged as fraudulent.

Cybersecurity tools are susceptible to false positives, which can inundate security analysts with alerts, leading to alert fatigue. Excessive false alerts may cause analysts to overlook genuine threats.

False positives within quality control processes may lead to the rejection of acceptable products, escalating manufacturing costs and diminishing efficiency.

The ramifications of false positives vary across these domains, contingent upon the specific context and repercussions of inaccurate outcomes. Broadly, a heightened false positive rate can squander resources, impair efficiency, undermine trust in systems or models, and potentially yield adverse consequences for individuals or organizations.

False positive rate in A/B testing

The false positive rate poses a significant risk in A/B testing scenarios, where businesses compare different website or app versions to determine which performs better. When the false positive rate is high, the A/B test takes longer to conclude and get statistical significance.

To bolster the reliability and effectiveness of A/B testing software while minimizing false positives, it’s prudent to lower the false positive rate threshold. Typically set at 5% in A/B testing, reducing it to 1% can enhance test accuracy and reduce false positives. Platforms like VWO utilize the Probability to Beat the Baseline (PTBB) to control the false positive rate, if the PTBB is 99% then the FPR is 1%.

Conclusion

In conclusion, the false positive rate is a critical metric that impacts various domains, including medical diagnostics, finance, cybersecurity, and quality control processes. High false positive rates can lead to erroneous decisions, squander resources, and undermine trust in systems or models.

Platforms like VWO leverage PTBB to mitigate the threat of false positive rates. If you want to know more about it, grab a 30-day free trial of the VWO platform to explore all its capabilities.

The post False Positive Rate appeared first on VWO Glossary.

Sequential Testing Correction

Aastha Trehan — Thu, 14 Mar 2024 09:34:57 +0000

Sequential testing is a statistical method used to analyze data as it is collected, ensuring decisions are made in a step-by-step way rather than waiting until all data is collected.

This can help to reduce the time and resources needed for experimentation, particularly in situations where the outcome becomes clear before all data is collected.

Let’s say you’re a product manager for an eCommerce website, and you’re planning to roll out a new feature aimed at increasing conversion rates. However, your website has limited traffic, and acquiring additional traffic through advertising campaigns is costly. In such a case, sequential testing would be ideal.

You could implement the new feature and use sequential testing to monitor its performance. If the feature shows significant positive results early on, you can conclude the test sooner and roll out the feature to all visitors, saving time and resources. On the other hand, if the feature doesn’t perform as expected, you can stop the test early, preventing further investment of resources in an ineffective feature.

Sequential testing correction encompasses methods aimed at preventing the issues that arise from sequential testing, such as false conclusions when interpreting interim results. Sometimes, sequential testing may heighten the risk of erroneously concluding a variation to be better when it isn’t (a false positive). Sequential testing correction mitigates this risk by adjusting the threshold of confidence necessary before finalizing significance.

Fixed horizon tests vs Sequential tests

In contrast to sequential tests, fixed horizon tests have both sample sizes and experiment goals predetermined. Conclusions can only be drawn upon completion of the review period. This approach generally provides a higher level of statistical trustworthiness but at the cost of higher traffic being used for each experiment.

Why are Sequential tests more suited to modern A/B testing?

In recent years, sequential tests have become increasingly popular, enabling continuous data collection. Here are some reasons why it is more suited to modern A/B testing:

Efficiency

By implementing sequential testing, organizations can quickly identify potential disadvantageous ideas or content at an early stage of development before they are fully implemented or exposed to a large audience. Organizations can effectively allocate resources and minimize the overall costs associated with implementing such ideas. This helps businesses make informed decisions, such as releasing a big feature before a major event, in fast-paced digital environments.

Flexibility

Modern businesses need experimentation to be visitor efficient so that A/B testing can be done on pages with low traffic as well. With sequential testing, sample sizes are not fixed, offering the option to stop the experiment early if significant results are observed or to continue until reaching a predetermined endpoint, accommodating varying traffic levels and experiment durations.

What are the problems caused by sequential testing?

Despite having benefits, sequential testing may also pose problems for businesses.

It may seem counterintuitive, but whenever statistical results are calculated multiple times, there is a risk of increasing the false positive rate.

This is the main concern with continuously monitoring A/B test statistics. Therefore, several solutions have been proposed to sequentially correct test statistics and reduce the occurrence of false positives in sequential testing

How do you correct errors from sequential testing?

There are a couple of ways of correcting errors in sequential testing. They are as follows:

Bonferroni corrections

False positive rates increase linearly with the number of interim checks you make. The most simple solution is to divide your false positive rate by the number of interim checks you are making.

So, if you need a 5% false positive rate, and you are making 10 interim analyses, set the false positive rate of the test to be 5/10 = 0.5%. This is the Bonferroni correction.

Always valid inference

This method allows for continuous testing during data collection without determining in advance when to stop or how many interim analyses to conduct. This approach offers flexibility, as it doesn’t require prior knowledge of sample size and supports both streaming and batch data processing.

Always Valid Inference isn’t popular because it’s complex to grasp and significantly compromises statistical power. This implies that detecting a winner will take significantly longer when one actually exists.

To simplify the testing process and allow you to focus on running tests and obtaining early results without concern for skewed outcomes, VWO uses a derivative of an approach called Alpha-Spending to correct Sequential Testing by Lan and DeMets.

The alpha-spending approach involves distributing the type I error (alpha) across the duration of a sequential A/B test. With this approach, alpha can be allocated flexibly across the selected peek times, and it is only utilized when peeking occurs. If a peek is skipped, the unused alpha can be retained for future use. Additionally, there is no need to predetermine the number of tests or the timing of their execution during data collection.

By selecting Sequential Testing Correction in the SmartStats Configuration, decision probabilities will be adjusted to minimize errors while monitoring test results during data collection in the new test reports.

If you prioritize obtaining reliable test results and desire greater control over test statistics, consider using VWO, where our testing system is designed to meet your advanced needs.

The post Sequential Testing Correction appeared first on VWO Glossary.

Inverse Metrics

Aastha Trehan — Tue, 06 Feb 2024 11:42:21 +0000

Inverse metrics on a website are deemed more favorable when their values decrease.

For instance, if you notice an increase in the drop-off rate on your website’s cart page using analytics tools, and the heatmap analysis confirms the same, you might run a test to ‘reduce’ the drop-off. Ideally, you want the checkout rate to increase while the drop-off rate decreases.

In this example, the drop-off rate is the inverse metric you aim to decrease. A reduction in the drop-off rate can contribute to an increase in other crucial metrics, indicating that visitors are taking desired actions on your website and leading to an uplift in conversions for your business.

What are some inverse metrics?

Whether you want to improve conversions, introduce a new feature, or investigate navigation bottlenecks on your website, tracking inverse metrics is important to understand where visitors encounter problems and to find ways to reduce their values. Here are some inverse metrics you should watch out for:

Page load time

The page load time is an inverse metric because the lower it is, the better the visitor experience on a website. Consequently, maintaining a low page load time helps control other inverse metrics, such as bounce rates.

Bounce rate

Bounce rate is the percentage of visitors leaving after viewing one page on a website. It is important to maintain a low bounce rate to encourage visitors to explore further and move down the conversion funnel on your website.

Refund rate

Refund rate represents the percentage of customers requesting refunds for products or services. A lower refund rate suggests customer satisfaction, good product quality, and effective marketing, all of which are positive indicators for a business.

Customer support tickets

A decrease in the number of customer support tickets indicates that visitors are experiencing fewer issues or challenges with the products or services offered by a business. This could indicate improved product quality, clearer instructions, intuitive features, or the proactive resolution of common customer pain points.

Form abandonment rate

When visitors abandon web forms midway, it indicates that they found the form-filling process to be a hassle. You can monitor the field-level friction points through form analytics. A clear and intuitive form design encourages visitors to smoothly progress through the required fields.

Cart abandonment rate

A higher cart abandonment rate suggests that visitors are dropping off before completing their purchases, signaling friction in the conversion funnel. Do you want to learn effective methods for minimizing cart abandonment on your website? Download our eBook for valuable frameworks, tips, and real-world examples to guide you through the process.

Cost per acquisition

A lower Cost Per Acquisition (CPA) is desirable because it means a business is acquiring customers at a lower cost, improving profits and returns. Businesses can prioritize high-return channels to acquire new customers, nurture relationships with existing customers, and implement customer retention strategies to bring down CPA.

Businesses successfully reducing inverse metrics

Businesses actively strive to keep inverse metrics under check because a reduction in these values will indicate an improvement in visitor engagement and experience on their websites. Here are some brands that strategized to control inverse metrics and saw improvement in conversion metrics:

ReplaceDirect, a Dutch eCommerce site, revamped the second stage of the checkout process by adding an order overview showing the products, total costs, and delivery date. The layouts of the page and the form were changed for a cleaner look, and unnecessary fields were removed. It decreased the cart abandonment rate by 25% and increased sales by 12%.

MedaliaArt, an online art gallery, conducted a split URL test where they created two new versions of homepages with a holiday sale banner displayed at different locations – one at the top and another on the right. They wanted to track which variation could help reduce the bounce rate on the website. Variation 1, which showed the banner prominently at the top, was a winner, reducing the bounce rate by 21%.

POSist, an online restaurant management platform, wanted to increase the number of sign-ups for a demo of their platform. The team started with homepage improvements to figure out ways to reduce the drop-off on the website. They also reduced the loading time and enhanced the overall performance of their website to ensure faster loading on all devices and platforms. This optimization resulted in a 15.45% increase in visits to the contact page. Moreover, these changes addressed fundamental issues and laid the foundation for a couple of other tests that increased demo requests by 52%.

The lower the values of inverse metrics, the better the visitor experience. If you’re wondering where to start making changes to keep these metrics in check, VWO can help. With VWO, you can derive insights from visitor behavior, identify friction areas, run tests, and implement changes to control inverse metrics.

In fact, VWO recently introduced two powerful metrics – time spent on page and bounce rate. These metrics reveal how visitors behave, enabling increased engagement and better conversions on a website. In experiments where the bounce rate serves as a metric, VWO views lower bounce rate conversions as a sign of improved performance. To explore all the features of VWO, sign up for a free trial.

The post Inverse Metrics appeared first on VWO Glossary.

Guardrail Metric

Aastha Trehan — Mon, 05 Feb 2024 12:49:25 +0000

What is a guardrail metric?

Guardrail metrics are the business metrics that you don’t want to see negatively impacted while conducting experiments like A/B tests. The guardrail metric setting acts as a safety net, ensuring that while you’re focusing on enhancing certain aspects of your business, you’re not inadvertently causing harm to another critical metric essential for overall success.

An organization can establish guardrail metrics common to all teams to prevent negative impacts during experiments. Additionally, different teams can publicly share their key metrics and request to set them as guardrails to avoid causing harm. For instance, the web performance team may share their key metric like website speed threshold, which the marketing team can set as a guardrail metric when conducting an A/B test.

Comic strip on guardrail metric

Example of guardrail metric

Let’s imagine a scenario where a SaaS website is conducting an A/B test to improve scroll depth on its landing page. The original design of the landing page is as follows:

The A/B test involved testing a variation with a scroll-down feature for the “know feature” text in the first fold. To safeguard against unintended consequences, a guardrail metric was established to ensure the visibility and effectiveness of the “Book demo” call-to-action (CTA) in the first fold remained prominent and unaffected.

Throughout the test phase, the team meticulously analyzed user engagement metrics, and conversion rates, and gathered feedback. After a few weeks of experimentation, the data revealed a remarkable 20% boost in user scroll depth. Importantly, this increase was achieved without compromising the visibility or effectiveness above a threshold of the critical “Book demo” CTA. The successful outcome showcased a well-balanced approach, achieving increased scroll depth while ensuring there was no negative impact on the guardrail metric.

Types of guardrail metrics

To secure a continuous enhancement of your website or digital touchpoint experience while safeguarding your ROI, it’s crucial to monitor different types of guardrail metrics. Here are the types of guardrail metrics you should keep an eye on:

Financial metrics that have a direct impact on the revenue generated through your digital touchpoint, such as the checkout button click-through rate (CTR).
Metrics that track user experience, including engagement rate, scroll depth, time duration, and CTR, website speed.
The business-specific metric that changes at specific time intervals, for example, a business quarter aim, might be to reduce churn and track metrics that measure engagement from existing customers.

Benefits of using guardrail metrics

Setting a guardrail metric for an experimentation campaign offers key advantages:

a. Risk-averse approach

It maintains a risk-averse approach while enabling improvements, ensuring a balance in performance for your key business objectives.

b. Complex relationship insights

It facilitates the understanding of complex relationships between various parameters that may be overlooked during hypothesis creation.

c. Coordination between teams

An organization can ensure that individual teams working to improve respective key metrics don’t inadvertently harm other team metrics.

d. Ease for future hypotheses

The insights gained from tracking guardrail metrics aid in formulating hypotheses by providing clear guidelines on what to avoid for future hypotheses.

Setting and tracking guardrail metrics with VWO

Creating a guardrail metric with VWO is a straightforward process. Suppose you wish to set a guardrail for the form signup rate on your website. The image below shows the VWO interface with the required metric setup.

Woking with metrics in VWO

Once you have successfully created the metric, applying it to your VWO campaigns is a straightforward process. In any experimentation feature, like VWO Testing, you can access the VWO dashboard where you manage your metrics and goals. Set the primary metric as the one intended for the test and select the secondary metric as the guardrail metric you created.

VWO dashboard

By incorporating a guardrail metric into your VWO campaigns, you ensure a robust monitoring system that allows you to track and safeguard crucial business metrics during experimentation.

If you want to explore the VWO dashboard, discover how to set guardrail metrics, and utilize other experimentation features to enhance your CRO campaigns, we offer a comprehensive 30-day free trial. Give it a try and unlock the potential for optimizing your conversion rates!

Conclusion

In conclusion, guardrail metrics are crucial for businesses looking to conduct experiments and improve their key metrics without causing harm to other critical metrics essential for overall success. By setting and tracking guardrail metrics, organizations can maintain a risk-averse approach, gain insights into complex relationships, and ensure coordination between teams.

The post Guardrail Metric appeared first on VWO Glossary.