Future of E2E testing

Photo by Possessed Photography on Unsplash

In this article I’m trying to anticipate the evolution of End-to-End (E2E) testing in the near future.

There is a lot of info about E2E testing, for example, this one.

The plan for the article:

1. What do we want from E2E testing?
2. E2E testing outcome
3. Current situation with E2E testing
4. Problems we have now with E2E testing
5. One ideal solution
6. The ideal solution outcome
7. Virtual users solution
8. Would the outcome be the same as in the ideal solution?
9. Some words about the design and implementation of the virtual users
10. Summary

What do we want from E2E testing?

By using E2E testing we want to make sure that from the real user perspective all flows in our application work fine and that the user can successfully reach their goal using the application.

E2E testing outcome

As a result of testing we would have the following artifacts:

Using the above outcome we can decide if the new version of our app is ready to be released in production and give us enough info to understand in which system we have bugs and how to repro them.

Current situation with E2E testing

Currently, in order to implement E2E testing, we need to prepare/update the testing scenarios, prepare test data (test accounts, test cards, etc), ask manual testers to test the application using the testing scenarios, or automate running those scenarios using different tools and methods.

Problems we have now with E2E testing

There are quite a few problems with E2E testing. We can enlist some:

One ideal solution

Then the question is how the ideal solution could look like. How do we see the perfect E2E testing?

What if we can have all our real users use our application, do whatever they need, and report to us if they find a bug or inconvenience. They should not stop using the app or leave a bad rating in stores or call our support, instead, they should just send us well-prepared reports with enough info to reproduce and fix bugs.

This approach definitely satisfies the definition of the E2E testing and it doesn’t have those problems from the previous chapter and actually, it sounds like perfect E2E testing, at least from my point of view.

The ideal solution outcome

In the case of the ideal solution we would have the same artifacts as from the usual E2E testing plus some more:

Sounds rather cool. Of course, there are its own difficulties around usage of special environments and organization of the process but it seems possible to solve them as we’ve already done that for the usual E2E testing.

The next question is how we can make that dream come true without bothering our real users.

Virtual users

What if we can invent a virtual user (AI driven) who would install our application or open our website and use it as the real user with a purpose in their mind (or maybe more correct — in our Neural Network). Then we summon as many such users as we need and they will be doing exactly what we described in the ideal solution.

However, there are two requirements that make the task rather hard:

One way of solving the both above requirements is to set up different types of users that will match the different groups of the real users. For example, our app is a mega online store, so we can, probably, define the following groups of users using their intentions:

I’m not an expert in user behavior, it’s just what comes to my mind first.

The main idea is that there are some common behavior patterns and they can be reproduced by our virtual users. However, the above patternd are not the testing scenarios. The testing scenarios are hardcoded and should give the tester or testing pipeline clear instructions: tap this, input that, check this, and so on. On the contrast the behaviors or patterns are very general, they are on a higher level of abstraction: “search for black office chair under 100 pounds with an overall rating bigger than 4.5 and check all reviews so there are no any injures mentioned, then buy it using {card} and {address}”.

A virtual user will then imitate the real users — trying to understand UI, find the needed elements on the screen, and interact with them in the same way the real users do.

Using this approach we can summon few virtual users per group and “ask” them to use our app in the same way the real users do it.

Would the outcome from the virtual users solution be the same as in the ideal solution?

The outcome would be pretty much the same, so we will have the same stats about users who participated and successfully reached their goals, the same reports about any bugs that prevent our virtual users to reach their goals, reports about backend errors that didn’t impact user experience, and so on.

What will be missing:

Some words about the design and implementation

How would we implement the virtual users?

Let’s see what would we need to make the virtual users do what we want.

There could be the following layers:

We can think of layer 3 as a core of the system, layer 4 as an input, layer 2 as an adapter from core to any UI (web, mobile, PC apps, or even something special like medical equipment with touch screens or industrial monitoring systems with hardware buttons).

High level design

How could the above example about the buying of a black office chair work here?

Layer 4

Layer 4 gets the text input, analyzes and extracts all the important features from it. It could look like this:
* Main goal: buy
* Subject: chair
* Subject properties: black, office
* Limitations: price (100 pounds), rating (better than 4.5), reviews (not contain injures).
* Account: …
* Card: …

I suppose the best way to do that is to use machine learning and train it to extract the meaning from the text, probably, there are already some on the market.

Layer 3

This one is a bit tricky. It should look like a problem solver and for each task provides the algorithm or the sequence of steps to solve it. In our example, the steps could look like this:

I can see two ways of implementing this:

I appreciate that I don't have much knowledge in machine learning and maybe there are many other ways to implement this layer but I’m sure that the task is very interesting, challenging, and, what is more important, promising as the results can be reused in many other fields like robotics, game AI and many more.

Layer 2

This layer is very important as it gives us the real interface to any application. What I mean here is that UI is rather different in web, mobile apps, and PC apps and the way we can interact with the UI is different too. Layer 2 abstracts it and provides us with something more high level.

In our example, Layer 2 will be constantly analyzing the page it sees (using screenshots, some special tools like ‘adb’ for android or something else), extract all UI elements, and “understand” them, so when it receives command “Open search page” from Layer 3 it looks up something that has the meaning of search — button with the text “Search”, magnifying glass icon, menu with the text “Search”, etc, and then activates it by tapping on it (using special tools or just tapping on screen using elements coordinates).

This layer alone can be rather useful for UI testing as it gives the opportunity to interact with any UI universally and does not depend on special ‘ids’ of elements or special tooling for interacting with internal UI representation.

We have already made a PoC for this layer and it looks very promising. Hopefully, in the next articles, we can reveal more about it.

Layer 1

This is just the UI of our app and the implemetation of a bidirectional way of interacting with it. We already mentioned some options in the previous section — like taking screenshots and tapping on screen using coordinates of the UI elements.


In this article, we discussed one possible way of E2E testing evolution and suggested some ways of going there.

We think that the future of E2E testing in the hands of virtual users (AI) who can test our apps in the same way our real users use them every day.

It’s definitely the challenge to make that AI work but each step to the end state can bring its own value and can reveal new unknown ways of testing, using the apps, and even thinking about the apps.

First of all, I’m super happy that somebody reaches this far and I’ll be even happier to discuss this topic in comments to the article or in any other appropriate place.

Feel free to share your ideas, critics, and comments.