Table of Contents
More on AI-Powered Agents Transforming Browser Functional Testing with Selenium and Playwright
Introduction
The purpose of this article is to update you on some discoveries made as I was testing using Claude Desktop to dialog with a Playwright MCP Server and a Selenium MCP Server. I began to understand the reasons there are already multiple MCP Servers for the same products as Selenium and Playwright. One of the primary reasons is configuration. I will go into more details later in this article. Another reason is design. The beauty of programming is design. Programmers or Developers can be artists to create MCP Servers that are unique in their services. So, users of the servers must be aware of the differences between the offerings to determine what server choice is required.
I began my test comparison project with the two tools expecting that the results would be similar. Though the tools are unique in functionality, I still expected the experience to be similar. Well, both completed the objectives of the tests. But behavior and responses were very different.
My experience with the Playwright Server probably spoiled me. It did more than I was expecting. I made one prompt request to test an application. Claude and the Playwright MCP Server started a length dialog recovering from error conditions, intuitively performing verification and validation actions, and then wrapping up the test case with a summary of all that transpired. I was truly impressed.
I then expected that the Selenium Server setup and behavior would be similar. Not hardly. The setup after much trial and error due to limited documentation, had to be a local setup versus remote as done with the Playwright Server. Much time was lost to be ready for a test. Then I specified the same prompt for test as before. The server choked because it could not find a browser. I eventually had to modify the prompt to communicate which browser to locate. Then it did launch the browser but then responded with a message saying I had reached the maximum requests, please start a new conversation. That was alarming. I had the wisdom to retry with simple prompts that require single responses. That allowed me to enter the prompts necessary to complete the test case. It got me most of what I had seen with the Playwright Server, but shy of automatic updates, ongoing screen shots, and an ending summary report. Let me turn your attention to what this is all about.
Opening Remarks
It seems appropriate to draft a follow-on article to my previous article that covers more detail about what became a deviation from the expected path I planned to take. Plans are just this. A roadmap if everything goes as planned. If one has to deviate, then one moves into uncharted territory – maybe because documentation didn’t meet the need. Pardon me if I take jabs at modern-day Agile diehards who think Agile means no documentation or as little as possible. Maybe AI will solve all our short-comings. Enough with digressing.
MCP Server Configurations
I understood that MCP Servers could run remotely or locally. But I did not know clearly why. I thought it was a choice you have as a user. I thought if it was available on the internet, as long as I did not need to modify the server code, the right choice was remote. It’s not that simple. And documentation does not make that clear. I hope this article helps with that.
So, let’s take a closer look at the MCP Server design options. Here I am using the reference “configuration modes”. Think of it from the perspective of what approach the MCP Client will use to interact with the server.
Allow me to explain the visual diagram that illustrates the four main MCP server configuration modes. There can be other variations, but here are four main configurations. And let’s include how Claude Desktop interfaces with each.
Here is more detail to explain each mode.
1. Local Execution (STDIO)
- Transport: Standard Input/Output (STDIO)
- Setup: Claude Desktop or other AI agent launches the server as an OS subprocess.
- Use Case: Tools that need access to local resources (e.g., Selenium with local browser paths).
- Variation: The server source code download is optional. Download only if you need to modify its functions.
- Pros: The usageis simple, fast, with no network latency.
- Cons: The process is limited to the local environment.
2. Remote Execution (HTTP/SSE)
- Transport: HTTP or Server-Sent Events (SSE)
- Setup: Claude connects to a server hosted on a remote machine (e.g., Smithery.ai).
- Use Case: Tools like Playwright that are cloud-optimized.
- Variation: HTTP supports one response per request, while SSE supports the server sending updates back to Claude in real time.
- Pros: The server is scalable, and centralized.
- Cons: The server may fail if the tool requires local resources (e.g., browser paths) because the server is probably running on a Linux machine.
3. Remote Execution with Port Forwarding
- Transport: SSH Tunnel (which appears local to Claude)
- Setup: You run the server remotely but forward the port to localhost.
- Use Case: When the tool must run on a remote Linux machine, but Claude is on Windows/macOS.
- Pros: The server combines remote power with local accessibility.
- Cons: The server requires SSH setup and port management.
4. Hybrid Configuration
- Transport: Adaptive (custom logic)
- Setup: Server detects client environment and adapts (e.g., OS, file paths).
- Use Case: Advanced tools that serve multiple clients with different setups.
- Pros: This server is flexible and intelligent.
- Cons: This server is complex to implement.
MCP Server Design Behavior
Let’s go a little deeper with the two transport options used in MCP — HTTP and Server-Sent Events (SSE) — and highlight what distinguishes them both in behavior and design:
HTTP (Request/Response Model)
✅ Characteristics:
- Unidirectional: The client sends a request, and the server sends back a response.
- Stateless: Each request is independent; the server doesn’t retain a session state unless explicitly managed.
- Simple: Easy to implement and widely supported.
- Synchronous: The client waits for the server to respond before continuing.
🧰 Use Case in MCP:
- Suitable for simple tool calls where the client sends a query and expects a single response.
- Often used for non-interactive tools or one-shot executions.
⚠️ Limitations:
- Not ideal for streaming responses or long-running tasks.
- No built-in mechanism for the server to push updates to the client.
Server-Sent Events (SSE)
✅ Characteristics:
- Unidirectional Streaming: The server can continuously push updates to the client over a single HTTP connection.
- Persistent Connection: The connection stays open, allowing real-time updates.
- Lightweight: Uses standard HTTP and is simpler than WebSocket.
- Text-based: Data is sent as UTF-8 encoded text.
🧰 Use Case in MCP:
- Ideal for streaming tool outputs, progress updates, or LLM-generated content that unfolds over time.
- Used when Claude needs to receive incremental results from a tool (e.g., Playwright test logs, live data feeds).
⚠️ Limitations:
- One-way only: Server → Client. If the client needs to send data, it must use a separate HTTP request.
- Browser/client support: Generally good, but not universal in all environments.
Summary Comparison Table
It might be helpful for reference purposes to have something like the following chart. Here you can see by feature how HTTP compares to SSE.
Feature | HTTP | Server-Sent Events (SSE) |
Direction | Client → Server → Client | Server → Client (only) |
Connection | Short-lived | Long-lived |
Streaming Support | ❌ | ✅ |
Complexity | Low | Moderate |
Use Case in MCP | One-shot tool calls | Streaming tool responses like test scenarios |
Yes — what I experienced with the Playwright and Selenium servers does align with the differences between streaming (SSE) and non-streaming (HTTP) transport modes, but there’s more to it than just the transport protocol.
What was Observed
My experience with testing using the Playwright and Selenium would also be different if I chose different versions of the MCP servers. The idea of the MCP concept is to be able to use different tools or applications but the client users the same AI prompts, and then the server interacts to convert requests into the language the tool or app understands.
But here is a bird’s-eye view of what I encountered. This is not bad, but it helps to have documentation that alerts the user to expectations.
Feature | Playwright Server (Remote) | Selenium Server (Local) |
Single prompt → multiple actions | ✅ Yes | ❌ No |
Automatic summary | ✅ Yes | ❌ No |
Frequent screenshots | ✅ Yes | ❌ Rare |
Required multiple prompts | ❌ No | ✅ Yes |
What Explains This Behavior?
1. Transport Protocol (SSE vs HTTP)
- Playwright Server likely uses SSE, which allows it to:
- Stream updates back to Claude in real time.
- Send intermediate results (e.g., screenshots, logs).
- Push a final summary once the task is complete.
- Selenium Server may use basic HTTP, which:
- Only allows one response per request.
- Requires you to manually prompt for each step or result.
- Cannot stream updates or summaries unless asked or explicitly coded.
2. Tool Server Design
- Playwright Server is probably designed to:
- Interpret complex prompts.
- Break them into subtasks.
- Execute them sequentially.
- Report progress and results automatically.
- Selenium Server may be more barebones, requiring:
- Manual step-by-step instructions.
- Separate prompts for screenshots or summaries.
3. Session & Context Handling
- Playwright might maintain session state across the task.
- Selenium might treat each prompt as a stateless request.
Analogy
Think of it like this:
- Playwright Server is like a concierge: you say, “Book me a flight, hotel, and car,” and the server handles it all and reports back outcome.
- Selenium Server is like a receptionist: you have to ask for each thing one at a time — “Book a flight.” Then, “Now book a hotel.” Then, “Now get me a car.”