Skip to content

Commit ea170e9

Browse files
committed
a little more detail
1 parent 3d84540 commit ea170e9

File tree

1 file changed

+14
-0
lines changed

1 file changed

+14
-0
lines changed

content/blog/evals.md

Lines changed: 14 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -33,6 +33,20 @@ well..
3333
{{< tweet user="gdb" id="1733553161884127435" >}}
3434

3535

36+
#### ok so.. evals for MCP?
37+
38+
> what capabilities do clients have when connected to an MCP server?
39+
40+
is not the same question as:
41+
42+
> what is literally exposed by my MCP server?
43+
44+
clients might have other capabilities, like using a more general interface (e.g. terminal a la Claude Code) where the user might prefer more sensitive operations to happen
45+
46+
so, if we want to evaluate that an MCP client can do a thing on behalf of a user, we just need to set up an initial condition, and let the client loop with its tools/MCP servers until it achieves the desired outcome (perhaps asserting that this happened a particular way)
47+
48+
by extension, if you restrict the set of tools to only your MCP server, you can evaluate that your MCP server enables clients in general to have a particular capability on behalf of a user.
49+
3650

3751
```python
3852
@pytest.fixture

0 commit comments

Comments
 (0)