Skip to content

Fix #1424: add convert_file and convert_directory MCP tools#1834

Open
kdjkdjkdj wants to merge 1 commit intomicrosoft:mainfrom
kdjkdjkdj:feature/mcp-convert-file-directory
Open

Fix #1424: add convert_file and convert_directory MCP tools#1834
kdjkdjkdj wants to merge 1 commit intomicrosoft:mainfrom
kdjkdjkdj:feature/mcp-convert-file-directory

Conversation

@kdjkdjkdj
Copy link
Copy Markdown

Summary

Adds two thin wrapper MCP tools — convert_file(file_path) and convert_directory(dir_path, recursive=True) — alongside the existing convert_to_markdown(uri). Directly addresses #1424: AI agents tend to skip the current URI-only tool when they already have a plain filesystem path in hand, and end up writing Python glue instead of calling MarkItDown.

Changes

One file: packages/markitdown-mcp/src/markitdown_mcp/__main__.py.

convert_file(file_path: str) -> str

Thin wrapper around MarkItDown.convert_local(). Resolves ~ and relative paths against CWD, raises FileNotFoundError for missing paths via Path.resolve(strict=True). No library-side change needed.

convert_directory(dir_path: str, recursive: bool = True) -> dict[str, str]

Walks a directory (recursively by default via Path.rglob, otherwise Path.iterdir) and converts every file it finds, returning a {relative_path: markdown} mapping. Files that fail individually are skipped with a try/except Exception: continue so a single bad file doesn't abort the batch — mirroring how bulk converters typically behave.

Tool schemas are auto-generated by FastMCP from the Python type hints, so no manual JSON-schema surface is introduced.

Backwards compatibility

  • Fully additive: convert_to_markdown(uri) is untouched.
  • No new dependencies. Only uses pathlib.Path from the stdlib and the already-imported MarkItDown class.
  • No configuration, environment variable, or breaking-change surface.

Test plan

Built and installed locally (editable) on Windows with Python 3.13, then ran the new tools via an MCP client (Claude Code):

  • convert_file on a real .docx from a deeply-nested OneDrive/SharePoint path with German umlauts (`D:\OneDrive\OneDrive - KEGON AG\KEGON\Intern...\Erfassung-KI-Anwendungsfall-01-Intern-Verzeichnisse.docx`) — converted cleanly.
  • convert_directory on the repo's own `packages/markitdown-mcp/` (non-recursive) — returned {'Dockerfile', 'pyproject.toml', 'README.md'}.
  • Agent tool selection: Claude chose convert_file autonomously given a plain Windows path (the original `convert_to_markdown` would require the agent to URL-escape and prefix file:// itself — markitdown-mcp: Enhance tool to improve AI agent discoverability for file conversion tasks #1424's core complaint).

No unit tests added — the tests/ directory is present but currently empty; happy to add pytest coverage for both new tools if that's desired before merge.

Notes

  • Co-authored with Claude Opus 4.7 (Claude Code).

Closes #1424.

The markitdown-mcp server currently exposes a single tool,
convert_to_markdown(uri), which covers http/https/file/data URIs.
In practice MCP clients often have plain filesystem paths on hand
rather than URIs, which leads to agents spending extra turns on the
URI construction (or skipping the tool entirely, see microsoft#1424).

This change adds two thin wrapper tools that delegate to the existing
MarkItDown.convert_local() API, so no library-side changes are needed:

- convert_file(file_path): convert a single local file to markdown.
  Paths are resolved relative to the current working directory and
  non-existent paths surface as FileNotFoundError.
- convert_directory(dir_path, recursive=True): convert every file in
  a directory and return a {relative_path: markdown} mapping. Files
  that fail individually are skipped so a single bad file doesn't abort
  the batch.

Tool-schema generation is handled by FastMCP via Python type hints,
so no manual JSON-schema surface is introduced.

Closes microsoft#1424.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@kdjkdjkdj
Copy link
Copy Markdown
Author

@microsoft-github-policy-service agree

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

markitdown-mcp: Enhance tool to improve AI agent discoverability for file conversion tasks

1 participant