docs: correct image converter description in README (EXIF + LLM, not OCR)#1837
Open
sjhddh wants to merge 1 commit intomicrosoft:mainfrom
Open
docs: correct image converter description in README (EXIF + LLM, not OCR)#1837sjhddh wants to merge 1 commit intomicrosoft:mainfrom
sjhddh wants to merge 1 commit intomicrosoft:mainfrom
Conversation
…OCR) The supported-formats list claims "Images (EXIF metadata and OCR)", but the built-in `ImageConverter` does not perform OCR. Per the converter's own docstring, it extracts EXIF metadata (when exiftool is available) and generates a description via a multimodal LLM when an `llm_client` is configured. OCR is only available through the separate Azure Document Intelligence converter (`[az-doc-intel]` optional dependency), which is documented elsewhere in the README. This mislabeling has caused recurring user confusion, visible in issues microsoft#1601, microsoft#1344, microsoft#1170, and microsoft#255 where users expected OCR to work out of the box on images and scanned PDFs. The one-word change brings the README in line with the actual behavior of `ImageConverter`.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What
The supported-formats list in README.md claims:
But the built-in
ImageConverterdoes not perform OCR. Per the converter's own docstring:OCR is only available through the separate Azure Document Intelligence converter (
[az-doc-intel]optional dependency), which is already documented in its own section of the README.Why
This one-word misstatement has caused recurring user confusion. Recent examples:
Users install
markitdown[all], feed a JPEG, and expect OCR output — but what they get is EXIF-only (no LLM client configured) or an LLM-generated description (not OCR).Change
One line in
README.md:This matches
ImageConverter's own docstring and the pattern used elsewhere in the list (e.g. "Audio (EXIF metadata and speech transcription)" — conditional, documented).Related
Aware of #1608 — overlapping intent (clarify OCR availability), but that PR points users at a
markitdown-ocrplugin that is not in this repo. This change is minimal and factual: describe what the in-tree converter actually does. Happy to defer if maintainers prefer #1608's framing, or to fold this into a broader docs rewrite.