Skip to content

fix(pptx): handle missing text frames#1811

Open
MukundaKatta wants to merge 1 commit intomicrosoft:mainfrom
MukundaKatta:codex/markitdown-pptx-none-text
Open

fix(pptx): handle missing text frames#1811
MukundaKatta wants to merge 1 commit intomicrosoft:mainfrom
MukundaKatta:codex/markitdown-pptx-none-text

Conversation

@MukundaKatta
Copy link
Copy Markdown

@MukundaKatta MukundaKatta commented Apr 21, 2026

Summary

  • treat shape.text and notes_frame.text as empty strings when python-pptx returns None
  • avoid failing the whole PPTX conversion on malformed or third-party-generated decks
  • add a regression test that patches a PPTX text frame getter to return None

Verification

  • PYTHONPYCACHEPREFIX=/tmp/pycache-markitdown python3 -m py_compile packages/markitdown/src/markitdown/converters/_pptx_converter.py packages/markitdown/tests/test_module_misc.py
  • targeted pytest collection was blocked locally because this environment does not have onnxruntime, which is imported through magika during test startup

Closes #1808

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

PptxConverter: TypeError "can only concatenate str (not NoneType) to str" when shape/notes text is None

1 participant