AppleScript that will rename PDF files based on words that FOLLOW a phrase in a PDF

I need to run an AppleScript to rename a batch of PDF files (134!).


I’m all set up in Automator.


What I need is an AppleScript that will do this —> Rename each file based on any words after the phrase “Receipt for” — ie, what’s on the rest of that line (a person’s name) and only that line in the PDF.


Example: Script finds "Receipt for Bob Ross" in the PDF and renames the file: "Bob Ross.pdf".


Can you help with that?


(I'm not clever enough to write a script that will do "words that follow" a searched phrase.)


Explanation: This is a stack of printed paper for which the original data file was lost.


Thank you all.



MacBook Pro 15″, macOS 11.7

Posted on Jun 18, 2025 07:23 PM

Reply

Similar questions

8 replies

Jun 19, 2025 08:22 AM in response to bigbunny

Several months ago, I wrote an AppleScript/Objective-C script that parsed scanned PDFs for a trigger string and then renamed the PDFs based on the text (invoice number) that followed what would be your "Receipt for " string. This can be altered to work with your PDFs. However…


When you scan paper to a PDF, the result is a PDF container around an image of the scanned text. One cannot search an image, so the PDF needs to be processed with Optical Character Recognition (OCR) to produce an exact text registration above the image text. Then that text registration layer can be successfully searched and the names captured for renaming the individual PDF.


Did your scanning process also automatically OCR the PDFs to they are ready for processing? Are you still on macOS 11.* (Big Sur) or have you upgraded to a newer major version of macOS (e.g. Sonoma or Sequoia)?


I recommend that the script does not use spacing in the PDF renaming, but rather other punctuation as in "Bob_Ross.pdf."

Jun 19, 2025 08:53 AM in response to bigbunny

There are task-oriented websites where you can get bids for specific small programming tasks.


But given it’s all of 134 entries, and given the relative complexity of this task (searching an OCR of an image stored in a PDF container, seeking some amount of text immediately following some target text), somebody can probably do that task manually more cheaply and more quickly than an app can be written and debugged and tested and explained and supported. Maybe using Quick Look here, rather than launching Preview for each file.


If you want to outsource this, various task-oriented websites (including Amazon) offer these “Mechanical Turk” services, as well.


For programming, I’d usually use bash or zsh here and not AppleScript, but shortcuts has access to Live Text (OCR) available in macOS 12 and later.

https://blog.greg.technology/2024/01/02/how-do-you-ocr-on-a-mac.html


Here’s some Swift code that uses Live Text (OCR), from an image (not a PDF): https://github.com/MatthiasWinkelmann/macocr/blob/main/Sources/macocr/Runner.swift


Adobe reportedly has tools:

https://www.adobe.com/acrobat/hub/use-ocr-to-read-text-from-image.html


But absent somebody that has this PDF-receipt-scanning-text-retrieval app already written, this is going to be faster to do the rename of the 134 files manually. And purchasing that hypothetically-already-existing app may still be more costly to purchase than a manual rename process iterating 134 files.


PS: Spotlight may be able to pick up this text automatically on recent macOS too, so you may be able to ignore all of this renaming work, using a Spotlight to search for the name of the party associated with the receipt when you need it. If you’re on macOS 12 or newer, try it with your particular image-containing PDF files.


PPS: yeah, if you’re just getting started with scripting, probably avoid using spaces in file names with scripts.

Jun 19, 2025 09:12 AM in response to MrHoffman

Last Fall, I wrote an ASOC script that searches for a predefined text string in a PDF, captures what follows (in that case an invoice number), and then renames the PDF to that invoice number. The changes to that script would be minimal and reasonable to the OP's scenario. I also advised how to freely OCR the existing PDFs with Ghostscript and Tessarct Language Library files since Ghostscript v10.04+ has an OCR writing device in it that works in concert with free Tesseract library files when a specialized environment variable is present.


I was hoping the OP had upgraded to Sonoma or Sequoia as Apple's Preview now has PDF OCR capability on export, just that Apple being Apple does not call it OCR. Otherwise, doing OCR on existing scanned PDFs would require extra cost software.

Jun 19, 2025 09:22 AM in response to VikingOSX

VikingOSX wrote:

Last Fall, I wrote an ASOC script that searches for a predefined text string in a PDF, captures what follows (in that case an invoice number), and then renames the PDF to that invoice number. The changes to that script would be minimal and reasonable to the OP's scenario. I also advised how to freely OCR the existing PDFs with Ghostscript and Tessarct Language Library files since Ghostscript v10.04+ has an OCR writing device in it that works in concert with free Tesseract library files when a specialized environment variable is present.

I was hoping the OP had upgraded to Sonoma or Sequoia as Apple's Preview now has PDF OCR capability on export, just that Apple being Apple does not call it OCR. Otherwise, doing OCR on existing scanned PDFs would require extra cost software.


I still think a 134 files is going to be faster to rename manually. Yeah, I risk my programmer card writing that. 😉

This thread has been closed by the system or the community team. You may vote for any posts you find helpful, or search the Community for additional answers.

AppleScript that will rename PDF files based on words that FOLLOW a phrase in a PDF

Welcome to Apple Support Community
A forum where Apple customers help each other with their products. Get started with your Apple Account.