XYMer's Home away from Home

When http://bbs.xlr8yourmac.com is down (i.e. always)
Privacy Policy
It is currently Thu Dec 05, 2019 4:37 pm

All times are UTC - 8 hours




Post new topic Reply to topic  [ 14 posts ] 
Author Message
 Post subject: A PDF Puzzler
PostPosted: Wed Nov 20, 2019 7:30 pm 
Offline
User avatar

Joined: Fri Dec 10, 2010 9:41 am
Posts: 873
Location: Halfway between New York City and Atlantic City
While I await a reply from Devon Technologies, the people behind the EasyFind app, I thought I'd see if anyone here has any ideas.

EasyFind has the ability to search the contents of many types of files. But it stumped me today, when I tried to locate a specific PDF by searching for a text string I knew to be in that file. It did not find it, even later, after I'd already found the file manually, and directed EasyFind to the folder the file was located in. However, EasyFind DOES locate other PDFs by searching content.

So, I'm wondering: what could it be that would make EasyFind locate some PDFs but not others?

Hmmm...

_________________
_____________________
MacMini 2.5 GHz Intel Core i5, 16 GB RAM, OS 10.12.6


Top
 Profile  
 
 Post subject: Re: A PDF Puzzler
PostPosted: Wed Nov 20, 2019 7:54 pm 
Offline
Benevolent Dictator
User avatar

Joined: Mon Apr 21, 2008 2:03 am
Posts: 16738
Good question, I think I remember something about EasyFind only searching a certain number of bytes into any file, but the memory is so distant & vague...


Top
 Profile  
 
 Post subject: Re: A PDF Puzzler
PostPosted: Wed Nov 20, 2019 7:58 pm 
Offline
User avatar

Joined: Thu May 15, 2008 8:07 pm
Posts: 2755
Location: Inside Flatus Maximus
While it could be what BDA suggested, it wouldn't happen to be that the location of the PDF is in a folder or volume that is set to private for spotlight searches is it? That would prevent anything from finding it, not just EasyFind.

_________________
Official Mac Tech Support Forum Cookie™ (Mint Chocolate Chip)
Guaranteed tasty; Potentially volatile when dipped in WWIII Forum Syrup®
Caution: This cookie bites back.


Top
 Profile  
 
 Post subject: Re: A PDF Puzzler
PostPosted: Wed Nov 20, 2019 8:24 pm 
Offline
User avatar

Joined: Fri Dec 10, 2010 9:41 am
Posts: 873
Location: Halfway between New York City and Atlantic City
BD: Hmmm... do you mean that the file size could be preventing a search? The PDF in question is 13MB and the one EasyFind did locate is only 71k.

Tia: In this case, no; all these files are on my internal drive with no restrictions. Also, in this case, I have a Word version of the PDF file in the same folder and EasyFind "saw" the Word version but not the PDF.

_________________
_____________________
MacMini 2.5 GHz Intel Core i5, 16 GB RAM, OS 10.12.6


Top
 Profile  
 
 Post subject: Re: A PDF Puzzler
PostPosted: Thu Nov 21, 2019 2:03 am 
Offline
User avatar

Joined: Fri Feb 18, 2011 10:38 pm
Posts: 570
Are you sure the pdf file text is readable and not just an image?

I did a few little tests myself and it found pdf text in my pdf's
that weren't saved as images (such as pdf's made from scanned pages).

Easyfind 4.9.3 (4.9.3)

_________________
Mac pro 1,1 - Mac pro 5,1 w/Areca Raid - Macbook pro 8,3 - Snow Leopard, Mountain lion and Sierra.
"We are the Messengers between Time and it's Keeper."


Top
 Profile  
 
 Post subject: Re: A PDF Puzzler
PostPosted: Thu Nov 21, 2019 8:58 am 
Offline
Benevolent Dictator
User avatar

Joined: Mon Apr 21, 2008 2:03 am
Posts: 16738
As I recall, & I may very well be wrong, I think it searched the first 100K or something like that,

kjk555 brings up a good point though.


Top
 Profile  
 
 Post subject: Re: A PDF Puzzler
PostPosted: Thu Nov 21, 2019 10:16 am 
Offline
User avatar

Joined: Sat May 11, 2019 6:52 pm
Posts: 429
Location: New York City
MANY assume that all PDFs contain actual text. MANY do not, but are simply some form of image file. Be wary when any business says they'll send you a pdf... many times it's simply a graphic scan put INTO PDF format.


Top
 Profile  
 
 Post subject: Re: A PDF Puzzler
PostPosted: Thu Nov 21, 2019 3:36 pm 
Offline
User avatar

Joined: Fri Feb 18, 2011 10:38 pm
Posts: 570
Easy enough to figure out if a PDF is searchable, just highlight
some PDF text, pic, etc. and see if you can copy and paste it.

_________________
Mac pro 1,1 - Mac pro 5,1 w/Areca Raid - Macbook pro 8,3 - Snow Leopard, Mountain lion and Sierra.
"We are the Messengers between Time and it's Keeper."


Top
 Profile  
 
 Post subject: Re: A PDF Puzzler
PostPosted: Thu Nov 21, 2019 5:09 pm 
Offline
User avatar

Joined: Fri Dec 10, 2010 9:41 am
Posts: 873
Location: Halfway between New York City and Atlantic City
Yes, I'm aware that PDFs can be images only, but the ones I'm dealing with do have text that is highlight-able and copy-able.

_________________
_____________________
MacMini 2.5 GHz Intel Core i5, 16 GB RAM, OS 10.12.6


Top
 Profile  
 
 Post subject: Re: A PDF Puzzler
PostPosted: Thu Nov 28, 2019 11:21 am 
Offline
User avatar

Joined: Thu May 15, 2008 8:10 pm
Posts: 3253
Location: Spain
In the affected file, when you 'copy' a word, does it 'paste' the same word?

I've known of cases where copying a word would paste as a word with different characters in it. Searching for the word (as it pasted) would find the correct file.

I can't remember which search application I was using at the time, though.

_________________
VILA: They missed us! Avon's gadget works!
BLAKE: [to Avon] Is something wrong?
AVON: It just occurred to me, that as the description of a highly sophisticated technological achievement, 'Avon's gadget works' seems to lack a certain style.


Top
 Profile  
 
 Post subject: Re: A PDF Puzzler
PostPosted: Thu Nov 28, 2019 9:27 pm 
Offline
User avatar

Joined: Thu May 15, 2008 8:13 pm
Posts: 10982
Location: Caught between the moon and NYC
I have seen PDFs which the text was encoded... strangely... so a simple index of the file contents wouldn't yield the text. I never did figure out what encoding they used.


Top
 Profile  
 
 Post subject: Re: A PDF Puzzler
PostPosted: Mon Dec 02, 2019 3:18 pm 
Offline
User avatar

Joined: Fri Feb 18, 2011 10:38 pm
Posts: 570
Some .PDF's might be encrypted (password protected).

https://labnol.blogspot.com/2007/01/how-to-open-password-protected-pdf.html

_________________
Mac pro 1,1 - Mac pro 5,1 w/Areca Raid - Macbook pro 8,3 - Snow Leopard, Mountain lion and Sierra.
"We are the Messengers between Time and it's Keeper."


Top
 Profile  
 
 Post subject: Re: A PDF Puzzler
PostPosted: Mon Dec 02, 2019 3:41 pm 
Offline
Master

Joined: Sun Apr 20, 2008 5:24 am
Posts: 10102
Location: North of the State of Jefferson
Since (I think) EasyFind probably has to parse the PDF to examine the contents, and the PDF "standard" is a monstrosity that is only implemented in whole by Adobe, it's plausible that the PDF is slightly corrupt, or of too new a version, or otherwise just weird enough that EasyFind's parser dies/gives up before reaching your text. And at the same time it might be just not quite messed up enough to keep Preview from reading it.

If it's naively trying to pull strings out of the PDF without parsing it, then the text may be encoded in a way that's fully compatible with the PDF standard, but which isn't plain enough for a dumb parser to identify. I don't remember what kind of encoding rules PDF may use, but I suspect text objects can be gzip'd and separately encoded.

You could try opening the PDF in Preview, then printing it, but in the print dialog choose "Save as PDF..." to make a copy of it. This seems to rewrite the internal structure of the PDF and all of its various objects, so you might end up with something searchable at the end. It does, however, remove some PDF features, so if you have (for example) a fillable PDF, it might not be fillable at the end. The file size will also probably change, sometimes dramatically. But it would be an interesting test.

These sort of PDF problems used to be much more common. Often broken PDFs wouldn't print, or would crash viewers, etc. It's better now, but not perfect.

- Anonymous


Top
 Profile  
 
 Post subject: Re: A PDF Puzzler
PostPosted: Mon Dec 02, 2019 4:19 pm 
Offline
User avatar

Joined: Thu May 15, 2008 8:13 pm
Posts: 10982
Location: Caught between the moon and NYC
On a PC PDFs can still refuse to print if you're using PCL or other vendor-specific drivers. But sometimes the reverse is true, it prints in PCL but not PS. Depends on the PDF. I got tired so I gave the troublesome people two printers, one setup with PCL, one setup with PS, and told them if it doesn't print in one try the other. And I set PCL to be the default because PS is "too slow" according to them (takes a couple seconds for the first page to start printing, the horror, the horror).


Top
 Profile  
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 14 posts ] 

All times are UTC - 8 hours


Who is online

Users browsing this forum: No registered users and 6 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Search for:
Jump to:  
Powered by phpBB® Forum Software © phpBB Group