C-Command Software Forum

Searching for text in MS Word files

I am hoping my issue is merely a new user’s confusion. I am trying to build a library containing a wide variety of filetypes: txt, doc, docx, xls, pdf, jpg, and others. As I am trying to emulate a library built by a colleague, Dave Kindem, who had nothing but praise for EagleFiler. The library currently shows it has 10,164 records and OS X Finder says the entire library’s files occupy 54+GB. I have plenty of storage so these sizes are not a problem on my Mac.

My issue is EF is not finding text I know is in certain MS Word documents. That is, I put a word into the search field and Search Anywhere. The document that I know has it many times does not appear. However, when I manually select the file in the Records List, and use cmd+F to bring up a Find window, place my word into the Find field and use the Contains option, it finds all instances instantly. So it’s working at the record level but not the library level. The Word files were created in a variety of versions over many years. If it is relevant, I have Word for Mac 15.24 on my machine.

So far, I have tried opening the library with Option+Command to reindex the entire library. That seemed to improve searching in general but has not solved this puzzle. I have searched this forum but see no recent posts on this topic; I’ve noted many improvements have been made in how EF treats Word files over time. I have proved that Excel files are being indexed properly - no issues there. I have also proved that many PDF documents are indexed and word searches are successful for the records I’ve tested.

Context: I’m on a Mac desktop with 32GB RAM, approx. 15TB HD storage, running OS X 10.11.6, and EF 1.7. Everything’s running fine.

Am I exceeding the sizes supported by EF? There are many records I could place in separate libraries if necessary. I have also tried limiting the search scope to a folder containing only 207 records without success.

Will appreciate any thoughts.
Doug

No, there’s no limit that you’re running into. I recommend that you create a new test library and import one of the problem Word files into it. If the file is findable (when searching Records) in the test library, then that would show that there’s something wrong with the main library. If the file is not findable, then there is probably an issue with the particular Word file, in which case you could send it to me so that I can look into how the indexer is handling it.

Test successful

Good test. Created new library, imported the file that was not yielding search results. By itself in a new library, Search is working on the document.

Does this indicate I should build a new library and test Search as additional docs are added? When I built the one with problems, I simple captured or imported files wholesale in large groups.

Thank you,
Doug

Library reindexed on morning opening
Michael,
Am curious what you would make of this: This morning, after doing the experiment you suggested, I opened the library I had asked about. I happened to have the Activity panel open from yesterday. When I opened that library, EF started a process of indexing all 10K+ records over a period of about 30 minutes. At the end of that time, the Search I reported as not working works. It works from the Search field and when I select the Word doc I knew to contain the target, all instances were highlighted when it opened in the Record Viewer. I’ve tried several other searches and all have worked successfully.

May I conclude that EF somehow detected imperfect indexes and decided to rebuild on its own? Does the act of closing and then opening the library cause anything beyond checking for new files or newly edited files?

Would you have any remaining concerns about my indexes that I should investigate? For now it appears all is working OK.

Thank you again for helping.
Doug

Possibly. If EagleFiler detects that an index file is damaged, it will start a new one, redo the indexing, and move the damaged index to a side folder inside the .eflibrary. Or, it could simply be that the indexing didn’t finish after you imported the files, or perhaps there was a transient error. When you re-open the library, EagleFiler does a full scan of the files and updates any that are not current in the index.

It also checks for missing files, syncs tags with the Finder, and backs up the metadata from the database to XML.

No.

Great information
Thank you very much.