Deleting unwanted file types

This was new to me, so I thought I would share the idea. During crawling, it looked like Javascript was being crawled: .js files. There are other file types like this I do not find useful in my search results, so here is how I deleted them.

Index Administration > Index Deletion

Expression used:
.*.(js|json|xml|rss|css|ico|zip|tar|gz|bz2|rar|7z|iso|exe|dll|bin|apk|so|o|class|jar|woff|woff2|ttf|eot|otf|DS_Store|Thumbs.db)

All those file types I wanted off the hard drive. Make more room for human readable content.

Next, I pressed “Engage Deletion”.
deleted

A few minutes later, the machine showed about 2,000 documents removed.


I’m happy to have found an easy way to delete file types.

2 Likes

And I am now using the same expression to block crawling those same file types so hopefully I will not need to delete these again.

If you want to do something similar, it’s the field: must-not-match

2 Likes