A powerful Python utility for searching keywords across multiple file types and directories.
File Name: FileCrawler.py
Description: Searches for keywords in various file types (text and PDF) within a directory structure, providing detailed occurrence logs.
Author: Ajay Singh
Version: 1.0
Date: 27-10-2024
-
Multi-format Search: Supports various file types including:
- Documents:
.txt,.doc,.docx,.pdf,.rtf - Code Files:
.py,.java,.cpp,.js,.html,.xml - Data Files:
.csv,.json,.yaml,.yml,.tsv - Configuration:
.ini,.config - Logs:
.log
- Documents:
-
Search Capabilities:
- Case-insensitive keyword matching
- Multiple keyword search support
- Recursive directory scanning
- Line/page number tracking
- Word position recording
-
Output Features:
- Detailed search results
- Optional search summary
- Colored error messages
- Both console and file output
-
Clone the Repository:
git clone https://github.com/yourusername/FileCrawler.git cd FileCrawler -
Install Dependencies:
pip install PyPDF2 colorama
/path/to/search/directory
keyword1--keyword2--keyword3
- Summary display can be configured by changing
DISPLAY_SUMMARYin the script:0: No summary1: Show complete summary
-
Basic Usage:
python FileCrawler.py
-
Interactive Prompts:
- Directory prompt: Enter search directory or press Enter to use existing
- Keywords prompt: Enter keywords separated by '--' or press Enter to use existing
- Continue prompt: 'y' to perform another search, any other key to exit
-
Output Location:
- Results are saved in
output.txt - Real-time results appear in console
- Results are saved in
$ python FileCrawler.py
Please enter the directory to search (leave blank to use existing): /home/documents
Please enter keywords to search (separated by '--', leave blank to use existing): error--warning
Searching for 'error' in directory '/home/documents'...
[Results appear here]
Searching for 'warning' in directory '/home/documents'...
[Results appear here]
Do you want to perform another search? (y/n):-
PDF Search Not Working
- Ensure PyPDF2 is installed:
pip install PyPDF2 - Check PDF file permissions
- Verify PDF is not encrypted
- Ensure PyPDF2 is installed:
-
No Color Output
- Install colorama:
pip install colorama - Check terminal color support
- Install colorama:
-
File Access Errors
- Verify directory permissions
- Check file path validity
- Ensure files aren't locked by other processes
- Fork the repository
- Create a feature branch
- Commit changes
- Push to the branch
- Open a Pull Request
This project is licensed under the MIT License - see the LICENSE file for details.