Job Description
We are seeking a skilled Data Crawling Specialist to join our team. The ideal candidate will be responsible for developing and maintaining web crawlers to collect data from various sources, ensuring high-quality data extraction and storage.
Key Responsibilities
- Responsible for data crawling, including static web pages, dynamic web pages (JS rendering), API interface data, etc.
 - Handle anti-crawling strategies such as User Agent impersonation, proxy pooling, captcha bypass, cookie encryption, body parameter encryption, etc., to improve crawling success rate.
 - Analyze webpage data and extract information using techniques such as XPath, CSS selectors, regular expressions, etc.
 - Store and crawl data to databases such as MySQL, MongoDB, Redis, Selectdb, etc.
 - Write data cleaning and deduplication related code to improve data quality.
 - Monitor the running status of crawlers, optimize crawling strategies, and ensure the stability of data crawling.
 
Job Requirements
- Proven experience in web scraping and data crawling techniques.
 - Strong knowledge of handling anti-crawling mechanisms and strategies.
 - Proficiency in data extraction techniques like XPath, CSS selectors, and regular expressions.
 - Experience with various databases such as MySQL, MongoDB, Redis, or Selectdb.
 - Ability to write efficient data cleaning and deduplication scripts.
 - Strong problem-solving skills and attention to detail.
 - Experience in monitoring and optimizing crawler performance is a plus.
 


