How to Select your Search Engines and Website Sources to Scrape
How to Select your Search Engines and Website Sources to Scrape
If you are not planning of scraping your own website list, you can choose the search engines, maps, business directories and social media sites to scrape. You can select multiple platforms to scrape at the same. Unlike other scrapers that allow you to scrape one website platform at a time, with our website scraper, you can scrape multiple website sources and the results will be formatted and collated into a single database. In practical terms, this means that you can scrape multiple search engines, Google Maps, Business directories and social media sites at the same time. However, it is recommended that you do not select too many sources as it will be redundant.
Scraping Dictionary List: Yellow Pages, Yelp, Google Maps
If you are planning of scraping Google Maps, Yellow Pages and Yelp, we strongly recommend that you use our footprints tool because these website sources use geo location to present their results. For example, if you are searching for a vape shop, then you will need to add different cities (and post codes) to the main keyword. This is because you will get different vape shops for different areas. If you are scraping Yellow Pages or Yelp you can enter your keyword and select your country and the cities and states. Click on the plus icon to expand the selection. If you select all states/cities then the website scraper will search for your keyword + every single city (separate searches). The same applies to Google Maps (UK and USA). However, please note that for Google Maps source, the website scraper will scrape your keywords at a city level for UK and state level for USA. This may work well for vape shops for example but not so much for coffee shops. This is because there are more coffee shops than there are vape shops and therefore, you would be better off to use custom city + zip code footprints (coffee shop + city zip code). We will not go into much detail about footprints here. Please see the footprints section inside the guide.
Scraping Search Engines
For search engines such as Ask, Yahoo, Bing, Ecosia, AOL, Google, So, DuckDuckGo, Yandex, you can use almost all types of keywords. It is recommended that you pick a maximum of two search engines to have one as a fallback option. Usually, Google and Bing work well together. Selecting too many search engines will simply lead to duplicate results and slow down your scraping.
If you are planning to do local scraping, you can select your local region by double clicking on the plus icon next to the search engine name. This will help you to get more targeted results. It is likewise advisable to use a local IP address or local proxies. For example, if you are scraping USA, then try to use a USA IP address or USA proxies, if you are scraping UK, then try to use UK proxies. If you are scraping globally, just select the international search engine and use any proxies.
Tip: it is advisable to scrape either search engines or dictionary lists (business directories, Google Maps, etc) but not both together. This is because you will need to use different keyword for search engines and business directories/Google Maps. If you would like to combine multiple databases, you can use our Excel CSV file merging tool. It will merge your databases and remove duplicates.
Once you have select your website sources to scrape, do not forget to check "Crawl and Scrape E-Mails from Search Engines".