Creating a php CRAWLER

Asked

Viewed 4,068 times

2

I am a layman on the subject and would like to know where I can find more information about creating a Rawler to download data and images from some websites. I searched a lot but so far I found nothing very detailed!

I appreciate the answers

1 answer

1

It is necessary php?

So, first it is necessary to understand your need to see if we really need to make use of php in this case, since a Crawler works upon requests per client. There is a Crawler that works as an extension very well for Google Chrome, which is called Web Scraper. You can find it here

I will use an automated process that will search images periodically.

In this case, it is necessary to use php or some server-side programming language. In this case I recommend the Phpcrawl framework, which can be found here. It has a simple but very rich functioning, with several options. If you master the php language, just by reading the example you will be able to understand its use.

What I need to know to do Crawler?

What I strongly recommend is that you have a good knowledge in Regexp, since you will need to recognize patterns to make accurate searches about the content you search for on the site where you are doing the crawling.

  • Using Regex to search HTML pages is usually a bad idea and only brings frustration. The recommended is to use XPATH or CSS selectors to locate and extract data and regex only to modify them.

Browser other questions tagged

You are not signed in. Login or sign up in order to post.