# How to avoid getting blocked

Scanning and fetching data directly from store websites have several advantages:

* A simple, clear process for importing product.
* You can work with stores that have no API or data feeds.
* Up-to-date and complete data.

But large-scale data extraction and product data parsing have their own challenges. One of them is that some websites can implement anti-bot mechanisms. Sometimes, your bot can be blocked if it sends too many requests per day/hour. Usually, a restriction is imposed on your hosting's IP address.

The most obvious reason for bots being blocked is preventing heavy automated traffic that could affect website performance. Please note that getting a product and updating a price requires a separate HTTP request to the target website.

That's why the general rule for stable use of External Importer is to ***send as few requests to the target websites as possible***!

### Anti-blocking settings

External Importer was developed to be a good bot, not to cause inconveniences to other websites. For example, the plugin will follow the rules from *robots.txt* and has request limits. To find these settings, go to `External Importer > Settings > Extractor`.

![](https://2204606725-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MJHhS3qgDA1lCM6b1Nw%2F-MJHz1DuNuexmGST3Nv7%2F-MJHzhzffekgFHdvkoVd%2Fexternal-importer-10.png?alt=media\&token=81b3db90-4d66-46a2-9cfb-9f5fcabc7169)

Please pay attention to the `Daily limit`. It's an important value. The plugin will count any requests to every domain for every 24 hours and block automated queries that exceed this limit.

There's also an option for 1- or 24-hour blocking available to prevent new requests if several errors in a row are received from the target website.

{% hint style="info" %}
Only automated requests, price updates, or auto import will be blocked. These limits aren't considered if you extract the products by manually entering the URL on the Product Import page.
{% endhint %}

So please, don't try to import too many products at a time from one source. It's better to split big tasks into several days.

We also recommend setting as long a pause as possible between requests for each product:

![](https://2204606725-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MJHhS3qgDA1lCM6b1Nw%2F-MJHz1DuNuexmGST3Nv7%2F-MJHzuATgl_jb_qWy2hK%2Fexternal-importer-11.png?alt=media\&token=c00602c6-f40f-412b-a451-a00178a6e93f)

What else can you do to avoid getting blocked?&#x20;

{% hint style="warning" %}
Don't update prices too often.
{% endhint %}

![](https://2204606725-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MJHhS3qgDA1lCM6b1Nw%2F-MJHz1DuNuexmGST3Nv7%2F-MJI-ECmy49ObDfVa1uU%2Fexternal-importer-12.png?alt=media\&token=f90c78ad-b34f-49b1-a90f-13c898dd545f)

### What can you do if you're already blocked?

Your server's IP may be temporary or permanently blocked on the target website's side for the following reasons:

* You're sending too many requests.
* You use a shared IP with a bad reputation or bad hosting neighbors.
* The website has low bot tolerance and blocks bots globally.
* The country where your server is located is blocked on the website.

You can get the following errors because of blocking:

* 403 - Forbidden
* 503 - Service Unavailable
* 429 - Too Many Requests
* 408 - Request Timeout
* 400 - Bad Request

![](https://2204606725-files.gitbook.io/~/files/v0/b/gitbook-legacy-files/o/assets%2F-MJHhS3qgDA1lCM6b1Nw%2F-MJHz1DuNuexmGST3Nv7%2F-MJI-7Y0BA5SARE9Qs3Y%2Fexternal-importer-12-a.png?alt=media\&token=f2de2117-f9ab-4534-9141-e43453cd8308)

What you can do:

* Temporarily disable price updates, and don't send new requests. Temporary blocks are usually removed a day later.
* Try negotiating with the website owners to whitelist your IP. Some advertisers might do you a favor as you promote their products and generate traffic.
* Use a dedicated IP instead of a shared IP.
* Change the hosting. We don't recommend using large-scale cloud hostings like Amazon Web Services or Google Cloud, as some websites block the whole subnets of these services.&#x20;
* Use proxies.
* Use built-in [crawling services](https://ei-docs.keywordrush.com/extracting-products/crawling-services).

{% content-ref url="crawling-services" %}
[crawling-services](https://ei-docs.keywordrush.com/extracting-products/crawling-services)
{% endcontent-ref %}

<details>

<summary></summary>

fghfgh

</details>
