Set known crawler library rules

Products

Web Application Firewall

2022-02-22 05:44:58

Set known crawler library rules

WAF supports identifying and classifying BOT robot programs and adopts targeted traffic management strategies. For example, it releases search engine-like robot traffic and blocks malicious crawling of core data such as commodity information, seckilling prices, and inventory information. It can also deal with the resource consumption and query business data caused by malicious robot program crawling, and also ensure the normal operation of friendly robot programs (such as search engines and advertising programs). This page mainly introduces how to use the known crawler library in the BOT management module.

Precondition

Background Information

WAF provides known types of crawler libraries, including 11 known public BOT categories, more than 300+ BOT sub-categories, including search engines, speed measurement tools, content aggregation, scanning, and web crawler. Users can set protective actions (observation, man-machine interaction, interception) for public BOT categories according to their own needs. WAF will handle BOT requests that hit the public type accordingly.

Note You can also use access control rules to further whitelist requests from friendly crawler whitelisted IPs or UAs outside the Bot management module.

Operation Steps

  1. Log in to the Web Application Firewall Console.

  2. In the navigation bar at left side, click Website Configuration.

  3. Navigate to the domain name to be protected on the Website Configuration page, and click Protection Configuration in the operation bar.

  4. Click the BOT Management tab on the protection configuration page, and navigate to the Known BOT Type Settings module to activate the Status switch.

    image

​ 5. Select whether to enable the corresponding enabled status according to the BOT Description in the Known BOT Type rule list. You can view BOT type details by clicking View Type. For legal crawlers to be released, turn on the Observation switch under the operation bar. For all BOT rules, the Observation mode is enabled by default. You are advised to release crawler requests from search engines (Google, Bing, Baidu, Sogou and 360), that is, select the Observation mode.

image

image

Operation Note
Observation The log is recorded only if the request is released
Man-Machine Interaction For the request of man-machine interaction challenge, it will be released if the verification is successful and intercepted if unsuccessful
Interception The interception request is fed back to the 493 page
Feedback

开始与售前顾问沟通

可直接拨打电话 400-098-8505转1

我们的产品专家为您找到最合适的产品/解决⽅案

在线咨询 5*8⼩时

1v1线上咨询获取售前专业咨询

点击咨询
企微服务助手

专业产品顾问,随时随地沟通