前几天的时分介绍了怎么屏蔽AhrefsBot蜘蛛,那个是国外网络营销网站的抓取蜘蛛,其实这样的蜘蛛很多的,大家查询网站日志就能够看到了,今日咱们就来看看另外一家网络公司的蜘蛛——MJ12bot

MJ12bot是什么蜘蛛?
MJ12bot是一个来自英国网络营销公司的搜索引擎蜘蛛,这个搜索引擎称号叫做:Majestic。这家公司的搜索引擎主要是用来绘制互联网地图的,然后用这个互联网地图数据来为企业供给互联网营销数据服务。现在,这家公司供给了13种言语的网站服务。

About MJ12Bot
Bot Type
Good crawler
(always identifies itself)
IP Range
Distributed, Worldwide
Obeys Robots.txt
Yes
Obeys Crawl Delay
Yes
Data served at
Majestic.com
Majestic is a UK based specialist search engine used by hundreds of thousands of businesses in 13 languages and over 60 countries to paint a map of the Internet independent of the consumer based search engines. Majestic also powers other legitimate technologies that help to understand the continually changing fabric of the web.

Web site owners can see data about their own websites on majestic.com.

MJ12Bot does not currently cache web content or personal data. Instead it maps the link relationships between websites to build a search engine. This data is available to technologies and the public, either by searching for a keyword or a website at Majestic. Details about the community project behind the crawlers are at Majestic12.co.uk.

 

A few days ago, I introduced how to block the AhrefsBot spider, which is a crawling spider of a foreign Internet marketing website. In fact, there are many such spiders. You can see it by checking the website logs. Today, let’s take a look at another Internet company’s. Spider-MJ12bot.
What kind of spider?
It is a search engine spider from a British Internet marketing company. The title of this search engine is Majestic. The company's search engine is mainly used to draw Internet maps, and then use this Internet map data to provide Internet marketing data services for enterprises. Now, this company provides web services in 13 languages.
Users can learn some data about their website on Majestic.
In general, if your website business is mainly in China, then the MJ12bot spider is of little use to you.
What effect do spiders have on the website?
This spider program will also slightly increase the burden on the server. In some cases, if its crawl volume is relatively large, the impact will be more obvious, but under normal circumstances it will not have much impact. If you don't want to see it in the log, you can block it in robots.txt. This spider follows the rules of robots.
How to block MJ12bot spider?
Since the MJ12bot spider will follow the robots protocol, we can block it directly in robots.txt. The specific code is as follows:

User-agent: MJ12bot
Disallow: /

用户能够在Majestic上了解到自己网站的一些数据。

总的来说,假如你的网站业务主要是在国内的话,那么MJ12bot蜘蛛关于你来说就没有多大的作用。

MJ12bot蜘蛛对网站有什么影响?
这个蜘蛛程序同样会轻微的增加服务器的负担,有的时分假如它的抓取量比较大的话,影响也会比较明显,可是一般情况下不会有太大的影响。假如你不想在日志中看到它,能够将它在robots.txt中屏蔽掉,这个蜘蛛是遵循robots规矩的。

怎么屏蔽MJ12bot蜘蛛?
由于MJ12bot蜘蛛会遵循robots协议,所以咱们能够直接在robots.txt中将它屏蔽掉,具体代码如下:

User-agent: MJ12bot
Disallow: /

声明:本站所有文章,如无特殊说明或标注,均为本站原创发布。任何个人或组织,在未征得本站同意时,禁止复制、盗用、采集、发布本站内容到任何网站、书籍等各类媒体平台。如若本站内容侵犯了原著者的合法权益,可联系我们进行处理。