PaperLiBot is the generic name of Paper.li's web crawler.
Why did PaperLiBot hit my website?
PaperLiBot may start crawling your website for the following reasons:
- Your website has been shared or engaged upon on social media sites
- Your website is advertising RSS feeds.
- A Paper.li user is adding some of your website content to their paper.
How is my website content used?
Your content may be included within one or more of Paper.li's papers. This content may be visible on paper webpages and be included within email newsletters and social promotions (Twitter, Facebook and LinkedIn).
When your content is included it takes the following format:
- A link to the original content URL (your website).
- The title of the page (at the time the content was crawled).
- A thumbnail of the image found at the original content URL.
- A short extract of the content (no more than 200 characters).
- The source which lead to the discovery of the content (post on social media, RSS feed entry, contributor).
Technical details about the crawler
When a PaperLiBot crawler visits your website, it will send a valid User-Agent header and connect from Paper.li's infrastructure.
PaperLiBot's user agent is:
Mozilla/5.0 (compatible; PaperLiBot/2.1; https://support.paper.li/entries/20023257-what-is-paper-li)
PaperLiBot is designed to run simultaneously on multiple different machines to improve performance and scale. Therefore, your logs may show visits from several machines within our infrastructure which is hosted on OVHcloud (Europe) and Amazon Web Services (USA).
PaperLiBot will usually access your site no more than once every few seconds, on average. However, due to the viral nature of social media content it is possible that the rate will appear to be slightly higher over short periods.
Blocking PaperLiBot from visiting your site
If you want to prevent PaperLiBot from crawling content on your site please get in touch with us so we can add your site to our blocklist and prevent future content to be included within Paper.li service.