The Webpage Text API is a cloud service that lets you easily retrieve the HTML for the content of a webpage without the junk (chrome, navigation, ads, and scripts) that tends to clutter modern webpages.
The Webpage Text API is perfect for RSS readers, read later services, browser extensions, newsbots, and other applications where the user wants the content of the webpage without the cruft.
Check out the demo that lets you see generated content for any given URL.
The Webpage Text API lets you retrieve webpage text for up to 100 webpage URLs with one HTTPS request.
The Webpage Text API generates webpage text for a given URL very quickly. Subsequent requests for the same URL are almost instantaneous.
Ease of Use
The Webpage Text API is well-documented and adheres to REST conventions. In addition I provide great customer support.
The Story Behind the Webpage Text API
In 2018 I began work on next-generation webpage text capabilities for Unread 2. Unread had built-in webpage text retrieval capabilities that were powered by Readability.js. That worked well, but I needed the ability to cache webpage text and associated images ahead of time. It was impractical to generate webpage text for thousands of articles at a time on-device, so I researched server-based options.
At that time Mercury Reader provided an API and generously made it available for free. However their terms of service would not allow Unread to aggressively cache webpage text for articles ahead of time. The Mercury Parser source code had not yet been made public.
I looked into commercial options, but none fit my needs. So I started writing my own server-based system. I started by incorporating the heuristics used by Readability.js. I then added test cases from hundreds of different websites to improve the webpage text quality.
After Mercury Parser went open source, I evaluated whether it would be more suitable for generating webpage text for Unread. I discovered that I got higher quality results from my own Webpage Text API than I would from Mercury Parser. This inspired me to continue improving the Webpage Text API, and to later offer it as a commercial product.
- 10,000 requests per month
- 50,000 unique URLs per month
- 100,000 requests per month
- 500,000 unique URLs per month
- 200,000 requests per month
- 1,000,000 unique URLs per month
Higher limit plans are available upon request.
Just write to firstname.lastname@example.org. I will set you up right away.