Webpage Text API

We are not currently accepting Webpage Text API customers.


The Webpage Text API is a cloud service that lets you easily retrieve the HTML for the content of a webpage without the junk (chrome, navigation, ads, and scripts) that tends to clutter modern webpages.

The Webpage Text API is perfect for RSS readers, read later services, browser extensions, newsbots, and other applications where the user wants the content of the webpage without the cruft.


Features

Bulk Retrieval

The Webpage Text API lets you retrieve webpage text for up to 100 webpage URLs with one HTTPS request.

Speed

The Webpage Text API generates webpage text for a given URL very quickly. Subsequent requests for the same URL are almost instantaneous.

Ease of Use

The Webpage Text API is well-documented and adheres to REST conventions. In addition I provide great customer support.


The Story Behind the Webpage Text API

In 2018 I began work on next-generation webpage text capabilities for Unread. Unread had built-in webpage text retrieval capabilities that were powered by Readability.js. That worked well, but I needed the ability to cache webpage text and associated images ahead of time. It was impractical to generate webpage text for thousands of articles at a time on-device, so I researched server-based options.

At that time Mercury Reader provided an API and generously made it available for free. However their terms of service would not allow Unread to aggressively cache webpage text for articles ahead of time. The Mercury Parser source code had not yet been made public.

I looked into commercial options, but none fit my needs. So I started writing my own server-based system. I started by incorporating the heuristics used by Readability.js. I then added test cases from hundreds of different websites to improve the webpage text quality.

After Mercury Parser went open source, I evaluated whether it would be more suitable for generating webpage text for Unread. I discovered that I got higher quality results from my own Webpage Text API than I would from Mercury Parser. This inspired me to continue improving the Webpage Text API, and to later offer it as a commercial product.


Pricing

Starter Plan

$99 (USD)/month

  • 100,000 requests per month
  • 1,000,000 unique URLs per month

Professional Plan

$299 (USD)/month

  • 300,000 requests per month
  • 5,000,000 unique URLs per month

Premium Plan

$499 (USD)/month

  • 1,000,000 requests per month
  • 20,000,000 unique URLs per month

Higher limit plans are available upon request.


Quick Setup

Just write to sales@goldenhillsoftware.com. I will set you up right away.