Using REST API to search for news

With Google News API gone, how you can use our REST API to search for news content. Explore the various available filters.

Using REST API to search for news

In order to access our News API, we provide the Python and Node.js SDKs, which significantly simplify making the searches and iterating through the results. In some use cases, however, a direct REST API access is needed. In this blog post, we will describe the basics of how you can start using our API by making REST calls.

Endpoints

For different types of actions that you can perform using our API, there are corresponding endpoints that you need to call. If you'd like to search for articles, for example, then you'll need to call endpoint

http://eventregistry.org/api/v1/article/getArticles

whereas to search for the events you would call

http://eventregistry.org/api/v1/event/getEvents

To get the full list of available endpoints and the supported parameters, you can view the documentation page.

To GET or to POST, that is the question

One thing to mention is the HTTP method that you can or even should use when making the requests. Semantically, the GET method would be the most appropriate for the provided endpoints, since you are always just retrieving information. Using the GET method you are however limited to a maximum 2,048 characters which can become a limitation if you have a complex query with a long list of specified parameters.

To avoid the URL length limitation we suggest that you simply use POST requests and encode the parameters as a JSON object in the body of the request. This approach is also what we will use here.

Additionally, you also need to set header "Content-Type" to value "application/json".

Searching for Elon Musk

For an example, let's assume that you want to find news articles mentioning Elon Musk. In that case, you need to call the endpoint

http://eventregistry.org/api/v1/article/getArticles

with request body:

{
    "keyword": "Elon Musk",
    "apiKey": "YOUR_API_KEY"
}

If you don't specify any additional parameters, the output would today look something like this:

{
    "articles": {
        "results": [
            {
                "uri": "6382884774",
                "lang": "eng",
                "isDuplicate": false,
                "date": "2021-01-11",
                "time": "09:11:00",
                "dateTime": "2021-01-11T09:11:00Z",
                "dateTimePub": "2021-01-11T09:04:00Z",
                "dataType": "news",
                "sim": 0.6745098233222961,
                "url": "https://www.techradar.com/news/is-it-time-to-try-signal-or-telegram",
                "title": "Is it time to try Signal or Telegram? ",
                "body": "... Signal got an especially good boost from the likes of tech celebrities such as Edward Snowden and Elon Musk who tweeted ...",
                "source": {
                    "uri": "techradar.com",
                    "dataType": "news",
                    "title": "TechRadar"
                },
                "authors": [
                    {
                        "uri": "leila_stein@techradar.com",
                        "name": "Leila Stein",
                        "type": "author",
                        "isAgency": false
                    }
                ],
                "image": "https://cdn.mos.cms.futurecdn.net/EP3nfZFSctwY45wJvooRvb-1200-80.jpg",
                "eventUri": "eng-6455774",
                "sentiment": 0.2392156862745098,
                "wgt": 348052260,
                "relevance": 1
            },
            ...
        ],
        "totalResults": 20593,
        "page": 1,
        "count": 100,
        "pages": 206
    }
}
abbreviated sample response

The above example response is significantly abbreviated and shows just a single returned article (instead of 100) and has an abbreviated body (which is otherwise returned in full).

Properties to retrieve in the output JSON

In the response, you can see the information such as the title, body, date, language, source information, authors, and the URL. In addition, there are also several properties that we compute for each article, which can be returned or not, depending on whether the user needs them or not. Such properties include sentiment, event URI, info whether the article is a duplicate or not, number of article shares on social media, list of concepts and a list of categories.

To retrieve additional available article meta-data, you can specify in the query the properties that you'd like to see added in the output. The full list of requested properties can be found on the documentation page. For example, to retrieve the list of concepts and categories, specify these parameters:

{
    "keyword": "Elon Musk",
    "includeArticleConcepts": true,
    "includeArticleCategories": true,
    "apiKey": "YOUR_API_KEY"
}
example query that will additionally for each article also return the list of concepts and categories

You can also see what is the total number of results matching the query (in the above example, the query was limited to the last 30 days of content), which page of results we've returned for you and what is the total number of pages, that you can retrieve.

Retrieving more results

Since there are more than 100 matching results, you can retrieve more results by repeating the query, but this time also specifying the articlesPage parameter and setting it to value 2, 3, 4, etc. with each call.

{
    "keyword": "Elon Musk",
    "articlesPage": 2,
    "apiKey": "YOUR_API_KEY"
}
example query to retrieve the second page of results

With each of these calls, you'll receive additional 100 articles.

Additional filters

The example that we used contained a single filter, which is that the results should contain a phrase "Elon Musk". There are several other filtering options available that you can use to retrieve just the right set of results. These filters include:

  • concepts - entities and things that we are able to recognize and disambiguate in the articles,
  • categories - article topics, such as sports, investment, natural disasters, etc.,
  • sources - source that published the news article,
  • source locations - the location of the news source (e.g. get articles published by sources located in NY or Germany),
  • authors - who is the person that wrote the article,
  • language - get only content published in a particular language,
  • date - find only articles published in a particular date range,
  • sentiment - get only articles with a particular sentiment

If you'd like, for example, to retrieve only business related news published by US news sources, you can create a query like:

{
    "keyword": "Elon Musk",
    "categoryUri": "news/Business",
    "sourceLocationUri": "http://en.wikipedia.org/wiki/United_States",
    "apiKey": "YOUR_API_KEY"
}

Please view the documentation page to get the full list of available filters and their descriptions.