Let’s build a Telegram bot that calls HackerNews API every 60 minutes, grabs the top 10 stories and notifies subscribed users. Everything will run entirely using Google Cloud Functions, Cloud Firestore and Cloud Scheduler.

Google Cloud Functions allows writing code that runs in the cloud without worrying about servers, scaling and availability. Cloud functions can be triggered by cloud events like incoming HTTP requests, Google Pub/Sub notifications, Cloud Storage events and many others…

What’s behind HNBot?

So, the user starts chatting with the bot by sending the /start command.
A Telegram bot has two ways of getting incoming messages: - it can pool for updates periodically - or it can be notified through an HTTP POST request sent to a particular URL, called WebHook, set by the developer.

We can define a Cloud function to act as a WebHook that accepts updates from Telegram. When the start function is triggered (step 2) the user id is stored in the Firestore database (step 3) and a welcome message is sent back (step 4 and 5). The subscription is made.

How about the dispatch of HackerNews updates? Unlike the previous case, a function reachable from the internet is not a great idea. Instead, we can define a second function that is triggered by Google Pub/Sub notifications. The notification is published periodically by Google Cloud Scheduler (step a), a sort of in-cloud cron job scheduler. The function selects subscribed users (step b and c) and makes a few calls to the HackerNews API (step d and e). Finally, the result is sent to the users (step f).

Bot’s lifecycle

Zooming in a little bit we can identify a few services we need:

  • Telegram Bot
  • Firebase Firestore is a cloud-hosted NoSQL database. It will store the identifiers of the users that started chatting with the bot. Firestore has a generous free quota and an interesting pricing model mainly based on the number of reads and writes operations carried out.
  • Google Cloud Functions is the core of the bot code. Unfortunately, just a few programming languages are available: NodeJS, Python3.7 and Go. The free quota is incredibly generous: cloud functions can be invoked without charges 2 million times a month and consume up to 400000 GB*seconds and 200000 GHz*seconds.
  • Google Pub/Sub is a real-time messaging service that allows exchanging data and triggerring events between different applications. It allows triggering a Cloud Function without exposing it to the internet.
  • Google Cloud Scheduler is a solution to schedule cron jobs in the cloud. It will publish a message to Pub/Sub once every 60 minutes to trigger the dispatch function.

Telegram Bot Setup

Quoting Telegram docs:

There’s a… bot for that. Just talk to BotFather and follow a few simple steps. Once you’ve created a bot and received your authorization token, head down to the Bot API manual to see what you can teach your bot to do.

Take note of the API token.

Google Cloud stuffs Setup

Head to Google Cloud Console and create a new project. Even though the bot uses only resources from the free tier, you must enable billing.

On the Firebase Console create a new project linked to the one just created on the Google Cloud Console.

Cloud Firestore database

Now we can set up Cloud Firestore and define how the user ids will be stored. Cloud Firestore is a NoSQL database: only documents (along with their properties) and collections of those documents can be stored. The database can be modelled as a tree though strictly speaking, it is not. At the root we can only have a collection of documents. Each document can include other collections and so on.

We try to keep the number of reads and writes as low as possible. This application is pretty simple so I chose to define a default collection that contains a document ids. ids document has a values array property storing identifiers and nicknames: this way reads and writes involve just one document. Inside the Firebase Console activate the Firestore database and define a structure like the following:

Firestore database structure

Cloud functions

Once the Cloud platform is set up and the database is configured, it is time to write the two NodeJS cloud functions that will power the bot.

[me@home ~]$ mkdir hackernews-bot
[me@home ~]$ cd hackernews-bot
[me@home hackernews-bot]$ mkdir start
[me@home hackernews-bot]$ mkdir dispatch
[me@home hackernews-bot]$ cd start

First of all, we need some dependencies in the package.json file:

{
    "dependencies": {
        "@google-cloud/firestore": "2.2.6",
        "axios": "^0.19.0",
        "firebase-admin": "8.3.0",
        "telegraf": "^3.32.0"
    }
}

One of these is Telegraf that simplifies the creation of Telegram bots.

In the index.js file:

const Telegraf = require("telegraf");
const Firestore = require("@google-cloud/firestore");
const admin = require("firebase-admin");

const PROJECT_ID = process.env.PROJECT_ID;
const COLLECTION_NAME = "default";
const DOCUMENT_NAME = "ids";

const firestore = new Firestore({
    projectId: PROJECT_ID
});

const bot = new Telegraf(process.env.TELEGRAM_API_TOKEN);

Two environment variables are present and will be defined later at deployment time:

  • PROJECT_ID: id of the Google Cloud project,
  • TELEGRAM_API_TOKEN: api token given by the BotFather.

Next, let’s define the function that handles the updates from Telegram. The callback passed to the start method receives a ctx object from which we can extract the chat id and the username. This pair of values is then put into the values array of the ids document.

const REPLY = "Welcome. You'll receive updates from HackerNews soon! 👍";
bot.start(async (ctx) => {
    ctx.reply(REPLY);
    const chat = await ctx.getChat();

    firestore
        .collection(COLLECTION_NAME)
        .doc(DOCUMENT_NAME)
        .update({
            values: admin.firestore.FieldValue.arrayUnion({
                id: chat.id,
                username: chat.username
            })
        });
});

bot.launch();

The only snippet missing here is the entry point that can be called by Google Cloud Function. Since the function is triggered by HTTP requests, we must export a function start that takes two parameters req and res representing the incoming request and the response. In particular, the body of the request carries the updates from Telegram:

exports.start = (req, res) => {
    bot.handleUpdate(req.body, res);
    res.status(200).send();
};

The start function is ready to be deployed. Use gcloud init to configure the Google Cloud CLI with your current project and then deploy the function. Substitute the Telegram token and Google Cloud project id in the following command:

[me@home start]$ gcloud functions deploy start \
--trigger-http \
--runtime nodejs10 \
--set-env-vars TELEGRAM_API_TOKEN=<YOUR_TOKEN_KEY_HERE>,PROJECT_ID=<YOUR_PROJECT_ID_HERE> \
--timeout=5

After a few minutes, a positive result should appear showing the address at which the Cloud Function can be reached. This URL is the WebHook Telegram must call to notify updates. To set the WebHook you need to make an HTTP GET call to the following address:

[me@home start]$ curl https://api.telegram.org/bot<BOT_TOKEN>/setWebhook?url=<WEBHOOK_URL>

The bot is now ready to accept new users. If you try to start a new chat, you should see a welcome message and your chat id appearing in the Firebase console.

First reply of the HackerNews bot

The user appeared in the Firestore console


The second function is slightly more involved. Move to the dispatch folder created before. The package.json file is the same as before so we can copy-paste it in the second folder. Let’s start with the definition of a function that calls HackerNews API and returns an array of stories. For each story, we need its title and URL. Details on the HackerNews API can be found here.

const axios = require("axios");

async function getTopTen() {
    const url = "https://hacker-news.firebaseio.com/v0/topstories.json"
    const response = await axios.get(url);
    const values = response.data.slice(0, 10);

    const ITEM_URL = "https://hacker-news.firebaseio.com/v0/item/";
    const ls = values.map(async (id) => {
        const url = `${ITEM_URL}${id}.json`
        const response = await axios.get(url);
        return {
            title: response.data.title,
            url: response.data.url
        };
    });

    return await Promise.all(ls);
}

The code is straightforward. https://hacker-news.firebaseio.com/v0/topstories.json returns the top 500 stories so we can slice the first 10. For each one, we must call the API again to get more details.

Then we must initialize Firestore and Telegraf as before.

const Telegraf = require('telegraf');
const Firestore = require('@google-cloud/firestore');

const PROJECT_ID = process.env.PROJECT_ID;
const COLLECTION_NAME = "default";
const DOCUMENT_NAME = "ids";

const firestore = new Firestore({
    projectId: PROJECT_ID
});

const bot = new Telegraf(process.env.TELEGRAM_API_TOKEN);

Finally, we can export the entry point of the function. This time the trigger must be a Pub/Sub notification so the callback arguments are different:

exports.dispatch = async (pubSubEvent, context) => {
    const posts = await getTopTen();

    // message composition
    let message = "";
    posts.forEach((p) => {
        message += `${p.title} - ${p.url}\n\n`;
    });

    // get the document containing the user list
    const doc = await firestore
        .collection(COLLECTION_NAME)
        .doc(DOCUMENT_NAME)
        .get();

    if (doc.exists) { // check that the doc exists
        doc.data().values.forEach((entry) => {
            // for each user send the notification message
            bot.telegram.sendMessage(entry.id, message);
        });
    }
};

Before deploying the function, we must create a new Pub/Sub topic from the console:

Pub/Sub topic creation

The last step is to schedule a job that will publish a message to the Pub/Sub topic.

Google Cloud scheduler job creation

Cloud Scheduler uses standard cron format. Hence 0 */1 * * * defines a task that will be executed once every hour at minute 0.

We are ready to go. Let’s deploy the function:

[me@home dispatch]$ gcloud functions deploy dispatch --runtime nodejs10 \
--set-env-vars TELEGRAM_API_TOKEN=<YOUR_TOKEN_KEY_HERE>,PROJECT_ID=<YOUR_PROJECT_ID_HERE> \
--timeout=60 \
--trigger-topic hackernews_bot_schedule	

Great! After some time the first update is delivered. Or you can hit the Run now button from the Google Cloud Scheduler console.

News from HackerNews

Remember to delete the project from the Google Cloud Console to avoid unexpected costs.

You can find all the code on Github.