Episode 9: Using the Cron Service to run scheduled tasks

Welcome to Episode 9. In this episode, we shall be looking at how you can run background tasks in your GAEJ Application. By background Task, I mean any piece of code that you would like to run at a scheduled time and independent of the user interaction.There are several examples of such tasks. For e.g. :

  • Hourly/Daily/Weekly/Monthly backup of data
  • End of the day report generation to report any errors, transactions, etc.
  • Sending an email at the end of the day (or once a day) with some information to subscribers. For e.g. News sites.

If you have written a few web applications, you would definitely have come across more scenarios like that.

In this episode, we shall cover the following:

  1. What is a Cron Job?
  2. How to schedule a Cron Job?
  3. Write a simple Cron Job that prints a single statement
  4. Configure, execute and monitor the Cron Job execution

Let’s go!

What is a Cron Job? When would you need one?

I will use information liberally from Wikipedia over here to explain some of the core concepts. You can refer to the Cron page at Wikipedia if you want.

The word ‘cron’ is short for Chronograph. A Cron is a time-based job scheduler. It enables our application to schedule a job to run automatically at a certain time or date. A Job (also known as a Task) is any module that you wish to run. This module can perform system maintenance or administration, though its general purpose nature means that it can be used for other purposes, such as connecting to the Internet and downloading email.

Examples include:

  • Taking a daily backup of data via a scheduled task and moving the file to another server. (Runs once daily)
  • Sending an email every week to your subscribers. (Runs once weekly)
  • Clearing the log files at the end of every day (Runs once daily)
  • Remind yourself of a wonderful weekend coming up, every Friday at 5:00 PM (Runs once a week on a Friday at 5:00 PM)

The Google App Engine provides a service called the Cron Service that helps us do two fundamental things:

  1. Allows your application to schedule these tasks.
  2. Execute these tasks based on their schedule.

What does a Cron Job look like? And how do I schedule one?

A Cron Job is nothing but a URL that is invoked by the Google App Engine infrastructure at its scheduled execution time. To write a Cron Job, you need to do the following:

1. Write a Java Servlet and configure it in the web.xml. Note down the URL where the servlet can be invoked. The URL is the <url-pattern> mentioned in the <servlet-mapping> for your Servlet configuration in web.xml. For e.g.  the URL is the <url-pattern> element specified in the segment of the web.xml that is shown below:

 

<servlet>
<servlet-name>GAEJCronServlet</servlet-name>
<servlet-class>com.gaejexperiments.cron.GAEJCronServlet</servlet-class>
</servlet>

<servlet-mapping>
<servlet-name>GAEJCronServlet</servlet-name>
<url-pattern>/cron/mycronjob</url-pattern>
</servlet-mapping>

 

2. Create a cron.xml file that specifies one or more Cron Jobs (Scheduled Tasks) that you want to execute. A sample for the above Cron Job is shown below:

<?xml version="1.0" encoding="UTF-8"?>
<cronentries>
<cron>
<url>/cron/mycronjob</url>
<description>Put your Cron Job description here</description>
<schedule>Put Cron Job Schedule here</schedule>
</cron>
</cronentries>

The cron.xml file tells Google App Engine about the Cron Jobs that are scheduled by your application. This file resides in the WEB-INF directory of your application and is copied to the App Engine cloud when you deploy the application. The following points are important about the cron.xml file:

  1. Each Cron Job configured in your application is defined in a <cron/> element. So there can be one or more <cron/> elements.
  2. The above <cron/>element has the following 3 elements that defines the Job.
    • <url/> specifies where the Google App Engine can invoke your Cron Job. This is nothing but the Servlet URL that you defined in the web.xml file that we saw earlier.The Servlet URL will point to your Servlet which contains the Cron Job implementation.
    • <description/> is a simple text based description of what your Cron Job does. It does not influence any aspect of the execution and is used for display purposes when you look at your application configuration via the App Console.
    • <schedule/> is the time when your Job has to be executed. This is where you specify if your job is to be run daily, once every hour, on Friday at 5:00 PM, etc. It is completely dependent on when you wish to execute this job. However, you must follow some rules and they are specified in the documentation on Scheduling Format. I strongly recommend you to read it up to understand various ways of specifying the schedule. Some of the examples are: “every 1 minute”, “every 12 hours”, “every friday 17:00” and so on.

Develop a simple Cron Job

The first thing to do is to create a New Google Web Application Project. Follow these steps:

1. Either click on File –> New –> Other or press Ctrl-N to create a new project. Select Google and then Web Application project. Alternately you could also click on the New Web Application Project Toolbar icon as part of the Google Eclipse plugin.

2. In the New Web Application Project dialog, deselect the Use Google Web Toolkit and give a name to your project. I have named mine GAEJExperiments. I suggest you go with the same name so that things are consistent with the rest of the article, but I leave that to you. In case you are following the series, you could simply use the same project and skip all these steps altogether. You can simply go to the next part i.e. the Servlet code.

3. Click on Finish. This will generate the project and also create a sample Hello World Servlet for you. But we will be writing our own Servlet.

GAEJCronServlet.java

Our Cron Job is going to be very simple. It is simply going to print out a statement in the log file that says that it is getting executed. The Cron Service of Google App Engine automatically will invoke this Servlet when its scheduled time to execute has arrived. So all we need to do is code out Servlet. The code is shown below:

 

package com.gaejexperiments.cron;

import java.io.IOException;
import java.util.logging.Logger;

import javax.servlet.ServletException;
import javax.servlet.http.*;

@SuppressWarnings("serial")
public class GAEJCronServlet extends HttpServlet {
private static final Logger _logger = Logger.getLogger(GAEJCronServlet.class.getName());
public void doGet(HttpServletRequest req, HttpServletResponse resp)
throws IOException {

try {
_logger.info("Cron Job has been executed");

//Put your logic here
//BEGIN
//END
}
catch (Exception ex) {
//Log any exceptions in your Cron Job
}
}

@Override
public void doPost(HttpServletRequest req, HttpServletResponse resp)
throws ServletException, IOException {
doGet(req, resp);
}
}

 

The code is straightforward to understand. It has doGet() and doPost() methods. And you will find in the doGet() method, that we simply log with an INFO level, that the Cron Job has been executed. In fact, your actual Job implementation should go in here as indicated by the comments. So whether you are invoking a backend database, or sending a consolidated email report, etc should all go in here.

All that remains is to now tell the App Engine via configuration about your Servlet (via web.xml) and create the cron.xml file in which you will mention your Cron Job.

Configure the Cron Job

As mentioned, we need to configure the Servlet in the web.xml and also specify it in the cron.xml file. Let us look at that now:

Configuring the Servlet

We need to add the <servlet/> and <servlet-mapping/> entry to the web.xml file. This file is present in the WEB-INF folder of the project. The necessary fragment to be added to your web.xml file are shown below. Please note that you can use your own namespace and servlet class. Just modify it accordingly if you do so.

 

<servlet>
<servlet-name>GAEJCronServlet</servlet-name>
<servlet-class>com.gaejexperiments.cron.GAEJCronServlet</servlet-class>
</servlet>

<servlet-mapping>
<servlet-name>GAEJCronServlet</servlet-name>
<url-pattern>/cron/gaejcronjob</url-pattern>
</servlet-mapping>

 

Specifying the Cron Job (cron.xml)

The cron.xml for our application will contain only one Cron Job. And here we specify the Servlet URL along with the schedule. Notice that I have chosen to execute this Cron job every 2 minutes. But you are free to experiment if you like with different Schedule Formats. This files needs to be created in the WEB-INF folder of your project.

<?xml version="1.0" encoding="UTF-8"?>
<cronentries>
<cron>
<url>/cron/gaejcronjob</url>
<description>GAEJExperiments Cron Job that simply announces that it got invoked.</description>
<schedule>every 2 minutes</schedule>
</cron>
</cronentries>

Deploy the Application

To deploy the application, follow these steps (they should be familiar to you now. I am assuming that you already have the Application ID with you):

  1. Click on the Deploy Icon in the Toolbar.
  2. In the Deploy dialog, provide your Email and Password. Do not click on Deploy button yet.
  3. Click on the App Engine Project settings link. This will lead you to a dialog, where you need to enter your Application ID [For e.g. my Application Identifier gaejexperiments]
  4. Click on OK. You will be lead back to the previous screen, where you can click on the Deploy button. This will start deploying your application to the GAEJ cloud. You should see several messages in the Console window as the application is being deployed.
  5. Finally, you should see the message “Deployment completed successfully”.

We can now check if the Google App Engine got our Cron Job correctly configured and verify if it is getting executed at the schedule that we have configured it to.

Monitoring the Cron Job

You can use the App Engine console to verify if your Cron Job is executing well or not. To do that, perform the following steps:

  1. Go to http://appengine.google.com and log in with your account.
  2. You will see a list of applications registered. Click on the application that you just deployed. In my case, it is gaejexperiments.
  3. When you click on a particular application, you will be taken to the Dashboard for that application, which contains a wealth of information around the requests, quotas, logs, versions, etc.
  4. Verify that the Cron Jobs that you specified in the cron.xml have been configured successfully for the application by clicking Cron Jobs, visible under Main. For our application that we deployed, here is the screen shot from the App Engine console:ep9-1

You will notice that the Cron Job has not yet run as the console indicates. Every time that the job is executed, this column is updated with the last date time stamp that the Job executed along with its status. Since we have configured our Job to run every 2 minutes, I waited for 2 minutes and then the job executed itself and when I refreshed the Cron Jobs page, the status was updated as shown below:

ep9-2

You can also click on the Logs link. This will display the application log. And all your application log statements that you code using the Logger class can be visible here. By default, the severity level is set at ERROR and we can change that to INFO and you should be able your log statements that had the log level of INFO. This was the log level at which we had logged the statement in our Java Servlet (Cron Job). Shown below is a screen shot of the log when the Cron Job was fired once.

ep9-3

Conclusion

This concludes Episode 9 of this series in which you learn how to schedule tasks in your Google App Engine applications. These background tasks that can be scheduled at a certain time and which are executed by the Cron Service are an indispensable part of several web applications that are deployed today. If you ever wish to do repeated tasks in your application without any user intervention like sending emails, crawling web sites, taking database backups, etc, then writing a Cron Job and scheduling it for execution is a key feature that you can utilize while deploying your application to Google App Engine.

There is a lot more to Cron Jobs and I suggest to read up the documentation.

Till the next episode, Happy Scheduling!

Read more Episodes on App Engine Services

 

15 thoughts on “Episode 9: Using the Cron Service to run scheduled tasks

    1. Thanks for the feedback. The Experimental Task Queue support Episode is planned in the coming week.

      Do let me know if there are specific episodes that you would like to see covered. I have several lined up but if it is a different topic but related in some way to GAEJ, I would learn and cover it for sure.

      Romin.

  1. Thanks Romin,
    Could you please confirm if 30 seconds time limit applies to cron job as well?
    We were evaluating GAE for our business App, and disappointed with this limit, as we have some long running algorithms.

    1. Yes. 30 second limit applies to all aspects (Jobs, Requests, Task Queues) as far as I know. Refer to http://code.google.com/appengine/docs/java/taskqueue/overview.html#Task_Execution for th ’30-second’. You can look at a recent article written by Nick Johnson : http://blog.notdot.net/2010/03/Handling-downtime-The-capabilities-API-and-testing in which some techniques to overcome these limitations is discussed. But still it is something that you as a developer will need to do.

      A general pattern for long running tasks would be :
      1. Start a task
      2. If the limit is reached ..catch that exception, save your work and
      Launch a task again that takes off from where you were last time, and so on .

      Thanks
      Romin

  2. Hello Sir,

    Thank you for your seminar at GDG Ahemadabad. I got so many new things from their. Further I am working on your example6 of cron job of twitter last tweet.

    On that I am not getting same tweet again & again . I am not getting latest tweets. I already check all admin configuration as shown in this page. They are same as you shown in image & mentioned.

    I also add your id (romin.k.irani@gmail.com) so you can get my problem easily.

    My application on “bonlinesolution@appspot.com”. I already tried for recreating the application & also tried by deployed it by different name.

    1. It is a bit difficult for me to address your issue without looking at your code. Could you please check the code in the Cron Servlet where you search for the Tweet. If you can even send me the Java file for that, I could help debug the issue.

  3. hi this really nice one. i done this application through the step and i deployed on app engine. In the cron jobs tab the every 2 min on suucess time will come, but in log i am facing problem “the cron job has been executed ” is not displaying in log instead it display :-

    2015-04-04 16:31:32.032 /cron/gaejcronjob 200 2200ms 0kb AppEngine-Google; (+http://code.google.com/appengine) module=default version=1
    0.1.0.1 – – [04/Apr/2015:04:01:32 -0700] “GET /cron/gaejcronjob HTTP/1.1” 200 55 – “AppEngine-Google; (+http://code.google.com/appengine)” “evacronjob.appspot.com” ms=2200 cpu_ms=2318 cpm_usd=0.000006 queue_name=__cron task_name=6ddf816ea374be943f2f1ae7cf29d2d4 loading_request=1 app_engine_release=1.9.18 instance=00c61b117c560efc3d12aed621dc1ec80059a5
    I 2015-04-04 16:31:32.032
    This request caused a new process to be started for your application, and thus caused your application code to be loaded for the first time. This request may thus take longer and use more CPU than a typical request for your application.

    could you plz help me why this information coming why it is not displaying the Cron Job has been executed in log . why this application is not working properly . plz reply i am waiting

    thanks in advance.

    1. A couple of things that you can try out:
      1) Check your logging properties file and see if the log level is set to a level that is lower or equal to the one that you are using when logging the statement.
      2) Put in a try / catch block and log the error statement. Maybe something is going wrong in the code.

Leave a reply to Wadael Cancel reply