Have you ever needed to schedule a particular database query to be executed once per day? Trivial enough, depending on the environment, right? What about retrieving XML data from a particular web service, on a scheduled basis? Again, depending on your environment this can be a very simple procedure. Now, what about executing several complex database queries per day that somehow require an ever-changing list of XML data to be retrieved from multiple web services, and then have this XML data loaded back into your database? Now do a hundred minor variations on this, without ever writing a custom application. Interested yet? That's the type of task automation that the Calico system is designed to accomodate. Don't get me wrong, it can handle the earlier examples just fine, too - but so can a dozen other solutions. But when you really need to do some serious automation and scheduling, you may find yourself feeling somewhat limited by cron, or DTS (a.k.a. Integration Services in Microsoft SQL Server 2005) [particularly this latter if you don't use Microsoft products], or other customized applications. You won't be limited by the Calico framework.
A few definitions before we begin.
System:
An organizational unit for tracking tasks, jobs and users. The only information needed for a system is a name, and an email address/alias to use when sending notification emails pertaining to this system.
System Variable:
System variables can be thought of as constants (e.g. database connection strings, file paths, email addresses) that can be defined on a per-system basis. These variables can then be referenced by name in task resources or jobs (e.g. you could reference "CALICO_DB_CONNECTION_STRING" instead of "mysql://some_user:somepassword@somehost.thecalico.com/calico").
Task Class:
Any object/class (derived from a base "Task" class) that performs a particular task. The purpose of a task class may range from making a request of a web page, to moving files around a file system, to executing database queries, and so on. In some sense, a task class represents a basic implementation of a design pattern that has some relevance to you. The task classes that are included in the base distribution are:
Task Instance (a.k.a. Task):
A task instance is a set of configuration information that matches a task class (e.g. "WebRequest") with a particular system to execute in, a name to reference this task instance as (e.g. "Utilities - Simple Web Request"), and optionally a set of resources to use when executing this particular instance of the task. Tasks can also contain information about which host servers may execute them.
Task Resource:
A task resource is information that a task instance can use at run-time, without requiring that the information be passed through as a job detail. Sounds confusing, I know. In practice, resources are generally used to define database queries that you'd rather not have hard-coded into your task classes. This allows for the creation of single class (i.e. one PHP file) for executing database queries, that can be used over and over again for many different queries - each instance of the task simply specifies different query resources. [A few years ago I wrote a short article describing the basic syntax used to describe queries in this manner, albeit in C#. You can view that article here.] Another possible use of resources is to override incoming parameters from a job (say, to "hard-code" a task to only operate in a particular directory, for instance) - this is a bit of an advanced topic, though.
Host Server:
A host server is an instance of the server application that actually executes jobs. Generally this corresponds to one single machine, but it is quite possible to run several instances of the server application on a single host. Each host can be configured with information about it's name, what groups it should be included in (for controlling execution of tasks), and control bits (e.g. pause, stop, restart).
Job:
A job is a set of information including a task to execute, a system to execute in, parameters/details to control the task execution, and scheduling information (i.e. when to execute the task, what to do if it succeeds or fails, optional assignment to a particular host server, etc.). As a job is executed, additional information is added such as it's host server, start/finish times, and logged messages.
The basic operation of the Calico system involves a central scheduling/controller database and one or more host servers. Jobs are scheduled for execution in any variety of ways: manually through the scheduling database or a user interface to the database, programmatically from other applications, or as conditional branches from other jobs.
Scheduling a job requires the insertion of only one record into a single table, rather than adherence to a complex API. Because of this, any application (no matter the programming language, architecture, deployment platform, etc.) can easily take advantage of job scheduling. Likewise it's exceedingly simple to encapsulate this scheduling functionality into simple to use wrapper classes.
Host servers then periodically poll the central database for jobs that are due for execution (and that meet all their dependency requirements and are marked as allowed to execute on the given host). When a suitable job is located, a host server will "claim" the job and carry out it's execution as requested.
As an example of some of the benefits to using the Calico task scheduling and automation system, consider the following:
Central Management
All messages that are logged during the execution of a job are also stored with that job record in the central scheduling database. This means that you never need to go sifting through log files or event logs on the different host servers to find errors. You can find the status and messages for any job in one place - the central scheduling database.
Historical Tracking
Every job that gets executed is stamped with a start and finish time, as well as information about what host server the job was executed on. Immediately then, you gain the benefit of being able to track historical job volume and performance.
Self Documenting
All task classes and tasks (task instances) include fields in the database for entering descriptions and comments. Once entered these comments can be parsed and used in other applications (such as to help prompt for job parameters/details when manually scheduling a job, as in the demonstration). Coupled with the transparent nature of the task resources (which, again, will generally be used to store query definitions) helps make the entire system self documenting. To demonstrate this, supposed you were planning a substantial schema change to one of your main databases and wanted to know which tasks would be affected by this change: simply query the task resources for any query definitions that reference the schema objects in question. No guesswork required.
As you start to build a repertoire of often-used tasks you'll likely find that there is a need for some sensitive information to be contained in the central scheduling database (often in the form of system variables [remember, these are like constants in a software application]). Additionally, if you use a web-based user interface to administer the job scheduling database (or said database is in any way accessible to the public) you may be concerned with preventing malicious users from viewing some of this sensitive information, and/or taking control of your host servers. To that end it is possible to fully encrypt all system variables and jobs (including the job details/parameters as well as any logged messages during the execution of the job). The integrated security system prevents improper tampering with encrypted jobs and restricts the availability of sensitive information to just those users trusted with the proper encryption/decryption keys.
The simplest installation of a Calico automation system would include one machine: that machine would include the scheduling database, run an instance of the host server application, and optionally provide for a user interface to the scheduling/control system. Obviously such an install would not be acceptable for a large project where single points of failure are unacceptable. Luckily, a given deployment need not have these limitations. Any number of host servers can be configured to monitor a deployment, and they can be configured to allow or restrict execution of various tasks as necessary in your environment. Fail-over redundancy can be as simple as running a second instance of the host server application on the same machine, or launching instance(s) on other machine(s). And while it's true that only a single instance of a job scheduling/control database is supported, database redundancy/fail-over is one of the most widely documented- and provided-for scenarios that exist for servers today. Any combination of replication and database clusters can be employed to reduce down-time for a database - I'll leave those solutions to the subject matter experts in the area.
The basic premise of scheduling is the "due date". A job will not be executed until it it's scheduled/requested date/time of execution. Simple enough. But for complex procedures this is by no means enough control for proper functioning. So additionally, jobs may be grouped in "batches". A batch of jobs have some dependency on each other. A dependency is generally of the nature "Don't execute job Y until X has finished successfully." or "Don't execute job Y while X is running.".
While job batching is suitable in a number of situations, particularly well-defined and rigid procedures, even this is not control enough for very complex processes. Sometimes decisions need to be made at run-time, decisions that cannot possibly be planned out fully in advance. For this, the Calico system employs a concept known as "bootstrapping". A bootstrap, put simply, is any process that starts some other process. Jobs in the Calico system can bootstrap (i.e. schedule for execution) other jobs when they finish successfully (or even when they fail). Flexible logic, in the form of XSL files that are dynamically applied to the job details and logged messages, allow for sophisticated decision making at run-time. For example: you schedule a job to retrieve a particular XML file from an FTP server every day - bootstrapping logic for you job states that if a file is located, then another jobs will be scheduled to do an XSL transformation on that XML to turn it into a CSV file; if no file is downloaded no subsequent transformation job is ever scheduled.
The job parameters/details is a text string using a sample XML syntax. For example a detail string might consist of "<strEmailNotifyFailure>someone@thecalico.com</strEmailNotifyFailure>" which would indicate that if a failure email message is generated during the execution of the job, it should be addressed to "someone@thecalico.com". This allows for a completely flexible, yet uniform, method of passing information from jobs into their actual tasks of execution. The task framework, in turn, makes it very simple to extract this information and use it properly within actual task classes.