Monday, 14 September 2015

Apache Falcon – Scheduling First Process

In my last post I described how we can define our first Apache Falcon process, in this post I will describe how we can schedule (execute) that process.

As a first step towards scheduling this process we need to submit our cluster to Falcon using below command –

$ falcon entity -type cluster -submit -file test-primary-cluster.xml
falcon/default/Submit successful (cluster) test-primary-cluster

We can verify all the clusters registered with Falcon using below command -

$ falcon entity -type cluster –list
(CLUSTER) test-primary-cluster

After the cluster is submitted we need to submit our feed and process respectively -

$ falcon entity -type feed -submit -file feed-01-trigger.xml
falcon/default/Submit successful (feed) feed-01-trigger

$ falcon entity -type process -submit -file process-01.xml
falcon/default/Submit successful (process) process-01
Now we have to upload our Oozie workflow which is referred by the Falcon process to HDFS –

$ hadoop fs -mkdir -p /tmp/oozie_workflow
$ hadoop fs -put workflow.xml /tmp/oozie_workflow/

After submitting all the Falcon entities we need to schedule our Falcon process and feed –

$ falcon entity -type feed -name feed-01-trigger –schedule
$ falcon entity -type process -name process-01 –schedule

Once the Falcon process is scheduled, the status of different instances of this process will be in waiting state as the feed file is not present on HDFS as shown in below screenshot -

Let us create one instance of this feed file

$ hadoop fs -mkdir -p /tmp/feed-01/2015-09-07
On refreshing Falcon UI for this process we can see that the first instance of this process has triggered and it is in running state and after some time the status is successful as shown in below screenshot -

We can also check /tmp/demo.out file to confirm if our script has executed successfully.

If you want to delete this Falcon feed and process entities execute below commands –

$ falcon entity -type process -name process-01 -delete
$ falcon entity -type feed -name feed-01-trigger -delete

No comments:

Post a Comment