Automating the ingestion and processing of data
Overview
Graft allows you to iterate on your workflow by importing data, training and applying models. After you are satisfied with the results, you will want to automate the workflow to get updated results on a regular basis and to continue to improve your model(s). Scheduled workflow allows you to run a workflow automatically on a given cadence that you specify.
In the Entity Dashboard, Graft displays options for workflows as tabs, those under the Run Workflows are generated based on the configuration and processing status of the data sources and enrichments and can be manually triggered by Clicking on the related Play icon. To view or create schedule a workflow Click on the Schedule Workflows tab to display any currently configured workflows.
Create a Schedule
- Click on SCHEDULE WORKFLOWS to bring up the workflow configuration window.
By default No recurring workflow is selected.
- Select your Workflow Option
- Ingest data from the data sources associated with the current Entity
- Ingest and apply one or more trunk models to the new data
- Ingest, apply model(s) and also apply one or more enrichments
- Specify the Data Ingestion Frequency using the number and units drop down. For example you could ingest data every 2 days.
- Click OK to save the work flow or CANCEL to return to the Workflow tab
The workflow will show each job and its most recent status, which in this case was completed
AVAILABLE WORKFLOWS
Only those workflow sequences which are possible are shown, any workflow which is not available is greyed out
INCOMPLETE WORKFLOWS
If the processing of a workflow is still running when the workflow is scheduled to run again the new workflow will only run those jobs which were complete from the previous run. The "In process" job will not be run. On the subsequent workflow run all jobs would be run, provided there are none running, catching up on all the processing. If you would rather not wait til the next scheduled workflow you may manually run the job from the jobs tab or from the entity dashboard. For example should an hourly workflow to ingest, apply trunk model and enrich still be running (enrichment has not completed), the next workflow will only ingest and apply the trunk model to the new data