Parameterizable jobs

If you want to make a part of a job parameterizable/variable, then it is possible to do so. Currently this is a feature only supported by means of editing the .analysis.xml files though, since the DataCleaner graphical user interface does not store job variables when saving jobs.

In the source section of your job, you can add variables which are key/value pairs that will be referenced throughout your job. Each variable can have a default value which will be used in case the variable value is not specified. Here's a simple example:

			...

			<source>
			  <data-context ref="my_datastore" />
			  <columns>
			    <column path="column1" id="col_1" />
			    <column path="column2" id="col_2" />
			  </columns>
			  <variables>
			    <variable id="filename" value="/output/dc_output.csv" />
			    <variable id="separator" value="," />
			  </variables>
			</source>

			...
		

In the example we've defined two variables: filename and separator . These we can refer to for specific property values, further down in our job:

			...

			<analyzer>
			  <descriptor ref="Write to CSV file"/>
			  <properties>
			    <property name="File" ref="filename" />
			    <property name="Quote char" value="&quot;" />
			    <property name="Separator char" ref="separator" />
			  </properties>
			  <input ref="col_1" />
			  <input ref="col_2" />
			</analyzer>

			...
		

Now the property values of the File and Separator char properties in the Write to CSV file have been made parameterizable. To execute the job with new variable values, use -var parameters from the command line, like this:

			DataCleaner-console.exe -job my_job.analysis.xml -var filename=/output/my_file.csv -var separator=;