Apache Tajo – Custom Functions

Apache Tajo supports the custom / user defined functions (UDFs). The custom functions can be created in python.

The custom functions are just plain python functions with decorator “@output_type(<tajo sql datatype>)” as follows −

@ouput_type(“integer”) 
def sum_py(a, b): 
   return a + b; 

The python scripts with UDFs can be registered by adding the below configuration in “tajosite.xml”.

<property> 
   <name>tajo.function.python.code-dir</name> 
   <value>file:///path/to/script1.py,file:///path/to/script2.py</value> 
</property>

Once the scripts are registered, restart the cluster and the UDFs will be available right in the SQL query as follows −

select sum_py(10, 10) as pyfn; 

Apache Tajo supports user defined aggregate functions as well but does not support user defined window functions.

Leave a Reply