Julia for Data Science
上QQ阅读APP看书,第一时间看更新

Facilitating language interoperability

Although Julia can be used to write most kinds of code, there are mature libraries for numerical and scientific computing which we would like to exploit. These libraries can be in C, Fortran or Python. Julia allows the ease of using the existing code written in Python, C, or Fortran. This is done by making Julia perform simple and efficient-to-call C, Fortran, or Python functions.

The C/Fortran libraries should be available to Julia. An ordinary but valid call with ccall is made to this code. This is possible when the code is available as a shared library. Julia's JIT generates the same machine instructions as the native C call. Therefore, it is generally no different from calling through a C code with a minimal overhead.

Importing Python code can be beneficial and sometimes needed, especially for data science, because it already has an exhaustive library of implementations of machine learning and statistical functions. For example, it contains scikit-learn and pandas. To use Python in Julia, we require PyCall.jl. To add PyCall.jl do the following:

Pkg.add("PyCall") 

PyCall contains a macro @pyimport that facilitates importing Python packages and provides Julia wrappers for all of the functions and constants therein, including automatic conversion of types between Julia and Python.

PyCall also provides functionalities for lower-level manipulation of Python objects, including a PyObject type for opaque Python objects. It also has a pycall function (similar to Julia's ccall function), which can be used in Julia to call Python functions with type conversions. PyCall does not use the Python program but links directly to the libpython library. During the Pkg.build, it finds the location of the libpython by Punning python.

Calling Python code in Julia

The @pyimport macro automatically makes the appropriate type conversions to Julia types in most of the scenarios based on a runtime inspection of the Python objects. It achieves better control over these type conversions by using lower-level functions. Using PyCall in scenarios where the return type is known can help in improving the performance, both by eliminating the overhead of runtime type inference, and also by providing more type information to the Julia compiler:

  • pycall(function::PyObject, returntype::Type, args...): This calls the given Python function (typically looked up from a module) with the given args... (of standard Julia types which are converted automatically to the corresponding Python types if possible), converting the return value to returntype (use a returntype of PyObject to return the unconverted Python object reference, or PyAny to request an automated conversion).
  • pyimport(s): This imports the Python modules (a string or symbol) and returns a pointer to it (a PyObject). Functions or other symbols in the module may then be looked up by s[name] where the name is a string (for the raw PyObject) or a symbol (for automatic type conversion). Unlike the @pyimport macro, this does not define a Julia module and members cannot be accessed with an s.name.