When you create a cluster, Databricks launches one executor instance per worker node, and the executor uses all of the cores on the node. In certain situations, such as if you want to run non-thread-safe JNI libraries, you might need an executor that has only one core or task slot, and does not attempt to run concurrent tasks. In this case, multiple executor instances run on a single worker node, and each executor has only one core.
If you run multiple executors, you increase the JVM overhead and decrease the overall memory available for processing.
To start single-core executors on a worker node, configure two properties:
spark.executor.cores specifies the number of cores per executor. Set this property to
spark.executor.memory specifies the amount of memory to allot to each executor. To determine this amount, check the total amount of memory that is available on the worker node.
For example, an
i3.xlarge node, which has 30.5 GB of memory, shows available memory at 24.9 GB. Choose a value that fits the available memory when multiplied by the number of executors. You may need to set a value that allows for some overhead. For example, set
i3.xlarge instance type has 4 cores, and so 4 executors are created on the node, each with 6 GB of memory.