Clusters
These articles can help you manage your Apache Spark clusters.
- Enable OpenJSSE and TLS 1.3
- How to calculate the number of cores in a cluster
- Install a private PyPI repo
- IP access list update returns
INVALID_STATE
- Launch fails with
Client.InternalError
- Cannot apply updated cluster policy
- Cluster Apache Spark configuration not applied
- Cluster failed to launch
- Custom Docker image requires root
- Custom garbage collection prevents cluster launch
- Job fails due to cluster manager core instance request limit
- Admin user cannot restart cluster to run job
- Cluster fails to start with
dummy does not exist
error - Cluster slowdown due to Ganglia metrics filling root partition
- Failed to create cluster with invalid tag value
- Failed to expand the EBS volume
- EBS leaked volumes
- Log delivery fails with
AssumeRole
- Multi-part upload failure
- Persist Apache Spark CSV metrics to a DBFS location
- Replay Apache Spark events in a cluster
- S3 connection fails with
No role specified and no roles available
- Set Apache Hadoop
core-site.xml
properties - Set executor log level
- Set
instance_profile_arn
as optional with a cluster policy - Apache Spark job doesn’t start
- Auto termination is disabled when starting a job cluster
- Unexpected cluster termination
- How to configure single-core executors to run JNI libraries
- How to overwrite
log4j
configurations on Databricks clusters - Adding a configuration setting overwrites all default
spark.executor.extraJavaOptions
settings - Apache Spark executor memory allocation
- Apache Spark UI shows less than total node memory
- Configure a cluster to use a custom NTP server
- Enable GCM cipher suites
- Enable retries in init script
- Validate environment variable behavior