Install PyGraphViz
PyGraphViz Python libraries are used to plot causal inference networks. If you try to install PyGraphViz as a standard library, it fails due to dependency errors. PyGraphViz has the following dependencies: python3-dev graphviz libgraphviz-dev pkg-config Install via notebook Install the dependencies with apt-get.%sh sudo apt-get install -y python3-de...
0 min reading timeOpenSSL SSL_connect: SSL_ERROR_SYSCALL error
Problem You are trying to install third-party libraries via an init script. The init script attempts to download the libraries using curl or wget, but the download fails with an SSL error message. curl: (35) OpenSSL SSL_connect: SSL_ERROR_SYSCALL in connection to <hostname>:443 Cause The OpenSSL SSL_connect: SSL_ERROR_SYSCALL error means that ...
1 min reading timeEnable s3cmd for notebooks
s3cmd is a client library that allows you to perform all AWS S3 operations from any machine. s3cmd is not installed on Databricks clusters by default. You must install it via a cluster-scoped init script before it can be used. Info The sample init script stores the path to a secret in an environment variable. You should store secrets in this fashion...
0 min reading timeRevoke all user privileges
When user permissions are explicitly granted for individual tables and views, the selected user can access those tables and views even if they don’t have permission to access the underlying database. If you want to revoke a user’s access, you can do so with the REVOKE command. However, the REVOKE command is explicit, and is strictly scoped to the ob...
1 min reading timeUse tcpdump to create pcap files
If you want to analyze the network traffic between nodes on a specific cluster, you can install tcpdump on the cluster and use it to dump the network packet details to pcap files. The pcap files can then be downloaded to a local machine for analysis. Create the tcpdump init script Run this sample script in a notebook on the cluster to create the ini...
0 min reading timeS3 connection fails with "No role specified and no roles available"
Problem You are using Databricks Utilities (dbutils) to access a S3 bucket, but it fails with a No role specified and no roles available error. You have confirmed that the instance profile associated with the cluster has the permissions needed to access the S3 bucket. Unable to load AWS credentials from any provider in the chain: [com.databricks.bac...
0 min reading time