Problem: Python Command Execution Fails with AttributeError

This article can help you resolve scenarios in which Python command execution fails with an AttributeError.

Problem: 'tuple' object has no attribute 'type'

When you run a notebook, Python command execution fails with the following error and stack trace:

AttributeError: 'tuple' object has no attribute 'type'
Traceback (most recent call last):
File "/local_disk0/tmp/1547561952809-0/PythonShell.py", line 23, in <module>
  import matplotlib as mpl
File "/databricks/python/local/lib/python2.7/site-packages/matplotlib/__init__.py", line 122, in <module>
  from matplotlib.cbook import is_string_like, mplDeprecation, dedent, get_label
File "/databricks/python/local/lib/python2.7/site-packages/matplotlib/cbook.py", line 33, in <module>
  import numpy as np
File "/databricks/python/local/lib/python2.7/site-packages/numpy/__init__.py", line 142, in <module>
  from . import core
File "/databricks/python/local/lib/python2.7/site-packages/numpy/core/__init__.py", line 57, in <module>
  from . import numerictypes as nt
File "/databricks/python/local/lib/python2.7/site-packages/numpy/core/numerictypes.py", line 111, in <module>
  from ._type_aliases import (
File "/databricks/python/local/lib/python2.7/site-packages/numpy/core/_type_aliases.py", line 63, in <module>
  _concrete_types = {v.type for k, v in _concrete_typeinfo.items()}
File "/databricks/python/local/lib/python2.7/site-packages/numpy/core/_type_aliases.py", line 63, in <setcomp>
  _concrete_types = {v.type for k, v in _concrete_typeinfo.items()}
AttributeError: 'tuple' object has no attribute 'type'


19/01/15 11:29:26 WARN PythonDriverWrapper: setupRepl:ReplId-7d8d1-8cc01-2d329-9: at the end, the status is
Error(ReplId-7d8d1-8cc01-2d329-,com.databricks.backend.daemon.driver.PythonDriverLocal$PythonException: Python shell failed to start in 30 seconds)

Cause

A newer version of numpy (1.16.1), which is installed by default by some PyPI clients, is incompatible with other libraries.

Solution

Follow the steps below to create a cluster-scoped init script that removes the current version and installs version 1.15.0 of numpy.

  1. If the init script does not already exist, create a base directory to store it:

    dbutils.fs.mkdirs("dbfs:/databricks/<directory>/")
    
  2. Create the following script:

    • If the cluster is running Python 2, use this init script:

      dbutils.fs.put("dbfs:/databricks/<directory>/numpy.sh","""
      #!/bin/bash
      pip uninstall --yes numpy
      rm -rf /home/ubuntu/databricks/python/lib/python2.7/site-packages/numpy*
      rm -rf /databricks/python/lib/python2.7/site-packages/numpy*
      /usr/bin/yes | /home/ubuntu/databricks/python/bin/pip install numpy==1.15.0
      """,True)
      
    • If the cluster is running Python 3, use this init script:

      dbutils.fs.put("dbfs:/databricks/<directory>/numpy.sh","""
      #!/bin/bash
      pip uninstall --yes numpy
      rm -rf /home/ubuntu/databricks/python/lib/python3.5/site-packages/numpy*
      rm -rf /databricks/python/lib/python3.5/site-packages/numpy*
      /usr/bin/yes | /home/ubuntu/databricks/python/bin/pip install numpy==1.15.0
      """,True)
      
  3. Confirm that the script exists:

    display(dbutils.fs.ls("dbfs:/databricks/<directory>/numpy.sh"))
    
  4. Go to the cluster configuration page and click the Advanced Options toggle.

  5. At the bottom of the page, click the Init Scripts tab:

    ../_images/init-script-tab.png
  6. In the Destination drop-down, select DBFS, provide the file path to the script, and click Add.

  7. Restart the cluster.

  8. In your PyPI client, pin the numpy installation to version 1.15.1, the latest working version.

Problem: module 'lib' has no attribute 'SSL_ST_INIT'

When you run a notebook, library installation fails and all Python commands executed on the notebook are cancelled with the following error and stack trace:

AttributeError: module 'lib' has no attribute 'SSL_ST_INIT'
Traceback (most recent call last): File "/databricks/python3/bin/pip", line 7, in <module>
 from pip._internal import main
File "/databricks/python3/lib/python3.5/site-packages/pip/_internal/__init__.py", line 40, in <module>
 from pip._internal.cli.autocompletion import autocomplete
File "/databricks/python3/lib/python3.5/site-packages/pip/_internal/cli/autocompletion.py", line 8, in <module>
 from pip._internal.cli.main_parser import create_main_parser
File "/databricks/python3/lib/python3.5/site-packages/pip/_internal/cli/main_parser.py", line 12, in <module>
 from pip._internal.commands import (
File "/databricks/python3/lib/python3.5/site-packages/pip/_internal/commands/__init__.py", line 6, in <module>
 from pip._internal.commands.completion import CompletionCommand
File "/databricks/python3/lib/python3.5/site-packages/pip/_internal/commands/completion.py", line 6, in <module>
 from pip._internal.cli.base_command import Command
File "/databricks/python3/lib/python3.5/site-packages/pip/_internal/cli/base_command.py", line 20, in <module>
 from pip._internal.download import PipSession
File "/databricks/python3/lib/python3.5/site-packages/pip/_internal/download.py", line 15, in <module>
 from pip._vendor import requests, six, urllib3
File "/databricks/python3/lib/python3.5/site-packages/pip/_vendor/requests/__init__.py", line 97, in <module>
 from pip._vendor.urllib3.contrib import pyopenssl
File "/databricks/python3/lib/python3.5/site-packages/pip/_vendor/urllib3/contrib/pyopenssl.py", line 46, in <module>
 import OpenSSL.SSL
File "/databricks/python3/lib/python3.5/site-packages/OpenSSL/__init__.py", line 8, in <module>
 from OpenSSL import rand, crypto, SSL
File "/databricks/python3/lib/python3.5/site-packages/OpenSSL/SSL.py", line 124, in <module>
 SSL_ST_INIT = _lib.SSL_ST_INIT AttributeError: module 'lib' has no attribute 'SSL_ST_INIT'

Cause

A newer version of the cryptography package (in this case, 2.7) was installed by default along with another PyPI library, and this cryptography version is incompatible with the version of pyOpenSSL included in Databricks Runtimes.

Solution

To resolve and prevent this issue, upgrade pyOpenSSL to the most recent version before you install any library. Use a cluster-scoped init script to install the most recent version of pyOpenSSL:

  1. Create a base directory to store the init script:

    dbutils.fs.mkdirs("dbfs:/databricks/<directory>/")
    
  2. Create the following script:

    dbutils.fs.put("dbfs:/databricks/<directory>/openssl_fix.sh","""
    #!/bin/bash
    /databricks/python/bin/pip uninstall pyOpenSSL -y
    /databricks/python3/bin/pip3 install pyOpenSSL==19.0.0
    """, True)
    
  3. Confirm that the script exists:

    display(dbutils.fs.ls("dbfs:/databricks/<directory>/openssl_fix.sh"))
    
  4. Go to the cluster configuration page and click the Advanced Options toggle.

  5. At the bottom of the page, click the Init Scripts tab:

    ../_images/init-script-tab.png
  6. In the Destination drop-down, select DBFS, provide the file path to the script, and click Add.

  7. Restart the cluster.