How to specify the DBFS path
When working with Databricks you will sometimes have to access the Databricks File System (DBFS).
Accessing files on DBFS is done with standard filesystem commands, however the syntax varies depending on the language or tool used.
For example, take the following DBFS path:
dbfs:/mnt/test_folder/test_folder1/
Apache Spark
Under Spark, you should specify the full path inside the Spark read command.
spark.read.parquet(“dbfs:/mnt/test_folder/test_folder1/file.parquet”)
DBUtils
When you are using DBUtils, the full DBFS path should be used, just like it is in Spark commands. The language specific formatting around the DBFS path differs depending on the language used.
%fs
ls dbfs:/mnt/test_folder/test_folder1/
dbutils.fs.ls(‘dbfs:/mnt/test_folder/test_folder1/’)
dbutils.fs.ls(“dbfs:/mnt/test_folder/test_folder1/”)
Note
Specifying dbfs:
is not required when using DBUtils or Spark commands. The path dbfs:/mnt/test_folder/test_folder1/
is equivalent to /mnt/test_folder/test_folder1/
.