Problem
When you copy a large file from the local file system to DBFS on S3, the following exception can occur:
Amazon.S3.AmazonS3Exception: Part number must be an integer between 1 and 10000, inclusive
Cause
This is an S3 limit on segment count. Part files can only be numbered from 1 to 10000, inclusive.
Solution
To prevent this exception from occurring, increase the size of each part file.
- Set the following property in your cluster's Spark configuration:
spark.hadoop.fs.s3a.multipart.size 104857600
- Restart the cluster.