S3 part number must be between 1 and 10000 inclusive

Learn how to resolve the S3 part number must be between 1 and 10000 inclusive error.

Written by Adam Pavlacka

Last published at: February 25th, 2022

Problem

When you copy a large file from the local file system to DBFS on S3, the following exception can occur:

Amazon.S3.AmazonS3Exception: Part number must be an integer between 1 and 10000, inclusive

Cause

This is an S3 limit on segment count. Part files can only be numbered from 1 to 10000, inclusive.

Solution

To prevent this exception from occurring, increase the size of each part file. Set the following property at the cluster level or notebook level.

  • Cluster Level (Bash): you must restart the cluster after setting this property.
    spark.hadoop.fs.s3a.multipart.size 104857600
  • Notebook Level (Python):
    spark.conf.set("spark.hadoop.fs.s3a.multipart.size", "104857600")


Delete

Note

If the error still occurs, increase the multipart size even more.