Problem
Now that Mosaic AI Model Serving supports Foundation Model APIs, users within the workspace can access and query advanced open models by default. However, there may be instances where you need to restrict access to certain serving endpoints to prevent unauthorized usage.
Cause
The default configuration grants all users within the workspace access to these models, which could lead to potential misuse or unintended access if not properly controlled.
Solution
To prevent users from accessing or using a specific endpoint, administrators can effectively "disable" the endpoint by setting its rate limits to zero. Navigate to the endpoint's detailed settings page and adjust both the request and response limits to 0.
Important
Only a workspace administrator has the authority to modify these settings. Once the limits are set to zero, any attempt to query the endpoint will result in a 403 permission denied error, ensuring that access is blocked.
An unauthorized user will receive the following error message.
{"error_code":"PERMISSION_DENIED","message":"PERMISSION_DENIED: The endpoint is disabled due to a rate limit set to 0."}
For more information regarding model serving endpoints, please refer to the Model serving with Databricks (AWS | Azure) documentation.