clustering CLI (legado)
Essa documentação foi descontinuada e pode não estar atualizada.
Esta informação se aplica ao legado Databricks CLI versões 0.18 e abaixo. Databricks recomenda que o senhor use a versão mais recente do Databricks CLI 0.205 ou o acima. Consulte O que é a CLI do Databricks? Para encontrar sua versão do site Databricks CLI, execute databricks -v
.
Para migrar de Databricks CLI versão 0.18 ou abaixo para Databricks CLI versão 0.205 ou acima, consulte Databricks CLI migration.
O senhor executa Databricks clustering CLI subcomandos anexando-os a databricks clusters
. Esses subcomandos chamam o clustering API.
databricks clusters -h
Usage: databricks clusters [OPTIONS] COMMAND [ARGS]...
Utility to interact with Databricks clusters.
Options:
-v, --version [VERSION]
-h, --help Show this message and exit.
Commands:
create Creates a Databricks cluster.
Options:
--json-file PATH File containing JSON request to POST to /api/2.0/clusters/create.
--json JSON JSON string to POST to /api/2.0/clusters/create.
delete Removes a Databricks cluster.
Options:
--cluster-id CLUSTER_ID Can be found in the URL at https://<databricks-instance>/?o=<16-digit-number>#/setting/clusters/$CLUSTER_ID/configuration.
edit Edits a Databricks cluster.
Options:
--json-file PATH File containing JSON request to POST to /api/2.0/clusters/edit.
--json JSON JSON string to POST to /api/2.0/clusters/edit.
events Gets events for a Spark cluster.
Options:
--cluster-id CLUSTER_ID Can be found in the URL at https://<databricks-instance>/#/setting/clusters/$CLUSTER_ID/configuration. [required]
--start-time TEXT The start time in epoch milliseconds. If
unprovided, returns events starting from the
beginning of time.
--end-time TEXT The end time in epoch milliseconds. If unprovided,
returns events up to the current time
--order TEXT The order to list events in; either ASC or DESC.
Defaults to DESC (most recent first).
--event-type TEXT An event types to filter on (specify multiple event
types by passing the --event-type option multiple
times). If empty, all event types are returned.
--offset TEXT The offset in the result set. Defaults to 0 (no
offset). When an offset is specified and the
results are requested in descending order, the
end_time field is required.
--limit TEXT The maximum number of events to include in a page
of events. Defaults to 50, and maximum allowed
value is 500.
--output FORMAT can be "JSON" or "TABLE". Set to TABLE by default.
get Retrieves metadata about a cluster.
Options:
--cluster-id CLUSTER_ID Can be found in the URL at https://<databricks-instance>/?o=<16-digit-number>#/setting/clusters/$CLUSTER_ID/configuration.
list Lists active and recently terminated clusters.
Options:
--output FORMAT JSON or TABLE. Set to TABLE by default.
list-node-types Lists node types for a cluster.
list-zones Lists zones where clusters can be created.
permanent-delete Permanently deletes a cluster.
Options:
--cluster-id CLUSTER_ID Can be found in the URL at https://<databricks-instance>/?o=<16-digit-number>#/setting/clusters/$CLUSTER_ID/configuration.
resize Resizes a Databricks cluster given its ID.
Options:
--cluster-id CLUSTER_ID Can be found in the URL at https://<databricks-instance>/?o=<16-digit-number>#/setting/clusters/$CLUSTER_ID/configuration.
--num-workers INTEGER Number of workers. [required]
restart Restarts a Databricks cluster.
Options:
--cluster-id CLUSTER_ID Can be found in the URL at https://<databricks-instance>/?o=<16-digit-number>#/setting/clusters/$CLUSTER_ID/configuration.
spark-versions Lists possible Databricks Runtime versions.
start Starts a terminated Databricks cluster.
Options:
--cluster-id CLUSTER_ID Can be found in the URL at https://<databricks-instance>/?o=<16-digit-number>#/setting/clusters/$CLUSTER_ID/configuration.
Criar um clustering
Para exibir a documentação de uso, execute databricks clusters create --help
.
databricks clusters create --json-file create-cluster.json
create-cluster.json
:
{
"cluster_name": "my-cluster",
"spark_version": "7.3.x-scala2.12",
"node_type_id": "i3.xlarge",
"spark_conf": {
"spark.speculation": true
},
"aws_attributes": {
"availability": "SPOT",
"zone_id": "us-west-2a"
},
"num_workers": 25
}
{
"cluster_id": "1234-567890-batch123"
}
Excluir um clustering
Para exibir a documentação de uso, execute databricks clusters delete --help
.
databricks clusters delete --cluster-id 1234-567890-batch123
Se for bem-sucedido, nenhuma saída será exibida.
Alterar a configuração de um clustering
Para exibir a documentação de uso, execute databricks clusters edit --help
.
databricks clusters edit --json-file edit-cluster.json
edit-cluster.json
:
{
"cluster_id": "1234-567890-batch123",
"num_workers": 10,
"spark_version": "7.3.x-scala2.12",
"node_type_id": "i3.xlarge"
}
Se for bem-sucedido, nenhuma saída será exibida.
Listar eventos para um clustering
Para exibir a documentação de uso, execute databricks clusters events --help
.
databricks clusters events \
--cluster-id 1234-567890-batch123 \
--start-time 1617238800000 \
--end-time 1619485200000 \
--order DESC \
--limit 5 \
--event-type RUNNING \
--output JSON \
| jq .
{
"events": [
{
"cluster_id": "1234-567890-batch123",
"timestamp": 1619214150232,
"type": "RUNNING",
"details": {
"current_num_workers": 2,
"target_num_workers": 2
}
},
...
{
"cluster_id": "1234-567890-batch123",
"timestamp": 1617895221986,
"type": "RUNNING",
"details": {
"current_num_workers": 2,
"target_num_workers": 2
}
}
],
"next_page": {
"cluster_id": "1234-567890-batch123",
"start_time": 1617238800000,
"end_time": 1619485200000,
"order": "DESC",
"event_types": [
"RUNNING"
],
"offset": 5,
"limit": 5
},
"total_count": 11
}
Obter informações sobre um clustering
Para exibir a documentação de uso, execute databricks clusters get --help
.
databricks clusters get --cluster-id 1234-567890-batch123
Ou:
databricks clusters get --cluster-name my-cluster
{
"cluster_id": "1234-567890-batch123",
"spark_context_id": 8232037838300762810,
"cluster_name": "my-cluster",
"spark_version": "8.1.x-scala2.12",
"aws_attributes": {
"zone_id": "us-west-2c",
"first_on_demand": 1,
"availability": "SPOT_WITH_FALLBACK",
"spot_bid_price_percent": 100,
"ebs_volume_count": 0
},
"node_type_id": "i3.xlarge",
"driver_node_type_id": "i3.xlarge",
"autotermination_minutes": 120,
"enable_elastic_disk": false,
"disk_spec": {
"disk_count": 0
},
"cluster_source": "UI",
"enable_local_disk_encryption": false,
"instance_source": {
"node_type_id": "i3.xlarge"
},
"driver_instance_source": {
"node_type_id": "i3.xlarge"
},
"state": "TERMINATED",
"state_message": "Inactive cluster terminated (inactive for 120 minutes).",
"start_time": 1616773202562,
"terminated_time": 1619228528317,
"last_state_loss_time": 1619214150116,
"autoscale": {
"min_workers": 2,
"max_workers": 8
},
"default_tags": {
"Vendor": "Databricks",
"Creator": "someone@example.com",
"ClusterName": "my-cluster",
"ClusterId": "1234-567890-batch123"
},
"creator_user_name": "somone@example.com",
"termination_reason": {
"code": "INACTIVITY",
"parameters": {
"inactivity_duration_min": "120"
},
"type": "SUCCESS"
},
"init_scripts_safe_mode": false
}
Listar informações sobre todos os clusters disponíveis
Para exibir a documentação de uso, execute databricks clusters list --help
.
databricks clusters list --output JSON | jq .
{
"clusters": [
{
"cluster_id": "1234-567890-batch123",
"spark_context_id": 8232037838300762810,
"cluster_name": "my-cluster",
"spark_version": "8.1.x-scala2.12",
"aws_attributes": {
"zone_id": "us-west-2c",
"first_on_demand": 1,
"availability": "SPOT_WITH_FALLBACK",
"spot_bid_price_percent": 100,
"ebs_volume_count": 0
},
"node_type_id": "i3.xlarge",
"driver_node_type_id": "i3.xlarge",
"autotermination_minutes": 120,
"enable_elastic_disk": false,
"disk_spec": {
"disk_count": 0
},
"cluster_source": "UI",
"enable_local_disk_encryption": false,
"instance_source": {
"node_type_id": "i3.xlarge"
},
"driver_instance_source": {
"node_type_id": "i3.xlarge"
},
"state": "TERMINATED",
"state_message": "Inactive cluster terminated (inactive for 120 minutes).",
"start_time": 1616773202562,
"terminated_time": 1619228528317,
"last_state_loss_time": 1619214150116,
"autoscale": {
"min_workers": 2,
"max_workers": 8
},
"default_tags": {
"Vendor": "Databricks",
"Creator": "someone@example.com",
"ClusterName": "my-cluster",
"ClusterId": "1234-567890-batch123"
},
"creator_user_name": "somone@example.com",
"termination_reason": {
"code": "INACTIVITY",
"parameters": {
"inactivity_duration_min": "120"
},
"type": "SUCCESS"
},
"init_scripts_safe_mode": false
},
...
]
}
Lista de tipos de nós de clustering disponíveis
Para exibir a documentação de uso, execute databricks clusters list-node-types --help
.
databricks clusters list-node-types
{
"node_types": [
{
"node_type_id": "z1d.12xlarge",
"memory_mb": 393216,
"num_cores": 48.0,
"description": "z1d.12xlarge",
"instance_type_id": "z1d.12xlarge",
"is_deprecated": false,
"category": "Memory Optimized",
"support_ebs_volumes": true,
"support_cluster_tags": true,
"num_gpus": 0,
"node_instance_type": {
"instance_type_id": "z1d.12xlarge",
"local_disks": 2,
"local_disk_size_gb": 900,
"instance_family": "EC2 z1d Family vCPUs",
"swap_size": "10g"
},
"is_hidden": false,
"support_port_forwarding": true,
"display_order": 0,
"is_io_cache_enabled": false
},
...
]
}
Lista de zonas disponíveis para criação de clustering
Para exibir a documentação de uso, execute databricks clusters list-zones --help
.
databricks clusters list-zones
{
"zones": [
"us-west-2c",
"us-west-2a",
"us-west-2b"
],
"default_zone": "us-west-2c"
}
Excluir permanentemente um clustering
Para exibir a documentação de uso, execute databricks clusters permanent-delete --help
.
databricks clusters permanent-delete --cluster-id 1234-567890-batch123
Se for bem-sucedido, nenhuma saída será exibida.
Redimensionar um clustering
Para exibir a documentação de uso, execute databricks clusters resize --help
.
databricks clusters resize --cluster-id 1234-567890-batch123 --num-workers 10
Se for bem-sucedido, nenhuma saída será exibida.
Reiniciar um clustering
Para exibir a documentação de uso, execute databricks clusters restart --help
.
databricks clusters restart --cluster-id 1234-567890-batch123
Se for bem-sucedido, nenhuma saída será exibida.
Listar as versões de tempo de execução do Spark disponíveis
Para exibir a documentação de uso, execute databricks clusters spark-versions --help
.
databricks clusters spark-versions
{
"versions": [
{
"key": "8.2.x-scala2.12",
"name": "8.2 (includes Apache Spark 3.1.1, Scala 2.12)"
},
...
]
}
começar a clustering
Para exibir a documentação de uso, execute databricks clusters start --help
.
databricks clusters start --cluster-id 1234-567890-batch123
Se for bem-sucedido, nenhuma saída será exibida.