3
5
6
Log directory: /dbfs/ml/pytorch/1733957302.7722206
8
/root/.ipykernel/4182/command-2280852430858390-1170614654:30: UserWarning: Implicit dimension choice for log_softmax has been deprecated. Change the call to include dim=X as an argument.
return F.log_softmax(x)
Train Epoch: 1 [0/60000 (0%)] Loss: 2.324770
Train Epoch: 1 [10000/60000 (17%)] Loss: 2.297613
Train Epoch: 1 [20000/60000 (33%)] Loss: 2.277661
Train Epoch: 1 [30000/60000 (50%)] Loss: 2.290999
Train Epoch: 1 [40000/60000 (67%)] Loss: 2.275110
Train Epoch: 1 [50000/60000 (83%)] Loss: 2.220335
Train Epoch: 2 [0/60000 (0%)] Loss: 2.224195
Train Epoch: 2 [10000/60000 (17%)] Loss: 2.176879
Train Epoch: 2 [20000/60000 (33%)] Loss: 2.169643
Train Epoch: 2 [30000/60000 (50%)] Loss: 2.138331
Train Epoch: 2 [40000/60000 (67%)] Loss: 2.017443
Train Epoch: 2 [50000/60000 (83%)] Loss: 1.934057
Train Epoch: 3 [0/60000 (0%)] Loss: 1.731840
Train Epoch: 3 [10000/60000 (17%)] Loss: 1.870500
Train Epoch: 3 [20000/60000 (33%)] Loss: 1.674558
Train Epoch: 3 [30000/60000 (50%)] Loss: 1.598748
Train Epoch: 3 [40000/60000 (67%)] Loss: 1.360065
Train Epoch: 3 [50000/60000 (83%)] Loss: 1.438313
Average test loss: 0.895073413848877
/databricks/python/lib/python3.11/site-packages/_distutils_hack/__init__.py:33: UserWarning: Setuptools is replacing distutils.
warnings.warn("Setuptools is replacing distutils.")
2024/12/11 22:45:11 WARNING mlflow.models.model: Model logged without a signature. Signatures will be required for upcoming model registry features as they validate model inputs and denote the expected schema of model outputs. Please visit https://www.mlflow.org/docs/2.15.1/models.html#set-signature-on-logged-model for instructions on setting a model signature on your logged model.
2024/12/11 22:45:12 WARNING mlflow.models.model: Input example should be provided to infer model signature if the model signature is not provided when logging the model.
2024/12/11 22:45:12 INFO mlflow.tracking._tracking_service.client: 🏃 View run popular-koi-949 at: e2-dogfood.staging.cloud.databricks.com/ml/experiments/2280852430858633/runs/ff899ade65ad40fc8b17fd7e351015a9.
2024/12/11 22:45:12 INFO mlflow.tracking._tracking_service.client: 🧪 View experiment at: e2-dogfood.staging.cloud.databricks.com/ml/experiments/2280852430858633.
10
Data is located at: /dbfs/ml/pytorch/1733957479.3564515
12
Running distributed training
/root/.ipykernel/4182/command-2280852430858390-193987937:30: UserWarning: Implicit dimension choice for log_softmax has been deprecated. Change the call to include dim=X as an argument.
return F.log_softmax(x)
Train Epoch: 1 [0/60000 (0%)] Loss: 2.284188
Train Epoch: 1 [10000/60000 (17%)] Loss: 2.295907
Train Epoch: 1 [20000/60000 (33%)] Loss: 2.273554
Train Epoch: 1 [30000/60000 (50%)] Loss: 2.260596
/root/.ipykernel/4182/command-2280852430858390-193987937:30: UserWarning: Implicit dimension choice for log_softmax has been deprecated. Change the call to include dim=X as an argument.
return F.log_softmax(x)
/root/.ipykernel/4182/command-2280852430858390-193987937:30: UserWarning: Implicit dimension choice for log_softmax has been deprecated. Change the call to include dim=X as an argument.
return F.log_softmax(x)
Train Epoch: 1 [40000/60000 (67%)] Loss: 2.248712
Train Epoch: 1 [50000/60000 (83%)] Loss: 2.233172
Train Epoch: 2 [0/60000 (0%)] Loss: 2.249660
Train Epoch: 2 [10000/60000 (17%)] Loss: 2.155293
Train Epoch: 2 [20000/60000 (33%)] Loss: 2.051280
Train Epoch: 2 [30000/60000 (50%)] Loss: 1.962492
Train Epoch: 2 [40000/60000 (67%)] Loss: 1.912481
Train Epoch: 2 [50000/60000 (83%)] Loss: 1.886552
Train Epoch: 3 [0/60000 (0%)] Loss: 1.862257
Train Epoch: 3 [10000/60000 (17%)] Loss: 1.675992
Train Epoch: 3 [20000/60000 (33%)] Loss: 1.436601
Train Epoch: 3 [30000/60000 (50%)] Loss: 1.384962
Train Epoch: 3 [40000/60000 (67%)] Loss: 1.489862
Train Epoch: 3 [50000/60000 (83%)] Loss: 1.324469
2024/12/11 22:52:02 WARNING mlflow.utils.environment: Encountered an unexpected error while inferring pip requirements (model URI: /local_disk0/repl_tmp_data/ReplId-193b7-d9d6b-5/tmpe5nilt11/model/data, flavor: pytorch). Fall back to return ['torch==2.3.1', 'cloudpickle==2.2.1']. Set logging level to DEBUG to see the full traceback.
2024/12/11 22:52:02 WARNING mlflow.models.model: Model logged without a signature. Signatures will be required for upcoming model registry features as they validate model inputs and denote the expected schema of model outputs. Please visit https://www.mlflow.org/docs/2.15.1/models.html#set-signature-on-logged-model for instructions on setting a model signature on your logged model.
2024/12/11 22:52:03 WARNING mlflow.models.model: Input example should be provided to infer model signature if the model signature is not provided when logging the model.
Average test loss: 0.761795163154602
2024/12/11 22:52:14 INFO mlflow.tracking._tracking_service.client: 🏃 View run righteous-shrimp-206 at: e2-dogfood.staging.cloud.databricks.com/ml/experiments/2280852430858633/runs/434989bb483d4d48b1649f5050a5efee.
2024/12/11 22:52:14 INFO mlflow.tracking._tracking_service.client: 🧪 View experiment at: e2-dogfood.staging.cloud.databricks.com/ml/experiments/2280852430858633.
14
Data is located at: /dbfs/ml/pytorch/1733957534.7987056
INFO:TorchDistributor:Started local training with 2 processes
INFO:TorchDistributor:Finished local training with 2 processes
WARNING:__main__:
*****************************************
Setting OMP_NUM_THREADS environment variable for each process to be 1 in default, to avoid your system being overloaded, please further tune the variable for optimal performance in your application as needed.
*****************************************
Wed Dec 11 22:52:22 2024 Connection to spark from PID 7235
Wed Dec 11 22:52:22 2024 Initialized gateway on port 46337
Wed Dec 11 22:52:22 2024 Connection to spark from PID 7236
Wed Dec 11 22:52:22 2024 Initialized gateway on port 36391
Wed Dec 11 22:52:22 2024 Connected to spark.
Wed Dec 11 22:52:22 2024 Connected to spark.
Running distributed training
Running distributed training
/root/.ipykernel/4182/command-2280852430858390-193987937:30: UserWarning: Implicit dimension choice for log_softmax has been deprecated. Change the call to include dim=X as an argument.
/root/.ipykernel/4182/command-2280852430858390-193987937:30: UserWarning: Implicit dimension choice for log_softmax has been deprecated. Change the call to include dim=X as an argument.
Train Epoch: 1 [0/30000 (0%)] Loss: 2.344287
Train Epoch: 1 [0/30000 (0%)] Loss: 2.319191
Train Epoch: 1 [10000/30000 (33%)] Loss: 2.286878
Train Epoch: 1 [10000/30000 (33%)] Loss: 2.311805
Train Epoch: 1 [20000/30000 (67%)] Loss: 2.266556
Train Epoch: 1 [20000/30000 (67%)] Loss: 2.279220
Train Epoch: 2 [0/30000 (0%)] Loss: 2.276896
Train Epoch: 2 [0/30000 (0%)] Loss: 2.257281
Train Epoch: 2 [10000/30000 (33%)] Loss: 2.268584
Train Epoch: 2 [10000/30000 (33%)] Loss: 2.232745
Train Epoch: 2 [20000/30000 (67%)] Loss: 2.242372
Train Epoch: 2 [20000/30000 (67%)] Loss: 2.192771
Train Epoch: 3 [0/30000 (0%)] Loss: 2.207398
Train Epoch: 3 [0/30000 (0%)] Loss: 2.233767
Train Epoch: 3 [10000/30000 (33%)] Loss: 2.163989
Train Epoch: 3 [10000/30000 (33%)] Loss: 2.120047
Train Epoch: 3 [20000/30000 (67%)] Loss: 2.066106
Train Epoch: 3 [20000/30000 (67%)] Loss: 2.087892
2024/12/11 22:52:44 INFO mlflow.utils.databricks_utils: No workspace ID specified; if your Databricks workspaces share the same host URL, you may want to specify the workspace ID (along with the host information in the secret manager) for run lineage tracking. For more details on how to specify this information in the secret manager, please refer to the Databricks MLflow documentation.
2024/12/11 22:52:48 WARNING mlflow.utils.environment: Encountered an unexpected error while inferring pip requirements (model URI: /tmp/repl_tmp_data/ReplId-193b7-d9d6b-5/tmpuivzyo93/model/data, flavor: pytorch). Fall back to return ['torch==2.3.1', 'cloudpickle==2.2.1']. Set logging level to DEBUG to see the full traceback.
/databricks/python/lib/python3.11/site-packages/_distutils_hack/__init__.py:33: UserWarning: Setuptools is replacing distutils.
warnings.warn("Setuptools is replacing distutils.")
2024/12/11 22:52:48 WARNING mlflow.models.model: Model logged without a signature. Signatures will be required for upcoming model registry features as they validate model inputs and denote the expected schema of model outputs. Please visit https://www.mlflow.org/docs/2.15.1/models.html#set-signature-on-logged-model for instructions on setting a model signature on your logged model.
Uploading artifacts: 100%|██████████| 10/10 [00:00<00:00, 10.35it/s]
2024/12/11 22:52:49 WARNING mlflow.models.model: Input example should be provided to infer model signature if the model signature is not provided when logging the model.
/root/.ipykernel/4182/command-2280852430858390-193987937:30: UserWarning: Implicit dimension choice for log_softmax has been deprecated. Change the call to include dim=X as an argument.
Average test loss: 1.9049434661865234
2024/12/11 22:52:59 INFO mlflow.tracking._tracking_service.client: 🏃 View run amusing-shrike-121 at: https://e2-dogfood.staging.cloud.databricks.com/ml/experiments/2280852430858633/runs/78a38eed946942d89de4ac5ba935ab22.
2024/12/11 22:52:59 INFO mlflow.tracking._tracking_service.client: 🧪 View experiment at: https://e2-dogfood.staging.cloud.databricks.com/ml/experiments/2280852430858633.
/root/.ipykernel/4182/command-2280852430858390-193987937:30: UserWarning: Implicit dimension choice for log_softmax has been deprecated. Change the call to include dim=X as an argument.
return F.log_softmax(x)
Average test loss: 1.9049434661865234
2024/12/11 22:53:16 WARNING mlflow.models.model: Model logged without a signature. Signatures will be required for upcoming model registry features as they validate model inputs and denote the expected schema of model outputs. Please visit https://www.mlflow.org/docs/2.15.1/models.html#set-signature-on-logged-model for instructions on setting a model signature on your logged model.
2024/12/11 22:53:17 WARNING mlflow.models.model: Input example should be provided to infer model signature if the model signature is not provided when logging the model.
16
Data is located at: /dbfs/ml/pytorch/1733957597.6578755
INFO:TorchDistributor:Started distributed training with 2 executor processes
Running distributed training
Running distributed training
/root/.ipykernel/4182/command-2280852430858390-193987937:30: UserWarning: Implicit dimension choice for log_softmax has been deprecated. Change the call to include dim=X as an argument.
/root/.ipykernel/4182/command-2280852430858390-193987937:30: UserWarning: Implicit dimension choice for log_softmax has been deprecated. Change the call to include dim=X as an argument.
Train Epoch: 1 [0/30000 (0%)] Loss: 2.339231
Train Epoch: 1 [0/30000 (0%)] Loss: 2.292328
Train Epoch: 1 [10000/30000 (33%)] Loss: 2.310895
Train Epoch: 1 [10000/30000 (33%)] Loss: 2.310132
Train Epoch: 1 [20000/30000 (67%)] Loss: 2.292369
Train Epoch: 1 [20000/30000 (67%)] Loss: 2.288747
Train Epoch: 2 [0/30000 (0%)] Loss: 2.267363
Train Epoch: 2 [0/30000 (0%)] Loss: 2.250873
Train Epoch: 2 [10000/30000 (33%)] Loss: 2.252213
Train Epoch: 2 [10000/30000 (33%)] Loss: 2.242889
Train Epoch: 2 [20000/30000 (67%)] Loss: 2.257112
Train Epoch: 2 [20000/30000 (67%)] Loss: 2.214770
Train Epoch: 3 [0/30000 (0%)] Loss: 2.203631
Train Epoch: 3 [0/30000 (0%)] Loss: 2.226314
Train Epoch: 3 [10000/30000 (33%)] Loss: 2.151444
Train Epoch: 3 [10000/30000 (33%)] Loss: 2.153582
Train Epoch: 3 [20000/30000 (67%)] Loss: 2.086527
Train Epoch: 3 [20000/30000 (67%)] Loss: 2.095566
Spark Command: /usr/lib/jvm/zulu8-ca-amd64/jre/bin/java -cp /databricks/spark/conf/:/databricks/spark/assembly/target/scala-2.12/jars/*:/databricks/spark/dbconf/log4j/master-worker/:/databricks/jars/* -Xmx1g -XX:+IgnoreUnrecognizedVMOptions --add-opens=java.base/java.lang=ALL-UNNAMED --add-opens=java.base/java.lang.invoke=ALL-UNNAMED --add-opens=java.base/java.lang.reflect=ALL-UNNAMED --add-opens=java.base/java.io=ALL-UNNAMED --add-opens=java.base/java.net=ALL-UNNAMED --add-opens=java.base/java.nio=ALL-UNNAMED --add-opens=java.base/java.util=ALL-UNNAMED --add-opens=java.base/java.util.concurrent=ALL-UNNAMED --add-opens=java.base/java.util.concurrent.atomic=ALL-UNNAMED --add-opens=java.base/jdk.internal.ref=ALL-UNNAMED --add-opens=java.base/sun.nio.ch=ALL-UNNAMED --add-opens=java.base/sun.nio.cs=ALL-UNNAMED --add-opens=java.base/sun.security.action=ALL-UNNAMED --add-opens=java.base/sun.util.calendar=ALL-UNNAMED --add-opens=java.security.jgss/sun.security.krb5=ALL-UNNAMED --add-opens=jdk.jfr/jdk.jfr.internal.consumer=ALL-UNNAMED --add-opens=jdk.jfr/jdk.jfr.internal=ALL-UNNAMED --add-opens=java.management/sun.management=ALL-UNNAMED --add-opens=java.base/jdk.internal.loader=ALL-UNNAMED -Djdk.reflect.useDirectMethodHandle=false -Dderby.connection.requireAuthentication=false org.apache.spark.deploy.SparkSubmit pyspark-shell
========================================
WARN StatusConsoleListener The use of package scanning to locate plugins is deprecated and will be removed in a future release
WARN StatusConsoleListener The use of package scanning to locate plugins is deprecated and will be removed in a future release
WARN StatusConsoleListener The use of package scanning to locate plugins is deprecated and will be removed in a future release
WARN StatusConsoleListener The use of package scanning to locate plugins is deprecated and will be removed in a future release
WARN StatusConsoleListener RollingFileAppender 'com.databricks.logging.structured.PrometheusMetricsSnapshot.appender': The bufferSize is set to 8192 but bufferedIO is not true
24/12/11 22:54:16 INFO DatabricksEdgeConfigs: serverlessEnabled : false
24/12/11 22:54:17 INFO DatabricksEdgeConfigs: perfPackEnabled : false
24/12/11 22:54:17 INFO DatabricksEdgeConfigs: classicSqlEnabled : false
24/12/11 22:54:18 INFO RawConfigSingleton$: Successfully loaded DB_CONF into RawConfigSingleton.
24/12/11 22:54:19 INFO SecurityManager: Changing view acls to: root
24/12/11 22:54:19 INFO SecurityManager: Changing modify acls to: root
24/12/11 22:54:19 INFO SecurityManager: Changing view acls groups to:
24/12/11 22:54:19 INFO SecurityManager: Changing modify acls groups to:
24/12/11 22:54:19 INFO SecurityManager: SecurityManager: authentication is enabled: false; ui acls disabled; users with view permissions: root groups with view permissions: EMPTY; users with modify permissions: root; groups with modify permissions: EMPTY; RPC SSL disabled
Setting default log level to "WARN".
To adjust logging level use sc.setLogLevel(newLevel). For SparkR, use setLogLevel(newLevel).
24/12/11 22:54:19 INFO SparkContext: Running Spark version 3.5.0
24/12/11 22:54:19 INFO SparkContext: OS info Linux, 5.15.0-1072-aws, amd64
24/12/11 22:54:19 INFO SparkContext: Java version 1.8.0_412
24/12/11 22:54:19 INFO ResourceUtils: ==============================================================
24/12/11 22:54:19 INFO ResourceUtils: No custom resources configured for spark.driver.
24/12/11 22:54:19 INFO ResourceUtils: ==============================================================
24/12/11 22:54:19 INFO SparkContext: Submitted application: pyspark-shell
24/12/11 22:54:19 INFO ResourceProfile: Default ResourceProfile created, executor resources: Map(cores -> name: cores, amount: 1, script: , vendor: , memory -> name: memory, amount: 1024, script: , vendor: , offHeap -> name: offHeap, amount: 0, script: , vendor: ), task resources: Map(cpus -> name: cpus, amount: 1.0)
24/12/11 22:54:19 INFO ResourceProfile: Limiting resource is cpu
24/12/11 22:54:19 INFO ResourceProfileManager: Added ResourceProfile id: 0
24/12/11 22:54:19 INFO SecurityManager: Changing view acls to: root
24/12/11 22:54:19 INFO SecurityManager: Changing modify acls to: root
24/12/11 22:54:19 INFO SecurityManager: Changing view acls groups to:
24/12/11 22:54:19 INFO SecurityManager: Changing modify acls groups to:
24/12/11 22:54:19 INFO SecurityManager: SecurityManager: authentication is enabled: false; ui acls disabled; users with view permissions: root groups with view permissions: EMPTY; users with modify permissions: root; groups with modify permissions: EMPTY; RPC SSL disabled
24/12/11 22:54:20 INFO Utils: Successfully started service 'sparkDriver' on port 44335.
24/12/11 22:54:20 INFO SparkEnv: Registering MapOutputTracker
24/12/11 22:54:20 INFO SparkEnv: Registering BlockManagerMaster
24/12/11 22:54:20 INFO BlockManagerMasterEndpoint: Using org.apache.spark.storage.DefaultTopologyMapper for getting topology information
24/12/11 22:54:20 INFO BlockManagerMasterEndpoint: BlockManagerMasterEndpoint up
24/12/11 22:54:20 INFO SparkEnv: Registering BlockManagerMasterHeartbeat
24/12/11 22:54:20 INFO DiskBlockManager: Created local directory at /local_disk0/spark-6084ceb8-465b-493b-a78e-1726579dc598/executor-a3d9fc8f-9b96-47b3-84b0-dc6e33edd3e0/blockmgr-7fca0165-4072-43ae-b5c3-24df05afd7a9
24/12/11 22:54:20 INFO SparkEnv: Registering OutputCommitCoordinator
24/12/11 22:54:20 WARN JfrStreamingManager: JFR streaming is only available in JDK 17+
24/12/11 22:54:20 WARN MetricsSystem: Using default name SparkStatusTracker for source because neither spark.metrics.namespace nor spark.app.id is set.
24/12/11 22:54:20 INFO log: Logging initialized @8449ms to org.eclipse.jetty.util.log.Slf4jLog
24/12/11 22:54:21 INFO JettyUtils: Start Jetty 10.68.143.139:40001 for SparkUI
24/12/11 22:54:21 INFO Server: jetty-9.4.52.v20230823; built: 2023-08-23T19:29:37.669Z; git: abdcda73818a1a2c705da276edb0bf6581e7997e; jvm 1.8.0_412-b08
24/12/11 22:54:21 INFO Server: Started @8723ms
24/12/11 22:54:21 INFO AbstractConnector: Started ServerConnector@17664041{HTTP/1.1, (http/1.1)}{10.68.143.139:40001}
24/12/11 22:54:21 INFO Utils: Successfully started service 'SparkUI' on port 40001.
24/12/11 22:54:21 INFO ContextHandler: Started o.e.j.s.ServletContextHandler@3a8290a{/,null,AVAILABLE,@Spark}
24/12/11 22:54:22 INFO DriverPluginContainer: Initialized driver component for plugin org.apache.spark.sql.connect.SparkConnectPlugin.
24/12/11 22:54:22 INFO DriverPluginContainer: Initialized driver component for plugin com.databricks.spark.connect.LocalSparkConnectPlugin.
24/12/11 22:54:22 INFO DLTDebugger: Registered DLTDebuggerEndpoint at endpoint dlt-debugger
24/12/11 22:54:22 INFO DriverPluginContainer: Initialized driver component for plugin org.apache.spark.debugger.DLTDebuggerSparkPlugin.
24/12/11 22:54:22 INFO SecurityManager: Changing view acls to: root
24/12/11 22:54:22 INFO SecurityManager: Changing modify acls to: root
24/12/11 22:54:22 INFO SecurityManager: Changing view acls groups to:
24/12/11 22:54:22 INFO SecurityManager: Changing modify acls groups to:
24/12/11 22:54:22 INFO SecurityManager: SecurityManager: authentication is enabled: false; ui acls disabled; users with view permissions: root groups with view permissions: EMPTY; users with modify permissions: root; groups with modify permissions: EMPTY; RPC SSL disabled
24/12/11 22:54:22 INFO Executor: Starting executor ID driver on host ip-10-68-143-139.us-west-2.compute.internal
24/12/11 22:54:22 INFO Executor: OS info Linux, 5.15.0-1072-aws, amd64
24/12/11 22:54:22 INFO Executor: Java version 1.8.0_412
24/12/11 22:54:22 INFO Executor: Starting executor with user classpath (userClassPathFirst = false): ''
24/12/11 22:54:22 INFO Executor: Created or updated repl class loader org.apache.spark.util.MutableURLClassLoader@38bce0b6 for default.
24/12/11 22:54:22 INFO ExecutorPluginContainer: Initialized executor component for plugin org.apache.spark.debugger.DLTDebuggerSparkPlugin.
24/12/11 22:54:22 INFO Utils: resolved command to be run: ArraySeq(getconf, PAGESIZE)
24/12/11 22:54:23 WARN NativeMemoryWatchdog: Native memory watchdog is disabled by conf.
24/12/11 22:54:23 INFO TaskSchedulerImpl: Preemption disabled in FIFO scheduling mode.
24/12/11 22:54:23 INFO Utils: Successfully started service 'org.apache.spark.network.netty.NettyBlockTransferService' on port 35711.
24/12/11 22:54:23 INFO NettyBlockTransferService: Server created on ip-10-68-143-139.us-west-2.compute.internal:35711
24/12/11 22:54:23 INFO BlockManager: Using org.apache.spark.storage.RandomBlockReplicationPolicy for block replication policy
24/12/11 22:54:23 INFO BlockManagerMaster: Registering BlockManager BlockManagerId(driver, ip-10-68-143-139.us-west-2.compute.internal, 35711, None)
24/12/11 22:54:23 INFO BlockManagerMasterEndpoint: Registering block manager ip-10-68-143-139.us-west-2.compute.internal:35711 with 366.3 MiB RAM, BlockManagerId(driver, ip-10-68-143-139.us-west-2.compute.internal, 35711, None)
24/12/11 22:54:23 INFO BlockManagerMaster: Registered BlockManager BlockManagerId(driver, ip-10-68-143-139.us-west-2.compute.internal, 35711, None)
24/12/11 22:54:23 INFO BlockManager: Initialized BlockManager: BlockManagerId(driver, ip-10-68-143-139.us-west-2.compute.internal, 35711, None)
24/12/11 22:54:25 INFO ContextHandler: Stopped o.e.j.s.ServletContextHandler@3a8290a{/,null,STOPPED,@Spark}
24/12/11 22:54:25 INFO ContextHandler: Started o.e.j.s.ServletContextHandler@49bd8292{/jobs,null,AVAILABLE,@Spark}
24/12/11 22:54:25 INFO ContextHandler: Started o.e.j.s.ServletContextHandler@5a3edc6{/jobs/json,null,AVAILABLE,@Spark}
24/12/11 22:54:25 INFO ContextHandler: Started o.e.j.s.ServletContextHandler@2c09c453{/jobs/job,null,AVAILABLE,@Spark}
24/12/11 22:54:25 INFO ContextHandler: Started o.e.j.s.ServletContextHandler@44a52971{/jobs/job/json,null,AVAILABLE,@Spark}
24/12/11 22:54:25 INFO ContextHandler: Started o.e.j.s.ServletContextHandler@147b9e4d{/stages,null,AVAILABLE,@Spark}
24/12/11 22:54:25 INFO ContextHandler: Started o.e.j.s.ServletContextHandler@396810cf{/stages/json,null,AVAILABLE,@Spark}
24/12/11 22:54:25 INFO ContextHandler: Started o.e.j.s.ServletContextHandler@67183f23{/stages/stage,null,AVAILABLE,@Spark}
24/12/11 22:54:25 INFO ContextHandler: Started o.e.j.s.ServletContextHandler@24bffb66{/stages/stage/json,null,AVAILABLE,@Spark}
24/12/11 22:54:25 INFO ContextHandler: Started o.e.j.s.ServletContextHandler@6146fed2{/stages/pool,null,AVAILABLE,@Spark}
24/12/11 22:54:25 INFO ContextHandler: Started o.e.j.s.ServletContextHandler@5b09f7cd{/stages/pool/json,null,AVAILABLE,@Spark}
24/12/11 22:54:25 INFO ContextHandler: Started o.e.j.s.ServletContextHandler@43a1201e{/stages/taskThreadDump,null,AVAILABLE,@Spark}
24/12/11 22:54:25 INFO ContextHandler: Started o.e.j.s.ServletContextHandler@6545830a{/stages/taskThreadDump/json,null,AVAILABLE,@Spark}
24/12/11 22:54:25 INFO ContextHandler: Started o.e.j.s.ServletContextHandler@523fcb3{/storage,null,AVAILABLE,@Spark}
24/12/11 22:54:25 INFO ContextHandler: Started o.e.j.s.ServletContextHandler@8be321a{/storage/json,null,AVAILABLE,@Spark}
24/12/11 22:54:25 INFO ContextHandler: Started o.e.j.s.ServletContextHandler@1985d911{/storage/rdd,null,AVAILABLE,@Spark}
24/12/11 22:54:25 INFO ContextHandler: Started o.e.j.s.ServletContextHandler@74cb9e88{/storage/rdd/json,null,AVAILABLE,@Spark}
24/12/11 22:54:25 INFO ContextHandler: Started o.e.j.s.ServletContextHandler@b7d8cef{/environment,null,AVAILABLE,@Spark}
24/12/11 22:54:25 INFO ContextHandler: Started o.e.j.s.ServletContextHandler@4c07b9dc{/environment/json,null,AVAILABLE,@Spark}
24/12/11 22:54:25 INFO ContextHandler: Started o.e.j.s.ServletContextHandler@2bc48600{/executors,null,AVAILABLE,@Spark}
24/12/11 22:54:25 INFO ContextHandler: Started o.e.j.s.ServletContextHandler@3caa2ec3{/executors/json,null,AVAILABLE,@Spark}
24/12/11 22:54:25 INFO ContextHandler: Started o.e.j.s.ServletContextHandler@34e3783e{/executors/threadDump,null,AVAILABLE,@Spark}
24/12/11 22:54:25 INFO ContextHandler: Started o.e.j.s.ServletContextHandler@1a733a6{/executors/threadDump/json,null,AVAILABLE,@Spark}
24/12/11 22:54:25 INFO ContextHandler: Started o.e.j.s.ServletContextHandler@55a7a5ff{/executors/heapHistogram,null,AVAILABLE,@Spark}
24/12/11 22:54:25 INFO ContextHandler: Started o.e.j.s.ServletContextHandler@37cf6b1a{/executors/heapHistogram/json,null,AVAILABLE,@Spark}
24/12/11 22:54:25 INFO ContextHandler: Started o.e.j.s.ServletContextHandler@752f7104{/executors/heapHistogram,null,AVAILABLE,@Spark}
24/12/11 22:54:25 INFO ContextHandler: Started o.e.j.s.ServletContextHandler@5cc665a6{/executors/heapHistogram/json,null,AVAILABLE,@Spark}
24/12/11 22:54:25 INFO ContextHandler: Started o.e.j.s.ServletContextHandler@32f83721{/static,null,AVAILABLE,@Spark}
24/12/11 22:54:25 INFO ContextHandler: Started o.e.j.s.ServletContextHandler@14c9fab7{/,null,AVAILABLE,@Spark}
24/12/11 22:54:25 INFO ContextHandler: Started o.e.j.s.ServletContextHandler@7aae5f3f{/api,null,AVAILABLE,@Spark}
24/12/11 22:54:25 INFO ContextHandler: Started o.e.j.s.ServletContextHandler@e62d909{/metrics,null,AVAILABLE,@Spark}
24/12/11 22:54:25 INFO ContextHandler: Started o.e.j.s.ServletContextHandler@19f49448{/jobs/job/kill,null,AVAILABLE,@Spark}
24/12/11 22:54:25 INFO ContextHandler: Started o.e.j.s.ServletContextHandler@dfa539e{/stages/stage/kill,null,AVAILABLE,@Spark}
24/12/11 22:54:25 INFO ContextHandler: Started o.e.j.s.ServletContextHandler@1dfcd39e{/connect,null,AVAILABLE,@Spark}
24/12/11 22:54:25 INFO ContextHandler: Started o.e.j.s.ServletContextHandler@7650fd03{/connect/json,null,AVAILABLE,@Spark}
24/12/11 22:54:25 INFO ContextHandler: Started o.e.j.s.ServletContextHandler@17eec73f{/connect/session,null,AVAILABLE,@Spark}
24/12/11 22:54:25 INFO ContextHandler: Started o.e.j.s.ServletContextHandler@6c632e3{/connect/session/json,null,AVAILABLE,@Spark}
24/12/11 22:54:25 INFO ContextHandler: Started o.e.j.s.ServletContextHandler@2f3dffd{/metrics/json,null,AVAILABLE,@Spark}
2024/12/11 22:54:26 INFO mlflow.utils.databricks_utils: No workspace ID specified; if your Databricks workspaces share the same host URL, you may want to specify the workspace ID (along with the host information in the secret manager) for run lineage tracking. For more details on how to specify this information in the secret manager, please refer to the Databricks MLflow documentation.
2024/12/11 22:54:30 WARNING mlflow.utils.environment: Encountered an unexpected error while inferring pip requirements (model URI: /tmp/tmpqi4e38ph/model/data, flavor: pytorch). Fall back to return ['torch==2.3.1', 'cloudpickle==2.2.1']. Set logging level to DEBUG to see the full traceback.
/databricks/python/lib/python3.11/site-packages/_distutils_hack/__init__.py:33: UserWarning: Setuptools is replacing distutils.
warnings.warn("Setuptools is replacing distutils.")
2024/12/11 22:54:30 WARNING mlflow.models.model: Model logged without a signature. Signatures will be required for upcoming model registry features as they validate model inputs and denote the expected schema of model outputs. Please visit https://www.mlflow.org/docs/2.15.1/models.html#set-signature-on-logged-model for instructions on setting a model signature on your logged model.
Uploading artifacts: 100%|██████████| 10/10 [00:00<00:00, 11.79it/s]
2024/12/11 22:54:31 WARNING mlflow.models.model: Input example should be provided to infer model signature if the model signature is not provided when logging the model.
/root/.ipykernel/4182/command-2280852430858390-193987937:30: UserWarning: Implicit dimension choice for log_softmax has been deprecated. Change the call to include dim=X as an argument.
Average test loss: 1.9135884046554565
2024/12/11 22:54:42 INFO mlflow.tracking._tracking_service.client: 🏃 View run clean-fawn-311 at: https://oregon.staging.cloud.databricks.com/ml/experiments/2280852430858633/runs/10ccc2655b19485d992d6ce93a76676c.
2024/12/11 22:54:42 INFO mlflow.tracking._tracking_service.client: 🧪 View experiment at: https://oregon.staging.cloud.databricks.com/ml/experiments/2280852430858633.
24/12/11 22:54:43 INFO SparkContext: Invoking stop() from shutdown hook
24/12/11 22:54:43 INFO SparkContext: SparkContext is stopping with exitCode 0 from run at Executors.java:511.
24/12/11 22:54:43 WARN SparkContext: Requesting executors is not supported by current scheduler.
24/12/11 22:54:43 INFO AbstractConnector: Stopped Spark@17664041{HTTP/1.1, (http/1.1)}{10.68.143.139:40001}
24/12/11 22:54:43 INFO SparkUI: Stopped Spark web UI at http://10.68.143.139:40001
24/12/11 22:54:43 INFO DeadlockDetector: Trigger deadlock detection immediately.
24/12/11 22:54:43 INFO ContextHandler: Stopped o.e.j.s.ServletContextHandler@1dfcd39e{/connect,null,STOPPED,@Spark}
24/12/11 22:54:43 INFO ContextHandler: Stopped o.e.j.s.ServletContextHandler@7650fd03{/connect/json,null,STOPPED,@Spark}
24/12/11 22:54:43 INFO ContextHandler: Stopped o.e.j.s.ServletContextHandler@17eec73f{/connect/session,null,STOPPED,@Spark}
24/12/11 22:54:43 INFO ContextHandler: Stopped o.e.j.s.ServletContextHandler@6c632e3{/connect/session/json,null,STOPPED,@Spark}
24/12/11 22:55:13 WARN ShutdownHookManager: ShutdownHook '$anon$2' timeout, java.util.concurrent.TimeoutException
java.util.concurrent.TimeoutException
at java.util.concurrent.FutureTask.get(FutureTask.java:205)
at org.apache.hadoop.util.ShutdownHookManager.executeShutdown(ShutdownHookManager.java:124)
at org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:95)
24/12/11 22:55:13 INFO DriverPluginContainer: Exception while shutting down plugin com.databricks.spark.connect.LocalSparkConnectPlugin.
java.lang.InterruptedException
at java.lang.Object.wait(Native Method)
at java.lang.Thread.join(Thread.java:1265)
at com.databricks.spark.connect.LocalSparkConnectPlugin$$anon$1.shutdown(LocalSparkConnectPlugin.scala:81)
at org.apache.spark.internal.plugin.DriverPluginContainer.$anonfun$shutdown$1(PluginContainer.scala:85)
at org.apache.spark.internal.plugin.DriverPluginContainer.$anonfun$shutdown$1$adapted(PluginContainer.scala:82)
at scala.collection.Iterator.foreach(Iterator.scala:943)
at scala.collection.Iterator.foreach$(Iterator.scala:943)
at scala.collection.AbstractIterator.foreach(Iterator.scala:1431)
at scala.collection.IterableLike.foreach(IterableLike.scala:74)
at scala.collection.IterableLike.foreach$(IterableLike.scala:73)
at scala.collection.AbstractIterable.foreach(Iterable.scala:56)
at org.apache.spark.internal.plugin.DriverPluginContainer.shutdown(PluginContainer.scala:82)
at org.apache.spark.SparkContext.$anonfun$stop$19(SparkContext.scala:2976)
at org.apache.spark.SparkContext.$anonfun$stop$19$adapted(SparkContext.scala:2976)
at scala.Option.foreach(Option.scala:407)
at org.apache.spark.SparkContext.$anonfun$stop$18(SparkContext.scala:2976)
at org.apache.spark.util.Utils$.tryLogNonFatalError(Utils.scala:1537)
at org.apache.spark.SparkContext.stop(SparkContext.scala:2976)
at org.apache.spark.SparkContext.stop(SparkContext.scala:2902)
at org.apache.spark.SparkContext.$anonfun$new$53(SparkContext.scala:1012)
at org.apache.spark.util.SparkShutdownHook.run(ShutdownHookManager.scala:237)
at org.apache.spark.util.SparkShutdownHookManager.$anonfun$runAll$2(ShutdownHookManager.scala:211)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:2192)
at org.apache.spark.util.SparkShutdownHookManager.$anonfun$runAll$1(ShutdownHookManager.scala:211)
at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
at scala.util.Try$.apply(Try.scala:213)
at org.apache.spark.util.SparkShutdownHookManager.runAll(ShutdownHookManager.scala:211)
at org.apache.spark.util.SparkShutdownHookManager$$anon$2.run(ShutdownHookManager.scala:190)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:750)
24/12/11 22:55:13 INFO MapOutputTrackerMasterEndpoint: MapOutputTrackerMasterEndpoint stopped!
24/12/11 22:55:13 INFO MemoryStore: MemoryStore started with capacity 366.3 MiB
24/12/11 22:55:13 INFO MemoryStore: MemoryStore cleared
24/12/11 22:55:13 INFO BlockManager: BlockManager stopped
24/12/11 22:55:13 INFO BlockManagerMaster: BlockManagerMaster stopped
24/12/11 22:55:13 INFO MetricsSystem: Stopping driver MetricsSystem
24/12/11 22:55:13 INFO DeadlockDetectorManager: Stopping all deadlock detection tasks.
24/12/11 22:55:13 INFO OutputCommitCoordinator$OutputCommitCoordinatorEndpoint: OutputCommitCoordinator stopped!
24/12/11 22:55:13 INFO SparkContext: Successfully stopped SparkContext
24/12/11 22:55:13 INFO privateLog: "disk_health_monitor" #51 TIMED_WAITING
sun.misc.Unsafe.park(Native Method)
- waiting on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@7c7b9519
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2083)
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093)
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:750)
24/12/11 22:55:13 INFO privateLog: "disk_health_monitor" #60 TIMED_WAITING
sun.misc.Unsafe.park(Native Method)
- waiting on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@2c6646f0
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2083)
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093)
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:750)
24/12/11 22:55:13 INFO privateLog: "Finalizer" #3 WAITING
java.lang.Object.wait(Native Method)
- waiting on java.lang.ref.ReferenceQueue$Lock@6aa9bab
java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:144)
java.lang.ref.ReferenceQueue.remove(ReferenceQueue.java:165)
java.lang.ref.Finalizer$FinalizerThread.run(Finalizer.java:188)
24/12/11 22:55:13 INFO privateLog: "main" #1 WAITING holding [java.lang.Class@798db0e4]
java.lang.Object.wait(Native Method)
- waiting on org.apache.hadoop.util.ShutdownHookManager$1@691541bc
java.lang.Thread.join(Thread.java:1257)
java.lang.Thread.join(Thread.java:1331)
java.lang.ApplicationShutdownHooks.runHooks(ApplicationShutdownHooks.java:107)
java.lang.ApplicationShutdownHooks$1.run(ApplicationShutdownHooks.java:46)
java.lang.Shutdown.runHooks(Shutdown.java:130)
java.lang.Shutdown.exit(Shutdown.java:178)
- locked java.lang.Class@798db0e4
java.lang.Runtime.exit(Runtime.java:104)
java.lang.System.exit(System.java:987)
org.apache.spark.api.python.PythonGatewayServer$.main(PythonGatewayServer.scala:75)
org.apache.spark.api.python.PythonGatewayServer.main(PythonGatewayServer.scala)
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
java.lang.reflect.Method.invoke(Method.java:498)
org.apache.spark.deploy.JavaMainApplication.start(SparkApplication.scala:52)
org.apache.spark.deploy.SparkSubmit.org$apache$spark$deploy$SparkSubmit$$runMain(SparkSubmit.scala:1017)
org.apache.spark.deploy.SparkSubmit.doRunMain$1(SparkSubmit.scala:187)
org.apache.spark.deploy.SparkSubmit.submit(SparkSubmit.scala:210)
org.apache.spark.deploy.SparkSubmit.doSubmit(SparkSubmit.scala:82)
org.apache.spark.deploy.SparkSubmit$$anon$2.doSubmit(SparkSubmit.scala:1107)
org.apache.spark.deploy.SparkSubmit$.main(SparkSubmit.scala:1116)
org.apache.spark.deploy.SparkSubmit.main(SparkSubmit.scala)
24/12/11 22:55:13 INFO privateLog: "node_status_monitor" #52 TIMED_WAITING
sun.misc.Unsafe.park(Native Method)
- waiting on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@1b55d368
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2083)
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093)
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:750)
24/12/11 22:55:13 INFO privateLog: "node_status_monitor" #61 TIMED_WAITING
sun.misc.Unsafe.park(Native Method)
- waiting on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@63964445
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2083)
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:1093)
java.util.concurrent.ScheduledThreadPoolExecutor$DelayedWorkQueue.take(ScheduledThreadPoolExecutor.java:809)
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1074)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:750)
24/12/11 22:55:13 INFO privateLog: "process reaper" #9 TIMED_WAITING
sun.misc.Unsafe.park(Native Method)
- waiting on java.util.concurrent.SynchronousQueue$TransferStack@33d1c02a
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
java.util.concurrent.SynchronousQueue$TransferStack.awaitFulfill(SynchronousQueue.java:460)
java.util.concurrent.SynchronousQueue$TransferStack.transfer(SynchronousQueue.java:362)
java.util.concurrent.SynchronousQueue.poll(SynchronousQueue.java:941)
java.util.concurrent.ThreadPoolExecutor.getTask(ThreadPoolExecutor.java:1073)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1134)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:750)
24/12/11 22:55:13 INFO privateLog: "Reference Handler" #2 WAITING
java.lang.Object.wait(Native Method)
- waiting on java.lang.ref.Reference$Lock@26d9053a
java.lang.Object.wait(Object.java:502)
java.lang.ref.Reference.tryHandlePending(Reference.java:191)
java.lang.ref.Reference$ReferenceHandler.run(Reference.java:153)
24/12/11 22:55:13 INFO privateLog: "rpc-boss-3-1" #15 TIMED_WAITING
java.lang.Thread.sleep(Native Method)
io.netty.util.concurrent.SingleThreadEventExecutor.confirmShutdown(SingleThreadEventExecutor.java:790)
io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:596)
io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
java.lang.Thread.run(Thread.java:750)
24/12/11 22:55:13 INFO privateLog: "rpc-client-1-1" #98 TIMED_WAITING
java.lang.Thread.sleep(Native Method)
io.netty.util.concurrent.SingleThreadEventExecutor.confirmShutdown(SingleThreadEventExecutor.java:790)
io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:596)
io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
java.lang.Thread.run(Thread.java:750)
24/12/11 22:55:13 INFO privateLog: "rpc-client-1-2" #99 TIMED_WAITING
java.lang.Thread.sleep(Native Method)
io.netty.util.concurrent.SingleThreadEventExecutor.confirmShutdown(SingleThreadEventExecutor.java:790)
io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:596)
io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
java.lang.Thread.run(Thread.java:750)
24/12/11 22:55:13 INFO privateLog: "rpc-server-4-1" #96 TIMED_WAITING
java.lang.Thread.sleep(Native Method)
io.netty.util.concurrent.SingleThreadEventExecutor.confirmShutdown(SingleThreadEventExecutor.java:790)
io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:596)
io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
java.lang.Thread.run(Thread.java:750)
24/12/11 22:55:13 INFO privateLog: "rpc-server-4-2" #97 TIMED_WAITING
java.lang.Thread.sleep(Native Method)
io.netty.util.concurrent.SingleThreadEventExecutor.confirmShutdown(SingleThreadEventExecutor.java:790)
io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:596)
io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
java.lang.Thread.run(Thread.java:750)
24/12/11 22:55:13 INFO privateLog: "shuffle-boss-6-1" #66 TIMED_WAITING
java.lang.Thread.sleep(Native Method)
io.netty.util.concurrent.SingleThreadEventExecutor.confirmShutdown(SingleThreadEventExecutor.java:790)
io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:596)
io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
java.lang.Thread.run(Thread.java:750)
24/12/11 22:55:13 INFO privateLog: "shuffle-client-5-1" #94 TIMED_WAITING
java.lang.Thread.sleep(Native Method)
io.netty.util.concurrent.SingleThreadEventExecutor.confirmShutdown(SingleThreadEventExecutor.java:790)
io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:596)
io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
java.lang.Thread.run(Thread.java:750)
24/12/11 22:55:13 INFO privateLog: "shuffle-client-5-2" #95 TIMED_WAITING
java.lang.Thread.sleep(Native Method)
io.netty.util.concurrent.SingleThreadEventExecutor.confirmShutdown(SingleThreadEventExecutor.java:790)
io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:596)
io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
java.lang.Thread.run(Thread.java:750)
24/12/11 22:55:13 INFO privateLog: "shuffle-server-7-1" #92 TIMED_WAITING
java.lang.Thread.sleep(Native Method)
io.netty.util.concurrent.SingleThreadEventExecutor.confirmShutdown(SingleThreadEventExecutor.java:790)
io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:596)
io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
java.lang.Thread.run(Thread.java:750)
24/12/11 22:55:13 INFO privateLog: "shuffle-server-7-2" #93 TIMED_WAITING
java.lang.Thread.sleep(Native Method)
io.netty.util.concurrent.SingleThreadEventExecutor.confirmShutdown(SingleThreadEventExecutor.java:790)
io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:596)
io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
java.lang.Thread.run(Thread.java:750)
24/12/11 22:55:13 INFO privateLog: "shutdown-hook-0" #79 RUNNABLE holding [java.util.concurrent.ThreadPoolExecutor$Worker@1e63ba59]
sun.management.ThreadImpl.dumpThreads0(Native Method)
sun.management.ThreadImpl.dumpAllThreads(ThreadImpl.java:496)
sun.management.ThreadImpl.dumpAllThreads(ThreadImpl.java:484)
org.apache.spark.util.Utils$.getThreadDump(Utils.scala:2407)
org.apache.spark.util.Utils$.logThreadDump(Utils.scala:2490)
org.apache.spark.util.Utils$.$anonfun$addThreadDumpShutdownHook$1(Utils.scala:2483)
org.apache.spark.util.Utils$$$Lambda$871/1159658512.apply$mcV$sp(Unknown Source)
org.apache.spark.util.SparkShutdownHook.run(ShutdownHookManager.scala:237)
org.apache.spark.util.SparkShutdownHookManager.$anonfun$runAll$2(ShutdownHookManager.scala:211)
org.apache.spark.util.SparkShutdownHookManager$$Lambda$2144/686498783.apply$mcV$sp(Unknown Source)
scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
org.apache.spark.util.Utils$.logUncaughtExceptions(Utils.scala:2192)
org.apache.spark.util.SparkShutdownHookManager.$anonfun$runAll$1(ShutdownHookManager.scala:211)
org.apache.spark.util.SparkShutdownHookManager$$Lambda$2143/640473450.apply$mcV$sp(Unknown Source)
scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
scala.util.Try$.apply(Try.scala:213)
org.apache.spark.util.SparkShutdownHookManager.runAll(ShutdownHookManager.scala:211)
org.apache.spark.util.SparkShutdownHookManager$$anon$2.run(ShutdownHookManager.scala:190)
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
java.util.concurrent.FutureTask.run(FutureTask.java:266)
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
java.lang.Thread.run(Thread.java:750)
24/12/11 22:55:13 INFO privateLog: "Signal Dispatcher" #4 RUNNABLE
24/12/11 22:55:13 INFO privateLog: "Thread-1" #10 TIMED_WAITING
sun.misc.Unsafe.park(Native Method)
- waiting on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@2ae9b417
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2083)
java.util.concurrent.ThreadPoolExecutor.awaitTermination(ThreadPoolExecutor.java:1475)
java.util.concurrent.Executors$DelegatedExecutorService.awaitTermination(Executors.java:675)
org.apache.hadoop.util.ShutdownHookManager.shutdownExecutor(ShutdownHookManager.java:146)
org.apache.hadoop.util.ShutdownHookManager.access$300(ShutdownHookManager.java:65)
org.apache.hadoop.util.ShutdownHookManager$1.run(ShutdownHookManager.java:102)
24/12/11 22:55:13 INFO privateLog: "Thread-16" #49 TIMED_WAITING
sun.misc.Unsafe.park(Native Method)
- waiting on java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject@2d1e42bf
java.util.concurrent.locks.LockSupport.parkNanos(LockSupport.java:215)
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.awaitNanos(AbstractQueuedSynchronizer.java:2083)
java.util.concurrent.LinkedBlockingDeque.pollFirst(LinkedBlockingDeque.java:522)
java.util.concurrent.LinkedBlockingDeque.poll(LinkedBlockingDeque.java:684)
sun.nio.fs.AbstractWatchService.poll(AbstractWatchService.java:108)
com.databricks.spark.connect.service.LocalSparkConnectService$.waitUntilFileExists(LocalSparkConnectService.scala:70)
com.databricks.spark.connect.service.LocalSparkConnectService$.startGRPCService(LocalSparkConnectService.scala:123)
com.databricks.spark.connect.service.LocalSparkConnectService$.start(LocalSparkConnectService.scala:155)
com.databricks.spark.connect.LocalSparkConnectPlugin$$anon$1$$anon$2.run(LocalSparkConnectPlugin.scala:64)
24/12/11 22:55:13 INFO privateLog: "Thread-17" #50 RUNNABLE
sun.nio.fs.LinuxWatchService.poll(Native Method)
sun.nio.fs.LinuxWatchService.access$600(LinuxWatchService.java:47)
sun.nio.fs.LinuxWatchService$Poller.run(LinuxWatchService.java:314)
java.lang.Thread.run(Thread.java:750)
24/12/11 22:55:13 INFO privateLog: "Thread-18" #57 TIMED_WAITING
java.lang.Thread.sleep(Native Method)
com.databricks.spark.util.ExternalLogAggregator.run(ExternalLogAggregator.scala:106)
java.lang.Thread.run(Thread.java:750)
24/12/11 22:55:13 INFO privateLog: "Thread-3" #12 RUNNABLE
java.net.PlainSocketImpl.socketAccept(Native Method)
java.net.AbstractPlainSocketImpl.accept(AbstractPlainSocketImpl.java:409)
java.net.ServerSocket.implAccept(ServerSocket.java:571)
java.net.ServerSocket.accept(ServerSocket.java:534)
py4j.GatewayServer.run(GatewayServer.java:705)
java.lang.Thread.run(Thread.java:750)
24/12/11 22:55:13 INFO ShutdownHookManager: Shutdown hook called
24/12/11 22:55:13 INFO ShutdownHookManager: Deleting directory /local_disk0/spark-6084ceb8-465b-493b-a78e-1726579dc598/executor-a3d9fc8f-9b96-47b3-84b0-dc6e33edd3e0/spark-0e8b90d6-6341-40d9-aa22-eb475660167b
24/12/11 22:55:13 INFO ShutdownHookManager: Deleting directory /tmp/spark-0cbec9c4-6e22-4fbd-8952-7a57a6044c53
24/12/11 22:55:13 INFO ShutdownHookManager: Deleting directory /local_disk0/spark-6084ceb8-465b-493b-a78e-1726579dc598/executor-a3d9fc8f-9b96-47b3-84b0-dc6e33edd3e0/spark-0e8b90d6-6341-40d9-aa22-eb475660167b/pyspark-57a5edc4-a20a-4865-83f3-218d05fe34d4
INFO:TorchDistributor:Finished distributed training with 2 executor processes
/root/.ipykernel/4182/command-2280852430858390-193987937:30: UserWarning: Implicit dimension choice for log_softmax has been deprecated. Change the call to include dim=X as an argument.
return F.log_softmax(x)
Average test loss: 1.9135884046554565
2024/12/11 22:55:26 WARNING mlflow.models.model: Model logged without a signature. Signatures will be required for upcoming model registry features as they validate model inputs and denote the expected schema of model outputs. Please visit https://www.mlflow.org/docs/2.15.1/models.html#set-signature-on-logged-model for instructions on setting a model signature on your logged model.
2024/12/11 22:55:27 WARNING mlflow.models.model: Input example should be provided to infer model signature if the model signature is not provided when logging the model.