Census income dataset
This dataset contains census data from the 1994 census database. Each row represents a group of individuals. The goal is to determine whether a group has an income of over 50k a year or not. This classification is represented as a string in the income column with values <=50K
or >50k
.
Next steps
- Explore the notebooks and experiments linked above.
- If the metrics for the best trial notebook look good, skip directly to the inference section.
- If you want to improve on the model generated by the best trial:
- Go to the notebook with the best trial and clone it.
- Edit the notebook as necessary to improve the model. For example, you might try different hyperparameters.
- When you are satisfied with the model, note the URI where the artifact for the trained model is logged. Assign this URI to the
model_uri
variable in Cmd 12.
AutoML classification example
Requirements
Databricks Runtime for Machine Learning.