Title: | 'DataRobot' Predictive Modeling API |
---|---|
Description: | For working with the 'DataRobot' predictive modeling platform's API <https://www.datarobot.com/>. |
Authors: | Ron Pearson [aut], Zachary Deane-Mayer [aut], David Chudzicki [aut], Dallin Akagi [aut], Sergey Yurgenson [aut], Thakur Raj Anand [aut], Peter Hurford [aut], Chester Ismay [aut], AJ Alon [aut], Andrew Watson [aut], Gregory Williams [aut], Anastasiia Tamazlykar [ctb], Mykhailo Poliakov [ctb], DataRobot, Inc. [cph] |
Maintainer: | AJ Alon <[email protected]> |
License: | MIT + file LICENSE |
Version: | 2.18.6 |
Built: | 2024-11-09 06:16:30 UTC |
Source: | CRAN |
For working with the 'DataRobot' predictive modeling platform's API https://www.datarobot.com/.
Each Eureqa model contains multiple possible solutions (see GetParetoFront
).
However, only the best model is included in the leaderboard by default. To include
other models, you can get them via GetParetoFront
and then add them.
AddEureqaSolution(project, eureqaSolutionId)
AddEureqaSolution(project, eureqaSolutionId)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
eureqaSolutionId |
character. The solution ID of the Eureqa model to add. |
## Not run: projectId <- "5b2827556523cd05bd1507a5" modelId <- "5b29406c6523cd0665685a8d" eureqaModel <- GetModel(projectId, modelId) paretoFront <- GetParetoFront(eureqaModel) ## End(Not run)
## Not run: projectId <- "5b2827556523cd05bd1507a5" modelId <- "5b29406c6523cd0665685a8d" eureqaModel <- GetModel(projectId, modelId) paretoFront <- GetParetoFront(eureqaModel) ## End(Not run)
Apply a schema to DataRobot objects (lists, frames)
ApplySchema(inList, schema, mask = NULL)
ApplySchema(inList, schema, mask = NULL)
inList |
object. The DataRobot object to apply the schema to. |
schema |
list. The schema to apply. |
mask |
list. Search the schema and only apply values that match this with grep. Defaults to NULL, or no masking. |
These functions extend R's generic as.data.frame function to the DataRobot S3 object classes listOfBlueprints, listOfFeaturelists, listOfModels, and projectSummaryList.
If simple = TRUE (the default), this method returns a dataframe with one row for each model and the following columns: projectName, projectId, created, fileName, target, targetType, positiveClass, metric, autopilotMode, stage, maxTrainPct, and holdoutUnlocked. If simple = FALSE, a dataframe is constructed from all elements of projectSummaryList.
## S3 method for class 'listOfBlueprints' as.data.frame(x, row.names = NULL, optional = FALSE, ...) ## S3 method for class 'listOfFeaturelists' as.data.frame(x, row.names = NULL, optional = FALSE, ...) ## S3 method for class 'listOfModels' as.data.frame(x, row.names = NULL, optional = FALSE, simple = TRUE, ...) ## S3 method for class 'projectSummaryList' as.data.frame(x, row.names = NULL, optional = FALSE, simple = TRUE, ...) ## S3 method for class 'listOfDataRobotPredictionDatasets' as.data.frame(x, row.names = NULL, optional = FALSE, ...)
## S3 method for class 'listOfBlueprints' as.data.frame(x, row.names = NULL, optional = FALSE, ...) ## S3 method for class 'listOfFeaturelists' as.data.frame(x, row.names = NULL, optional = FALSE, ...) ## S3 method for class 'listOfModels' as.data.frame(x, row.names = NULL, optional = FALSE, simple = TRUE, ...) ## S3 method for class 'projectSummaryList' as.data.frame(x, row.names = NULL, optional = FALSE, simple = TRUE, ...) ## S3 method for class 'listOfDataRobotPredictionDatasets' as.data.frame(x, row.names = NULL, optional = FALSE, ...)
x |
S3 object to be converted into a dataframe. |
row.names |
character. Optional. Row names for the dataframe returned by the method. |
optional |
logical. Optional. If TRUE, setting row
names and converting column names to syntactic names: see help for
|
... |
list. Additional optional parameters to be passed to the generic as.data.frame function (not used at present). |
simple |
logical. Optional. if TRUE (the default), a simplified dataframe is returned for objects of class listOfModels or projectSummaryList. |
All of the DataRobot S3 ‘listOf’ class objects have relatively complex structures and are often easier to work with as dataframes. The methods described here extend R's generic as.data.frame function to convert objects of these classes to convenient dataframes. For objects of class listOfBlueprints and listOfFeaturelists or objects of class listOfModels and projectSummaryList with simple = FALSE, the dataframes contain all information from the original S3 object. The default value simple = TRUE provides simpler dataframes for objects of class listOfModels and projectSummaryList.
If simple = TRUE (the default), this method returns a dataframe with one row for each model and the following columns: modelType, expandedModel (constructed from modelType and processes from the listOfModels elements), modelId, blueprintId, featurelistName, featurelistId, samplePct, and the metrics validation value for projectMetric. If simple = FALSE, the method returns a complete dataframe with one row for each model and columns constructed from all fields in the original listOfModels object
A dataframe containing some or all of the data from the original S3 object; see Details.
Information on a data feature.
as.dataRobotFeatureInfo(inList)
as.dataRobotFeatureInfo(inList)
inList |
list. See return value below for expected elements. |
A named list which contains:
id numeric. feature id. Note that throughout the API, features are specified using their names, not this ID.
name character. The name of the feature.
featureType character. Feature type: 'Numeric', 'Categorical', etc.
importance numeric. numeric measure of the strength of relationship between the feature and target (independent of any model or other features).
lowInformation logical. Whether the feature has too few values to be informative.
uniqueCount numeric. The number of unique values in the feature.
naCount numeric. The number of missing values in the feature.
dateFormat character. The format of the feature if it is date-time feature.
projectId character. Character id of the project the feature belongs to.
max. The maximum value in the dataset, formatted in the same format as the data.
min. The minimum value in the dataset, formatted in the same format as the data.
mean. The arithmetic mean of the dataset, formatted in the same format as the data.
median. The median of the dataset, formatted in the same format as the data.
stdDev. The standard deviation of the dataset, formatted in the same format as the data.
timeSeriesEligible logical. Whether this feature can be used as the datetime partition column in a time series project.
timeSeriesEligibilityReason character. Why the feature is ineligible for the datetime partition column in a time series project, "suitable" when it is eligible.
crossSeriesEligible logical. Whether the cross series group by column is eligible for cross-series modeling. Will be NULL if no cross series group by column is used.
crossSeriesEligibilityReason character. The type of cross series eligibility (or ineligibility).
timeStep numeric. For time-series eligible features, a positive integer determining the interval at which windows can be specified. If used as the datetime partition column on a time series project, the feature derivation and forecast windows must start and end at an integer multiple of this value. NULL for features that are not time series eligible.
timeUnit character. For time series eligible features, the time unit covered by a single time step, e.g. "HOUR", or NULL for features that are not time series eligible.
targetLeakage character. Whether a feature is considered to have target leakage or not. A value of "SKIPPED_DETECTION" indicates that target leakage detection was not run on the feature.
keySummary data.frame. Optional. Descriptive statistics for this feature, iff it is a summarized categorical feature. This data.frame contains:
key. The name of the key.
summary. Descriptive statistics for this key, including:
max. The maximum value in the dataset.
min. The minimum value in the dataset.
mean. The arithmetic mean of the dataset.
median. The median of the dataset.
stdDev. The standard deviation of the dataset.
pctRows. The percentage of rows (from the EDA sample) in which this key occurs.
Other feature functions:
GetFeatureInfo()
,
ListFeatureInfo()
,
ListModelFeatures()
Return value for GetMultiSeriesProperties() and others
as.dataRobotMultiSeriesProperties(inList)
as.dataRobotMultiSeriesProperties(inList)
inList |
list. See return value below for expected elements. |
A named list which contains:
timeSeriesEligible logical. Whether or not the series is eligible to be used for time series.
crossSeriesEligible logical. Whether or not the cross series group by column is eligible for cross-series modeling. Will be NULL if no cross series group by column is used.
crossSeriesEligibilityReason character. The type of cross series eligibility (or ineligibility).
timeUnit character. For time series eligible features, the time unit covered by a single time step, e.g. "HOUR", or NULL for features that are not time series eligible.
timeStep integer. Expected difference in time units between rows in the data. Will be NULL for features that are not time series eligible.
Other MultiSeriesProject functions:
GetMultiSeriesProperties()
,
RequestCrossSeriesDetection()
,
RequestMultiSeriesDetection()
Return value for SetupProject() and others
as.dataRobotProjectShort(inList)
as.dataRobotProjectShort(inList)
inList |
list. See return value below for expected elements. |
A named list that contains:
character. The name assigned to the DataRobot project
character. The unique alphanumeric project identifier for this DataRobot project
character. The name of the CSV modeling file uploaded for this project
character. The time and date of project creation
This is a list that contains the valid values for autopilot mode. If you wish, you can specify autopilot modes using the list values, e.g. AutopilotMode$FullAuto instead of typing the string "auto". This way you can benefit from autocomplete and not have to remember the valid options.
AutopilotMode
AutopilotMode
An object of class list
of length 4.
FullAuto
represents running the entire autopilot. Quick
runs a quicker, abridged
version of the autopilot that focuses on the most important models. Manual
does not run
the autopilot and instead leaves it to the user to select the algorithms to be run.
Comprehensive
runs all blueprints in the repository, and may be extremely slow.
Supports feature transformations, including:
text to categorical
text to numeric
categorical to text
categorical to numeric
numeric to categorical
BatchFeaturesTypeTransform( project, parentNames, variableType, prefix = NULL, suffix = NULL, maxWait = 600 )
BatchFeaturesTypeTransform( project, parentNames, variableType, prefix = NULL, suffix = NULL, maxWait = 600 )
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
parentNames |
character. Character vector of variable names to be transformed. |
variableType |
character. The new type that the columns should be converted to.
See |
prefix |
character. Optional. The string to preface all the transformed features.
Either |
suffix |
character. Optional. The string that will be appended at the end to all
the transformed features. Either |
maxWait |
integer. Optional. The maximum amount of time (in seconds) to wait for DataRobot to finish processing the new column before providing a timeout error. |
a list of all the features, after transformation. See GetFeaturelist
for details.
## Not run: projectId <- "59a5af20c80891534e3c2bde" BatchFeaturesTypeTransform(projectId, parentNames = c("var1", "var2"), variableType = VariableTransformTypes$Categorical, suffix = "_transformed") ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" BatchFeaturesTypeTransform(projectId, parentNames = c("var1", "var2"), variableType = VariableTransformTypes$Categorical, suffix = "_transformed") ## End(Not run)
This is a list that contains the valid values for Blend methods
BlendMethods
BlendMethods
An object of class list
of length 13.
Convert a blueprint chart into graphviz DOT format
BlueprintChartToGraphviz(blueprintChart)
BlueprintChartToGraphviz(blueprintChart)
blueprintChart |
list. The list returned by |
Character string representation of chart in graphviz DOT language.
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) blueprintId <- model$blueprintId blueprintChart <- GetBlueprintChart(projectId, blueprintId) BlueprintChartToGraphviz(blueprintChart) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) blueprintId <- model$blueprintId blueprintChart <- GetBlueprintChart(projectId, blueprintId) BlueprintChartToGraphviz(blueprintChart) ## End(Not run)
Make sure the path is a reasonable URL
CheckUrl(url)
CheckUrl(url)
url |
character. The URL to check. |
Added in DataRobot API 2.18.
ClassificationDeploymentAccuracyMetric
ClassificationDeploymentAccuracyMetric
An object of class list
of length 14.
Reformat paginated data returned from the server.
CleanServerData(serverData)
CleanServerData(serverData)
serverData |
list. Raw JSON parsed list returned from the server. |
This function clones a project, creating a fresh (post-EDA1) copy that will need a target and modeling options set.
CloneProject(project, newProjectName = NULL, maxWait = 600)
CloneProject(project, newProjectName = NULL, maxWait = 600)
project |
dataRobotProject, or a character representing that project's ID. |
newProjectName |
character. The name of the newly cloned project. If no
name is given, the API will default to 'Copy of |
maxWait |
integer. The maximum time to wait for each of two steps: (1) The initial project creation request, and (2) data processing that occurs after receiving the response to this initial request. |
A named list that contains:
character. The name assigned to the DataRobot project
character. The unique alphanumeric project identifier for this DataRobot project
character. The name of the CSV modeling file uploaded for this project
character. The time and date of project creation
## Not run: project <- GetProject("5c1303269300d900016b41a7") CloneProject(project, newProjectName = "Project Restart") ## End(Not run)
## Not run: project <- GetProject("5c1303269300d900016b41a7") CloneProject(project, newProjectName = "Project Restart") ## End(Not run)
Compute datetime trend plots for datetime partitioned model. This includes Accuracy over Time, Forecast vs Actual, and Anomaly over Time plots.
ComputeDatetimeTrendPlots( model, backtest = 0, source = SourceType$Validation, forecastDistanceStart = NULL, forecastDistanceEnd = NULL )
ComputeDatetimeTrendPlots( model, backtest = 0, source = SourceType$Validation, forecastDistanceStart = NULL, forecastDistanceEnd = NULL )
model |
An S3 object of class dataRobotModel like that returned by the function GetModel, or each element of the list returned by the function ListModels. |
backtest |
integer or character. Optional. Compute plots for a specific backtest.
Use the backtest index starting from zero.
To compute plots for holdout, use |
source |
character. Optional. The source of the data for the backtest/holdout.
Must be one of |
forecastDistanceStart |
integer. Optional. The start of forecast distance range (forecast window) to compute. If not specified, the first forecast distance for this project will be used. Only for time series supervised models. |
forecastDistanceEnd |
integer. Optional. The end of forecast distance range (forecast window) to compute. If not specified, the last forecast distance for this project will be used. Only for time series supervised models. |
Forecast distance specifies the number of time steps between the predicted point and the origin point.
For the multiseries models only first 1000 series in alphabetical order and an average plot for them will be computed.
Maximum 100 forecast distances can be requested for calculation in time series supervised projects.
An integer value that can be used as the jobId parameter in a subsequent call
to WaitForJobToComplete
.
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) jobId <- ComputeDatetimeTrendPlots(model) WaitForJobToComplete(projectId, jobId) # optional step ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) jobId <- ComputeDatetimeTrendPlots(model) WaitForJobToComplete(projectId, jobId) # optional step ## End(Not run)
This function initializes a DataRobot session. To use DataRobot, you must connect to your account. This can be done in three ways:
by passing an endpoint
and token
directly to ConnectToDataRobot
by having a YAML config file in $HOME/.config/datarobot/drconfig.yaml
by setting DATAROBOT_API_ENDPOINT and DATAROBOT_API_TOKEN environment variables
The three methods of authentication are given priority in that order (explicitly passing
parameters to the function will trump a YAML config file, which will trump the environment
variables.)
If you have a YAML config file or environment variables set, you will not need to
pass any parameters to ConnectToDataRobot
in order to connect.
ConnectToDataRobot( endpoint = NULL, token = NULL, username = NULL, password = NULL, userAgentSuffix = NULL, sslVerify = TRUE, configPath = NULL )
ConnectToDataRobot( endpoint = NULL, token = NULL, username = NULL, password = NULL, userAgentSuffix = NULL, sslVerify = TRUE, configPath = NULL )
endpoint |
character. URL specifying the DataRobot server to be used. It depends on DataRobot modeling engine implementation (cloud-based, on-prem...) you are using. Contact your DataRobot admin for endpoint to use and to turn on API access to your account. The endpoint for DataRobot cloud accounts is https://app.datarobot.com/api/v2 |
token |
character. DataRobot API access token. It is unique for each DataRobot modeling engine account and can be accessed using DataRobot webapp in Account profile section. |
username |
character. No longer supported. |
password |
character. No longer supported. |
userAgentSuffix |
character. Additional text that is appended to the User-Agent HTTP header when communicating with the DataRobot REST API. This can be useful for identifying different applications that are built on top of the DataRobot Python Client, which can aid debugging and help track usage. |
sslVerify |
logical. Whether to check the SSL certificate. Either TRUE to check (default), FALSE to not check. |
configPath |
character. Path to YAML config file specifying configuration (token and endpoint). |
## Not run: ConnectToDataRobot("https://app.datarobot.com/api/v2", "thisismyfaketoken") ConnectToDataRobot(configPath = "~/.config/datarobot/drconfig.yaml") ## End(Not run)
## Not run: ConnectToDataRobot("https://app.datarobot.com/api/v2", "thisismyfaketoken") ConnectToDataRobot(configPath = "~/.config/datarobot/drconfig.yaml") ## End(Not run)
A duration of six months, 3 days, and 12 hours could be represented as P6M3DT12H.
ConstructDurationString( years = 0, months = 0, days = 0, hours = 0, minutes = 0, seconds = 0 )
ConstructDurationString( years = 0, months = 0, days = 0, hours = 0, minutes = 0, seconds = 0 )
years |
integer. The number of years in the duration. |
months |
integer. The number of months in the duration. |
days |
integer. The number of days in the duration. |
hours |
integer. The number of hours in the duration. |
minutes |
integer. The number of minutes in the duration. |
seconds |
integer. The number of seconds in the duration. |
The duration string, specified compatibly with ISO8601.
ConstructDurationString() ConstructDurationString(days = 100) ConstructDurationString(years = 10, months = 2, days = 5, seconds = 12)
ConstructDurationString() ConstructDurationString(days = 100) ConstructDurationString(years = 10, months = 2, days = 5, seconds = 12)
Uniquely defines a Backtest used in a DatetimePartitioning
CreateBacktestSpecification( index, gapDuration, validationStartDate, validationDuration )
CreateBacktestSpecification( index, gapDuration, validationStartDate, validationDuration )
index |
integer. The index of the backtest |
gapDuration |
character. The desired duration of the gap between training and validation data for the backtest in duration format (ISO8601). |
validationStartDate |
character. The desired start date of the validation data for this backtest (RFC 3339 format). |
validationDuration |
character. The desired end date of the validation data for this backtest in duration format (ISO8601). |
Includes only the attributes of a backtest directly controllable by users. The other attributes are assigned by the DataRobot application based on the project dataset and the user-controlled settings. All durations should be specified with a duration string such as those returned by the ConstructDurationString helper function.
list with backtest parameters
zeroDayDuration <- ConstructDurationString() hundredDayDuration <- ConstructDurationString(days = 100) CreateBacktestSpecification(index = 0, gapDuration = zeroDayDuration, validationStartDate = "1989-12-01", validationDuration = hundredDayDuration)
zeroDayDuration <- ConstructDurationString() hundredDayDuration <- ConstructDurationString(days = 100) CreateBacktestSpecification(index = 0, gapDuration = zeroDayDuration, validationStartDate = "1989-12-01", validationDuration = hundredDayDuration)
Create a calendar from an uploaded CSV.
CreateCalendar( dataSource, name = NULL, multiSeriesIdColumn = NULL, maxWait = 600 )
CreateCalendar( dataSource, name = NULL, multiSeriesIdColumn = NULL, maxWait = 600 )
dataSource |
object. Either (a) the name of a CSV file, or (b) a dataframe. This parameter identifies the source of the calendar data. |
name |
character. Optional. The name of the calendar. |
multiSeriesIdColumn |
character. Optional. Added in 2.19. The column in the calendar that defines which series an event belongs to. Only one column is supported. |
maxWait |
integer. The maximum time (in seconds) to wait for the retrieve to complete. |
An S3 object of class "dataRobotCalendar"
## Not run: CreateCalendar("inst/extdata/calendar.csv", name = "intlHolidayCalendar") ## End(Not run) ## Not run: holidayCalendarDF <- as.data.frame(myCalendar) CreateCalendar(holidayCalendarDF, name = "intlHolidayCalendar") ## End(Not run) ## Not run: CreateCalendar("inst/extdata/calendar.csv", name = "intlHolidayCalendar", multiSeriesIdColumn = "Country") ## End(Not run)
## Not run: CreateCalendar("inst/extdata/calendar.csv", name = "intlHolidayCalendar") ## End(Not run) ## Not run: holidayCalendarDF <- as.data.frame(myCalendar) CreateCalendar(holidayCalendarDF, name = "intlHolidayCalendar") ## End(Not run) ## Not run: CreateCalendar("inst/extdata/calendar.csv", name = "intlHolidayCalendar", multiSeriesIdColumn = "Country") ## End(Not run)
Note that if you're looking to download compliance documentation to a DOCX file, you can
call DownloadComplianceDocumentation
directly without using this function.
CreateComplianceDocumentation(model, templateId = NULL)
CreateComplianceDocumentation(model, templateId = NULL)
model |
An S3 object of class dataRobotModel like that returned by the function GetModel, or each element of the list returned by the function ListModels. |
templateId |
character. Optional. The ID of the template to use in generating custom model documentation. |
An integer value that can be used as the jobId parameter in a subsequent call
to WaitForJobToComplete
.
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) jobId <- CreateComplianceDocumentation(model) # optional step WaitForJobToComplete(projectId, jobId) # optional step DownloadComplianceDocumentation(model) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) jobId <- CreateComplianceDocumentation(model) # optional step WaitForJobToComplete(projectId, jobId) # optional step DownloadComplianceDocumentation(model) ## End(Not run)
Create a data source.
CreateDataSource( type, canonicalName, dataStoreId, query = NULL, table = NULL, schema = NULL, partitionColumn = NULL, fetchSize = NULL )
CreateDataSource( type, canonicalName, dataStoreId, query = NULL, table = NULL, schema = NULL, partitionColumn = NULL, fetchSize = NULL )
type |
character. The type of data source. |
canonicalName |
character. The user-friendly name of the data source. |
dataStoreId |
character. The ID of the data store to connect to. |
query |
character. A query to execute on the data store to get the data. Optional. |
table |
character. The specified database table. Optional. |
schema |
character. The specified database schema. Optional. |
partitionColumn |
character. The name of the partition column. Optional. |
fetchSize |
integer. a user specified fetch size in the range [1, 20000]. Optional. By default a fetchSize will be assigned to balance throughput and memory usage |
## Not run: dataStoreId <- "5c1303269300d900016b41a7" CreateDataSource(type = "jdbc", canonicalName = "Airline stats after 1995", dataStoreId = dataStoreId, query = 'SELECT * FROM airlines10mb WHERE "Year" >= 1995;') ## End(Not run)
## Not run: dataStoreId <- "5c1303269300d900016b41a7" CreateDataSource(type = "jdbc", canonicalName = "Airline stats after 1995", dataStoreId = dataStoreId, query = 'SELECT * FROM airlines10mb WHERE "Year" >= 1995;') ## End(Not run)
Create a data store.
CreateDataStore(type, canonicalName, driverId, jdbcUrl)
CreateDataStore(type, canonicalName, driverId, jdbcUrl)
type |
character. The type of data store. |
canonicalName |
character. The user-friendly name of the data store. |
driverId |
character. The ID of the driver to use. |
jdbcUrl |
character. The full JDBC url. |
## Not run: CreateDataStore(type = "jdbc", canonicalName = "Demo DB", driverId = "57a7c978c808916f4a630f89", jdbcUrl = "jdbc:postgresql://my.db.address.org:5432/my_db") ## End(Not run)
## Not run: CreateDataStore(type = "jdbc", canonicalName = "Demo DB", driverId = "57a7c978c808916f4a630f89", jdbcUrl = "jdbc:postgresql://my.db.address.org:5432/my_db") ## End(Not run)
Uniquely defines a DatetimePartitioning for some project
CreateDatetimePartitionSpecification( datetimePartitionColumn, autopilotDataSelectionMethod = NULL, validationDuration = NULL, holdoutStartDate = NULL, holdoutDuration = NULL, disableHoldout = NULL, gapDuration = NULL, numberOfBacktests = NULL, backtests = NULL, useTimeSeries = FALSE, defaultToKnownInAdvance = FALSE, featureDerivationWindowStart = NULL, featureDerivationWindowEnd = NULL, featureSettings = NULL, treatAsExponential = NULL, differencingMethod = NULL, windowsBasisUnit = NULL, periodicities = NULL, forecastWindowStart = NULL, forecastWindowEnd = NULL, multiseriesIdColumns = NULL, useCrossSeries = NULL, aggregationType = NULL, crossSeriesGroupByColumns = NULL, calendar = NULL )
CreateDatetimePartitionSpecification( datetimePartitionColumn, autopilotDataSelectionMethod = NULL, validationDuration = NULL, holdoutStartDate = NULL, holdoutDuration = NULL, disableHoldout = NULL, gapDuration = NULL, numberOfBacktests = NULL, backtests = NULL, useTimeSeries = FALSE, defaultToKnownInAdvance = FALSE, featureDerivationWindowStart = NULL, featureDerivationWindowEnd = NULL, featureSettings = NULL, treatAsExponential = NULL, differencingMethod = NULL, windowsBasisUnit = NULL, periodicities = NULL, forecastWindowStart = NULL, forecastWindowEnd = NULL, multiseriesIdColumns = NULL, useCrossSeries = NULL, aggregationType = NULL, crossSeriesGroupByColumns = NULL, calendar = NULL )
datetimePartitionColumn |
character. The name of the column whose values as dates are used to assign a row to a particular partition |
autopilotDataSelectionMethod |
character. Optional. Whether models created by the autopilot should use "rowCount" or "duration" as their dataSelectionMethod |
validationDuration |
character. Optional. The default validationDuration for the backtests |
holdoutStartDate |
character. The start date of holdout scoring data (RFC 3339 format). If holdoutStartDate is specified, holdoutDuration must also be specified. |
holdoutDuration |
character. Optional. The duration of the holdout scoring data. If holdoutDuration is specified, holdoutStartDate must also be specified. |
disableHoldout |
logical. Optional. Whether to suppress allocating the holdout fold. If set to TRUE, holdoutStartDate and holdoutDuration must not be specified. |
gapDuration |
character. Optional. The duration of the gap between training and holdout scoring data. |
numberOfBacktests |
integer. The number of backtests to use. |
backtests |
list. List of BacktestSpecification the exact specification of backtests to use. The indexes of the specified backtests should range from 0 to numberOfBacktests - 1. If any backtest is left unspecified, a default configuration will be chosen. |
useTimeSeries |
logical. Whether to create a time series project (if TRUE) or an OTV project which uses datetime partitioning (if FALSE). The default behavior is to create an OTV project. |
defaultToKnownInAdvance |
logical. Whether to default to treating features as known in advance. Defaults to FALSE. Only used for time series project. Known in advance features are expected to be known for dates in the future when making predictions (e.g., "is this a holiday"). |
featureDerivationWindowStart |
integer. Optional. Offset into the past to define how far
back relative to the forecast point the feature derivation window should start. Only used for
time series projects. Expressed in terms of the |
featureDerivationWindowEnd |
integer. Optional. Offset into the past to define how far
back relative to the forecast point the feature derivation window should end. Only used for
time series projects. Expressed in terms of the |
featureSettings |
list. Optional. A list specifying settings for each feature. For each feature you would like to set feature settings for, pass the following in a list:
|
treatAsExponential |
character. Optional. Defaults to "auto". Used to specify whether to
treat data as exponential trend and apply transformations like log-transform. Use values
from |
differencingMethod |
character. Optional. Defaults to "auto". Used to specify differencing
method to apply if data is stationary. Use values from |
windowsBasisUnit |
character. Optional. Indicates which unit is the basis for the feature
derivation window and forecast window. Valid options are a time unit (see |
periodicities |
list. Optional. A list of periodicities for different times. Must be
specified as a list of lists, where each list item specifies the 'timeSteps' for a
particular 'timeUnit'. Should be "ROW" if |
forecastWindowStart |
integer. Optional. Offset into the future to define how far forward
relative to the forecast point the forecast window should start. Only used for time series
projects. Expressed in terms of the |
forecastWindowEnd |
integer. Optional. Offset into the future to define how far forward
relative to the forecast point the forecast window should end. Only used for time series
projects. Expressed in terms of the |
multiseriesIdColumns |
list. A list of the names of multiseries id columns to define series |
useCrossSeries |
logical. If |
aggregationType |
character. Optional. The aggregation type to apply when creating cross
series features. Must be either "total" or "average". See |
crossSeriesGroupByColumns |
character. Optional. Column to split a cross series into
further groups. For example, if every series is sales of an individual product, the cross
series group could be e product category with values like "men's clothing", "sports
equipment", etc. Requires multiseries with |
calendar |
character. Optional. Either the calendar object or calendar id to use for this project. |
Includes only the attributes of DatetimePartitioning that are directly controllable by users, not those determined by the DataRobot application based on the project dataset and the user-controlled settings. This is the specification that should be passed to SetTarget via the partition parameter. To see the full partitioning based on the project dataset, GenerateDatetimePartition. All durations should be specified with a duration string such as those returned by the ConstructDurationString helper function.
An S3 object of class 'partition' including the parameters required by the SetTarget function to generate a datetime partitioning of the modeling dataset.
CreateDatetimePartitionSpecification("date_col") CreateDatetimePartitionSpecification("date", featureSettings = list( list("featureName" = "Product_offers", "defaultToKnownInAdvance" = TRUE))) partition <- CreateDatetimePartitionSpecification("dateColumn", treatAsExponential = TreatAsExponential$Always, differencingMethod = DifferencingMethod$Seasonal, periodicities = list(list("timeSteps" = 10, "timeUnit" = "HOUR"), list("timeSteps" = 600, "timeUnit" = "MINUTE"), list("timeSteps" = 7, "timeUnit" = "DAY")))
CreateDatetimePartitionSpecification("date_col") CreateDatetimePartitionSpecification("date", featureSettings = list( list("featureName" = "Product_offers", "defaultToKnownInAdvance" = TRUE))) partition <- CreateDatetimePartitionSpecification("dateColumn", treatAsExponential = TreatAsExponential$Always, differencingMethod = DifferencingMethod$Seasonal, periodicities = list(list("timeSteps" = 10, "timeUnit" = "HOUR"), list("timeSteps" = 600, "timeUnit" = "MINUTE"), list("timeSteps" = 7, "timeUnit" = "DAY")))
Create a deployment.
CreateDeployment( model, label = "", description = "", defaultPredictionServerId = NULL )
CreateDeployment( model, label = "", description = "", defaultPredictionServerId = NULL )
model |
An S3 object of class dataRobotModel like that returned by the function GetModel, or each element of the list returned by the function ListModels. |
label |
character. The name of the deployment. |
description |
character. Optional. A longer description of the deployment. |
defaultPredictionServerId |
character. The ID of the prediction server to connect to. Can also be a prediction server object. |
A DataRobotDeployment object containing:
id character. The ID of the deployment.
label character. The label of the deployment.
description character. The description of the deployment.
defaultPredictionServer list. Information on the default prediction
server connected with the deployment. See ListPredictionServers
for details.
model dataRobotModel. The model associated with the deployment.
See GetModel
for details.
capabilities list. Information on the capabilities of the deployment.
predictionUsage list. Information on the prediction usage of the deployment.
permissions list. User's permissions on the deployment.
serviceHealth list. Information on the service health of the deployment.
modelHealth list. Information on the model health of the deployment.
accuracyHealth list. Information on the accuracy health of the deployment.
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) predictionServer <- ListPredictionServers()[[1]] CreateDeployment(model, label = "myDeployment", description = "this is my deployment", defaultPredictionServerId = predictionServer) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) predictionServer <- ListPredictionServers()[[1]] CreateDeployment(model, label = "myDeployment", description = "this is my deployment", defaultPredictionServerId = predictionServer) ## End(Not run)
These functions request that new features be created as transformations of existing features and wait for the new feature to be created.
CreateDerivedFeatureAsCategorical( project, parentName, name = NULL, dateExtraction = NULL, replacement = NULL, maxWait = 600 ) CreateDerivedFeatureAsText( project, parentName, name = NULL, dateExtraction = NULL, replacement = NULL, maxWait = 600 ) CreateDerivedFeatureAsNumeric( project, parentName, name = NULL, dateExtraction = NULL, replacement = NULL, maxWait = 600 ) CreateDerivedFeatureIntAsCategorical( project, parentName, name = NULL, dateExtraction = NULL, replacement = NULL, maxWait = 600 )
CreateDerivedFeatureAsCategorical( project, parentName, name = NULL, dateExtraction = NULL, replacement = NULL, maxWait = 600 ) CreateDerivedFeatureAsText( project, parentName, name = NULL, dateExtraction = NULL, replacement = NULL, maxWait = 600 ) CreateDerivedFeatureAsNumeric( project, parentName, name = NULL, dateExtraction = NULL, replacement = NULL, maxWait = 600 ) CreateDerivedFeatureIntAsCategorical( project, parentName, name = NULL, dateExtraction = NULL, replacement = NULL, maxWait = 600 )
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
parentName |
The name of the parent feature. |
name |
The name of the new feature. |
dateExtraction |
dateExtraction: The value to extract from the date column: 'year', 'yearDay', 'month', 'monthDay', 'week', or 'weekDay'. Required for transformation of a date column. Otherwise must not be provided. |
replacement |
The replacement in case of a failed transformation. Optional. |
maxWait |
The maximum time (in seconds) to wait for feature creation. |
Details for the created feature; same schema as the object returned from GetFeatureInfo.
This function allows the user to create a new featurelist in a project by specifying its name and a list of variables to be included
CreateFeaturelist(project, listName, featureNames)
CreateFeaturelist(project, listName, featureNames)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
listName |
character. String identifying the new featurelist to be created. |
featureNames |
character. Vector listing the names of the variables to be included in the featurelist. |
DataRobot featurelists define the variables from the modeling dataset used in fitting each project model. Some functions (SetTarget, StartNewAutopilot) optionally accept a featurelist (and use a default featurelist if none is specified).
A list with the following four elements describing the featurelist created:
Character string giving the unique alphanumeric identifier for the new featurelist.
Character string giving the projectId identifying the project to which the featurelist was added.
Character vector with the names of the variables included in the new featurelist.
Character string giving the name of the new featurelist.
## Not run: projectId <- "59a5af20c80891534e3c2bde" CreateFeaturelist(projectId, "myFeaturelist", c("feature1", "feature2", "otherFeature")) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" CreateFeaturelist(projectId, "myFeaturelist", c("feature1", "feature2", "otherFeature")) ## End(Not run)
Group partitioning constructs data partitions such that all records with each level in the column specified by the parameter partitionKeyCols occur together in the same partition.
CreateGroupPartition( validationType, holdoutPct, partitionKeyCols, reps = NULL, validationPct = NULL )
CreateGroupPartition( validationType, holdoutPct, partitionKeyCols, reps = NULL, validationPct = NULL )
validationType |
character. String specifying the type of partition generated, either "TVH" or "CV". |
holdoutPct |
integer. The percentage of data to be used as the holdout subset. |
partitionKeyCols |
list. List containing a single string specifying the name of the variable used in defining the group partition. |
reps |
integer. The number of cross-validation folds to generate; only applicable when validationType = "CV". |
validationPct |
integer. The percentage of data to be used as the validation subset. |
This function is one of several convenience functions provided to simplify the task
of starting modeling projects with custom partitioning options. The other
functions are CreateRandomPartition
, CreateStratifiedPartition
, and
CreateUserPartition
.
An S3 object of class 'partition' including the parameters required by the SetTarget function to generate a group-based partitioning of the modeling dataset.
CreateRandomPartition
, CreateStratifiedPartition
,
CreateUserPartition
.
CreateGroupPartition(validationType = "CV", holdoutPct = 20, partitionKeyCols = list("groupId"), reps = 5)
CreateGroupPartition(validationType = "CV", holdoutPct = 20, partitionKeyCols = list("groupId"), reps = 5)
In time series projects, a new set of modeling features is created after setting the partitioning options. These features are automatically derived from those in the project's dataset and are the features used for modeling. Modeling features are only accessible once the target and partitioning options have been set. In projects that don't use time series modeling, once the target has been set, ModelingFeaturelists and Featurelists will behave the same.
CreateModelingFeaturelist(project, listName, featureNames)
CreateModelingFeaturelist(project, listName, featureNames)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
listName |
character. String identifying the new featurelist to be created. |
featureNames |
character. Vector listing the names of the variables to be included in the featurelist. |
## Not run: projectId <- "59a5af20c80891534e3c2bde" CreateModelingFeaturelist(projectId, "myFeaturelist", c("feature1", "feature2")) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" CreateModelingFeaturelist(projectId, "myFeaturelist", c("feature1", "feature2")) ## End(Not run)
Create and validate the downloadable code for the ruleset associated with this model
CreatePrimeCode(project, primeModelId, language)
CreatePrimeCode(project, primeModelId, language)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
primeModelId |
character. Id returned by GetPrimeModel(s) functions. |
language |
character. Programming language to use for downloadable code (see PrimeLanguage). |
job Id
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" CreatePrimeCode(projectId, modelId, "Python") ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" CreatePrimeCode(projectId, modelId, "Python") ## End(Not run)
Random partitioning is supported for either Training/Validation/Holdout ("TVH") or cross-validation ("CV") splits. In either case, the holdout percentage (holdoutPct) must be specified; for the "CV" method, the number of cross-validation folds (reps) must also be specified, while for the "TVH" method, the validation subset percentage (validationPct) must be specified.
CreateRandomPartition( validationType, holdoutPct, reps = NULL, validationPct = NULL )
CreateRandomPartition( validationType, holdoutPct, reps = NULL, validationPct = NULL )
validationType |
character. String specifying the type of partition generated, either "TVH" or "CV". |
holdoutPct |
integer. The percentage of data to be used as the holdout subset. |
reps |
integer. The number of cross-validation folds to generate; only applicable when validationType = "CV". |
validationPct |
integer. The percentage of data to be used as the validation subset. |
This function is one of several convenience functions provided to simplify the task
of starting modeling projects with custom partitioning options. The other
functions are CreateGroupPartition
, CreateStratifiedPartition
, and
CreateUserPartition
.
An S3 object of class partition including the parameters required by SetTarget to generate a random partitioning of the modeling dataset.
CreateStratifiedPartition
, CreateGroupPartition
,
CreateUserPartition
.
CreateRandomPartition(validationType = "CV", holdoutPct = 20, reps = 5)
CreateRandomPartition(validationType = "CV", holdoutPct = 20, reps = 5)
Creates and validates a new rating table from an uploaded CSV.
CreateRatingTable( project, parentModelId, dataSource, ratingTableName = "Uploaded Rating Table" )
CreateRatingTable( project, parentModelId, dataSource, ratingTableName = "Uploaded Rating Table" )
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
parentModelId |
integer. The id of the model to validate the rating table against. |
dataSource |
object. Either (a) the name of a CSV file, or (b) a dataframe. This parameter identifies the source of the rating table. |
ratingTableName |
character. Optional. The name of the rating table. |
An integer value that can be used as the JobId parameter in subsequent calls representing this job.
## Not run: projectId <- "5984b4d7100d2b31c1166529" modelId <- "5984b4d7100d2b31c1166529" CreateRatingTable(projectId, modelId, dataSource = "myRatingTable.csv") ## End(Not run)
## Not run: projectId <- "5984b4d7100d2b31c1166529" modelId <- "5984b4d7100d2b31c1166529" CreateRatingTable(projectId, modelId, dataSource = "myRatingTable.csv") ## End(Not run)
Stratified partitioning is supported for binary classification problems and it randomly partitions the modeling data, keeping the percentage of positive class observations in each partition the same as in the original dataset. Stratified partitioning is supported for either Training/Validation/Holdout ("TVH") or cross-validation ("CV") splits. In either case, the holdout percentage (holdoutPct) must be specified; for the "CV" method, the number of cross-validation folds (reps) must also be specified, while for the "TVH" method, the validation subset percentage (validationPct) must be specified.
CreateStratifiedPartition( validationType, holdoutPct, reps = NULL, validationPct = NULL )
CreateStratifiedPartition( validationType, holdoutPct, reps = NULL, validationPct = NULL )
validationType |
character. String specifying the type of partition generated, either "TVH" or "CV". |
holdoutPct |
integer. The percentage of data to be used as the holdout subset. |
reps |
integer. The number of cross-validation folds to generate; only applicable when validationType = "CV". |
validationPct |
integer. The percentage of data to be used as the validation subset. |
This function is one of several convenience functions provided to simplify the task
of starting modeling projects with custom partitioning options. The other
functions are CreateGroupPartition
, CreateRandomPartition
, and
CreateUserPartition
.
An S3 object of class 'partition' including the parameters required by the SetTarget function to generate a stratified partitioning of the modeling dataset.
CreateGroupPartition
, CreateRandomPartition
,
CreateUserPartition
.
CreateStratifiedPartition(validationType = "CV", holdoutPct = 20, reps = 5)
CreateStratifiedPartition(validationType = "CV", holdoutPct = 20, reps = 5)
Creates a list object used by the SetTarget function to specify either Training/Validation/Holdout (validationType = "TVH") or cross-validation (validationType = "CV") partitions of the modeling dataset based on the values included in a column from the dataset. In either case, the name of this data column must be specified (as userPartitionCol).
CreateUserPartition( validationType, userPartitionCol, cvHoldoutLevel = NULL, trainingLevel = NULL, holdoutLevel = NULL, validationLevel = NULL )
CreateUserPartition( validationType, userPartitionCol, cvHoldoutLevel = NULL, trainingLevel = NULL, holdoutLevel = NULL, validationLevel = NULL )
validationType |
character. String specifying the type of partition generated, either "TVH" or "CV". |
userPartitionCol |
character. String naming the data column from the modeling dataset containing the subset designations. |
cvHoldoutLevel |
character. Data value from userPartitionCol that identifies the holdout subset under the "CV" option. |
trainingLevel |
character. Data value from userPartitionCol that identifies the training subset under the "TVH" option. |
holdoutLevel |
character. Data value from userPartitionCol that identifies the holdout subset under both "TVH" and "CV" options. To specify that the project should not use a holdout you can omit this parameter or pass NA directly. |
validationLevel |
character. Data value from userPartitionCol that identifies the validation subset under the "TVH" option. |
For the "TVH" option of cvMethod, no cross-validation is used. Users must specify the trainingLevel and validationLevel; use of a holdoutLevel is always recommended but not required. If no holdoutLevel is used, then the column must contain exactly 2 unique values. If a holdoutLevel is used, the column must contain exactly 3 unique values.
For the "CV" option, each value in the column will be used to separate rows into cross-validation folds. Use of a holdoutLevel is optional; if not specified, then no holdout is used.
This function is one of several convenience functions provided to simplify the task
of starting modeling projects with custom partitioning options. The other
functions are CreateGroupPartition
, CreateRandomPartition
, and
CreateStratifiedPartition
.
An S3 object of class 'partition' including the parameters required by the SetTarget function to generate a user-specified of the modeling dataset.
CreateGroupPartition
, CreateRandomPartition
,
CreateStratifiedPartition
.
CreateUserPartition(validationType = "CV", userPartitionCol = "TVHflag", cvHoldoutLevel = NA)
CreateUserPartition(validationType = "CV", userPartitionCol = "TVHflag", cvHoldoutLevel = NA)
Note that this runs cross validation on a model as-is. If you would like to run cross-validation
on a model with new parameters, use RequestNewModel
instead.
CrossValidateModel(model)
CrossValidateModel(model)
model |
An S3 object of class dataRobotModel like that returned by the function GetModel, or each element of the list returned by the function ListModels. |
Note that this is not implemented for prime models or datetime models.
Job ID of the cross validation job.
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) CrossValidateModel(model) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) CrossValidateModel(model) ## End(Not run)
This is a list that contains the valid values for CV methods
cvMethods
cvMethods
An object of class list
of length 5.
This is a list that contains the valid values for data partitions
DataPartition
DataPartition
An object of class list
of length 3.
Verifies that new data is either an existing datafile or a dataframe If a dataframe, save as a CSV file If neither an existing datafile nor a dataframe, halt with error
DataPathFromDataArg(dataSource, saveFile = NULL)
DataPathFromDataArg(dataSource, saveFile = NULL)
dataSource |
object. The dataframe or path to CSV to get data for. |
saveFile |
character. Optional. A file name to write an autosaved dataframe to. |
This is a list that contains the valid values for the dataSubset
parameter
found in RequestTrainingPredictions
. If you wish, you can specify
dataSubset
using the list values here.
DataSubset
DataSubset
An object of class list
of length 4.
For All
, all available data is used.
For ValidationAndHoldout
, only data outside the training set is used.
For Holdout
, only holdout data is used.
For AllBacktests
, data is used from all backtest validation folds. This requires
the model to have successfully scored all backtests. Backtests are available on datetime
partitioned projects only.
Datetime trend plots resolutions
DatetimeTrendPlotsResolutions
DatetimeTrendPlotsResolutions
An object of class list
of length 9.
Datetime trend plots statuses
DatetimeTrendPlotsStatuses
DatetimeTrendPlotsStatuses
An object of class list
of length 6.
Record is deleted with preview and explanations.
DeleteAnomalyAssessmentRecord(projectId, recordId)
DeleteAnomalyAssessmentRecord(projectId, recordId)
projectId |
character. The ID of the project. |
recordId |
character. The ID of the anomaly assessment record. |
Other Anomaly Assessment functions:
GetAnomalyAssessmentExplanations()
,
GetAnomalyAssessmentPredictionsPreview()
,
InitializeAnomalyAssessment()
,
ListAnomalyAssessmentRecords()
## Not run: projectId <- "59a5af20c80891534e3c2bde" recordId <- "59a5af20c80891534e3c2bdb" explanations <- DeleteAnomalyAssessmentRecord(projectId, recordId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" recordId <- "59a5af20c80891534e3c2bdb" explanations <- DeleteAnomalyAssessmentRecord(projectId, recordId) ## End(Not run)
Delete a calendar
DeleteCalendar(calendarId)
DeleteCalendar(calendarId)
calendarId |
character. The ID of the calendar to retrieve. |
## Not run: calendarId <- "5da75da31fb4a45b8a815a53" DeleteCalendar(calendarId) ## End(Not run)
## Not run: calendarId <- "5da75da31fb4a45b8a815a53" DeleteCalendar(calendarId) ## End(Not run)
Note that default templates cannot be deleted.
DeleteComplianceDocTemplate(templateId)
DeleteComplianceDocTemplate(templateId)
templateId |
character. The ID of the template to update. |
Nothing returned, but deletes the compliance doc template.
## Not run: templateId <- "5cf85080d9436e5c310c796d" DeleteComplianceDocTemplate(templateId) ## End(Not run)
## Not run: templateId <- "5cf85080d9436e5c310c796d" DeleteComplianceDocTemplate(templateId) ## End(Not run)
Delete a data store.
DeleteDataSource(dataSourceId)
DeleteDataSource(dataSourceId)
dataSourceId |
character. The ID of the data store to update. |
## Not run: dataSourceId <- "5c1303269300d900016b41a7" DeleteDataSource(dataSourceId) ## End(Not run)
## Not run: dataSourceId <- "5c1303269300d900016b41a7" DeleteDataSource(dataSourceId) ## End(Not run)
Delete a data store.
DeleteDataStore(dataStoreId)
DeleteDataStore(dataStoreId)
dataStoreId |
character. The ID of the data store to update. |
## Not run: dataStoreId <- "5c1303269300d900016b41a7" DeleteDataStore(dataStoreId) ## End(Not run)
## Not run: dataStoreId <- "5c1303269300d900016b41a7" DeleteDataStore(dataStoreId) ## End(Not run)
Delete a deployment.
DeleteDeployment(deploymentId)
DeleteDeployment(deploymentId)
deploymentId |
character. The ID of the deployment. |
## Not run: deploymentId <- "5e319d2e422fbd6b58a5edad" DeleteDeployment(deploymentId) ## End(Not run)
## Not run: deploymentId <- "5e319d2e422fbd6b58a5edad" DeleteDeployment(deploymentId) ## End(Not run)
Delete a featurelist
DeleteFeaturelist(featurelist)
DeleteFeaturelist(featurelist)
featurelist |
list. The featurelist to delete. |
## Not run: projectId <- "59a5af20c80891534e3c2bde" featureList <- CreateFeaturelist(projectId, "myFeaturelist", c("feature1", "feature2")) DeleteFeaturelist(featurelist) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" featureList <- CreateFeaturelist(projectId, "myFeaturelist", c("feature1", "feature2")) DeleteFeaturelist(featurelist) ## End(Not run)
Cancel a running job
DeleteJob(job)
DeleteJob(job)
job |
object. The job you want to cancel (one of the items in the list returned from
|
## Not run: projectId <- "59a5af20c80891534e3c2bde" initialJobs <- ListModelJobs(project) job <- initialJobs[[1]] DeleteJob(job) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" initialJobs <- ListModelJobs(project) job <- initialJobs[[1]] DeleteJob(job) ## End(Not run)
This function removes the model specified by the parameter model from its associated project.
DeleteModel(model)
DeleteModel(model)
model |
An S3 object of class dataRobotModel like that returned by the function GetModel, or each element of the list returned by the function ListModels. |
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) DeleteModel(model) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) DeleteModel(model) ## End(Not run)
Delete a modeling featurelist
DeleteModelingFeaturelist(featurelist)
DeleteModelingFeaturelist(featurelist)
featurelist |
list. The modeling featurelist to delete. |
## Not run: projectId <- "59a5af20c80891534e3c2bde" featureList <- CreateModelingFeaturelist(projectId, "myFeaturelist", c("feature1", "feature2")) featurelistId <- featureList$featurelistId GetModelingFeaturelist(projectId, featurelistId) DeleteModelingFeaturelist(projectId, featurelistId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" featureList <- CreateModelingFeaturelist(projectId, "myFeaturelist", c("feature1", "feature2")) featurelistId <- featureList$featurelistId GetModelingFeaturelist(projectId, featurelistId) DeleteModelingFeaturelist(projectId, featurelistId) ## End(Not run)
This function deletes the modeling job specified by modelJobId from the DataRobot modeling queue.
DeleteModelJob(project, modelJobId)
DeleteModelJob(project, modelJobId)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
modelJobId |
integer. Identifier for the modeling job to be deleted; can be obtained from the results returned by the function ListModelJobs. |
## Not run: projectId <- "59a5af20c80891534e3c2bde" initialJobs <- ListModelJobs(project) job <- initialJobs[[1]] modelJobId <- job$modelJobId DeleteModelJob(projectId, modelJobId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" initialJobs <- ListModelJobs(project) job <- initialJobs[[1]] modelJobId <- job$modelJobId DeleteModelJob(projectId, modelJobId) ## End(Not run)
This function removes a prediction dataset
DeletePredictionDataset(project, datasetId)
DeletePredictionDataset(project, datasetId)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
datasetId |
The id of the dataset to delete |
## Not run: projectId <- "59a5af20c80891534e3c2bde" datasets <- ListPredictionDatasets(projectId) dataset <- datasets[[1]] datasetId <- dataset$id DeletePredictionDataset(projectId, datasetId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" datasets <- ListPredictionDatasets(projectId) dataset <- datasets[[1]] datasetId <- dataset$id DeletePredictionDataset(projectId, datasetId) ## End(Not run)
This function deletes prediction explanations specified by project and predictionExplanationId.
DeletePredictionExplanations(project, predictionExplanationId)
DeletePredictionExplanations(project, predictionExplanationId)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
predictionExplanationId |
character. Id of the prediction explanations. |
Logical TRUE and displays a message to the user if the delete request was successful; otherwise an error message is displayed.
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" datasets <- ListPredictionDatasets(projectId) dataset <- datasets[[1]] datasetId <- dataset$id model <- GetModel(projectId, modelId) jobId <- RequestPredictionExplanations(model, datasetId) predictionExplanationId <- GetPredictionExplanationsMetadataFromJobId(projectId, jobId)$id DeletePredictionExplanations(projectId, predictionExplanationId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" datasets <- ListPredictionDatasets(projectId) dataset <- datasets[[1]] datasetId <- dataset$id model <- GetModel(projectId, modelId) jobId <- RequestPredictionExplanations(model, datasetId) predictionExplanationId <- GetPredictionExplanationsMetadataFromJobId(projectId, jobId)$id DeletePredictionExplanations(projectId, predictionExplanationId) ## End(Not run)
Delete the prediction explanations initialization for a model.
DeletePredictionExplanationsInitialization(model)
DeletePredictionExplanationsInitialization(model)
model |
An S3 object of class dataRobotModel like that returned by the function GetModel, or each element of the list returned by the function ListModels. |
Logical TRUE and displays a message to the user if the delete request was successful; otherwise an error message is displayed.
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) DeletePredictionExplanationsInitialization(model) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) DeletePredictionExplanationsInitialization(model) ## End(Not run)
This function deletes the predict job specified by predictJobId from the DataRobot queue.
DeletePredictJob(project, predictJobId)
DeletePredictJob(project, predictJobId)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
predictJobId |
integer. The integer ID |
Logical TRUE and displays a message to the user if the delete request was successful; otherwise, execution halts and an error message is displayed.
## Not run: projectId <- "59a5af20c80891534e3c2bde" initialJobs <- GetPredictJobs(project) job <- initialJobs[[1]] predictJobId <- job$predictJobId DeletePredictJob(projectId, predictJobId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" initialJobs <- GetPredictJobs(project) job <- initialJobs[[1]] predictJobId <- job$predictJobId DeletePredictJob(projectId, predictJobId) ## End(Not run)
This function deletes the project defined by project, described under Arguments. This parameter may be obtained in several ways, including: (1), as one of the projectId elements of the list returned by ListProjects; (2), as the S3 object returned by the GetProject function; or (3), as the list returned by the SetupProject function.
DeleteProject(project)
DeleteProject(project)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
## Not run: projectId <- "59a5af20c80891534e3c2bde" DeleteProject(projectId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" DeleteProject(projectId) ## End(Not run)
Delete this imported model.
DeleteTransferableModel(importId)
DeleteTransferableModel(importId)
importId |
character. Id of the import. |
Other Transferable Model functions:
DownloadTransferableModel()
,
GetTransferableModel()
,
ListTransferableModels()
,
RequestTransferableModel()
,
UpdateTransferableModel()
,
UploadTransferableModel()
## Not run: id <- UploadTransferableModel("model.drmodel") DeleteTransferableModel(id) ## End(Not run)
## Not run: id <- UploadTransferableModel("model.drmodel") DeleteTransferableModel(id) ## End(Not run)
All possible deployment accuracy metrics. Added in DataRobot API 2.18.
DeploymentAccuracyMetric
DeploymentAccuracyMetric
An object of class list
of length 27.
For usage, see DeploymentAccuracy
and DeploymentAccuracyOverTime
.
Added in DataRobot API 2.18.
DeploymentServiceHealthMetric
DeploymentServiceHealthMetric
An object of class list
of length 11.
For usage, see GetDeploymentServiceStats
.
Differencing method
DifferencingMethod
DifferencingMethod
An object of class list
of length 4.
Download a compliance doc template (in JSON format).
DownloadComplianceDocTemplate( filename = "template.json", templateId = NULL, type = NULL )
DownloadComplianceDocTemplate( filename = "template.json", templateId = NULL, type = NULL )
filename |
character. Filename of file to save the compliance doc template to. |
templateId |
character. Optional. The ID of the template to use in generating custom model documentation. |
type |
character. Optional. The type of compliance doc to get. Can be "normal" to retrieve the default template or "timeSeries" to get the default time series template. |
Nothing returned, but downloads the file to the stated filename.
## Not run: DownloadComplianceDocTemplate("template.json") # download the default template # download the default template DownloadComplianceDocTemplate("template.json", type = "normal") # download the default time series template DownloadComplianceDocTemplate("template.json" type = "timeSeries") templateId <- "5cf85080d9436e5c310c796d" DownloadComplianceDocTemplate(templateId) # Download a custom template for a specific ID. ## End(Not run)
## Not run: DownloadComplianceDocTemplate("template.json") # download the default template # download the default template DownloadComplianceDocTemplate("template.json", type = "normal") # download the default time series template DownloadComplianceDocTemplate("template.json" type = "timeSeries") templateId <- "5cf85080d9436e5c310c796d" DownloadComplianceDocTemplate(templateId) # Download a custom template for a specific ID. ## End(Not run)
This function will create the compliance documentation first if it has not already been created.
To create compliance documentation without downloading it, use
CreateComplianceDocumentation
. You can then skip the create step in this function by using
'create = FALSE'.
DownloadComplianceDocumentation( model, filename, templateId = NULL, create = TRUE, maxWait = 600 )
DownloadComplianceDocumentation( model, filename, templateId = NULL, create = TRUE, maxWait = 600 )
model |
An S3 object of class dataRobotModel like that returned by the function GetModel, or each element of the list returned by the function ListModels. |
filename |
character. Filename of file to save the compliance documentation to. |
templateId |
character. Optional. The ID of the template to use in generating custom model documentation. |
create |
logical. Should we create the compliance documentation prior to downloading? |
maxWait |
integer. How long to wait (in seconds) for compliance documentation creation before raising a timeout error? Default 600. |
Nothing returned, but downloads the file to the stated filename.
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) DownloadComplianceDocumentation(model) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) DownloadComplianceDocumentation(model) ## End(Not run)
Function to download and save prediction explanations rows as csv file
DownloadPredictionExplanations( project, predictionExplanationId, filename, encoding = "UTF-8", excludeAdjustedPredictions = TRUE )
DownloadPredictionExplanations( project, predictionExplanationId, filename, encoding = "UTF-8", excludeAdjustedPredictions = TRUE )
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
predictionExplanationId |
character. Id of the prediction explanations. |
filename |
character. Filename of file to save prediction explanations rows |
encoding |
character. Optional. Character string A string representing the encoding to use in the output file, defaults to 'UTF-8'. |
excludeAdjustedPredictions |
logical. Optional. Set to FALSE to include adjusted predictions, which are predictions adjusted by an exposure column. This is only relevant for projects that use an exposure column. |
Logical TRUE and displays a message to the user if the delete request was successful; otherwise an error message is displayed.
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" datasets <- ListPredictionDatasets(projectId) dataset <- datasets[[1]] datasetId <- dataset$id model <- GetModel(projectId, modelId) jobId <- RequestPredictionExplanations(model, datasetId) predictionExplanationId <- GetPredictionExplanationsMetadataFromJobId(projectId, jobId)$id file <- file.path(tempdir(), "testPredictionExplanation.csv") DownloadPredictionExplanations(projectId, predictionExplanationId, file) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" datasets <- ListPredictionDatasets(projectId) dataset <- datasets[[1]] datasetId <- dataset$id model <- GetModel(projectId, modelId) jobId <- RequestPredictionExplanations(model, datasetId) predictionExplanationId <- GetPredictionExplanationsMetadataFromJobId(projectId, jobId)$id file <- file.path(tempdir(), "testPredictionExplanation.csv") DownloadPredictionExplanations(projectId, predictionExplanationId, file) ## End(Not run)
Training a model using a ruleset is a necessary prerequisite for being able to download the code for a ruleset.
DownloadPrimeCode(project, primeFileId, filepath)
DownloadPrimeCode(project, primeFileId, filepath)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
primeFileId |
numeric. Prime file Id (can be acquired using ListPrimeFiles function) |
filepath |
character. The location to save the file to. |
## Not run: projectId <- "59a5af20c80891534e3c2bde" primeFiles <- ListPrimeFiles(projectId) primeFile <- primeFiles[[1]] primeFileId <- primeFile$id file <- file.path(tempdir(), "primeCode.py") DownloadPrimeCode(projectId, primeFileId, file) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" primeFiles <- ListPrimeFiles(projectId) primeFile <- primeFiles[[1]] primeFileId <- primeFile$id file <- file.path(tempdir(), "primeCode.py") DownloadPrimeCode(projectId, primeFileId, file) ## End(Not run)
Download a rating table to a CSV.
DownloadRatingTable(project, ratingTableId, filename)
DownloadRatingTable(project, ratingTableId, filename)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
ratingTableId |
character. The ID of the rating table. |
filename |
character. Filename of file to save the rating table to. |
Nothing returned, but downloads the file to the stated filename.
## Not run: projectId <- "5984b4d7100d2b31c1166529" ratingTableId <- "5984b4d7100d2b31c1166529" file <- file.path(tempdir(), "ratingTable.csv") DownloadRatingTable(projectId, ratingTableId, file) ## End(Not run)
## Not run: projectId <- "5984b4d7100d2b31c1166529" ratingTableId <- "5984b4d7100d2b31c1166529" file <- file.path(tempdir(), "ratingTable.csv") DownloadRatingTable(projectId, ratingTableId, file) ## End(Not run)
Download scoring code JAR
DownloadScoringCode(project, modelId, fileName, sourceCode = FALSE)
DownloadScoringCode(project, modelId, fileName, sourceCode = FALSE)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
modelId |
character. Unique alphanumeric identifier for the model of interest. |
fileName |
character. File path where scoring code will be saved. |
sourceCode |
logical. Optional. Set to TRUE to download source code archive. It will not be executable. |
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" file <- file.path(tempdir(), "scoringCode.jar") DownloadScoringCode(projectId, modelId, file) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" file <- file.path(tempdir(), "scoringCode.jar") DownloadScoringCode(projectId, modelId, file) ## End(Not run)
Download the series accuracy for a model, computing it if not already computed.
DownloadSeriesAccuracy(model, filename, encoding = "UTF-8")
DownloadSeriesAccuracy(model, filename, encoding = "UTF-8")
model |
character. The model for which you want to compute Feature Impact, e.g.
from the list of models returned by |
filename |
character. Filename of file to save reason codes rows |
encoding |
character. Optional. Character string A string representing the encoding to use in the output file, defaults to 'UTF-8'. |
Nothing returned, but downloads the file to the stated filename.
## Not run: projectId <- "5984b4d7100d2b31c1166529" modelId <- "5984b4d7100d2b31c1166529" model <- GetModel(projectId, modelId) DownloadSeriesAccuracy(model, "seriesAccuracy.csv") ## End(Not run)
## Not run: projectId <- "5984b4d7100d2b31c1166529" modelId <- "5984b4d7100d2b31c1166529" model <- GetModel(projectId, modelId) DownloadSeriesAccuracy(model, "seriesAccuracy.csv") ## End(Not run)
Download the time series feature derivation log as a text file.
DownloadTimeSeriesFeatureDerivationLog(project, file)
DownloadTimeSeriesFeatureDerivationLog(project, file)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
file |
character. The name or path of the file to download to. |
Nothing, but writes the output to the desired file.
GetTimeSeriesFeatureDerivationLog
## Not run: projectId <- "5984b4d7100d2b31c1166529" DownloadTimeSeriesFeatureDerivationLog(projectId, "featureLog.txt") ## End(Not run)
## Not run: projectId <- "5984b4d7100d2b31c1166529" DownloadTimeSeriesFeatureDerivationLog(projectId, "featureLog.txt") ## End(Not run)
Download training predictions on a specified data set.
DownloadTrainingPredictions( project, predictionId, filename, encoding = "UTF-8" )
DownloadTrainingPredictions( project, predictionId, filename, encoding = "UTF-8" )
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
predictionId |
character. ID of the prediction to retrieve training predictions for. |
filename |
character. Filename of file to save reason codes rows |
encoding |
character. Optional. Character string A string representing the encoding to use in the output file, defaults to 'UTF-8'. |
NULL, but will produce a CSV with a dataframe with out-of-fold predictions for the training data.
## Not run: projectId <- "59a5af20c80891534e3c2bde" predictions <- ListTrainingPredictions(projectId) predictionId <- predictions[[1]]$predictionId file <- file.path(tempdir(), "myTrainingPredictions.csv") DownloadTrainingPredictions(projectId, predictionId, file) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" predictions <- ListTrainingPredictions(projectId) predictionId <- predictions[[1]]$predictionId file <- file.path(tempdir(), "myTrainingPredictions.csv") DownloadTrainingPredictions(projectId, predictionId, file) ## End(Not run)
This function can only be used if model export is enabled, and will only be useful if you have an on-premise environment in which to import it.
DownloadTransferableModel(project, modelId, modelFile)
DownloadTransferableModel(project, modelId, modelFile)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
modelId |
numeric. Unique alphanumeric identifier for the model of interest. |
modelFile |
character. File name to be use for transferable model |
Other Transferable Model functions:
DeleteTransferableModel()
,
GetTransferableModel()
,
ListTransferableModels()
,
RequestTransferableModel()
,
UpdateTransferableModel()
,
UploadTransferableModel()
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" file <- file.path(tempdir(), "model.drmodel") DownloadTransferableModel(projectId, modelId, file) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" file <- file.path(tempdir(), "model.drmodel") DownloadTransferableModel(projectId, modelId, file) ## End(Not run)
allowAdditional
is FALSE (default).Make sure that the object has all of the keys specified. Also tests that there
are not additional keys if allowAdditional
is FALSE (default).
ExpectHasKeys(obj, keys, allowAdditional = FALSE)
ExpectHasKeys(obj, keys, allowAdditional = FALSE)
obj |
object. A list, vector, or data.frame to check names. |
keys |
character. A vector of names of keys to check. |
allowAdditional |
logical. Should we allow there to be more keys than specified? |
If feature creation times out, the error message includes a URL corresponding to the creation task. That URL can be passed to this function (which will return the feature details when finished) to resume waiting for feature creation.
FeatureFromAsyncUrl(asyncUrl, maxWait = 600)
FeatureFromAsyncUrl(asyncUrl, maxWait = 600)
asyncUrl |
character. The temporary status URL. |
maxWait |
integer. Optional. The maximum time to wait (in seconds) for project creation before aborting. |
The DataRobot APIs expect dates formatted as RFC 3339 strings. This is the same as ISO 8601. To be safe, use UTC as the timezone (and format it with a 'Z' suffix), and use 'T' as the date/time separator.
formatRFC3339Timestamp(date)
formatRFC3339Timestamp(date)
date |
POSIXt or date. The date(s) to be formatted. |
Other API datetime functions:
RFC3339DateTimeFormat
,
parseRFC3339Timestamp()
,
transformRFC3339Period()
,
validateReportingPeriodTime()
Based on the project dataset and the partitioning specification, inspect the full partitioning that would be used if the same specification were passed into SetTarget. This is not intended to be passed to SetTarget.
GenerateDatetimePartition(project, spec)
GenerateDatetimePartition(project, spec)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
spec |
list. Datetime partition specification returned by
|
list describing datetime partition with following components
cvMethod. The type of validation scheme used for the project.
projectId character. The id of the project this partitioning applies to.
datetimePartitionColumn character. The name of the column whose values as dates are used to assign a row to a particular partition.
dateFormat character. The format (e.g. " partition column was interpreted (compatible with strftime [https://docs.python.org/2/library/time.html#time.strftime]).
autopilotDataSelectionMethod character. Whether models created by the autopilot use "rowCount" or "duration" as their dataSelectionMethod.
validationDuration character. The validation duration specified when initializing the partitioning - not directly significant if the backtests have been modified, but used as the default validationDuration for the backtests.
availableTrainingStartDate character. The start date of the available training data for scoring the holdout.
availableTrainingDuration character. The duration of the available training data for scoring the holdout.
availableTrainingRowCount integer. The number of rows in the available training data for scoring the holdout. Only available when retrieving the partitioning after setting the target.
availableTrainingEndDate character. The end date of the available training data for scoring the holdout.
primaryTrainingStartDate character. The start date of primary training data for scoring the holdout.
primaryTrainingDuration character. The duration of the primary training data for scoring the holdout.
primaryTrainingRowCount integer. The number of rows in the primary training data for scoring the holdout. Only available when retrieving the partitioning after setting the target.
primaryTrainingEndDate character. The end date of the primary training data for scoring the holdout.
gapStartDate character. The start date of the gap between training and holdout scoring data.
gapDuration character. The duration of the gap between training and holdout scoring data.
gapRowCount integer. The number of rows in the gap between training and holdout scoring data. Only available when retrieving the partitioning after setting the target.
gapEndDate character. The end date of the gap between training and holdout scoring data.
holdoutStartDate character. The start date of holdout scoring data.
holdoutDuration character. The duration of the holdout scoring data.
holdoutRowCount integer. The number of rows in the holdout scoring data. Only available when retrieving the partitioning after setting the target.
holdoutEndDate character. The end date of the holdout scoring data.
numberOfBacktests integer. the number of backtests used.
backtests data.frame. A data frame of partition backtest. Each element represent one backtest and has the following components: index, availableTrainingStartDate, availableTrainingDuration, availableTrainingRowCount, availableTrainingEndDate, primaryTrainingStartDate, primaryTrainingDuration, primaryTrainingRowCount, primaryTrainingEndDate, gapStartDate, gapDuration, gapRowCount, gapEndDate, validationStartDate, validationDuration, validationRowCount, validationEndDate, totalRowCount.
useTimeSeries logical. Whether the project is a time series project (if TRUE) or an OTV project which uses datetime partitioning (if FALSE).
defaultToKnownInAdvance logical. Whether the project defaults to treating features as known in advance. Known in advance features are time series features that are expected to be known for dates in the future when making predictions (e.g., "is this a holiday").
featureDerivationWindowStart integer. Offset into the past to define how far
back relative to the forecast point the feature derivation window should start. Only used for
time series projects. Expressed in terms of the timeUnit
of the
datetimePartitionColumn
.
featureDerivationWindowEnd integer. Offset into the past to define how far back relative
to the forecast point the feature derivation window should end. Only used for
time series projects. Expressed in terms of the timeUnit
of the
datetimePartitionColumn
.
forecastWindowStart integer. Offset into the future to define how far forward relative
to the forecast point the forecast window should start. Only used for time series
projects. Expressed in terms of the timeUnit
of the datetimePartitionColumn
.
forecastWindowEnd integer. Offset into the future to define how far forward relative to
the forecast point the forecast window should end. Only used for time series
projects. Expressed in terms of the timeUnit
of the datetimePartitionColumn
.
featureSettings list. A list of lists specifying settings for each feature. For each feature you would like to set feature settings for, pass the following in a list:
featureName character. The name of the feature to set feature settings.
knownInAdvance logical. Optional. Whether or not the feature is known in advance.
Used for time series only. Defaults to FALSE
.
doNotDerive logical. Optional. If TRUE
, no time series derived features
(e.g., lags) will be automatically engineered from this feature. Used for time series
only. Defaults to FALSE
.
treatAsExponential character. Specifies whether to treat data as exponential trend
and apply transformations like log-transform. Uses values from from
TreatAsExponential
.
differencingMethod character. Used to specify differencing method to apply if data is
stationary. Use values from DifferencingMethod
.
windowsBasisUnit character. Indicates which unit is the basis for the feature derivation
window and forecast window. Uses values from TimeUnit
and the value "ROW".
periodicities list. A list of periodicities for different times, specified as a list of
lists, where each list item specifies the 'timeSteps' for a particular 'timeUnit'. Will be
"ROW" if windowsBasisUnit
is "ROW".
totalRowCount integer. The number of rows in the project dataset. Only available when
retrieving the partitioning after setting the target. Thus it will be NULL for
GenerateDatetimePartition
and populated for GetDatetimePartition
.
validationRowCount integer. The number of rows in the validation set.
multiseriesIdColumns list. A list of the names of multiseries id columns to define series.
numberOfKnownInAdvanceFeatures integer. The number of known in advance features.
useCrossSeriesFeatures logical. Whether or not cross series features are included.
aggregationType character. The aggregation type to apply when creating cross series
features. See SeriesAggregationType
.
calendarId character. The ID of the calendar used for this project, if any.
## Not run: projectId <- "59a5af20c80891534e3c2bde" partitionSpec <- CreateDatetimePartitionSpecification("date_col") GenerateDatetimePartition(projectId, partitionSpec) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" partitionSpec <- CreateDatetimePartitionSpecification("date_col") GenerateDatetimePartition(projectId, partitionSpec) ## End(Not run)
Retrieve Accuracy over Time plot for a model.
GetAccuracyOverTimePlot( model, backtest = 0, source = SourceType$Validation, seriesId = NULL, forecastDistance = NULL, maxBinSize = NULL, resolution = NULL, startDate = NULL, endDate = NULL, maxWait = 600 )
GetAccuracyOverTimePlot( model, backtest = 0, source = SourceType$Validation, seriesId = NULL, forecastDistance = NULL, maxBinSize = NULL, resolution = NULL, startDate = NULL, endDate = NULL, maxWait = 600 )
model |
An S3 object of class dataRobotModel like that returned by the function GetModel, or each element of the list returned by the function ListModels. |
backtest |
integer or character. Optional. Retrieve plots for a specific backtest.
Use the backtest index starting from zero.
To retrieve plots for holdout, use |
source |
character. Optional. The source of the data for the backtest/holdout.
Must be one of |
seriesId |
character. Optional. The name of the series to retrieve for multiseries projects. If not provided an average plot for the first 1000 series will be retrieved. |
forecastDistance |
integer. Optional. Forecast distance to retrieve the chartdata for. If not specified, the first forecast distance for this project will be used. Only available for time series projects. |
maxBinSize |
integer. Optional. An int between 1 and 1000, which specifies the maximum number of bins for the retrieval. Default is 500. |
resolution |
character. Optional. Specifying at which resolution the data should be binned.
If not provided an optimal resolution will be used to build chart data
with number of |
startDate |
POSIXct. Optional. The start of the date range to return. If not specified, start date for requested plot will be used. |
endDate |
POSIXct. Optional. The end of the date range to return. If not specified, end date for requested plot will be used. |
maxWait |
integer. Optional. The maximum time to wait for a compute job to complete before retrieving the plots. Default is 600. If 0, the plots would be retrieved without attempting the computation. |
list with the following components:
resolution. character: The resolution that is used for binning.
One of DatetimeTrendPlotsResolutions
.
startDate. POSIXct: The datetime of the start of the chartdata (inclusive).
endDate. POSIXct: The datetime of the end of the chartdata (exclusive).
bins. data.frame: Each row represents a bin in the plot. Dataframe has following columns:
startDate. POSIXct: The datetime of the start of the bin (inclusive).
endDate. POSIXct: The datetime of the end of the bin (exclusive).
actual. numeric: Average actual value of the target in the bin. NA if there are no entries in the bin.
predicted. numeric: Average prediction of the model in the bin. NA if there are no entries in the bin.
frequency. integer: Indicates number of values averaged in bin.
statistics. list: Contains statistical properties for the plot.
durbinWatson. numeric: The Durbin-Watson statistic for the chart data. Value is between 0 and 4. Durbin-Watson statistic is a test statistic used to detect the presence of autocorrelation at lag 1 in the residuals (prediction errors) from a regression analysis.
calendarEvents. data.frame: Each row represents a calendar event in the plot. Dataframe has following columns:
date. POSIXct: The date of the calendar event.
seriesId. character: The series ID for the event. If this event does not specify a series ID, then this will be NA, indicating that the event applies to all series.
name. character: The name of the calendar event.
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) GetAccuracyOverTimePlot(model) plot <- GetAccuracyOverTimePlot(model) png("accuracy_over_time.png", width = 1200, height = 600, units = "px") par(mar = c(10, 5, 5, 5)) plot(plot$bins$startDate, plot$bins$actual, type = "l", ylab = "Target", xaxt = "n", xlab = "") lines(plot$bins$startDate, plot$bins$predicted, col = "red") axis(1, plot$bins$startDate, format(plot$bins$startDate, "%Y-%m-%d"), las = 3) title(xlab = "Date", mgp = c(7, 1, 0)) legend("topright", legend = c("Actual", "Predicted"), col = c("black", "red"), lty = 1:1) dev.off() ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) GetAccuracyOverTimePlot(model) plot <- GetAccuracyOverTimePlot(model) png("accuracy_over_time.png", width = 1200, height = 600, units = "px") par(mar = c(10, 5, 5, 5)) plot(plot$bins$startDate, plot$bins$actual, type = "l", ylab = "Target", xaxt = "n", xlab = "") lines(plot$bins$startDate, plot$bins$predicted, col = "red") axis(1, plot$bins$startDate, format(plot$bins$startDate, "%Y-%m-%d"), las = 3) title(xlab = "Date", mgp = c(7, 1, 0)) legend("topright", legend = c("Actual", "Predicted"), col = c("black", "red"), lty = 1:1) dev.off() ## End(Not run)
Retrieve Accuracy over Time preview plot for a model.
GetAccuracyOverTimePlotPreview( model, backtest = 0, source = SourceType$Validation, seriesId = NULL, forecastDistance = NULL, maxWait = 600 )
GetAccuracyOverTimePlotPreview( model, backtest = 0, source = SourceType$Validation, seriesId = NULL, forecastDistance = NULL, maxWait = 600 )
model |
An S3 object of class dataRobotModel like that returned by the function GetModel, or each element of the list returned by the function ListModels. |
backtest |
integer or character. Optional. Retrieve plots for a specific backtest.
Use the backtest index starting from zero.
To retrieve plots for holdout, use |
source |
character. Optional. The source of the data for the backtest/holdout.
Must be one of |
seriesId |
character. Optional. The name of the series to retrieve for multiseries projects. If not provided an average plot for the first 1000 series will be retrieved. |
forecastDistance |
integer. Optional. Forecast distance to retrieve the chartdata for. If not specified, the first forecast distance for this project will be used. Only available for time series projects. |
maxWait |
integer. Optional. The maximum time to wait for a compute job to complete before retrieving the plots. Default is 600. If 0, the plots would be retrieved without attempting the computation. |
list with the following components:
startDate. POSIXct: The datetime of the start of the chartdata (inclusive).
endDate. POSIXct: The datetime of the end of the chartdata (exclusive).
bins. data.frame: Each row represents a bin in the plot. Dataframe has following columns:
startDate. POSIXct: The datetime of the start of the bin (inclusive).
endDate. POSIXct: The datetime of the end of the bin (exclusive).
actual. numeric: Average actual value of the target in the bin. NA if there are no entries in the bin.
predicted. numeric: Average prediction of the model in the bin. NA if there are no entries in the bin.
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) plot <- GetAccuracyOverTimePlotPreview(model) png("accuracy_over_time_preview.png", width = 1200, height = 600, units = "px") par(mar = c(10, 5, 5, 5)) plot(plot$bins$startDate, plot$bins$actual, type = "l", ylab = "Target", xaxt = "n", xlab = "") lines(plot$bins$startDate, plot$bins$predicted, col = "red") axis(1, plot$bins$startDate, format(plot$bins$startDate, "%Y-%m-%d"), las = 3) title(xlab = "Date", mgp = c(7, 1, 0)) legend("topright", legend = c("Actual", "Predicted"), col = c("black", "red"), lty = 1:1) dev.off() ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) plot <- GetAccuracyOverTimePlotPreview(model) png("accuracy_over_time_preview.png", width = 1200, height = 600, units = "px") par(mar = c(10, 5, 5, 5)) plot(plot$bins$startDate, plot$bins$actual, type = "l", ylab = "Target", xaxt = "n", xlab = "") lines(plot$bins$startDate, plot$bins$predicted, col = "red") axis(1, plot$bins$startDate, format(plot$bins$startDate, "%Y-%m-%d"), las = 3) title(xlab = "Date", mgp = c(7, 1, 0)) legend("topright", legend = c("Actual", "Predicted"), col = c("black", "red"), lty = 1:1) dev.off() ## End(Not run)
Retrieve Accuracy over Time plots metadata for a model.
GetAccuracyOverTimePlotsMetadata(model, forecastDistance = NULL)
GetAccuracyOverTimePlotsMetadata(model, forecastDistance = NULL)
model |
An S3 object of class dataRobotModel like that returned by the function GetModel, or each element of the list returned by the function ListModels. |
forecastDistance |
integer. Optional. Forecast distance to retrieve the metadata for. If not specified, the first forecast distance for this project will be used. Only available for time series projects. |
list with the following components:
forecastDistance. integer or NULL: The forecast distance for which the metadata was retrieved. NULL for OTV projects.
resolutions. list: A list of DatetimeTrendPlotsResolutions
,
which represents available time resolutions for which plots can be retrieved.
backtestStatuses. data.frame: Each row represents a status for the backtest
SourceType
. The row index corresponds to the backtest index via the relation
rowIndex <- backtestIndex + 1
. Status should be one of
DatetimeTrendPlotsStatuses
backtestMetadata. data.frame: Each row represents a metadata for the backtest
SourceType
start and end date. The row index corresponds to the
backtest index via the relation rowIndex <- backtestIndex + 1
.
Each cell contains a POSIXct timestamp for start date (inclusive)
and end date (exclusive) if the correspoding source type
for the backtest is computed, and NA otherwise.
holdoutStatuses. list: Contains statuses for holdout.
training. character: Status, one of DatetimeTrendPlotsStatuses
validation. character: Status, one of DatetimeTrendPlotsStatuses
holdoutMetadata. list. Contains metadata for holdout.
training. list. Contains start and end date for holdout training.
validation. list. Contains start and end date for holdout validation.
startDate. POSIXct or NA: The datetime of the start of the holdout training/validation (inclusive). NA if the data is not computed.
endDate. POSIXct or NA: The datetime of the end of the holdout training/validation (exclusive). NA if the data is not computed.
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) GetAccuracyOverTimePlotsMetadata(model) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) GetAccuracyOverTimePlotsMetadata(model) ## End(Not run)
Explanations contain predictions along with shap explanations for the most anomalous records in the specified date range/for defined number of points. Two out of three parameters: startDate, endDate or pointsCount must be specified.
GetAnomalyAssessmentExplanations( projectId, recordId, startDate = NULL, endDate = NULL, pointsCount = NULL )
GetAnomalyAssessmentExplanations( projectId, recordId, startDate = NULL, endDate = NULL, pointsCount = NULL )
projectId |
character. The ID of the project. |
recordId |
character. The ID of the anomaly assessment record. |
startDate |
POSIXct. Optional. The start of the date range to get explanations in. |
endDate |
POSIXct. Optional. The end of the date range to get explanations in. |
pointsCount |
integer. Optional. The number of the rows to return. |
The anomaly assessment explanations:
recordId. character. The ID of the record.
projectId. character. The project ID of the record.
modelId. character. The model ID of the record.
backtest. character. The backtest of the record.
source. character. The source of the record.
seriesId. character. the series ID of the record.
startDate. POSIXct. First timestamp in the response. Will be NULL if there is no data in the specified range.
endDate. POSIXct. Last timestamp in the response. Will be NULL if there is no data in the specified range.
shapBaseValue. numeric. Shap base value.
count. integer. The number of points in the data
.
data. list. A list of DataPoint objects in the specified date range containing:
shapExplanation. NULL or an array of up to 10 ShapleyFeatureContribution objects. Only rows with the highest anomaly scores have Shapley explanations calculated.
timestamp POSIXct. Timestamp for the row.
prediction numeric. The output of the model for this row.
Each ShapleyFeatureContribution contains:
featureValue. character. The feature value for this row. First 50 characters are returned.
strength numeric. The shap value for this feature and row.
feature character. The feature name.
Other Anomaly Assessment functions:
DeleteAnomalyAssessmentRecord()
,
GetAnomalyAssessmentPredictionsPreview()
,
InitializeAnomalyAssessment()
,
ListAnomalyAssessmentRecords()
## Not run: projectId <- "59a5af20c80891534e3c2bde" recordId <- "59a5af20c80891534e3c2bdb" explanations <- GetAnomalyAssessmentExplanations(projectId, recordId, pointsCount=100, startDate=as.Date("2021-01-01")) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" recordId <- "59a5af20c80891534e3c2bdb" explanations <- GetAnomalyAssessmentExplanations(projectId, recordId, pointsCount=100, startDate=as.Date("2021-01-01")) ## End(Not run)
Aggregated predictions over time for the corresponding anomaly assessment record. Intended to find the bins with highest anomaly scores.
GetAnomalyAssessmentPredictionsPreview(projectId, recordId)
GetAnomalyAssessmentPredictionsPreview(projectId, recordId)
projectId |
character. The ID of the project. |
recordId |
character. The ID of the anomaly assessment record. |
The anomaly assessment predictions preview:
recordId. character. The ID of the record.
projectId. character. The project ID of the record.
modelId. character. The model ID of the record.
backtest. character. The backtest of the record.
source. character. The source of the record.
seriesId. character. the series ID of the record.
startDate. POSIXct. Timestamp of the first prediction in the subset.
endDate. POSIXct. Timestamp of the last prediction in the subset.
previewBins. list. A list of PreviewBin objects in the specified date range. The aggregated predictions for the subset. Bins boundaries may differ from actual start/end dates because this is an aggregation. Each PreviewBin contains:
startDate. POSIXct. Datetime of the start of the bin.
endDate. POSIXct. Datetime of the end of the bin.
avgPredicted numeric. The average prediction of the model in the bin. NA if there are no entries in the bin.
maxPredicted numeric. The maximum prediction of the model in the bin. NA if there are no entries in the bin.
frequency integer. The number of the rows in the bin.
Other Anomaly Assessment functions:
DeleteAnomalyAssessmentRecord()
,
GetAnomalyAssessmentExplanations()
,
InitializeAnomalyAssessment()
,
ListAnomalyAssessmentRecords()
## Not run: projectId <- "59a5af20c80891534e3c2bde" recordId <- "59a5af20c80891534e3c2bdb" explanations <- GetAnomalyAssessmentPredictionsPreview(projectId, recordId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" recordId <- "59a5af20c80891534e3c2bdb" explanations <- GetAnomalyAssessmentPredictionsPreview(projectId, recordId) ## End(Not run)
This function returns a DataRobot S3 object of class dataRobotModel for the model defined by project and modelId.
GetBlenderModel(project, modelId)
GetBlenderModel(project, modelId)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
modelId |
character. Unique alphanumeric identifier for the blender model of interest. |
An S3 object of class ‘dataRobotBlenderModel’ summarizing all available information about the model. It is a list with the following components:
modelId. character. The unique alphanumeric blender model identifier.
modelNumber. integer. The assigned model number.
modelType. character. The type of model, e.g. 'AVG Blender'.
modelIds. character. List of unique identifiers for the blended models.
blenderMethod. character. The blender method used to create this model.
featurelistId. character. Unique alphanumeric identifier for the featurelist on which the model is based.
processes. character. Components describing preprocessing; may include modelType.
featurelistName. character. Name of the featurelist on which the model is based.
blueprintId. character. The unique blueprint identifier on which the model is based.
samplePct. numeric. The percentage of the dataset used in training the
model. For projects that use datetime partitioning, this will be NA. See
trainingRowCount
instead.
trainingRowCount. integer. Number of rows of the dataset used in
training the model. For projects that use datetime partitioning, if
specified, this defines the number of rows used to train the model and
evaluate backtest scores; if unspecified, either trainingDuration
or trainingStartDate
and trainingEndDate
was used instead.
isFrozen. logical. Was the model created with frozen tuning parameters?
metrics. list. The metrics associated with this model. Each element is a list with elements for each possible evaluation type (holdout, validation, and crossValidation).
modelCategory. character. The category of model (e.g., blend, model, prime).
projectId. character. Unique alphanumeric identifier for the project.
projectName. character. Name of the project.
projectTarget. character. The target variable predicted by all models in the project.
projectMetric. character. The fitting metric optimized by all project models.
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" GetBlenderModel(projectId, modelId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" GetBlenderModel(projectId, modelId) ## End(Not run)
The function RequestBlender initiates the creation of new blender models in a DataRobot project.
GetBlenderModelFromJobId(project, modelJobId, maxWait = 600)
GetBlenderModelFromJobId(project, modelJobId, maxWait = 600)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
modelJobId |
integer. The integer returned by RequestBlender. |
maxWait |
integer. The maximum time (in seconds) to wait for the model job to complete. |
It submits requests to the DataRobot modeling engine and returns an integer-valued modelJobId. The GetBlenderModelFromJobId function polls the modeling engine until the model has been built or a specified time limit is exceeded, returning an S3 object of class 'dataRobotBlenderModel' when the model is available.
Motivation for this function is the fact that some models - e.g., very complex machine learning models fit to large datasets - may take a long time to complete. Splitting the model creation request from model retrieval in these cases allows the user to perform other interactive R session tasks between the time the model creation/update request is made and the time the final model is available.
An S3 object of class ‘dataRobotBlenderModel’ summarizing all available information about the model. It is a list with the following components:
modelId. character. The unique alphanumeric blender model identifier.
modelNumber. integer. The assigned model number.
modelType. character. The type of model, e.g. 'AVG Blender'.
modelIds. character. List of unique identifiers for the blended models.
blenderMethod. character. The blender method used to create this model.
featurelistId. character. Unique alphanumeric identifier for the featurelist on which the model is based.
processes. character. Components describing preprocessing; may include modelType.
featurelistName. character. Name of the featurelist on which the model is based.
blueprintId. character. The unique blueprint identifier on which the model is based.
samplePct. numeric. The percentage of the dataset used in training the
model. For projects that use datetime partitioning, this will be NA. See
trainingRowCount
instead.
trainingRowCount. integer. Number of rows of the dataset used in
training the model. For projects that use datetime partitioning, if
specified, this defines the number of rows used to train the model and
evaluate backtest scores; if unspecified, either trainingDuration
or trainingStartDate
and trainingEndDate
was used instead.
isFrozen. logical. Was the model created with frozen tuning parameters?
metrics. list. The metrics associated with this model. Each element is a list with elements for each possible evaluation type (holdout, validation, and crossValidation).
modelCategory. character. The category of model (e.g., blend, model, prime).
projectId. character. Unique alphanumeric identifier for the project.
projectName. character. Name of the project.
projectTarget. character. The target variable predicted by all models in the project.
projectMetric. character. The fitting metric optimized by all project models.
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelsToBlend <- c("5996f820af07fc605e81ead4", "59a5ce3301e9f0296721c64c") blendJobId <- RequestBlender(projectId, modelsToBlend, "GLM") GetBlenderModelFromJobId(projectId, blendJobId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelsToBlend <- c("5996f820af07fc605e81ead4", "59a5ce3301e9f0296721c64c") blendJobId <- RequestBlender(projectId, modelsToBlend, "GLM") GetBlenderModelFromJobId(projectId, blendJobId) ## End(Not run)
Retrieve a blueprint
GetBlueprint(project, blueprintId)
GetBlueprint(project, blueprintId)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
blueprintId |
character. Id of blueprint to retrieve. |
List with the following four components:
Character string giving the unique DataRobot project identifier
List of character strings, identifying any preprocessing steps included in the blueprint
Character string giving the unique DataRobot blueprint identifier
Character string, specifying the type of model the blueprint builds
Character string. Describes the category of the blueprint and the kind of model it produces.
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) blueprintId <- model$blueprintId GetBlueprint(projectId, blueprintId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) blueprintId <- model$blueprintId GetBlueprint(projectId, blueprintId) ## End(Not run)
A Blueprint chart can be used to understand data flow in blueprint.
GetBlueprintChart(project, blueprintId)
GetBlueprintChart(project, blueprintId)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
blueprintId |
character. Id of blueprint to retrieve. |
List with the following two components:
nodes. list each element contains information about one node of a blueprint : id and label.
edges. Two column matrix, identifying blueprint nodes connections.
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) blueprintId <- model$blueprintId GetBlueprintChart(projectId, blueprintId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) blueprintId <- model$blueprintId GetBlueprintChart(projectId, blueprintId) ## End(Not run)
Get documentation for tasks used in the blueprint
GetBlueprintDocumentation(project, blueprintId)
GetBlueprintDocumentation(project, blueprintId)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
blueprintId |
character. Id of blueprint to retrieve. |
list with following components
Character string name of the task described in document
Character string task description
Character string title of document
List of parameters that task can received in human-readable format with following components: name, type, description
List of external lines used in document with following components: name, url
List of references used in document with following components: name, url
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) blueprintId <- model$blueprintId GetBlueprintDocumentation(projectId, blueprintId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) blueprintId <- model$blueprintId GetBlueprintDocumentation(projectId, blueprintId) ## End(Not run)
Retrieve a calendar
GetCalendar(calendarId)
GetCalendar(calendarId)
calendarId |
character. The ID of the calendar to retrieve. |
An S3 object of class "dataRobotCalendar"
## Not run: calendarId <- "5da75da31fb4a45b8a815a53" GetCalendar(calendarId) ## End(Not run)
## Not run: calendarId <- "5da75da31fb4a45b8a815a53" GetCalendar(calendarId) ## End(Not run)
Retrieve the calendar for a particular project.
GetCalendarFromProject(project)
GetCalendarFromProject(project)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
An S3 object of class "dataRobotCalendar"
## Not run: projectId <- "59a5af20c80891534e3c2bde" GetCalendar(projectId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" GetCalendar(projectId) ## End(Not run)
A custom compliance doc template can be retrieved using templateId
. Default compliance
doc templates that are built-in to DataRobot can be retrieved by using the type
parameter. A type of NULL or "normal" will retrieve the default template. A type of "timeSeries"
can be used to retrieve the default time series template.
GetComplianceDocTemplate(templateId = NULL, type = NULL)
GetComplianceDocTemplate(templateId = NULL, type = NULL)
templateId |
character. Optional. The ID of the template to use in generating custom model documentation. |
type |
character. Optional. The type of compliance doc to get. Can be "normal" to retrieve the default template or "timeSeries" to get the default time series template. |
An S3 object of class 'dataRobotComplianceDocTemplate' that contains:
name character. The name of the compliance doc template.
creatorUsername character. The name of the user who created the compliance doc template.
orgId character. The ID of the organization of the creator user.
creatorId character. The ID of the creator user.
sections list. The list of sections that define the template.
id character. The ID of the template.
## Not run: GetComplianceDocTemplate() # get the default template GetComplianceDocTemplate(type = "normal") # get the default template GetComplianceDocTemplate(type = "timeSeries") # get the default time series template templateId <- "5cf85080d9436e5c310c796d" GetComplianceDocTemplate(templateId) # Get a custom template for a specific ID. ## End(Not run)
## Not run: GetComplianceDocTemplate() # get the default template GetComplianceDocTemplate(type = "normal") # get the default template GetComplianceDocTemplate(type = "timeSeries") # get the default time series template templateId <- "5cf85080d9436e5c310c796d" GetComplianceDocTemplate(templateId) # Get a custom template for a specific ID. ## End(Not run)
Retrieve a model's confusion chart for a specified source.
GetConfusionChart( model, source = DataPartition$VALIDATION, fallbackToParentInsights = FALSE )
GetConfusionChart( model, source = DataPartition$VALIDATION, fallbackToParentInsights = FALSE )
model |
dataRobotModel. A DataRobot model object like that returned by |
source |
character. The data partition for which data would be returned. Default is
|
fallbackToParentInsights |
logical. If TRUE, this will return the lift chart data for the model's parent if the lift chart is not available for the model and the model has a parent model. |
data.frame with the following components:
source character. The name of the source of the confusion chart. Will be a member of
DataPartition
.
data list. The data for the confusion chart, containing:
classes character. A vector containing the names of all the classes.
confusionMatrix matrix. A matrix showing the actual versus the predicted class values.
classMetrics list. A list detailing further metrics for each class:
wasActualPercentages data.frame. A dataframe detailing the actual percentage distribution of the classes.
wasPredictedPercentages data.frame. A dataframe detailing the predicted distribution of the classes.
f1 numeric. The F1 score for the predictions of the class.
recall numeric. The recall score for the predictions of the class.
precision numeric. The precision score for the predictions of the class.
actualCount integer. The actual count of values for the class.
predictedCount integer. The predicted count of values for the class.
className character. A vector containing the name of the class.
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" GetModel(projectId, modelId) GetConfusionChart(modelId, source = DataPartition$VALIDATION) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" GetModel(projectId, modelId) GetConfusionChart(modelId, source = DataPartition$VALIDATION) ## End(Not run)
Get cross validation scores
GetCrossValidationScores(model, partition = NULL, metric = NULL)
GetCrossValidationScores(model, partition = NULL, metric = NULL)
model |
An S3 object of class dataRobotModel like that returned by the function GetModel, or each element of the list returned by the function ListModels. |
partition |
numeric. Optional. The ID of the partition to filter results by. |
metric |
character. Optional. The name of the metric to filter results by. |
A list of lists with cross validation score data. Each list contains a series of lists for each model metric. Each model metric list contains the metric data for each fold.
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) GetCrossValidationScores(model) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) GetCrossValidationScores(model) ## End(Not run)
Returns information about a particular data source.
GetDataSource(dataSourceId)
GetDataSource(dataSourceId)
dataSourceId |
character. The id of the data source |
A list containing information on the particular data source:
className character. The Java class name of the driver.
baseNames character. A vector of the file name(s) of the jar files.
canonicalName character. The user-friendly name of the driver.
id character. The dataSourceId of the driver.
creator character. The userId of the user who created the driver.
## Not run: dataSourceId <- "57a7c978c808916f4a630f89" GetDataSource(dataSourceId) ## End(Not run)
## Not run: dataSourceId <- "57a7c978c808916f4a630f89" GetDataSource(dataSourceId) ## End(Not run)
Returns information about a particular data store.
GetDataStore(dataStoreId)
GetDataStore(dataStoreId)
dataStoreId |
character. The id of the data store. |
A list containing information on the particular data store:
id character. The dataStoreId of the data store.
canonicalName character. The user-friendly name of the data store.
type character. The type of data store.
updated datetime. A timestamp for the last time the data store was updated.
creator character. The userId of the user who created the data store.
params list. A list specifying the data store parameters.
## Not run: dataStoreId <- "5c1303269300d900016b41a7" GetDataStore(dataStoreId) ## End(Not run)
## Not run: dataStoreId <- "5c1303269300d900016b41a7" GetDataStore(dataStoreId) ## End(Not run)
Get the schemas associated with a data store.
GetDataStoreSchemas(dataStoreId, username, password)
GetDataStoreSchemas(dataStoreId, username, password)
dataStoreId |
character. The ID of the data store to update. |
username |
character. The username to use for authentication to the database. |
password |
character. The password to use for authentication to the database. The password is encrypted at server side and never saved or stored. |
A list with the name of the catalog and the name of the schemas.
## Not run: dataStoreId <- "5c1303269300d900016b41a7" GetDataStoreSchemas(dataStoreId, username = "myUser", password = "mySecurePass129") ## End(Not run)
## Not run: dataStoreId <- "5c1303269300d900016b41a7" GetDataStoreSchemas(dataStoreId, username = "myUser", password = "mySecurePass129") ## End(Not run)
Get all tables associated with a data store.
GetDataStoreTables(dataStoreId, username, password, schema = NULL)
GetDataStoreTables(dataStoreId, username, password, schema = NULL)
dataStoreId |
character. The ID of the data store to update. |
username |
character. The username to use for authentication to the database. |
password |
character. The password to use for authentication to the database. The password is encrypted at server side and never saved or stored. |
schema |
character. The name of the schema to reference. Optional. |
A list with the name of the catalog and the name of the tables.
## Not run: dataStoreId <- "5c1303269300d900016b41a7" GetDataStoreTables(dataStoreId, username = "myUser", password = "mySecurePass129") ## End(Not run)
## Not run: dataStoreId <- "5c1303269300d900016b41a7" GetDataStoreTables(dataStoreId, username = "myUser", password = "mySecurePass129") ## End(Not run)
This function returns a DataRobot S3 object of class dataRobotDatetimeModel for the model defined by project and modelId.
GetDatetimeModel(project, modelId)
GetDatetimeModel(project, modelId)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
modelId |
character. Unique alphanumeric identifier for the model of interest. |
If the project does not use datetime partitioning an error will occur.
An S3 object of class 'dataRobotDatetimeModel', which is a list with the following components:
featurelistId character. Unique alphanumeric identifier for the featurelist on which the model is based.
processes character. Vector with components describing preprocessing; may include 'modelType'.
featurelistName character. The name of the featurelist on which the model is based.
projectId character. The unique alphanumeric identifier for the project.
samplePct numeric. Percentage of the dataset used to form the training dataset for model fitting.
isFrozen logical. Is model created with frozen tuning parameters?
modelType character. A description of the model.
metrics list. List with one element for each valid metric associated with the model. Each element is a list with elements for each possible evaluation type (holdout, validation, and crossValidation).
modelCategory character. The model category (e.g., blend, model).
blueprintId character. The unique DataRobot blueprint identifier on which the model is based.
modelId character. The unique alphanumeric model identifier.
modelNumber. integer. The assigned model number.
projectName character. Optional description of project defined by projectId.
projectTarget character. The target variable predicted by all models in the project.
projectMetric character. The fitting metric optimized by all project models.
trainingRowCount integer. The number of rows of the project dataset used in training
the model. In a datetime partitioned project, if specified, defines the number of
rows used to train the model and evaluate backtest scores; if unspecified, either
trainingDuration
or trainingStartDate
and trainingEndDate
was used to
determine that instead.
trainingDuration character. Only present for models in datetime partitioned projects. If specified, a duration string specifying the duration spanned by the data used to train the model and evaluate backtest scores.
trainingStartDate character. Only present for frozen models in datetime partitioned projects. If specified, the start date of the data used to train the model.
trainingEndDate character. Only present for frozen models in datetime partitioned projects. If specified, the end date of the data used to train the model.
backtests list. What data was used to fit each backtest, the score for the project metric, and why the backtest score is unavailable if it is not provided.
dataSelectionMethod character. Which of trainingRowCount, trainingDuration, or trainingStartDate and trainingEndDate were used to determine the data used to fit the model. One of "rowCount", "duration", or "selectedDateRange".
trainingInfo list. Which data was used to train on when scoring the holdout and making predictions. trainingInfo will have the following keys: 'holdoutTrainingStartDate', 'holdoutTrainingDuration', 'holdoutTrainingRowCount', 'holdoutTrainingEndDate', 'predictionTrainingStartDate', 'predictionTrainingDuration', 'predictionTrainingRowCount', 'predictionTrainingEndDate'. Start and end dates will be datetime string, durations will be duration strings, and rows will be integers.
holdoutScore numeric. The score against the holdout, if available and the holdout is unlocked, according to the project metric.
holdoutStatus character. The status of the holdout score, e.g. "COMPLETED", "HOLDOUT_BOUNDARIES_EXCEEDED".
effectiveFeatureDerivationWindowStart integer. Only available for time series projects. How many timeUnits into the past relative to the forecast point the user needs to provide history for at prediction time. This can differ from the 'featureDerivationWindowStart' set on the project due to the differencing method and period selected, or if the model is a time series native model such as ARIMA. Will be a negative integer in time series projects and 'NULL' otherwise.
effectiveFeatureDerivationWindowEnd integer. Only available for time series projects. How many timeUnits into the past relative to the forecast point the feature derivation window should end. Will be a non-positive integer in time series projects and 'NULL' otherwise.
forecastWindowStart integer. Only available for time series projects. How many timeUnits into the future relative to the forecast point the forecast window should start. Note that this field will be the same as what is shown in the project settings. Will be a non-negative integer in time series projects and 'NULL' otherwise.
forecastWindowEnd integer. Only available for time series projects. How many timeUnits into the future relative to the forecast point the forecast window should end. Note that this field will be the same as what is shown in the project settings. Will be a non-negative integer in time series projects and 'NULL' otherwise.
windowsBasisUnit character. Only available for time series projects. Indicates which unit is the basis for the feature derivation window and the forecast window. Note that this field will be the same as what is shown in the project settings. In time series projects, will be either the detected time unit or "ROW", and 'NULL' otherwise.
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" GetDatetimeModel(projectId, modelId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" GetDatetimeModel(projectId, modelId) ## End(Not run)
The functions RequestNewDatetimeModel and RequestFrozenDatetimeModel initiate the creation of new models in a DataRobot project. Both functions submit requests to the DataRobot modeling engine and return an integer-valued modelJobId. The GetDatetimeModelFromJobId function polls the modeling engine until the model has been built or a specified time limit is exceeded, returning an S3 object of class 'dataRobotDatetimeModel' when the model is available.
GetDatetimeModelFromJobId(project, modelJobId, maxWait = 600)
GetDatetimeModelFromJobId(project, modelJobId, maxWait = 600)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
modelJobId |
The integer returned by either RequestNewDatetimeModel |
maxWait |
Integer, The maximum time (in seconds) to wait for the model job to complete |
Motivation for this function is the fact that some models - e.g., very complex machine learning models fit to large datasets - may take a long time to complete. Splitting the model creation request from model retrieval in these cases allows the user to perform other interactive R session tasks between the time the model creation/update request is made and the time the final model is available.
An S3 object of class 'dataRobotDatetimeModel' summarizing all available information about the model. See GetDatetimeModel
## Not run: projectId <- "59a5af20c80891534e3c2bde" initialJobs <- ListModelJobs(project) job <- initialJobs[[1]] modelJobId <- job$modelJobId GetDatetimeModelFromJobId(projectId, modelJobId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" initialJobs <- ListModelJobs(project) job <- initialJobs[[1]] modelJobId <- job$modelJobId GetDatetimeModelFromJobId(projectId, modelJobId) ## End(Not run)
Only available if the project has already set the target as a datetime project.
GetDatetimePartition(project)
GetDatetimePartition(project)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
list describing datetime partition with following components
cvMethod. The type of validation scheme used for the project.
projectId character. The id of the project this partitioning applies to.
datetimePartitionColumn character. The name of the column whose values as dates are used to assign a row to a particular partition.
dateFormat character. The format (e.g. " partition column was interpreted (compatible with strftime [https://docs.python.org/2/library/time.html#time.strftime]).
autopilotDataSelectionMethod character. Whether models created by the autopilot use "rowCount" or "duration" as their dataSelectionMethod.
validationDuration character. The validation duration specified when initializing the partitioning - not directly significant if the backtests have been modified, but used as the default validationDuration for the backtests.
availableTrainingStartDate character. The start date of the available training data for scoring the holdout.
availableTrainingDuration character. The duration of the available training data for scoring the holdout.
availableTrainingRowCount integer. The number of rows in the available training data for scoring the holdout. Only available when retrieving the partitioning after setting the target.
availableTrainingEndDate character. The end date of the available training data for scoring the holdout.
primaryTrainingStartDate character. The start date of primary training data for scoring the holdout.
primaryTrainingDuration character. The duration of the primary training data for scoring the holdout.
primaryTrainingRowCount integer. The number of rows in the primary training data for scoring the holdout. Only available when retrieving the partitioning after setting the target.
primaryTrainingEndDate character. The end date of the primary training data for scoring the holdout.
gapStartDate character. The start date of the gap between training and holdout scoring data.
gapDuration character. The duration of the gap between training and holdout scoring data.
gapRowCount integer. The number of rows in the gap between training and holdout scoring data. Only available when retrieving the partitioning after setting the target.
gapEndDate character. The end date of the gap between training and holdout scoring data.
holdoutStartDate character. The start date of holdout scoring data.
holdoutDuration character. The duration of the holdout scoring data.
holdoutRowCount integer. The number of rows in the holdout scoring data. Only available when retrieving the partitioning after setting the target.
holdoutEndDate character. The end date of the holdout scoring data.
numberOfBacktests integer. the number of backtests used.
backtests data.frame. A data frame of partition backtest. Each element represent one backtest and has the following components: index, availableTrainingStartDate, availableTrainingDuration, availableTrainingRowCount, availableTrainingEndDate, primaryTrainingStartDate, primaryTrainingDuration, primaryTrainingRowCount, primaryTrainingEndDate, gapStartDate, gapDuration, gapRowCount, gapEndDate, validationStartDate, validationDuration, validationRowCount, validationEndDate, totalRowCount.
useTimeSeries logical. Whether the project is a time series project (if TRUE) or an OTV project which uses datetime partitioning (if FALSE).
defaultToKnownInAdvance logical. Whether the project defaults to treating features as known in advance. Known in advance features are time series features that are expected to be known for dates in the future when making predictions (e.g., "is this a holiday").
featureDerivationWindowStart integer. Offset into the past to define how far
back relative to the forecast point the feature derivation window should start. Only used for
time series projects. Expressed in terms of the timeUnit
of the
datetimePartitionColumn
.
featureDerivationWindowEnd integer. Offset into the past to define how far back relative
to the forecast point the feature derivation window should end. Only used for
time series projects. Expressed in terms of the timeUnit
of the
datetimePartitionColumn
.
forecastWindowStart integer. Offset into the future to define how far forward relative
to the forecast point the forecast window should start. Only used for time series
projects. Expressed in terms of the timeUnit
of the datetimePartitionColumn
.
forecastWindowEnd integer. Offset into the future to define how far forward relative to
the forecast point the forecast window should end. Only used for time series
projects. Expressed in terms of the timeUnit
of the datetimePartitionColumn
.
featureSettings list. A list of lists specifying settings for each feature. For each feature you would like to set feature settings for, pass the following in a list:
featureName character. The name of the feature to set feature settings.
knownInAdvance logical. Optional. Whether or not the feature is known in advance.
Used for time series only. Defaults to FALSE
.
doNotDerive logical. Optional. If TRUE
, no time series derived features
(e.g., lags) will be automatically engineered from this feature. Used for time series
only. Defaults to FALSE
.
treatAsExponential character. Specifies whether to treat data as exponential trend
and apply transformations like log-transform. Uses values from from
TreatAsExponential
.
differencingMethod character. Used to specify differencing method to apply if data is
stationary. Use values from DifferencingMethod
.
windowsBasisUnit character. Indicates which unit is the basis for the feature derivation
window and forecast window. Uses values from TimeUnit
and the value "ROW".
periodicities list. A list of periodicities for different times, specified as a list of
lists, where each list item specifies the 'timeSteps' for a particular 'timeUnit'. Will be
"ROW" if windowsBasisUnit
is "ROW".
totalRowCount integer. The number of rows in the project dataset. Only available when
retrieving the partitioning after setting the target. Thus it will be NULL for
GenerateDatetimePartition
and populated for GetDatetimePartition
.
validationRowCount integer. The number of rows in the validation set.
multiseriesIdColumns list. A list of the names of multiseries id columns to define series.
numberOfKnownInAdvanceFeatures integer. The number of known in advance features.
useCrossSeriesFeatures logical. Whether or not cross series features are included.
aggregationType character. The aggregation type to apply when creating cross series
features. See SeriesAggregationType
.
calendarId character. The ID of the calendar used for this project, if any.
## Not run: projectId <- "59a5af20c80891534e3c2bde" GetDatetimePartition(projectId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" GetDatetimePartition(projectId) ## End(Not run)
Get information on a particular deployment.
GetDeployment(deploymentId)
GetDeployment(deploymentId)
deploymentId |
character. The ID of the deployment. |
A DataRobotDeployment object containing:
id character. The ID of the deployment.
label character. The label of the deployment.
description character. The description of the deployment.
defaultPredictionServer list. Information on the default prediction
server connected with the deployment. See ListPredictionServers
for details.
model dataRobotModel. The model associated with the deployment.
See GetModel
for details.
capabilities list. Information on the capabilities of the deployment.
predictionUsage list. Information on the prediction usage of the deployment.
permissions list. User's permissions on the deployment.
serviceHealth list. Information on the service health of the deployment.
modelHealth list. Information on the model health of the deployment.
accuracyHealth list. Information on the accuracy health of the deployment.
## Not run: deploymentId <- "5e319d2e422fbd6b58a5edad" GetDeployment(deploymentId) ## End(Not run)
## Not run: deploymentId <- "5e319d2e422fbd6b58a5edad" GetDeployment(deploymentId) ## End(Not run)
Retrieve accuracy statistics for a deployment.
GetDeploymentAccuracy( deploymentId, modelId = NULL, start = NULL, end = NULL, segmentAttribute = NULL, segmentValue = NULL, targetClasses = NULL )
GetDeploymentAccuracy( deploymentId, modelId = NULL, start = NULL, end = NULL, segmentAttribute = NULL, segmentValue = NULL, targetClasses = NULL )
deploymentId |
character. The ID of the deployment. |
modelId |
character. Optional. The ID of the model to query. If provided, only data for this specific model will be retrieved; otherwise, data for the deployment's default model will be retrieved. |
start |
POSIXct. Optional. The start time of the reporting period for monitoring data.
Defaults to seven days prior to the end of the period. Sub-hour resolution is not permitted,
and the timezone must be |
end |
POSIXct. Optional. The end time of the reporting period for monitoring data. Defaults
to the next top of the hour. Sub-hour resolution is not permitted, and the timezone must be
|
segmentAttribute |
character. Optional. The name of an attribute used for segment analysis.
See |
segmentValue |
character. Optional. The value of |
targetClasses |
character. Optional. List of target classes to filter out of the response. Added in DataRobot 2.23. |
An object representing service health metrics for the deployment, containing:
modelId character. The ID of the deployment model for which monitoring data was retrieved.
period list. The duration of the reporting period, containing:
start POSIXct. Start of the reporting period.
end POSIXct. End of the reporting period.
metrics data.frame. Accuracy metrics for the deployment, where each row is a separate metric and contains the columns:
metric. character. Name of the metric. See DeploymentAccuracyMetric
for valid
values.
baselineValue. numeric. May be NA if accuracy data is not available.
value. numeric. May be NA if accuracy data is not available.
percentChange. numeric. The percent change of value over baseline. May be NA if accuracy data is not available.
segmentAttribute character. Optional. The name of the segment on which segment analysis was performed. Added in DataRobot 2.21.
segmentValue character. Optional. The value of the segmentAttribute. Added in DataRobot 2.21.
Other deployment accuracy functions:
GetDeploymentAccuracyOverTime()
,
GetDeploymentAssociationId()
,
SubmitActuals()
## Not run: library(dplyr) deploymentId <- "59a5af20c80891534e3c2bde" acc <- GetDeploymentAccuracy(deploymentId, end = ISOdate(2021, 01, 06, 1, 0, 0, tz = "UTC")) df <- mutate( acc$metrics, "modelId" = acc$modelId, "startTime" = acc$period$start, "endTime" = acc$period$end, .before = everything() ) ## End(Not run)
## Not run: library(dplyr) deploymentId <- "59a5af20c80891534e3c2bde" acc <- GetDeploymentAccuracy(deploymentId, end = ISOdate(2021, 01, 06, 1, 0, 0, tz = "UTC")) df <- mutate( acc$metrics, "modelId" = acc$modelId, "startTime" = acc$period$start, "endTime" = acc$period$end, .before = everything() ) ## End(Not run)
By default this will return statistics for the last seven days prior to the next; set the start
and end
parameters to adjust the reporting period.
GetDeploymentAccuracyOverTime( deploymentId, metrics, modelId = NULL, start = NULL, end = NULL, bucketSize = NULL, segmentAttribute = NULL, segmentValue = NULL )
GetDeploymentAccuracyOverTime( deploymentId, metrics, modelId = NULL, start = NULL, end = NULL, bucketSize = NULL, segmentAttribute = NULL, segmentValue = NULL )
deploymentId |
character. The ID of the deployment in question. |
metrics |
character. Metrics to query. See |
modelId |
character. Optional. The ID of the model to query. If provided, only data for this specific model will be retrieved; otherwise, data for the deployment's default model will be retrieved. |
start |
POSIXct. Optional. The start time of the reporting period for monitoring data.
Defaults to seven days prior to the end of the period. Sub-hour resolution is not permitted,
and the timezone must be |
end |
POSIXct. Optional. The end time of the reporting period for monitoring data. Defaults
to the next top of the hour. Sub-hour resolution is not permitted, and the timezone must be
|
bucketSize |
character. Optional. The time duration of a bucket. This should be a multiple
of one hour and cannot be longer than the total length of the period. If not set, a default
value will be calculated based on the |
segmentAttribute |
character. Optional. The name of an attribute used for segment analysis. See SegmentAnalysisAttribute“ for permitted values. Added in DataRobot 2.21. |
segmentValue |
character. Optional. The value of |
An object representing how accuracy has changed over time for the deployment, containing:
modelId character. The ID of the deployment model for which monitoring data was retrieved.
summary data.frame. A summary bucket
across the entire reporting period.
buckets data.frame. A list of buckets
representing each interval (constrained by
the bucketSize
parameter) in the reporting period.
baseline data.frame. A baseline bucket
.
Each bucket
contains:
sampleSize. integer. The number of predictions made against this deployment.
start. POSIXct. The start time of the bucket. May be NA.
end. POSIXct. The end time of the bucket. May be NA.
metricName
. numeric. Given N metrics queried, there will be N value columns, each
one named for the metric. See DeploymentAccuracyMetric
for supported values. May be NA if
sampleSize
is 0.
Other deployment accuracy functions:
GetDeploymentAccuracy()
,
GetDeploymentAssociationId()
,
SubmitActuals()
## Not run: deploymentId <- "59a5af20c80891534e3c2bde" aot <- GetDeploymentAccuracyOverTime(deploymentId, metrics = c(DeploymentAccuracyMetric$Gamma.Deviance, DeploymentAccuracyMetric$LogLoss, DeploymentAccuracyMetric$RMSE)) ## End(Not run)
## Not run: deploymentId <- "59a5af20c80891534e3c2bde" aot <- GetDeploymentAccuracyOverTime(deploymentId, metrics = c(DeploymentAccuracyMetric$Gamma.Deviance, DeploymentAccuracyMetric$LogLoss, DeploymentAccuracyMetric$RMSE)) ## End(Not run)
The association ID of a deployment is a foreign key for your prediction dataset that will be used to match up actual values with those predictions. The ID should correspond to an event for which you want to track the outcome.
GetDeploymentAssociationId(deployment) UpdateDeploymentAssociationId( deployment, columnNames = c(), requiredInPredictionRequests = NULL, maxWait = 600 )
GetDeploymentAssociationId(deployment) UpdateDeploymentAssociationId( deployment, columnNames = c(), requiredInPredictionRequests = NULL, maxWait = 600 )
deployment |
An S3 object representing a model deployment, or the unique ID of such a deployment. |
columnNames |
character. Optional. Name(s) of the column(s) in your dataset that will be used to map actuals to predictions and determine accuracy. Note: This cannot be changed after the model has served predictions and the API will return an error. |
requiredInPredictionRequests |
logical. Optional. Whether the association ID is required in a prediction request. |
maxWait |
integer. How long to wait (in seconds) for the computation to complete before returning a timeout error? (Default 600 seconds) |
These functions are convenience methods to get and set the association ID settings for a deployment.
An object classed dataRobotDeploymentAssociationIdSettings
that contains:
character. The columns that can be used as association IDs.
logical. Whether the association ID is required in a prediction request.
UpdateDeploymentAssociationId()
: Updates the association ID settings of
a deployment. It will only update those settings that correspond to set
arguments. This function will throw an error if the update fails and return
the updated settings on success.
Other deployment accuracy functions:
GetDeploymentAccuracyOverTime()
,
GetDeploymentAccuracy()
,
SubmitActuals()
Get drift tracking settings for a deployment.
GetDeploymentDriftTrackingSettings(deploymentId)
GetDeploymentDriftTrackingSettings(deploymentId)
deploymentId |
character. The ID of the deployment. |
A list with the following information on drift tracking:
associationId
predictionIntervals list. A list with two keys:
enabled. 'TRUE' if prediction intervals are enabled and 'FALSE' otherwise.
percentiles list. A list of percentiles, if prediction intervals are enabled.
targetDrift list. A list with one key, 'enabled', which is 'TRUE' if target drift is enabled, and 'FALSE' otherwise.
featureDrift list. A list with one key, 'enabled', which is 'TRUE' if feature drift is enabled, and 'FALSE' otherwise.
## Not run: deploymentId <- "5e319d2e422fbd6b58a5edad" GetDeploymentDriftTrackingSettings(deploymentId) ## End(Not run)
## Not run: deploymentId <- "5e319d2e422fbd6b58a5edad" GetDeploymentDriftTrackingSettings(deploymentId) ## End(Not run)
Retrieve service health statistics for a deployment.
GetDeploymentServiceStats( deploymentId, modelId = NULL, start = NULL, end = NULL, executionTimeQuantile = NULL, responseTimeQuantile = NULL, slowRequestsThreshold = NULL, segmentAttribute = NULL, segmentValue = NULL )
GetDeploymentServiceStats( deploymentId, modelId = NULL, start = NULL, end = NULL, executionTimeQuantile = NULL, responseTimeQuantile = NULL, slowRequestsThreshold = NULL, segmentAttribute = NULL, segmentValue = NULL )
deploymentId |
character. The ID of the deployment. |
modelId |
character. Optional. The ID of the model to query. If provided, only data for this specific model will be retrieved; otherwise, data for the deployment's default model will be retrieved. |
start |
POSIXct. Optional. The start time of the reporting period for monitoring data.
Defaults to seven days prior to the end of the period. Sub-hour resolution is not permitted,
and the timezone must be |
end |
POSIXct. Optional. The end time of the reporting period for monitoring data. Defaults
to the next top of the hour. Sub-hour resolution is not permitted, and the timezone must be
|
executionTimeQuantile |
numeric. Optional. Quantile for the |
responseTimeQuantile |
numeric. Optional. Quantile for the |
slowRequestsThreshold |
integer. Optional. Threshold for the |
segmentAttribute |
character. Optional. The name of an attribute used for segment analysis.
See |
segmentValue |
character. Optional. The value of |
An object representing service health metrics for the deployment, containing:
modelId character. The ID of the deployment model for which monitoring data was retrieved.
period list. The duration of the reporting period, containing:
start POSIXct. Start of the reporting period.
end POSIXct. End of the reporting period.
metrics list. Service health metrics for the deployment, containing:
totalPredictions integer. Total number of prediction rows.
totalRequests integer. Total number of prediction requests performed.
slowRequests integer. Number of requests with response time greater than
slowRequestsThreshold
.
responseTime numeric. Request response time at responseTimeQuantile
in
milliseconds. May be NA.
executionTime numeric. Request execution time at executionTimeQuantile
in
milliseconds. May be NA.
medianLoad integer. Median request rate, in requests per minute.
peakLoad integer. Greatest request rate, in requests per minute.
userErrorRate numeric. Ratio of user errors to the total number of requests.
serverErrorRate numeric. Ratio of server errors to the total number of requests.
numConsumers integer. Number of unique users performing requests.
cacheHitRatio numeric. The ratio of cache hits to requests.
segmentAttribute character. Added in DataRobot 2.20. The name of the segment on which segment analysis was performed.
segmentValue character. Added in DataRobot 2.20. The value of the segmentAttribute.
## Not run: deploymentId <- "59a5af20c80891534e3c2bde" startTime = ISOdate(2020, 12, 25, 1, 0, 0, tz = "UTC") endTime = ISOdate(2021, 01, 06, 1, 0, 0, tz = "UTC") GetDeploymentServiceStats(deploymentId, startTime, endTime) ## End(Not run) ## Not run: deploymentId <- "59a5af20c80891534e3c2bde" GetDeploymentServiceStats(deploymentId, segmentAttribute = SegmentAnalysisAttribute$DataRobotRemoteIP, segmentValue = "192.168.0.1") ## End(Not run)
## Not run: deploymentId <- "59a5af20c80891534e3c2bde" startTime = ISOdate(2020, 12, 25, 1, 0, 0, tz = "UTC") endTime = ISOdate(2021, 01, 06, 1, 0, 0, tz = "UTC") GetDeploymentServiceStats(deploymentId, startTime, endTime) ## End(Not run) ## Not run: deploymentId <- "59a5af20c80891534e3c2bde" GetDeploymentServiceStats(deploymentId, segmentAttribute = SegmentAnalysisAttribute$DataRobotRemoteIP, segmentValue = "192.168.0.1") ## End(Not run)
By default this will return statistics for the last seven days prior to the next; set the start
and end
parameters to adjust the reporting period.
GetDeploymentServiceStatsOverTime( deploymentId, metrics = DeploymentServiceHealthMetric$TotalPredictions, modelId = NULL, start = NULL, end = NULL, bucketSize = NULL, quantile = NULL, threshold = NULL, segmentAttribute = NULL, segmentValue = NULL )
GetDeploymentServiceStatsOverTime( deploymentId, metrics = DeploymentServiceHealthMetric$TotalPredictions, modelId = NULL, start = NULL, end = NULL, bucketSize = NULL, quantile = NULL, threshold = NULL, segmentAttribute = NULL, segmentValue = NULL )
deploymentId |
character. The ID of the deployment. |
metrics |
character. Optional. Metrics to query. See |
modelId |
character. Optional. The ID of the model to query. If provided, only data for this specific model will be retrieved; otherwise, data for the deployment's default model will be retrieved. |
start |
POSIXct. Optional. The start time of the reporting period for monitoring data.
Defaults to seven days prior to the end of the period. Sub-hour resolution is not permitted,
and the timezone must be |
end |
POSIXct. Optional. The end time of the reporting period for monitoring data. Defaults
to the next top of the hour. Sub-hour resolution is not permitted, and the timezone must be
|
bucketSize |
character. Optional. The time duration of a bucket. This should be a multiple
of one hour and cannot be longer than the total length of the period. If not set, a default
value will be calculated based on the |
quantile |
numeric. Optional. Quantile for the |
threshold |
integer. Optional. Threshold for the |
segmentAttribute |
character. Optional. The name of an attribute used for segment analysis.
See |
segmentValue |
character. Optional. The value of |
modelId character. The ID of the deployment model for which monitoring data was retrieved.
summary data.frame. Summarizes statistics for each metric over the entire reporting period.
buckets data.frame. Statistics for each metric, split into intervals of equal duration. There is one column representing stats for each metric queried, as well as:
start POSIXct. Start of the interval.
end POSIXct. End of the interval.
segmentAttribute character. Added in DataRobot 2.20. The name of the segment on which segment analysis was performed.
segmentValue character. Added in DataRobot 2.20. The value of segmentAttribute
.
## Not run: metrics <- c(DeploymentServiceHealthMetric) GetDeploymentServiceStatsOverTime(deploymentId, metrics = metrics) ## End(Not run)
## Not run: metrics <- c(DeploymentServiceHealthMetric) GetDeploymentServiceStatsOverTime(deploymentId, metrics = metrics) ## End(Not run)
Returns information about a particular driver.
GetDriver(driverId)
GetDriver(driverId)
driverId |
character. The id of the driver. |
A list containing information on the particular driver:
className character. The Java class name of the driver.
baseNames character. A vector of the file name(s) of the jar files.
canonicalName character. The user-friendly name of the driver.
id character. The driverId of the driver.
creator character. The userId of the user who created the driver.
## Not run: driverId <- "57a7c978c808916f4a630f89" GetDriver(driverId) ## End(Not run)
## Not run: driverId <- "57a7c978c808916f4a630f89" GetDriver(driverId) ## End(Not run)
Get pairwise feature association statistics for a project's informative features
GetFeatureAssociationMatrix(project, associationType, metric)
GetFeatureAssociationMatrix(project, associationType, metric)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
associationType |
character. The type of association, must be either "association" or "correlation". |
metric |
character. The specified association metric, must be one of "mutualInfo", "cramersV", "spearman", "pearson", or "tau". |
A list with two items:
features data.frame. A data.frame containing the following info for each feature:
alphabeticSortIndex integer. A number representing the alphabetical order of this feature compared to the other features in this dataset.
feature character. The name of the feature.
importanceSortIndex integer. A number ranking the importance of this feature compared to the other features in this dataset.
strengthSortIndex integer. A number ranking the strength of this feature compared to the other features in this dataset.
strengths data.frame. A data.frame of pairwise strength data, with the following info:
feature1 character. The name of the first feature.
feature2 character. The name of the second feature.
statistic numeric. Feature association statistics for 'feature1' and 'feature2'.
## Not run: projectId <- "59a5af20c80891534e3c2bde" GetFeatureAssociationMatrix(projectId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" GetFeatureAssociationMatrix(projectId) ## End(Not run)
Get a sample of the actual values used to measure the association between a pair of features.
GetFeatureAssociationMatrixDetails(project, feature1, feature2)
GetFeatureAssociationMatrixDetails(project, feature1, feature2)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
feature1 |
character. The name of the first feature of interest. |
feature2 |
character. The name of the second feature of interest. |
A list with the following info:
features list. The names of 'feature1' and 'feature2'.
types list. The type of 'feature1' and 'feature2'. Will be "C" for categorical and "N" for numeric.
values data.frame. The values of the feature associations and the relative frequency of the data points in the sample.
## Not run: projectId <- "59a5af20c80891534e3c2bde" GetFeatureAssociationMatrix(projectId, "SepalWidth", "SepalLength") ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" GetFeatureAssociationMatrix(projectId, "SepalWidth", "SepalLength") ## End(Not run)
A histogram is a popular way of visual representation of a feature values
distribution in a series of bins. For categorical features every bin represents
exactly one of feature values plus the number of occurrences of that value.
For numeric features every bin represents a range of values (low end inclusive,
high end exclusive) plus the total number of occurrences of all values in this range.
In addition to that, with every bin for categorical and numeric features there is also
included a target feature average for values in that bin (though it can be missing
if the feature is deemed uninformative, if the project target has not been selected
yet using SetTarget
, or if the project is a multiclass project).
GetFeatureHistogram(project, featureName, binLimit = NULL)
GetFeatureHistogram(project, featureName, binLimit = NULL)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
featureName |
Name of the feature to retrieve. Note: DataRobot renames some features, so the feature name may not be the one from your original data. You can use ListFeatureInfo to list the features and check the name. |
binLimit |
integer. Optional. Desired max number of histogram bins. The default is 60. |
list containing:
count numeric. The number of values in this bin's range. If a project is using weights, the value is equal to the sum of weights of all feature values in the bin's range.
target numeric. Average of the target feature for values in this bin. It may be NULL
if the feature is deemed uninformative, if the target has not yet been set
(see SetTarget
), or if the project is multiclass.
label character. The value of the feature if categorical, otherwise the low end of the bin range such that the difference between two consecutive bin labels is the length of the bin.
Feature Impact is computed for each column by creating new data with that column randomly permuted (but the others left unchanged), and seeing how the error metric score for the predictions is affected. The 'impactUnnormalized' is how much worse the error metric score is when making predictions on this modified data. The 'impactNormalized' is normalized so that the largest value is 1. In both cases, larger values indicate more important features. Elsewhere this technique is sometimes called 'Permutation Importance'.
GetFeatureImpact(model)
GetFeatureImpact(model)
model |
character. The model for which you want to compute Feature Impact, e.g.
from the list of models returned by |
Note that GetFeatureImpact
will block for the duration of feature impact calculation. If
you would prefer not to block the call, use RequestFeatureImpact
to generate an async
request for feature impact and then use GetFeatureImpactForModel
or
GetFeatureImpactForJobId
to get the feature impact when it has been calculated.
GetFeatureImpactForJobId
will also block until the request is complete, whereas
GetFeatureImpactForModel
will error if the job is not complete yet.
This will wait for the Feature Impact job to be completed (giving an error if the job is not a Feature Impact job and an error if the job errors).
GetFeatureImpactForJobId(project, jobId, maxWait = 600)
GetFeatureImpactForJobId(project, jobId, maxWait = 600)
project |
character. The project the Feature Impact is part of. |
jobId |
character. The ID of the job (e.g. as returned from RequestFeatureImpact) |
maxWait |
integer. The maximum time (in seconds) to wait for the model job to complete |
A data frame with the following columns:
featureName character. The name of the feature.
impactNormalized numeric. The normalized impact score (largest value is 1).
impactUnnormalized numeric. The unnormalized impact score.
redundantWith character. A feature that makes this feature redundant, or NA
if the feature is not redundant.
## Not run: model <- ListModels(project)[[1]] featureImpactJobId <- RequestFeatureImpact(model) featureImpact <- GetFeatureImpactForJobId(project, featureImpactJobId) ## End(Not run)
## Not run: model <- ListModels(project)[[1]] featureImpactJobId <- RequestFeatureImpact(model) featureImpact <- GetFeatureImpactForJobId(project, featureImpactJobId) ## End(Not run)
This will only succeed if the Feature Impact computation has completed.
GetFeatureImpactForModel(model)
GetFeatureImpactForModel(model)
model |
character. The model for which you want to retrieve Feature Impact. |
Feature Impact is computed for each column by creating new data with that column randomly permuted (but the others left unchanged), and seeing how the error metric score for the predictions is affected. The 'impactUnnormalized' is how much worse the error metric score is when making predictions on this modified data. The 'impactNormalized' is normalized so that the largest value is 1. In both cases, larger values indicate more important features. Elsewhere this technique is sometimes called 'Permutation Importance'.
Feature impact also runs redundancy detection, which detects if some features are redundant with higher importance features. Note that some types of projects, like multiclass, do not run redundancy detection. This function will generate a warning if redundancy detection was not run.
A data frame with the following columns:
featureName character. The name of the feature.
impactNormalized numeric. The normalized impact score (largest value is 1).
impactUnnormalized numeric. The unnormalized impact score.
redundantWith character. A feature that makes this feature redundant, or NA
if the feature is not redundant.
## Not run: model <- ListModels(project)[[1]] featureImpactJobId <- RequestFeatureImpact(model) # Note: This will only work after the feature impact job has completed. Use # GetFeatureImpactFromJobId to automatically wait for the job.\ featureImpact <- GetFeatureImpactForModel(model) ## End(Not run)
## Not run: model <- ListModels(project)[[1]] featureImpactJobId <- RequestFeatureImpact(model) # Note: This will only work after the feature impact job has completed. Use # GetFeatureImpactFromJobId to automatically wait for the job.\ featureImpact <- GetFeatureImpactForModel(model) ## End(Not run)
Details about a feature
GetFeatureInfo(project, featureName)
GetFeatureInfo(project, featureName)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
featureName |
Name of the feature to retrieve. Note: DataRobot renames some features, so the feature name may not be the one from your original data. You can use ListFeatureInfo to list the features and check the name. |
A named list which contains:
id numeric. feature id. Note that throughout the API, features are specified using their names, not this ID.
name character. The name of the feature.
featureType character. Feature type: 'Numeric', 'Categorical', etc.
importance numeric. numeric measure of the strength of relationship between the feature and target (independent of any model or other features).
lowInformation logical. Whether the feature has too few values to be informative.
uniqueCount numeric. The number of unique values in the feature.
naCount numeric. The number of missing values in the feature.
dateFormat character. The format of the feature if it is date-time feature.
projectId character. Character id of the project the feature belongs to.
max. The maximum value in the dataset, formatted in the same format as the data.
min. The minimum value in the dataset, formatted in the same format as the data.
mean. The arithmetic mean of the dataset, formatted in the same format as the data.
median. The median of the dataset, formatted in the same format as the data.
stdDev. The standard deviation of the dataset, formatted in the same format as the data.
timeSeriesEligible logical. Whether this feature can be used as the datetime partition column in a time series project.
timeSeriesEligibilityReason character. Why the feature is ineligible for the datetime partition column in a time series project, "suitable" when it is eligible.
crossSeriesEligible logical. Whether the cross series group by column is eligible for cross-series modeling. Will be NULL if no cross series group by column is used.
crossSeriesEligibilityReason character. The type of cross series eligibility (or ineligibility).
timeStep numeric. For time-series eligible features, a positive integer determining the interval at which windows can be specified. If used as the datetime partition column on a time series project, the feature derivation and forecast windows must start and end at an integer multiple of this value. NULL for features that are not time series eligible.
timeUnit character. For time series eligible features, the time unit covered by a single time step, e.g. "HOUR", or NULL for features that are not time series eligible.
targetLeakage character. Whether a feature is considered to have target leakage or not. A value of "SKIPPED_DETECTION" indicates that target leakage detection was not run on the feature.
keySummary data.frame. Optional. Descriptive statistics for this feature, iff it is a summarized categorical feature. This data.frame contains:
key. The name of the key.
summary. Descriptive statistics for this key, including:
max. The maximum value in the dataset.
min. The minimum value in the dataset.
mean. The arithmetic mean of the dataset.
median. The median of the dataset.
stdDev. The standard deviation of the dataset.
pctRows. The percentage of rows (from the EDA sample) in which this key occurs.
Other feature functions:
ListFeatureInfo()
,
ListModelFeatures()
,
as.dataRobotFeatureInfo()
## Not run: projectId <- "59a5af20c80891534e3c2bde" GetFeatureInfo(projectId, "myFeature") ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" GetFeatureInfo(projectId, "myFeature") ## End(Not run)
This function returns information about and the contents of a specified featurelist from a specified project.
GetFeaturelist(project, featurelistId)
GetFeaturelist(project, featurelistId)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
featurelistId |
Unique alphanumeric identifier for the featurelist to be retrieved. |
DataRobot featurelists define the variables from the modeling dataset used in fitting each project model. In most cases, the same featurelist is used in fitting all project models, but models can be fit using alternative featurelists using the RequestNewModel function. To do this, featurelistId is required, and this is one of the elements returned by the GetFeaturelist function.
DataRobot featurelists define the variables from the modeling dataset used in fitting each project model. In most cases, the same featurelist is used in fitting all project models, but models can be fit using alternative featurelists using the RequestNewModel function. To do this, featurelistId is required, and this is one of the elements returned by the GetFeaturelist function.
A list with the following elements describing the requested featurelist:
featurelistId character. The unique alphanumeric identifier for the featurelist.
projectId character. The project to which the featurelist belongs.
features character. The names of the variables included in the featurelist.
name character. The name of the featurelist.
created character. A timestamp of when the featurelist was created.
isUserCreated logical. Whether or not the featurelist was created by a user (as opposed to DataRobot automation).
numModels numeric. The number of models that currently use this featurelist.
## Not run: projectId <- "59a5af20c80891534e3c2bde" featureList <- CreateFeaturelist(projectId, "myFeaturelist", c("feature1", "feature2")) featurelistId <- featureList$featurelistId GetFeaturelist(projectId, featurelistId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" featureList <- CreateFeaturelist(projectId, "myFeaturelist", c("feature1", "feature2")) featurelistId <- featureList$featurelistId GetFeaturelist(projectId, featurelistId) ## End(Not run)
This function returns a DataRobot S3 object of class dataRobotFrozenModel for the model defined by project and modelId. GetModel also can be used to retrieve some information about frozen model, however then some frozen specific information (parentModelId) will not be returned
GetFrozenModel(project, modelId)
GetFrozenModel(project, modelId)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
modelId |
Unique alphanumeric identifier for the model of interest. |
The S3 object returned by this function is required by the functions DeleteModel, ListModelFeatures, and RequestSampleSizeUpdate.
An S3 object of class ‘dataRobotModel’, which is a list with the following components:
featurelistId. Character string: unique alphanumeric identifier for the featurelist on which the model is based.
processes. Character vector with components describing preprocessing; may include modelType.
featurelistName. Character string giving the name of the featurelist on which the model is based.
projectId. Character string giving the unique alphanumeric identifier for the project.
samplePct. Numeric or NULL. The percentage of the project dataset used in training the
model. If the project uses datetime partitioning, the samplePct
will be NULL.
See trainingRowCount
, trainingDuration
, and trainingStartDate
and trainingEndDate
instead.
trainingRowCount. Integer. The number of rows of the project dataset used in training
the model. In a datetime partitioned project, if specified, defines the number of
rows used to train the model and evaluate backtest scores; if unspecified, either
trainingDuration
or trainingStartDate
and trainingEndDate
was used to
determine that instead.
isFrozen. Logical : is model created with frozen tuning parameters.
modelType. Character string describing the model type.
metrics. List with one element for each valid metric associated with the model. Each element is a list with elements for each possible evaluation type (holdout, validation, and crossValidation).
modelCategory. Character string giving model category (e.g., blend, model).
blueprintId. Character string giving the unique DataRobot blueprint identifier on which the model is based.
modelId. Character string giving the unique alphanumeric model identifier.
modelNumber. Integer. The assigned model number.
projectName. Character string: optional description of project defined by projectId.
projectTarget. Character string defining the target variable predicted by all models in the project.
projectMetric. Character string defining the fitting metric optimized by all project models.
supportsMonotonicConstraints logical. Whether or not the model supports monotonic constraints.
monotonicIncreasingFeaturelistId character. The ID of the featurelist specifying the
features that are constrained to be monotonically increasing. Will be NULL
if no
increasing constraints are used.
monotonicDecreasingFeaturelistId character. The ID of the featurelist specifying the
features that are constrained to be monotonically decreasing. Will be NULL
if no
decreasing constraints are used.
isStarred logical. Whether or not the model is starred.
predictionThreshold numeric. For binary classification projects, the threshold used for predictions.
predictionThresholdReadOnly logical. Whether or not the prediction threshold can be modified. Typically, the prediction threshold can no longer be modified once a model has a deployment created or predictions have been made with the dedicated prediction API.
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" GetFrozenModel(projectId, modelId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" GetFrozenModel(projectId, modelId) ## End(Not run)
The function RequestFrozenModel initiate the creation of frozen models in a DataRobot project. RequestFrozenModel function submit requests to the DataRobot modeling engine and return an integer-valued modelJobId. The GetFrozenModelFromJobId function polls the modeling engine until the model has been built or a specified time limit is exceeded, returning an S3 object of class 'dataRobotFrozenModel' when the model is available.
GetFrozenModelFromJobId(project, modelJobId, maxWait = 600)
GetFrozenModelFromJobId(project, modelJobId, maxWait = 600)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
modelJobId |
integer. The integer returned by either |
maxWait |
integer. The maximum time (in seconds) to wait for the model job to complete. |
Motivation for this function is the fact that some models - e.g., very complex machine learning models fit to large datasets - may take a long time to complete. Splitting the model creation request from model retrieval in these cases allows the user to perform other interactive R session tasks between the time the model creation/update request is made and the time the final model is available.
GetModelFromJobId also can be used to retrieve some information about frozen model, however then some frozen specific information (parentModelId) will not be returned.
An S3 object of class 'dataRobotFrozenModel' summarizing all available information about the model.
## Not run: projectId <- "59a5af20c80891534e3c2bde" initialJobs <- ListModelJobs(project) job <- initialJobs[[1]] modelJobId <- job$modelJobId GetModelJobFromJobId(projectId, modelJobId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" initialJobs <- ListModelJobs(project) job <- initialJobs[[1]] modelJobId <- job$modelJobId GetModelJobFromJobId(projectId, modelJobId) ## End(Not run)
See GetLiftChart
, GetRocCurve
, GetResidualsChart
for details.
GetGeneralizedInsight( method, model, source = DataPartition$VALIDATION, fallbackToParentInsights = FALSE )
GetGeneralizedInsight( method, model, source = DataPartition$VALIDATION, fallbackToParentInsights = FALSE )
method |
character. The API URL to use to get insight information. |
model |
dataRobotModel. A DataRobot model object like that returned by |
source |
character. The data partition for which data would be returned. Default is
|
fallbackToParentInsights |
logical. If TRUE, this will return the lift chart data for the model's parent if the lift chart is not available for the model and the model has a parent model. |
Request information about a job
GetJob(project, jobId)
GetJob(project, jobId)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
jobId |
Character string specifying the job id |
list with following elements:
status character. Model job status; an element of JobStatus
, e.g.
JobStatus$Queue
.
url character. URL to request more detail about the job.
id character. The job id.
jobType character. See JobType
for valid values.
projectId character. The project that contains the model.
isBlocked logical. If TRUE, the job is blocked (cannot be executed) until its dependencies are resolved.
## Not run: projectId <- "59a5af20c80891534e3c2bde" initialJobs <- ListModelJobs(project) job <- initialJobs[[1]] jobId <- job$modelJobId GetJob(projectId, jobId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" initialJobs <- ListModelJobs(project) job <- initialJobs[[1]] jobId <- job$modelJobId GetJob(projectId, jobId) ## End(Not run)
Retrieve lift chart data for a model for a data partition (see DataPartition)
GetLiftChart( model, source = DataPartition$VALIDATION, fallbackToParentInsights = FALSE )
GetLiftChart( model, source = DataPartition$VALIDATION, fallbackToParentInsights = FALSE )
model |
dataRobotModel. A DataRobot model object like that returned by |
source |
character. The data partition for which data would be returned. Default is
|
fallbackToParentInsights |
logical. If TRUE, this will return the lift chart data for the model's parent if the lift chart is not available for the model and the model has a parent model. |
data.frame with the following components:
binWeight. Numeric: weight of the bin. For weighted projects, the sum of the weights of all rows in the bin; otherwise, the number of rows in the bin.
actual. Numeric: sum of actual target values in bin.
predicted. Numeric: sum of predicted target values in bin.
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) GetLiftChart(model, source = DataPartition$VALIDATION) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) GetLiftChart(model, source = DataPartition$VALIDATION) ## End(Not run)
The missing values report is a list of items, one per feature, sorted by missing count in descending order. Each item in the report contains details on the number of missing values for that feature and how they were handled by the model.
GetMissingValuesReport(project, modelId)
GetMissingValuesReport(project, modelId)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
modelId |
character. Unique alphanumeric identifier for the model of interest. |
A list containing:
feature character. The name of the feature.
type character. Feature type (numeric or categorical).
missingCount numeric. The number of missing values in the training data for that feature.
missingPercentage numeric. The percentage of missing values in the training data for the feature.
tasks list. A list of information on each task that was applied to that feature to handle missing values. This information contains:
id character. The id of the node in the model blueprint chart for this task. (See GetBlueprintChart for more information on blueprint charts.)
name character. The name of the task.
descriptions character. Aggregated information about how the task handles missing values.
## Not run: projectId <- "5984b4d7100d2b31c1166529" modelId <- "5984b4d7100d2b31c1166529" GetMissingValuesReport(projectId, modelId) ## End(Not run)
## Not run: projectId <- "5984b4d7100d2b31c1166529" modelId <- "5984b4d7100d2b31c1166529" GetMissingValuesReport(projectId, modelId) ## End(Not run)
This function returns a DataRobot S3 object of class dataRobotModel for the model defined by project and modelId.
GetModel(project, modelId)
GetModel(project, modelId)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
modelId |
character. Unique alphanumeric identifier for the model of interest. |
The S3 object returned by this function is required by the functions DeleteModel, ListModelFeatures, and RequestSampleSizeUpdate.
An S3 object of class ‘dataRobotModel’, which is a list with the following components:
featurelistId. Character string: unique alphanumeric identifier for the featurelist on which the model is based.
processes. Character vector with components describing preprocessing; may include modelType.
featurelistName. Character string giving the name of the featurelist on which the model is based.
projectId. Character string giving the unique alphanumeric identifier for the project.
samplePct. Numeric or NULL. The percentage of the project dataset used in training the
model. If the project uses datetime partitioning, the samplePct
will be NULL.
See trainingRowCount
, trainingDuration
, and trainingStartDate
and trainingEndDate
instead.
trainingRowCount. Integer. The number of rows of the project dataset used in training
the model. In a datetime partitioned project, if specified, defines the number of
rows used to train the model and evaluate backtest scores; if unspecified, either
trainingDuration
or trainingStartDate
and trainingEndDate
was used to
determine that instead.
isFrozen. Logical : is model created with frozen tuning parameters.
modelType. Character string describing the model type.
metrics. List with one element for each valid metric associated with the model. Each element is a list with elements for each possible evaluation type (holdout, validation, and crossValidation).
modelCategory. Character string giving model category (e.g., blend, model).
blueprintId. Character string giving the unique DataRobot blueprint identifier on which the model is based.
modelId. Character string giving the unique alphanumeric model identifier.
modelNumber. Integer. The assigned model number.
projectName. Character string: optional description of project defined by projectId.
projectTarget. Character string defining the target variable predicted by all models in the project.
projectMetric. Character string defining the fitting metric optimized by all project models.
supportsMonotonicConstraints logical. Whether or not the model supports monotonic constraints.
monotonicIncreasingFeaturelistId character. The ID of the featurelist specifying the
features that are constrained to be monotonically increasing. Will be NULL
if no
increasing constraints are used.
monotonicDecreasingFeaturelistId character. The ID of the featurelist specifying the
features that are constrained to be monotonically decreasing. Will be NULL
if no
decreasing constraints are used.
isStarred logical. Whether or not the model is starred.
predictionThreshold numeric. For binary classification projects, the threshold used for predictions.
predictionThresholdReadOnly logical. Whether or not the prediction threshold can be modified. Typically, the prediction threshold can no longer be modified once a model has a deployment created or predictions have been made with the dedicated prediction API.
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" GetModel(projectId, modelId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" GetModel(projectId, modelId) ## End(Not run)
A model blueprint is a "pruned down" blueprint representing what was actually run for the model. This is solely the branches of the blueprint that were executed based on the featurelist.
GetModelBlueprintChart(project, modelId)
GetModelBlueprintChart(project, modelId)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
modelId |
character. Unique alphanumeric identifier for the model of interest. |
List with the following two components:
nodes. list each element contains information about one node of a blueprint : id and label.
edges. Two column matrix, identifying blueprint nodes connections.
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" GetModelBlueprintChart(projectId, modelId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" GetModelBlueprintChart(projectId, modelId) ## End(Not run)
A model blueprint is a "pruned down" blueprint representing what was actually run for the model. This is solely the branches of the blueprint that were executed based on the featurelist.
GetModelBlueprintDocumentation(project, modelId)
GetModelBlueprintDocumentation(project, modelId)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
modelId |
character. Unique alphanumeric identifier for the model of interest. |
list with following components
Character string name of the task described in document
Character string task description
Character string title of document
List of parameters that task can received in human-readable format with following components: name, type, description
List of external links used in document with following components: name, url
List of references used in document with following components: name, url
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" GetModelBlueprintDocumentation(projectId, modelId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" GetModelBlueprintDocumentation(projectId, modelId) ## End(Not run)
Get supported capabilities for a model, e.g., whether it has a word cloud.
GetModelCapabilities(model)
GetModelCapabilities(model)
model |
An S3 object of class dataRobotModel like that returned by the function GetModel, or each element of the list returned by the function ListModels. |
Returns a list of logicals, representing different capabilities. Some of them are defined below:
supportsBlending logical. Whether the model supports blending. See
RequestBlender
.
supportsMonotonicConstraints logical. Whether the model supports
monotonic constraints. See RequestModel
.
supportsModelPackageExport. logical. Whether the model can be exported as a model package (a .mloc file).
supportsCodeGeneration logical. Added in DataRobot API 2.18. Whether the model supports code generation.
supportsShap logical. Added in DataRobot API 2.18. Whether the model supports the Shapley package, i.e. Shapley-based feature importance.
supportsEarlyStopping. logical. Added in DataRobot API 2.22. Whether this is an early-stopping tree-based model, which denotes that the number of trained iterations can be retrieved.
hasWordCloud logical. Whether the model has a word cloud. See
GetWordCloud
.
eligibleForPrime logical. Whether the model is eligible for Prime.
See CreatePrimeCode
.
hasParameters logical. Whether the model has parameters. See
GetModelParameters
.
The list also includes the following:
reasons. character. Explanations for why this model does not
support certain capabilities. Not all capabilities are listed here.
Names correspond to capabilities listed in ModelCapability
.
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) GetModelCapabilities(model) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) GetModelCapabilities(model) ## End(Not run)
The functions RequestNewModel and RequestSampleSizeUpdate initiate the creation of new models in a DataRobot project. Both functions submit requests to the DataRobot modeling engine and return an integer-valued modelJobId. The GetModelFromJobId function polls the modeling engine until the model has been built or a specified time limit is exceeded, returning an S3 object of class 'dataRobotModel' when the model is available.
GetModelFromJobId(project, modelJobId, maxWait = 600)
GetModelFromJobId(project, modelJobId, maxWait = 600)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
modelJobId |
The integer returned by either RequestNewModel or RequestSampleSizeUpdate. |
maxWait |
integer. The maximum time (in seconds) to wait for the model job to complete. |
Motivation for this function is the fact that some models - e.g., very complex machine learning models fit to large datasets - may take a long time to complete. Splitting the model creation request from model retrieval in these cases allows the user to perform other interactive R session tasks between the time the model creation/update request is made and the time the final model is available.
An S3 object of class 'dataRobotModel' summarizing all available information about the model.
## Not run: projectId <- "59a5af20c80891534e3c2bde" initialJobs <- ListModelJobs(project) job <- initialJobs[[1]] modelJobId <- job$modelJobId GetModelJobFromJobId(projectId, modelJobId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" initialJobs <- ListModelJobs(project) job <- initialJobs[[1]] modelJobId <- job$modelJobId GetModelJobFromJobId(projectId, modelJobId) ## End(Not run)
In time series projects, a new set of modeling features is created after setting the partitioning options. These features are automatically derived from those in the project's dataset and are the features used for modeling. Modeling features are only accessible once the target and partitioning options have been set. In projects that don't use time series modeling, once the target has been set, ModelingFeaturelists and Featurelists will behave the same.
GetModelingFeaturelist(project, featurelistId)
GetModelingFeaturelist(project, featurelistId)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
featurelistId |
Unique alphanumeric identifier for the featurelist to be retrieved. |
A list with the following elements describing the requested featurelist:
featurelistId character. The unique alphanumeric identifier for the featurelist.
projectId character. The project to which the featurelist belongs.
features character. The names of the variables included in the featurelist.
name character. The name of the featurelist.
created character. A timestamp of when the featurelist was created.
isUserCreated logical. Whether or not the featurelist was created by a user (as opposed to DataRobot automation).
numModels numeric. The number of models that currently use this featurelist.
## Not run: projectId <- "59a5af20c80891534e3c2bde" featureList <- CreateModelingFeaturelist(projectId, "myFeaturelist", c("feature1", "feature2")) featurelistId <- featureList$featurelistId GetModelingFeaturelist(projectId, featurelistId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" featureList <- CreateModelingFeaturelist(projectId, "myFeaturelist", c("feature1", "feature2")) featurelistId <- featureList$featurelistId GetModelingFeaturelist(projectId, featurelistId) ## End(Not run)
Request information about a single model job
GetModelJob(project, modelJobId)
GetModelJob(project, modelJobId)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
modelJobId |
Character string specifying the job id |
list with following elements:
status character. Model job status; an element of JobStatus
, e.g.
JobStatus$Queue
.
processes list. List of character vectors describing any preprocessing applied.
projectId character. The unique identifier for the project.
modelId character. The unique identifier for the related model.
samplePct numeric. The percentage of the dataset used for model building.
trainingRowCount. Integer. The number of rows of the project dataset used in training the model.
modelType character. string specifying the model this job builds.
modelCategory character. What kind of model this is - prime
for DataRobot Prime
models, blend
for blender models, and model
for other models.
featurelistId character. Id of the featurelist used in fitting the model.
blueprintId character. Id of the DataRobot blueprint on which the model is based.
modelJobId character. Id of the job.
isBlocked logical. If TRUE, the job is blocked (cannot be executed) until its dependencies are resolved.
## Not run: projectId <- "59a5af20c80891534e3c2bde" initialJobs <- ListModelJobs(project) job <- initialJobs[[1]] modelJobId <- job$modelJobId GetModelJob(projectId, modelJobId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" initialJobs <- ListModelJobs(project) job <- initialJobs[[1]] modelJobId <- job$modelJobId GetModelJob(projectId, modelJobId) ## End(Not run)
Retrieve model parameters
GetModelParameters(project, modelId)
GetModelParameters(project, modelId)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
modelId |
character. Unique alphanumeric identifier for the model of interest. |
List with the following components:
parameters. List of model parameters that are related to the whole model with following components: name, value.
derivedFeatures. List containing preprocessing information about derived features with following components: originalFeature, derivedFeature, type, coefficient, transformations and stageCoefficients. 'transformations' is a list itself with components: name and value. 'stageCoefficients' is also a list with components: stage and coefficient. It contains coefficients for each stage of multistage models and is empty list for single stage models.
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" GetModelParameters(projectId, modelId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" GetModelParameters(projectId, modelId) ## End(Not run)
Model recommendations are only generated when you run full Autopilot. One of them (the most accurate individual, non-blender model) will be prepared for deployment. In the preparation process, DataRobot will: (1) calculate feature impact for the selected model and use it to generate a reduced feature list, (2) retrain the selected model on the reduced featurelist, (3) will replace the recommended model with the new model if performance is improved on the reduced featurelist, (4) will retrain the model on a higher sample size, and (5) will replace the recommended model with the higher sample size model if it is more accurate.
GetModelRecommendation(project, type = RecommendedModelType$FastAccurate)
GetModelRecommendation(project, type = RecommendedModelType$FastAccurate)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
type |
character. The type of recommendation to retrieve. See
|
A list containing information about the recommended model:
modelId character. The model ID of the recommended model.
projectId character. The project ID of the project the recommendations were made for.
recommendationType character. The type of recommendation being made.
## Not run: projectId <- "5984b4d7100d2b31c1166529" GetModelRecommendation(projectId) ## End(Not run)
## Not run: projectId <- "5984b4d7100d2b31c1166529" GetModelRecommendation(projectId) ## End(Not run)
Multiseries time series projects use multiseries id columns to model multiple distinct series within a single project. This function returns the time series properties (time step and time unit) of this column if it were used as a datetime partition column with the specified multiseries id columns, running multiseries detection automatically if it had not previously been successfully ran.
GetMultiSeriesProperties( project, dateColumn, multiseriesIdColumns, crossSeriesGroupByColumns = NULL, maxWait = 600 )
GetMultiSeriesProperties( project, dateColumn, multiseriesIdColumns, crossSeriesGroupByColumns = NULL, maxWait = 600 )
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
dateColumn |
character. The name of the column containing the date that defines the time series. |
multiseriesIdColumns |
character. Optional. The Series ID to demarcate the series. If not specified, DataRobot will attempt to automatically infer the series ID. |
crossSeriesGroupByColumns |
character. Optional. Column to split a cross series into
further groups. For example, if every series is sales of an individual product, the cross
series group could be e product category with values like "men's clothing", "sports
equipment", etc. Requires multiseries with |
maxWait |
integer. if a multiseries detection task is run, the maximum amount of time to wait for it to complete before giving up. |
A named list which contains:
timeSeriesEligible logical. Whether or not the series is eligible to be used for time series.
crossSeriesEligible logical. Whether or not the cross series group by column is eligible for cross-series modeling. Will be NULL if no cross series group by column is used.
crossSeriesEligibilityReason character. The type of cross series eligibility (or ineligibility).
timeUnit character. For time series eligible features, the time unit covered by a single time step, e.g. "HOUR", or NULL for features that are not time series eligible.
timeStep integer. Expected difference in time units between rows in the data. Will be NULL for features that are not time series eligible.
Other MultiSeriesProject functions:
RequestCrossSeriesDetection()
,
RequestMultiSeriesDetection()
,
as.dataRobotMultiSeriesProperties()
## Not run: projectId <- "59a5af20c80891534e3c2bde" GetMultiSeriesProperties(projectId, dateColumn = "myFeature", multiseriesIdColumns = "Store") ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" GetMultiSeriesProperties(projectId, dateColumn = "myFeature", multiseriesIdColumns = "Store") ## End(Not run)
The Eureqa algorithm generates millions and millions of equations. Eureqa takes the best bits from the best initial models and splices them randomly into the next generation. After enough mixing, the models can achieve good accuracy. There are usually many equations at every complexity level, but they aren't exposed. The models that are displayed are the "Pareto-optimal" models. That means that for any given complexity score, it shows the model with the best error metric on the training data out of all the modes. After that, for each remaining model, if there a strictly better model, throw out the strictly-worse model. A Pareto Front are those "Pareto-optimal" models that are generated at various complexity scores.
GetParetoFront(model)
GetParetoFront(model)
model |
An S3 object of class dataRobotModel like that returned by the function GetModel, or each element of the list returned by the function ListModels. |
data.frame with the following components:
projectId character. the id of the project the model belongs to
errorMetric character. Eureqa error-metric identifier used to compute error metrics for this search. Note that Eureqa error metrics do NOT correspond 1:1 with DataRobot error metrics – the available metrics are not the same, and even equivalent metrics may be computed slightly differently.
hyperparameters list. A list of the various hyperparameters that could be used. By default there are none.
targetType character. Indicating what kind of modeling is being done in this project Options are: "Regression", "Binary" (Binary classification), "Multiclass" (Multiclass classification)
solutions list. List of Pareto points. Every Pareto point contains a dictionary with keys:
eureqaSolutionId character. ID of this solution
complexity numeric. Complexity score for this solution. Complexity score is a function of the mathematical operators used in the current solution. The Complexity calculation can be tuned via model hyperparameters.
error numeric. Error for the current solution, as computed by Eureqa using the "error_metric" error metric.
expression character. String specifying the Eureqa model equation.
expression_annotated character. Eureqa model equation string with variable names tagged for easy identification.
## Not run: projectId <- "5b2827556523cd05bd1507a5" modelId <- "5b29406c6523cd0665685a8d" model <- GetModel(projectId, modelId) GetParetoFront(model) ## End(Not run)
## Not run: projectId <- "5b2827556523cd05bd1507a5" modelId <- "5b29406c6523cd0665685a8d" model <- GetModel(projectId, modelId) GetParetoFront(model) ## End(Not run)
Retrieve data on a prediction dataset
GetPredictionDataset(project, datasetId)
GetPredictionDataset(project, datasetId)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
datasetId |
character. The ID of the prediction dataset. |
Data for a particular prediction dataset:
id character. The unique alphanumeric identifier for the dataset.
numColumns numeric. Number of columns in dataset.
name character. Name of dataset file.
created character. time of upload.
projectId character. String giving the unique alphanumeric identifier for the project.
numRows numeric. Number of rows in dataset.
forecastPoint. The point relative to which predictions will be generated, based on the forecast window of the project. Only specified in time series projects, otherwise will be NULL.
## Not run: projectId <- "59a5af20c80891534e3c2bde" datasetId <- "5cd36e6e77a90f79a28ba414" GetPredictionDataset(projectId, datasetId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" datasetId <- "5cd36e6e77a90f79a28ba414" GetPredictionDataset(projectId, datasetId) ## End(Not run)
A streamlined workflow to both generate and retrieve prediction explanations for a model.
GetPredictionExplanations( model, dataset, maxExplanations = NULL, thresholdLow = NULL, thresholdHigh = NULL, batchSize = NULL, maxWait = 600, excludeAdjustedPredictions = TRUE )
GetPredictionExplanations( model, dataset, maxExplanations = NULL, thresholdLow = NULL, thresholdHigh = NULL, batchSize = NULL, maxWait = 600, excludeAdjustedPredictions = TRUE )
model |
An S3 object of class dataRobotModel like that returned by the function GetModel, or each element of the list returned by the function ListModels. |
dataset |
object. Either (1) the prediction dataset object of class
|
maxExplanations |
integer. Optional. The maximum number of prediction explanations to supply per row of the dataset, default: 3. |
thresholdLow |
numeric. Optional. The lower threshold, below which a prediction must
score in order for prediction explanations to be computed for a row in the dataset. If
neither |
thresholdHigh |
numeric. Optional. The high threshold, above which a prediction must score
in order for prediction explanations to be computed. If neither |
batchSize |
integer. Optional. Maximum number of prediction explanations rows to retrieve per request |
maxWait |
integer. The maximum time (in seconds) to wait for the model job to complete. |
excludeAdjustedPredictions |
logical. Optional. Set to FALSE to include adjusted predictions, which are predictions adjusted by an exposure column. This is only relevant for projects that use an exposure column. |
data frame with following columns:
rowId integer. Row id from prediction dataset.
prediction numeric. The output of the model for this row (numeric prediction for regression problem, predicted class for classification problem).
class1Label character. Label of class 0. Available only for classification problem.
class1Probability numeric. Predicted probability of class 0. Available only for classification problem.
class2Label character. Label of class 1. Available only for classification problem.
class2Probability numeric. Predicted probability of class 1. Available only for classification problem.
explanation1FeatureName character. The name of the feature contributing to the prediction.
explanation1FeatureValue character. the value the feature took on for this row.
explanation1QualitativeStrength numeric. How strongly the feature affected the prediction.
explanation1Strength character. A human-readable description of how strongly the feature affected the prediction (e.g. '+++', '–', '+').
explanation1Label character. Describes what output was driven by this prediction explanation. For regression projects, it is the name of the target feature. For classification projects, it is the class whose probability increasing would correspond to a positive strength of this.
explanationNFeatureName character. The name of the feature contributing to the prediction.
explanationNFeatureValue character. The value the feature took on for this row.
explanationNQualitativeStrength numeric. How strongly the feature affected the prediction.
explanationNStrength character. A human-readable description of how strongly the feature affected the prediction (e.g. '+++', '–', '+').
explanationNLabel character. Describes what output was driven by this prediction explanation. For regression projects, it is the name of the target feature. For classification projects, it is the class whose probability increasing would correspond to a positive strength of this.
explanationNFeatureName. Character string the name of the feature contributing to the prediction.
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" datasets <- ListPredictionDatasets(projectId) dataset <- datasets[[1]] model <- GetModel(projectId, modelId) GetPredictionExplanations(model, dataset) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" datasets <- ListPredictionDatasets(projectId) dataset <- datasets[[1]] model <- GetModel(projectId, modelId) GetPredictionExplanations(model, dataset) ## End(Not run)
Prediction explanations initializations are a prerequisite for computing prediction explanations, and include a sample what the computed prediction explanations for a prediction dataset would look like.
GetPredictionExplanationsInitialization(model)
GetPredictionExplanationsInitialization(model)
model |
An S3 object of class dataRobotModel like that returned by the function GetModel, or each element of the list returned by the function ListModels. |
A named list which contains:
projectId character. ID of the project the feature belongs to.
modelId character. The unique alphanumeric model identifier.
predictionExplanationsSample list. List with sample of prediction explanations.
Each element of the list is information about prediction explanations for one data row.
For more information see GetPredictionExplanationsRows
.
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) GetPredictionExplanationsInitialization(model) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) GetPredictionExplanationsInitialization(model) ## End(Not run)
Prediction explanations initializations are a prerequisite for computing prediction explanations, and include a sample what the computed prediction explanations for a prediction dataset would look like.
GetPredictionExplanationsInitializationFromJobId(project, jobId, maxWait = 600)
GetPredictionExplanationsInitializationFromJobId(project, jobId, maxWait = 600)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
jobId |
integer. Unique integer identifier pointing to the prediction explanations job
(returned for example by |
maxWait |
integer. The maximum time (in seconds) to wait for the model job to complete |
A named list which contains:
projectId character. ID of the project the feature belongs to.
modelId character. The unique alphanumeric model identifier.
predictionExplanationsSample list. List with sample of prediction explanations.
Each element of the list is information about prediction explanations for one data row.
For more information see GetPredictionExplanationsRows
.
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) jobId <- RequestPredictionExplanationsInitialization(model) GetPredictionExplanationsInitializationFromJobId(projectId, jobId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) jobId <- RequestPredictionExplanationsInitialization(model) GetPredictionExplanationsInitializationFromJobId(projectId, jobId) ## End(Not run)
Retrieve metadata for specified prediction explanations
GetPredictionExplanationsMetadata(project, predictionExplanationId)
GetPredictionExplanationsMetadata(project, predictionExplanationId)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
predictionExplanationId |
character. Id of the prediction explanations. |
A named list which contains prediction explanation metadata:
id character. ID of the record and prediction explanations computation result.
projectId character. ID of the project the model belongs to.
modelId character. ID of the model prediction explanations initialization is for.
datasetId character. ID of the prediction dataset prediction explanations were computed for.
maxExplanations integer. Maximum number of prediction explanations to supply per row of the dataset.
thresholdLow numeric. The low threshold, below which a prediction must score in order for prediction explanations to be computed for a row in the dataset.
thresholdHigh numeric. The high threshold, above which a prediction must score in order for prediction explanations to be computed for a row in the dataset.
numColumns integer. The number of columns prediction explanations were computed for.
finishTime. Numeric timestamp referencing when computation for these prediction explanations finished.
predictionExplanationsLocation character. Where to retrieve the prediction explanations.
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" datasets <- ListPredictionDatasets(projectId) dataset <- datasets[[1]] datasetId <- dataset$id model <- GetModel(projectId, modelId) jobId <- RequestPredictionExplanations(model, datasetId) predictionExplanationId <- GetPredictionExplanationsMetadataFromJobId(projectId, jobId)$id GetPredictionExplanationsMetadata(projectId, predictionExplanationId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" datasets <- ListPredictionDatasets(projectId) dataset <- datasets[[1]] datasetId <- dataset$id model <- GetModel(projectId, modelId) jobId <- RequestPredictionExplanations(model, datasetId) predictionExplanationId <- GetPredictionExplanationsMetadataFromJobId(projectId, jobId)$id GetPredictionExplanationsMetadata(projectId, predictionExplanationId) ## End(Not run)
Retrieve the prediction explanations metadata for a model using jobId
GetPredictionExplanationsMetadataFromJobId(project, jobId, maxWait = 600)
GetPredictionExplanationsMetadataFromJobId(project, jobId, maxWait = 600)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
jobId |
integer. Unique integer identifier (return for example by
|
maxWait |
integer. The maximum time (in seconds) to wait for the model job to complete. |
A named list which contains prediction explanation metadata. For more information see
GetPredictionExplanationsMetadata
.
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" datasets <- ListPredictionDatasets(projectId) dataset <- datasets[[1]] datasetId <- dataset$id model <- GetModel(projectId, modelId) jobId <- RequestPredictionExplanations(model, datasetId) GetPredictionExplanationsMetadataFromJobId(projectId, jobId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" datasets <- ListPredictionDatasets(projectId) dataset <- datasets[[1]] datasetId <- dataset$id model <- GetModel(projectId, modelId) jobId <- RequestPredictionExplanations(model, datasetId) GetPredictionExplanationsMetadataFromJobId(projectId, jobId) ## End(Not run)
Retrieve all prediction explanations rows
GetPredictionExplanationsRows( project, predictionExplanationId, batchSize = NULL, excludeAdjustedPredictions = TRUE )
GetPredictionExplanationsRows( project, predictionExplanationId, batchSize = NULL, excludeAdjustedPredictions = TRUE )
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
predictionExplanationId |
character. Id of the prediction explanations. |
batchSize |
integer. Optional. Maximum number of prediction explanations rows to retrieve per request |
excludeAdjustedPredictions |
logical. Optional. Set to FALSE to include adjusted predictions, which are predictions adjusted by an exposure column. This is only relevant for projects that use an exposure column. |
list of raw prediction explanations, each element corresponds to a row of the prediction dataset and has following components.
rowId. Character string row Id.
prediction. prediction for the row.
predictionValues. list containing
label. describes what this model output corresponds to. For regression projects, it is the name of the target feature. For classification projects, it is a level from the target feature.
value. the output of the prediction. For regression projects, it is the predicted value of the target. For classification projects, it is the predicted probability the row belongs to the class identified by the label.
adjustedPrediction. adjusted predictions, if they are not excluded.
adjustedPredictionValues. Similar to predictionValues, but for adjusted predictions, if they are not excluded.
predictionExplanations. list containing
label. described what output was driven by this prediction explanation. For regression projects, it is the name of the target feature. For classification projects, it is the class whose probability increasing would correspond to a positive strength of this prediction explanation.
feature. the name of the feature contributing to the prediction.
featureValue. the value the feature took on for this row
strength. the amount this feature's value affected the prediction
qualitativeStrength. a human-readable description of how strongly the feature affected the prediction (e.g. '+++', '–', '+').
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" datasets <- ListPredictionDatasets(projectId) dataset <- datasets[[1]] datasetId <- dataset$id model <- GetModel(projectId, modelId) jobId <- RequestPredictionExplanations(model, datasetId) predictionExplanationId <- GetPredictionExplanationsMetadataFromJobId(projectId, jobId)$id GetPredictionExplanationsRows(projectId, predictionExplanationId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" datasets <- ListPredictionDatasets(projectId) dataset <- datasets[[1]] datasetId <- dataset$id model <- GetModel(projectId, modelId) jobId <- RequestPredictionExplanations(model, datasetId) predictionExplanationId <- GetPredictionExplanationsMetadataFromJobId(projectId, jobId)$id GetPredictionExplanationsRows(projectId, predictionExplanationId) ## End(Not run)
There are some groups of columns whose appearance depends on the exact contents of the project dataset. For classification projects, columns "classNLabel", 'classNProbability", "classNLabel", "classNProbability" will appear corresponding to each class within the target; these columns will not appear for regression projects. Columns like "explanationNLabel" will appear corresponding to each included prediction explanation in the row. In both cases, the value of N will start at 1 and count up.
GetPredictionExplanationsRowsAsDataFrame( project, predictionExplanationId, excludeAdjustedPredictions = TRUE, batchSize = NULL )
GetPredictionExplanationsRowsAsDataFrame( project, predictionExplanationId, excludeAdjustedPredictions = TRUE, batchSize = NULL )
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
predictionExplanationId |
character. Id of the prediction explanations. |
excludeAdjustedPredictions |
logical. Optional. Set to FALSE to include adjusted predictions, which are predictions adjusted by an exposure column. This is only relevant for projects that use an exposure column. |
batchSize |
integer. Optional. Maximum number of prediction explanations rows to retrieve per request |
data frame with following columns:
rowId integer. Row id from prediction dataset.
prediction numeric. The output of the model for this row (numeric prediction for regression problem, predicted class for classification problem).
class1Label character. Label of class 0. Available only for classification problem.
class1Probability numeric. Predicted probability of class 0. Available only for classification problem.
class2Label character. Label of class 1. Available only for classification problem.
class2Probability numeric. Predicted probability of class 1. Available only for classification problem.
explanation1FeatureName character. The name of the feature contributing to the prediction.
explanation1FeatureValue character. the value the feature took on for this row.
explanation1QualitativeStrength numeric. How strongly the feature affected the prediction.
explanation1Strength character. A human-readable description of how strongly the feature affected the prediction (e.g. '+++', '–', '+').
explanation1Label character. Describes what output was driven by this prediction explanation. For regression projects, it is the name of the target feature. For classification projects, it is the class whose probability increasing would correspond to a positive strength of this.
explanationNFeatureName character. The name of the feature contributing to the prediction.
explanationNFeatureValue character. The value the feature took on for this row.
explanationNQualitativeStrength numeric. How strongly the feature affected the prediction.
explanationNStrength character. A human-readable description of how strongly the feature affected the prediction (e.g. '+++', '–', '+').
explanationNLabel character. Describes what output was driven by this prediction explanation. For regression projects, it is the name of the target feature. For classification projects, it is the class whose probability increasing would correspond to a positive strength of this.
explanationNFeatureName. Character string the name of the feature contributing to the prediction.
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" datasets <- ListPredictionDatasets(projectId) dataset <- datasets[[1]] datasetId <- dataset$id model <- GetModel(projectId, modelId) jobId <- RequestPredictionExplanations(model, datasetId) predictionExplanationId <- GetPredictionExplanationsMetadataFromJobId(projectId, jobId)$id GetPredictionExplanationsRowsAsDataFrame(projectId, predictionExplanationId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" datasets <- ListPredictionDatasets(projectId) dataset <- datasets[[1]] datasetId <- dataset$id model <- GetModel(projectId, modelId) jobId <- RequestPredictionExplanations(model, datasetId) predictionExplanationId <- GetPredictionExplanationsMetadataFromJobId(projectId, jobId)$id GetPredictionExplanationsRowsAsDataFrame(projectId, predictionExplanationId) ## End(Not run)
This function can be used to retrieve predictions from a project and either
(1) a predictionId
specifying the ID for the predictions desired (use
ListPredictions
to see available predictionIds for individual prediction sets) or
(2) a predictionJobId
that comes from a call to RequestPredictions
.
This function will then return the predictions generated for the model and data.
GetPredictions( project, predictId, type = "response", classPrefix = "class_", maxWait = 600 )
GetPredictions( project, predictId, type = "response", classPrefix = "class_", maxWait = 600 )
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
predictId |
character or integer. Either can be the character id of the
|
type |
character. String specifying the type of response for binary classifiers; see Details. |
classPrefix |
character. For multiclass projects returning prediction probabilities, this prefix is prepended to each class in the header of the dataframe. Defaults to "class_". |
maxWait |
integer. The maximum time (in seconds) to wait for the prediction job to complete. |
The contents of the return vector depends on the modeling task - binary classification, multiclass classification, or regression; whether or not the underlying data is time series, multiseries, cross-series, or not time series; and the value of the 'type' parameter. For non-time-series regression tasks, the type parameter is ignored and a vector of numerical predictions of the response variable is returned.
For binary classification tasks, either
a vector of predicted responses is returned if type has the
value response
(the default), or a vector of probabilities
for the positive class is returned, if type is probability
.
You can also fetch the raw dataframe of prediction values using raw
.
For multiclass classification tasks, response
will return the predicted
class and probability
will return the probability of each class.
For time series tasks, 'type = "raw"' will return more detailed information on the time series prediction. This will also include any prediction intervals if requested.
This function will error if the requested job has errored or
if it has not completed within maxWait
seconds.
Vector of predictions, depending on the modeling task ("Binary", "Multiclass", or "Regression") and the value of the type parameter; see Details.
## Not run: # Retrieve by predictJobID dataset <- UploadPredictionDataset(project, diamonds_small) model <- ListModels(project)[[1]] modelId <- model$modelId predictJobId <- RequestPredictions(project, modelId, dataset$id) predictions <- GetPredictions(project, predictJobId) # Retrieve by predictionID predictions <- ListPredictions(project) predictions <- GetPredictions(project, predictions$predictionId[[1]]) ## End(Not run)
## Not run: # Retrieve by predictJobID dataset <- UploadPredictionDataset(project, diamonds_small) model <- ListModels(project)[[1]] modelId <- model$modelId predictJobId <- RequestPredictions(project, modelId, dataset$id) predictions <- GetPredictions(project, predictJobId) # Retrieve by predictionID predictions <- ListPredictions(project) predictions <- GetPredictions(project, predictions$predictionId[[1]]) ## End(Not run)
Request information about a predict job
GetPredictJob(project, predictJobId)
GetPredictJob(project, predictJobId)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
predictJobId |
Character string specifying the job id |
list with following elements:
Prediction job status; an element of JobStatus, e.g. JobStatus$Queue
Character string specifying the job id
Character string specifying the model from which predictions have been requested
Character string specifying the project that contains the model
## Not run: projectId <- "59a5af20c80891534e3c2bde" initialJobs <- GetPredictJobs(project) job <- initialJobs[[1]] predictJobId <- job$predictJobId GetPredictJob(projectId, predictJobId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" initialJobs <- GetPredictJobs(project) job <- initialJobs[[1]] predictJobId <- job$predictJobId GetPredictJob(projectId, predictJobId) ## End(Not run)
Function to list all prediction jobs in a project
GetPredictJobs(project, status = NULL)
GetPredictJobs(project, status = NULL)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
status |
character. The status of the desired jobs: one of JobStatus$Queue, JobStatus$InProgress, orJobStatus$Error. If NULL (default), queued and inprogress jobs are returned. |
Dataframe with one row for each prediction job in the queue, with the following columns:
Prediction job status; one of JobStatus$Queue, JobStatus$InProgress, or JobStatus$Error
Character string specifying the job id
Character string specifying the model from which predictions have been requested
Character string specifying the project that contains the model
## Not run: projectId <- "59a5af20c80891534e3c2bde" GetPredictJobs(projectId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" GetPredictJobs(projectId) ## End(Not run)
Check if model can be approximated with DataRobot Prime
GetPrimeEligibility(project, modelId)
GetPrimeEligibility(project, modelId)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
modelId |
character. Unique alphanumeric identifier for the model of interest. |
list with two members:
canMakePrime logical. TRUE if model can be approximated using DataRobot Prime, FALSE if model can not be approximated.
message character. Provides information why model may not be approximated with DataRobot Prime.
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" GetPrimeEligibility(projectId, modelId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" GetPrimeEligibility(projectId, modelId) ## End(Not run)
This function returns information about specified Prime file from a specified project.
GetPrimeFile(project, primeFileId)
GetPrimeFile(project, primeFileId)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
primeFileId |
numeric. Unique alphanumeric identifier for the primeFile to be retrieved. |
List with following elements:
Character string. Code programming language
logical flag indicating if code passed validation
Integer identifier for the ruleset
Unique alphanumeric identifier for the parent model
Unique alphanumeric identifier for the project
Unique alphanumeric identifier for the Prime file
Unique alphanumeric identifier for the model
## Not run: projectId <- "59a5af20c80891534e3c2bde" primeFiles <- ListPrimeFiles(projectId) primeFile <- primeFiles[[1]] primeFileId <- primeFile$id GetPrimeFile(projectId, primeFileId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" primeFiles <- ListPrimeFiles(projectId) primeFile <- primeFiles[[1]] primeFileId <- primeFile$id GetPrimeFile(projectId, primeFileId) ## End(Not run)
Retrieve a specific Prime file from a DataRobot project for corresponding jobId
GetPrimeFileFromJobId(project, jobId, maxWait = 600)
GetPrimeFileFromJobId(project, jobId, maxWait = 600)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
jobId |
numeric. Unique integer identifier (return for example by |
maxWait |
numeric. maximum time to wait (in sec) before job completed. |
List with following elements:
Character string. Code programming language
logical flag indicating if code passed validation
Integer identifier for the ruleset
Unique alphanumeric identifier for the parent model
Unique alphanumeric identifier for the project
Unique alphanumeric identifier for the Prime file
Unique alphanumeric identifier for the model
## Not run: projectId <- "59a5af20c80891534e3c2bde" initialJobs <- ListModelJobs(project) job <- initialJobs[[1]] modelJobId <- job$modelJobId GetPrimeFileFromJobId(projectId, modelJobId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" initialJobs <- ListModelJobs(project) job <- initialJobs[[1]] modelJobId <- job$modelJobId GetPrimeFileFromJobId(projectId, modelJobId) ## End(Not run)
This function requests the DataRobot Prime model information for the DataRobot project specified by the project argument, and modelId.
GetPrimeModel(project, modelId)
GetPrimeModel(project, modelId)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
modelId |
character. Unique alphanumeric identifier for the model of interest. |
The function returns list containing information about specified DataRobot Prime model.
list (classed as dataRobotPrimeModel
) containing information
about specified DataRobot Prime model.
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" GetPrimeModel(projectId, modelId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" GetPrimeModel(projectId, modelId) ## End(Not run)
Retrieve information about specified DataRobot Prime model using corresponding jobId.
GetPrimeModelFromJobId(project, jobId, maxWait = 600)
GetPrimeModelFromJobId(project, jobId, maxWait = 600)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
jobId |
Unique integer identifier (return for example by RequestPrimeModel) |
maxWait |
maximum time to wait (in sec) before job completed |
list (classed as dataRobotPrimeModel
) containing information
about specified DataRobot Prime model.
## Not run: projectId <- "59a5af20c80891534e3c2bde" initialJobs <- ListModelJobs(project) job <- initialJobs[[1]] modelJobId <- job$modelJobId GetPrimeModelFromJobId(projectId, modelJobId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" initialJobs <- ListModelJobs(project) job <- initialJobs[[1]] modelJobId <- job$modelJobId GetPrimeModelFromJobId(projectId, modelJobId) ## End(Not run)
Returns a list of details about the DataRobot modeling project specified by project.
GetProject(project)
GetProject(project)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
An S3 object of class 'dataRobotProject', consisting of the following elements:
projectId. Character string giving the unique project identifier.
projectName. Character string giving the name assigned to the project.
fileName. Character string giving the name of the modeling dataset for the project.
stage. Character string describing the stage of the DataRobot Autopilot.
autopilotMode. Numeric: 0 for fully automatic mode; 1 for semi-automatic mode; 2 for manual mode.
created. Character string representation of the project creation time and date.
target. Name of the target variable from fileName.
metric. Character string specifying the metric optimized by all project models.
partition. A 7-element list describing the data partitioning for model fitting and cross validation.
advancedOptions. A 4-element list with advanced option specifications.
positiveClass. Character string: name of positive class for binary response models.
maxTrainPct. The maximum percentage of the project dataset that can be used without going into the validation data or being too large to submit any blueprint for training a project.
maxTrainRows. The maximum number of rows that can be trained on without going into the validation data or being too large to submit any blueprint for training.
holdoutUnlocked. A logical flag indicating whether the holdout dataset has been used for model evaluation.
targetType. Character string specifying the type of modeling problem (e.g., regression or binary classification).
## Not run: projectId <- "59a5af20c80891534e3c2bde" GetProject(projectId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" GetProject(projectId) ## End(Not run)
This function polls the DataRobot Autopilot for the status of the project specified by the project parameter.
GetProjectStatus(project)
GetProjectStatus(project)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
List with the following three components:
Logical flag indicating whether the Autopilot has completed
Character string specifying the Autopilot stage
Character string interpreting the Autopilot stage value
## Not run: projectId <- "59a5af20c80891534e3c2bde" GetProjectStatus(projectId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" GetProjectStatus(projectId) ## End(Not run)
Retrieve a single rating table.
GetRatingTable(project, ratingTableId)
GetRatingTable(project, ratingTableId)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
ratingTableId |
character. The ID of the rating table. |
An S3 object of class 'dataRobotRatingTable' summarizing all available information about the rating table.
## Not run: projectId <- "5984b4d7100d2b31c1166529" ratingTableId <- "5984b4d7100d2b31c1166529" GetRatingTable(projectId, ratingTableId) ## End(Not run)
## Not run: projectId <- "5984b4d7100d2b31c1166529" ratingTableId <- "5984b4d7100d2b31c1166529" GetRatingTable(projectId, ratingTableId) ## End(Not run)
Get a rating table from the rating table job metadata.
GetRatingTableFromJobId(project, ratingTableJobId, maxWait = 600)
GetRatingTableFromJobId(project, ratingTableJobId, maxWait = 600)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
ratingTableJobId |
integer. The job ID returned by |
maxWait |
integer. The maximum time (in seconds) to wait for the retrieve to complete. |
An S3 object of class 'dataRobotRatingTable' summarizing all available information about the rating table.
## Not run: projectId <- "5984b4d7100d2b31c1166529" modelId <- "5984b4d7100d2b31c1166529" ratingTableJobId <- CreateRatingTable(projectId, modelId, dataSource = "myRatingTable.csv") GetRatingTableFromJobId(projectId, ratingTableJobId) ## End(Not run)
## Not run: projectId <- "5984b4d7100d2b31c1166529" modelId <- "5984b4d7100d2b31c1166529" ratingTableJobId <- CreateRatingTable(projectId, modelId, dataSource = "myRatingTable.csv") GetRatingTableFromJobId(projectId, ratingTableJobId) ## End(Not run)
Retrieve information about specified model with a rating table.
GetRatingTableModel(project, modelId)
GetRatingTableModel(project, modelId)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
modelId |
character. Unique alphanumeric identifier for the model of interest. |
list containing information about specified model with a rating table.
## Not run: projectId <- "5984b4d7100d2b31c1166529" modelId <- "5984b4d7100d2b31c1166529" GetRatingTableModel(projectId, modelId) ## End(Not run)
## Not run: projectId <- "5984b4d7100d2b31c1166529" modelId <- "5984b4d7100d2b31c1166529" GetRatingTableModel(projectId, modelId) ## End(Not run)
Retrieve a new or updated rating table model defined by a job ID.
GetRatingTableModelFromJobId(project, ratingTableModelJobId, maxWait = 600)
GetRatingTableModelFromJobId(project, ratingTableModelJobId, maxWait = 600)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
ratingTableModelJobId |
integer. The ID returned by |
maxWait |
integer. The maximum time (in seconds) to wait for the retrieve to complete. |
An S3 object of class 'dataRobotRatingTableModel' summarizing all available information about the model.
## Not run: projectId <- "59a5af20c80891534e3c2bde" ratingTableId <- "5984b4d7100d2b31c1166529" ratingTableModelJobId <- RequestNewModel(projectId, ratingTableId) GetRatingTableModelFromJobId(project, ratingTableModelJobId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" ratingTableId <- "5984b4d7100d2b31c1166529" ratingTableModelJobId <- RequestNewModel(projectId, ratingTableId) GetRatingTableModelFromJobId(project, ratingTableModelJobId) ## End(Not run)
See GetModelRecommendation
for details.
GetRecommendedModel(project, type = RecommendedModelType$FastAccurate)
GetRecommendedModel(project, type = RecommendedModelType$FastAccurate)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
type |
character. The type of recommendation to retrieve. See
|
The model object corresponding with that recommendation
## Not run: projectId <- "5984b4d7100d2b31c1166529" GetRecommendedModel(projectId) ## End(Not run)
## Not run: projectId <- "5984b4d7100d2b31c1166529" GetRecommendedModel(projectId) ## End(Not run)
Retrieve residuals chart data for a model for a data partition (see DataPartition).
GetResidualsChart( model, source = DataPartition$VALIDATION, fallbackToParentInsights = FALSE )
GetResidualsChart( model, source = DataPartition$VALIDATION, fallbackToParentInsights = FALSE )
model |
dataRobotModel. A DataRobot model object like that returned by |
source |
character. The data partition for which data would be returned. Default is
|
fallbackToParentInsights |
logical. If TRUE, this will return the residuals chart data for the model's parent if the residuals chart is not available for the model and the model has a parent model. |
list with a single object containing residuals chart data whose name matches the source
requested. See DataPartition
for details. This object has the following components:
residualMean. Numeric: the arithmetic mean of the predicted value minus the actual value over the downsampled dataset.
coefficientOfDetermination. Numeric: aka the r-squared value. This value is calculated over the downsampled output, not the full input.
data. data.frame: The rows of chart data in [actual, predicted, residual, rowNumber]
form. If the row number was not available at the time of model creation, or if working with
DataRobot 5.2, which does not provide rowNumber in the API response, the rowNumber will be
NA.
histogram. list: Data to plot a histogram of residual values. Each object contains:
intervalStart. Numeric: Start value for an interval, inclusive.
intervalEnd. Numeric: End value for an interval, exclusive for all but the last interval.
occurrences. Integer: the number of times the predicted value fits within the interval.
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) GetResidualsChart(model, source = DataPartition$VALIDATION) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) GetResidualsChart(model, source = DataPartition$VALIDATION) ## End(Not run)
Retrieve ROC curve data for a model for a particular data partition (see DataPartition)
GetRocCurve( model, source = DataPartition$VALIDATION, fallbackToParentInsights = FALSE )
GetRocCurve( model, source = DataPartition$VALIDATION, fallbackToParentInsights = FALSE )
model |
dataRobotModel. A DataRobot model object like that returned by |
source |
character. The data partition for which data would be returned. Default is
|
fallbackToParentInsights |
logical. If TRUE, this will return the lift chart data for the model's parent if the lift chart is not available for the model and the model has a parent model. |
list with the following components:
source. Character: data partition for which ROC curve data is returned (see DataPartition).
negativeClassPredictions. Numeric: example predictions for the negative class.
rocPoints. data.frame: each row represents pre-calculated metrics (accuracy, f1_score, false_negative_score, true_negative_score, true_positive_score, false_positive_score, true_negative_rate, false_positive_rate, true_positive_rate, matthews_correlation_coefficient, positive_predictive_value, negative_predictive_value, threshold) associated with different thresholds for the ROC curve.
positiveClassPredictions. Numeric: example predictions for the positive class.
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) GetRocCurve(model) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) GetRocCurve(model) ## End(Not run)
This function will return list of rulesets that could be used to approximate the specified model. Rulesets are created using the RequestApproximation function. If model hasn't been approximated yet, will return empty list
GetRulesets(project, modelId)
GetRulesets(project, modelId)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
modelId |
Unique alphanumeric identifier for the model of interest. |
A list of lists with one element for each ruleset. If there are no rulesets created for a model then an empty list is returned If the group is not empty,a list is returned with the following elements:
projectId. Character string giving the unique identifier for the project.
rulesetId. Integer number giving the identifier for the ruleset.
score. Score of ruleset (using project leaderboard metric).
parentModelId. Character string giving the unique identifier for the parent model.
ruleCount. integer: number of rules in ruleset.
modelId. Character string giving the unique identifier for a model using the ruleset. May be NULL if no model using the ruleset has been created yet.
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" GetRulesets(projectId, modelId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" GetRulesets(projectId, modelId) ## End(Not run)
Get the computed series accuracy for a model, computing it if not already computed.
GetSeriesAccuracy(model, maxWait = 600)
GetSeriesAccuracy(model, maxWait = 600)
model |
character. The model for which you want to compute Feature Impact, e.g.
from the list of models returned by |
maxWait |
integer. How long (in seconds) to wait for series accuracy computation before raising a timeout error? Default 600. |
data.frame with items:
multiseriesId character. The ID of the series.
rowCount integer. The number of rows in the series.
multiseriesValues character. The name of the series.
duration character. The duration of the series.
validationScore numeric. The validation score for the series.
backtestingScore numeric. The score on backtests for the series. See
ScoreBacktests
.
holdoutScore numeric. The score for the series on the holdout set.
## Not run: projectId <- "5984b4d7100d2b31c1166529" modelId <- "5984b4d7100d2b31c1166529" model <- GetModel(projectId, modelId) seriesAccuracy <- GetSeriesAccuracy(model) ## End(Not run)
## Not run: projectId <- "5984b4d7100d2b31c1166529" modelId <- "5984b4d7100d2b31c1166529" model <- GetModel(projectId, modelId) seriesAccuracy <- GetSeriesAccuracy(model) ## End(Not run)
This will not work if you have not separately computed series accuracy via
RequestSeriesAccuracy
. See GetSeriesAccuracy
for a function that
will get series accuracy and also compute it automatically if it has not already been compute.
GetSeriesAccuracyForModel(model)
GetSeriesAccuracyForModel(model)
model |
character. The model for which you want to compute Feature Impact, e.g.
from the list of models returned by |
data.frame with items:
multiseriesId character. The ID of the series.
rowCount integer. The number of rows in the series.
multiseriesValues character. The name of the series.
duration character. The duration of the series.
validationScore numeric. The validation score for the series.
backtestingScore numeric. The score on backtests for the series. See
ScoreBacktests
.
holdoutScore numeric. The score for the series on the holdout set.
## Not run: projectId <- "5984b4d7100d2b31c1166529" modelId <- "5984b4d7100d2b31c1166529" model <- GetModel(projectId, modelId) jobId <- RequestSeriesAccuracy(projectId, modelId) WaitForJobToComplete(projectId, jobId) seriesAccuracy <- GetSeriesAccuracyForModel(model) ## End(Not run)
## Not run: projectId <- "5984b4d7100d2b31c1166529" modelId <- "5984b4d7100d2b31c1166529" model <- GetModel(projectId, modelId) jobId <- RequestSeriesAccuracy(projectId, modelId) WaitForJobToComplete(projectId, jobId) seriesAccuracy <- GetSeriesAccuracyForModel(model) ## End(Not run)
Handle server side pagination.
GetServerDataInRows(serverData, batchSize = 50)
GetServerDataInRows(serverData, batchSize = 50)
serverData |
list. Raw JSON parsed list returned from the server. |
batchSize |
integer. The number of requests per page to expect. |
The time series feature derivation log provides details about the feature generation process for a time series project. It includes information about which features are generated and their priority, as well as the detected properties of the time series data such as whether the series is stationary, and periodicities detected.
GetTimeSeriesFeatureDerivationLog(project, offset = NULL, limit = NULL)
GetTimeSeriesFeatureDerivationLog(project, offset = NULL, limit = NULL)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
offset |
integer. Optional. Default is 0. This many results will be skipped. |
limit |
integer. Optional. Defaults to 100. At most this many results are returned. To specify no limit, use 0. The default may change without notice. |
This route is only supported for time series projects that have finished partitioning. The time series feature log will include information about:
Detected stationarity of the series (e.g. "Series detected as non-stationary")
Detected presence of multiplicative trend in the series (e.g., "Multiplicative trend detected")
Any periodicities (e.g., "Detected periodicities: 7 day")
Maximum number of feature to be generated (e.g., "Maximum number of feature to be generated is 1440")
Window sizes used in rolling statistics / lag extractors (e.g., "The window sizes chosen to be: 2 months") (because the time step is 1 month and Feature Derivation Window is 2 months)
Features that are specified as known-in-advance (e.g., "Variables treated as known in advance: holiday")
Details about why certain variables are transformed in the input data (e.g., "Generating variable "y (log)" from "y" because multiplicative trend is detected")
Details about features generated as time series features, and their priority (e.g., "Generating feature "date (actual)" from "date" (priority: 1)")
Returns the feature log output
## Not run: projectId <- "5984b4d7100d2b31c1166529" GetTimeSeriesFeatureDerivationLog(projectId) ## End(Not run)
## Not run: projectId <- "5984b4d7100d2b31c1166529" GetTimeSeriesFeatureDerivationLog(projectId) ## End(Not run)
Simplify the training prediction rows into a tidy format dataframe.
GetTrainingPredictionDataFrame(rows)
GetTrainingPredictionDataFrame(rows)
rows |
data.frame. The dataframe to tidy. |
Training predictions are the internal out-of-fold predictions for data that was used to train the model. These predictions are especially useful for creating stacked models or blenders.
GetTrainingPredictions(project, predictionId)
GetTrainingPredictions(project, predictionId)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
predictionId |
character. ID of the prediction to retrieve training predictions for. |
## Not run: projectId <- "59a5af20c80891534e3c2bde" predictions <- ListTrainingPredictions(projectId) predictionId <- predictions[[1]]$id trainingPredictions <- GetTrainingPredictions(projectId, predictionId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" predictions <- ListTrainingPredictions(projectId) predictionId <- predictions[[1]]$id trainingPredictions <- GetTrainingPredictions(projectId, predictionId) ## End(Not run)
Training predictions are the internal out-of-fold predictions for data that was used to train the model. These predictions are especially useful for creating stacked models or blenders.
GetTrainingPredictionsForModel(model, dataSubset = "all", maxWait = 600)
GetTrainingPredictionsForModel(model, dataSubset = "all", maxWait = 600)
model |
dataRobotModel. The model to get training predictions for. |
dataSubset |
character. What data subset would you like to predict on?
Possible options are included in
|
maxWait |
integer. The maximum time (in seconds) to wait for the model job to complete. |
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) trainingPredictions <- GetTrainingPredictionsFromModel(model) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) trainingPredictions <- GetTrainingPredictionsFromModel(model) ## End(Not run)
Retrieve the training predictions for a model using a job id.
GetTrainingPredictionsFromJobId(project, jobId, maxWait = 600)
GetTrainingPredictionsFromJobId(project, jobId, maxWait = 600)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
jobId |
integer. Unique integer identifier (return for example by
|
maxWait |
integer. The maximum time (in seconds) to wait for the model job to complete. |
A dataframe with out-of-fold predictions for the training data.
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) jobId <- RequestTrainingPredictions(model, dataSubset = "all") trainingPredictions <- GetTrainingPredictionsFromJobId(projectId, jobId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) jobId <- RequestTrainingPredictions(model, dataSubset = "all") trainingPredictions <- GetTrainingPredictionsFromJobId(projectId, jobId) ## End(Not run)
Retrieve imported model info using import id
GetTransferableModel(importId)
GetTransferableModel(importId)
importId |
character. Id of the import. |
A list describing uploaded transferable model with the following components:
note. Character string Manually added node about this imported model.
datasetName. Character string Filename of the dataset used to create the project the model belonged to.
modelName. Character string Model type describing the model generated by DataRobot.
displayName. Character string Manually specified human-readable name of the imported model.
target. Character string The target of the project the model belonged to prior to export.
projectName. Character string Name of the project the model belonged to prior to export.
importedByUsername. Character string Username of the user who imported the model.
importedAt. Character string The time the model was imported.
version. Numeric Project version of the project the model belonged to.
projectId. Character id of the project the model belonged to prior to export.
featurelistName. Character string Name of the featurelist used to train the model.
createdByUsername. Character string Username of the user who created the model prior to export.
importedById. Character string id of the user who imported the model.
id. Character string id of the import.
createdById. Character string id of the user who created the model prior to export.
modelId. Character string original id of the model prior to export.
originUrl. Character string URL.
Other Transferable Model functions:
DeleteTransferableModel()
,
DownloadTransferableModel()
,
ListTransferableModels()
,
RequestTransferableModel()
,
UpdateTransferableModel()
,
UploadTransferableModel()
## Not run: id <- UploadTransferableModel("model.drmodel") GetTransferableModel(id) ## End(Not run)
## Not run: id <- UploadTransferableModel("model.drmodel") GetTransferableModel(id) ## End(Not run)
Retrieve data on tuning parameters for a particular model.
GetTuningParameters(model)
GetTuningParameters(model)
model |
dataRobotModel. A DataRobot model object to get tuning parameters for. |
A list detailing the following about each tuning parameter:
currentValue character. The current searched values of that parameter.
defaultValue character. The default value of that parameter.
parameterId character. A unique ID for that particular parameter.
parameterName character. The name of the tuning parameter.
taskName character. The name of the task the parameter is for.
constraints list. A list describing constraints on the possible values for the parameter.
Will be one of int
or float
specifying a min
and max
value, or
will be select
and will specify possible values from a list of choices. int
and
float
correspond with integer and floating-point parameter spaces respectively. It is
possible for a parameter to be multiple types. Lastly, some parameters will also have a
supportsGridSearch
logical for whether or not that parameter can be grid searched
or not.
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) GetTuningParameters(model) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) GetTuningParameters(model) ## End(Not run)
For the response variable defined by the character string target and the project defined by the parameter project, return the vector of metric names that can be specified for fitting models in this project. This function is intended for use after SetupProject has been run but before SetTarget, allowing the user to specify valid non-default values for the metric parameter.
GetValidMetrics(project, target)
GetValidMetrics(project, target)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
target |
character. String giving the name of the response variable to be predicted by all project models. |
Character vector containing the names of the metric values that are valid for a subsequent call to the SetTarget function.
## Not run: projectId <- "59a5af20c80891534e3c2bde" GetValidMetrics(projectId, "targetFeature") ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" GetValidMetrics(projectId, "targetFeature") ## End(Not run)
Retrieve word cloud data for a model.
GetWordCloud(project, modelId, excludeStopWords = FALSE)
GetWordCloud(project, modelId, excludeStopWords = FALSE)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
modelId |
character. Unique alphanumeric identifier for the model of interest. |
excludeStopWords |
logical. Optional. Set to TRUE if you want stopwords filtered out the response. |
data.frame with the following components:
character. word or ngram value
numeric. value from [-1.0, 1.0] range, describes effect of this ngram on the target. A large negative value means a strong effect toward the negative class in classification projects and a smaller predicted target value in regression projects. A large positive value means a strong effect toward the positive class and a larger predicted target value respectively
numeric. value from (0.0, 1.0] range, frequency of this ngram relative to the most frequent ngram
integer. number of rows in the training sample where this ngram appears
logical. true for ngrams that DataRobot evaluates as stopwords
character. Optional. Added in DataRobot API 2.19. String representation of the ngram source. Contains the column name and, for some models, preprocessing details. For example, 'NGRAM_OCCUR_L2_cname' represents the ngram occurrences count using L2 normalization from the cname column
character. Optional. Added in DataRobot API 2.19. Values of the target class for the corresponding word or ngram. For regression, NA
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" GetWordCloud(projectId, modelId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" GetWordCloud(projectId, modelId) ## End(Not run)
Request anomaly assessment insight computation on the specified subset.
InitializeAnomalyAssessment( projectId, modelId, backtest, source, seriesId = NULL )
InitializeAnomalyAssessment( projectId, modelId, backtest, source, seriesId = NULL )
projectId |
character. The ID of the project to compute insight for. |
modelId |
character. The ID of the model to compute insight for. |
backtest |
integer or "holdout". The backtest to compute insight for. |
source |
"training" or "validation". The source to compute insight for. |
seriesId |
character. Optional. The series id to compute insight for. Required for multiseries projects. |
An object with anomaly assessment metadata:
recordId. character. The ID of the record.
projectId. character. The project ID of the record.
modelId. character. The model ID of the record.
backtest. character. The backtest of the record.
source. character. The source of the record.
seriesId. character. the series ID of the record.
status. character. The status of the insight.
statusDetails. character. The explanation of the status.
startDate. POSIXct. Timestamp of the first prediction in the subset. Will be NULL if
status is not completed
.
endDate. POSIXct. Timestamp of the last prediction in the subset. Will be NULL
if status is not completed
.
predictionThreshold. numeric. The threshold, all rows with anomaly scores greater or
equal to it have shap explanations computed. Will be NULL if status is not completed
.
previewLocation. character. URL to retrieve predictions preview for the subset.
Will be NULL if status is not completed
.
latestExplanationsLocation. character. the URL to retrieve the latest predictions with
the shap explanations. Will be NULL if status is not completed
.
deleteLocation. character. the URL to delete anomaly assessment record and relevant insight data.
Other Anomaly Assessment functions:
DeleteAnomalyAssessmentRecord()
,
GetAnomalyAssessmentExplanations()
,
GetAnomalyAssessmentPredictionsPreview()
,
ListAnomalyAssessmentRecords()
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "59a5af20c80891534e3c2bdd" record <- InitializeAnomalyAssessment(projectId, modelId, backtest=0, source="validation", seriesId="Baltimore") ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "59a5af20c80891534e3c2bdd" record <- InitializeAnomalyAssessment(projectId, modelId, backtest=0, source="validation", seriesId="Baltimore") ## End(Not run)
Check whether individual models can be blended together
IsBlenderEligible(project, modelIds, blendMethod)
IsBlenderEligible(project, modelIds, blendMethod)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
modelIds |
list. A list of model ids corresponding to the models to check. |
blendMethod |
character. The blender method to check. See |
List with:
blendable logical. Whether or not the models can be blended.
reason character. An explanation for why the models cannot be blended, if not
blendable. Otherwise ""
.
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelsToBlend <- c("5996f820af07fc605e81ead4", "59a5ce3301e9f0296721c64c") IsBlenderEligible(projectId, modelId, "GLM") ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelsToBlend <- c("5996f820af07fc605e81ead4", "59a5ce3301e9f0296721c64c") IsBlenderEligible(projectId, modelId, "GLM") ## End(Not run)
Checks if an id is a valid DataRobot ID (24 character string)
IsId(id)
IsId(id)
id |
character. An ID to test whether it is a valid DataRobot ID. |
Check if a parameter is in a list of possibilities.
IsParameterIn( paramValue, paramPossibilities, allowNULL = TRUE, paramName = NULL )
IsParameterIn( paramValue, paramPossibilities, allowNULL = TRUE, paramName = NULL )
paramValue |
object. The parameter value to check. |
paramPossibilities |
vector. A vector of possible values for the parameter. |
allowNULL |
logical. Whether or not to allow NULL as a possibility. |
paramName |
character. The name of the parameter to check. |
TRUE if paramValue
is valid, otherwise returns an error message.
## Not run: IsParameterIn("all", DataSubset) ## End(Not run)
## Not run: IsParameterIn("all", DataSubset) ## End(Not run)
This is a list that contains the valid values for job status when querying the list of jobs mode. If you wish, you can specify job status modes using the list values, e.g. JobStatus$InProgress instead of typing the string "inprogress". This way you can benefit from autocomplete and not have to remember the valid options.
JobStatus
JobStatus
An object of class list
of length 5.
This is a list that contains the valid values for job type when querying the list of jobs.
JobType
JobType
An object of class list
of length 10.
Retrieve anomaly assessment records.
ListAnomalyAssessmentRecords( projectId, modelId, backtest = NULL, source = NULL, seriesId = NULL, limit = 100, offset = 0 )
ListAnomalyAssessmentRecords( projectId, modelId, backtest = NULL, source = NULL, seriesId = NULL, limit = 100, offset = 0 )
projectId |
character. The ID of the project. |
modelId |
character. The ID of the model. |
backtest |
integer or "holdout". Optional. The backtest to filter records by. |
source |
"training" or "validation". Optional. The source of the data to filter records by. |
seriesId |
character. Optional. Can be specified for multiseries projects. The series id to filter records by. |
limit |
integer, greater than zero. Optional. Defaults to 100. At most this many results are returned. The default may change without notice. |
offset |
integer. Optional. Default is 0. This many results will be skipped. |
A list of objects with anomaly assessment metadata:
recordId. character. The ID of the record.
projectId. character. The project ID of the record.
modelId. character. The model ID of the record.
backtest. character. The backtest of the record.
source. character. The source of the record.
seriesId. character. the series ID of the record.
status. character. The status of the insight.
statusDetails. character. The explanation of the status.
startDate. POSIXct. Timestamp of the first prediction in the subset. Will be NULL if
status is not completed
.
endDate. POSIXct. Timestamp of the last prediction in the subset. Will be NULL
if status is not completed
.
predictionThreshold. numeric. The threshold, all rows with anomaly scores greater or
equal to it have shap explanations computed. Will be NULL if status is not completed
.
previewLocation. character. URL to retrieve predictions preview for the subset.
Will be NULL if status is not completed
.
latestExplanationsLocation. character. the URL to retrieve the latest predictions with
the shap explanations. Will be NULL if status is not completed
.
deleteLocation. character. the URL to delete anomaly assessment record and relevant insight data.
Other Anomaly Assessment functions:
DeleteAnomalyAssessmentRecord()
,
GetAnomalyAssessmentExplanations()
,
GetAnomalyAssessmentPredictionsPreview()
,
InitializeAnomalyAssessment()
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "59a5af20c80891534e3c2bdd" records <- ListAnomalyAssessmentRecords(projectId, modelId, backtest=0, seriesId="Baltimore") ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "59a5af20c80891534e3c2bdd" records <- ListAnomalyAssessmentRecords(projectId, modelId, backtest=0, seriesId="Baltimore") ## End(Not run)
This function returns the list of available blueprints for a specified modeling project, as an S3 object of class listOfBlueprints; see Value.
ListBlueprints(project)
ListBlueprints(project)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
An S3 object of class 'listOfBlueprints', a list with one element for each recommended blueprint in the associated project. For more information see GetBlueprint()
## Not run: projectId <- "59a5af20c80891534e3c2bde" ListBlueprints(projectId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" ListBlueprints(projectId) ## End(Not run)
List all available calendars.
ListCalendars()
ListCalendars()
A list of S3 objects of class "dataRobotCalendar"
## Not run: ListCalendars() ## End(Not run)
## Not run: ListCalendars() ## End(Not run)
Retrieve information about all compliance doc templates.
ListComplianceDocTemplates(namePart = NULL, limit = NULL, offset = NULL)
ListComplianceDocTemplates(namePart = NULL, limit = NULL, offset = NULL)
namePart |
character. Return only compliance doc templates that have a name that contains this string. |
limit |
integer. Return only this many compliance doc templates. |
offset |
integer. Skip this many compliance doc templates before returning. |
list of available compliance doc templates. Contains:
name character. The name of the compliance doc template.
creatorUsername character. The name of the user who created the compliance doc template.
orgId character. The ID of the organization of the creator user.
creatorId character. The ID of the creator user.
sections list. The list of sections that define the template.
id character. The ID of the template.
## Not run: # Get all compliance doc templates ListComplianceDocTemplates() Get the first three compliance doc templates with names that contain "foo". ListComplianceDocTemplates(namePart = "foo", limit = 3) ## End(Not run)
## Not run: # Get all compliance doc templates ListComplianceDocTemplates() Get the first three compliance doc templates with names that contain "foo". ListComplianceDocTemplates(namePart = "foo", limit = 3) ## End(Not run)
Note that the confusion chart for source = "crossValidation"
will not be available
unless cross validation has been run for that model. Also, the confusion chart
for source = "holdout"
will not be available unless the holdout has been unlocked for
the project.
ListConfusionCharts(model, fallbackToParentInsights = FALSE)
ListConfusionCharts(model, fallbackToParentInsights = FALSE)
model |
dataRobotModel. A DataRobot model object like that returned by |
fallbackToParentInsights |
logical. If TRUE, this will return the lift chart data for the model's parent if the lift chart is not available for the model and the model has a parent model. |
A list of all confusion charts for the model, one for each partition type
found in DataPartition
.
## Not run: modelId <- "5996f820af07fc605e81ead4" ListConfusionCharts(modelId) ## End(Not run)
## Not run: modelId <- "5996f820af07fc605e81ead4" ListConfusionCharts(modelId) ## End(Not run)
Returns a dataframe with information on available data sources.
ListDataSources()
ListDataSources()
data.frame containing information on possible data sources.
## Not run: ListDataSources() ## End(Not run)
## Not run: ListDataSources() ## End(Not run)
Returns a dataframe with information on available data stores.
ListDataStores()
ListDataStores()
data.frame containing information on possible data stores.
## Not run: ListDataStores() ## End(Not run)
## Not run: ListDataStores() ## End(Not run)
List all current model deployments.
ListDeployments(orderBy = NULL, search = NULL)
ListDeployments(orderBy = NULL, search = NULL)
orderBy |
string. Optional. the order to sort the deployment list by, defaults to
If the sort attribute is preceded by a hyphen, deployments will be sorted in descending order, otherwise in ascending order. For health related sorting, ascending means failing, warning, passing, unknown. |
search |
string. Optional. Case insensitive search against deployment labels and descriptions. |
A list of DataRobotDeployment objects containing:
id character. The ID of the deployment.
label character. The label of the deployment.
description character. The description of the deployment.
defaultPredictionServer list. Information on the default prediction
server connected with the deployment. See ListPredictionServers
for details.
model dataRobotModel. The model associated with the deployment.
See GetModel
for details.
capabilities list. Information on the capabilities of the deployment.
predictionUsage list. Information on the prediction usage of the deployment.
permissions list. User's permissions on the deployment.
serviceHealth list. Information on the service health of the deployment.
modelHealth list. Information on the model health of the deployment.
accuracyHealth list. Information on the accuracy health of the deployment.
## Not run: ListDeployments() ## End(Not run)
## Not run: ListDeployments() ## End(Not run)
Returns a dataframe with information on available drivers.
ListDrivers()
ListDrivers()
data.frame containing information on possible drivers.
## Not run: ListDrivers() ## End(Not run)
## Not run: ListDrivers() ## End(Not run)
Details about all features for this project
ListFeatureInfo(project)
ListFeatureInfo(project)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
A named list which contains:
id numeric. feature id. Note that throughout the API, features are specified using their names, not this ID.
name character. The name of the feature.
featureType character. Feature type: 'Numeric', 'Categorical', etc.
importance numeric. numeric measure of the strength of relationship between the feature and target (independent of any model or other features).
lowInformation logical. Whether the feature has too few values to be informative.
uniqueCount numeric. The number of unique values in the feature.
naCount numeric. The number of missing values in the feature.
dateFormat character. The format of the feature if it is date-time feature.
projectId character. Character id of the project the feature belongs to.
max. The maximum value in the dataset, formatted in the same format as the data.
min. The minimum value in the dataset, formatted in the same format as the data.
mean. The arithmetic mean of the dataset, formatted in the same format as the data.
median. The median of the dataset, formatted in the same format as the data.
stdDev. The standard deviation of the dataset, formatted in the same format as the data.
timeSeriesEligible logical. Whether this feature can be used as the datetime partition column in a time series project.
timeSeriesEligibilityReason character. Why the feature is ineligible for the datetime partition column in a time series project, "suitable" when it is eligible.
crossSeriesEligible logical. Whether the cross series group by column is eligible for cross-series modeling. Will be NULL if no cross series group by column is used.
crossSeriesEligibilityReason character. The type of cross series eligibility (or ineligibility).
timeStep numeric. For time-series eligible features, a positive integer determining the interval at which windows can be specified. If used as the datetime partition column on a time series project, the feature derivation and forecast windows must start and end at an integer multiple of this value. NULL for features that are not time series eligible.
timeUnit character. For time series eligible features, the time unit covered by a single time step, e.g. "HOUR", or NULL for features that are not time series eligible.
targetLeakage character. Whether a feature is considered to have target leakage or not. A value of "SKIPPED_DETECTION" indicates that target leakage detection was not run on the feature.
keySummary data.frame. Optional. Descriptive statistics for this feature, iff it is a summarized categorical feature. This data.frame contains:
key. The name of the key.
summary. Descriptive statistics for this key, including:
max. The maximum value in the dataset.
min. The minimum value in the dataset.
mean. The arithmetic mean of the dataset.
median. The median of the dataset.
stdDev. The standard deviation of the dataset.
pctRows. The percentage of rows (from the EDA sample) in which this key occurs.
Other feature functions:
GetFeatureInfo()
,
ListModelFeatures()
,
as.dataRobotFeatureInfo()
## Not run: projectId <- "59a5af20c80891534e3c2bde" ListFeatureInfo(projectId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" ListFeatureInfo(projectId) ## End(Not run)
This function returns an S3 object of class listOfFeaturelists that describes all featurelists (i.e., lists of modeling variables) available for the project specified by the project parameter. This list may be converted to a dataframe with the as.data.frame method for objects of class listOfFeaturelists.
ListFeaturelists(project)
ListFeaturelists(project)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
An S3 object of class 'listOfFeaturelists', which is a list of dataframes: each element of the list corresponds to one featurelist associated with the project, and each dataframe has one row and the following four columns:
featurelistId. Unique alphanumeric identifier for the featurelist.
projectId. Unique alphanumeric project identifier.
features. Comma-separated character string listing the variables included in the featurelist.
name. Character string giving the name of the featurelist.
## Not run: projectId <- "59a5af20c80891534e3c2bde" ListFeaturelists(projectId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" ListFeaturelists(projectId) ## End(Not run)
This function requests information about the jobs that go through the DataRobot queue.
ListJobs(project, status = NULL)
ListJobs(project, status = NULL)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
status |
character. The status of the desired jobs: one of JobStatus$Queue, JobStatus$InProgress, orJobStatus$Error. If NULL (default), queued and inprogress jobs are returned. |
A list of lists with one element for each job. The named list for each job contains:
status character. Model job status; an element of JobStatus
, e.g.
JobStatus$Queue.
url character. URL to request more detail about the job.
id character. The job id.
jobType character. See JobType
for valid values.
projectId character. The project that contains the model.
isBlocked logical. If TRUE, the job is blocked (cannot be executed) until its dependencies are resolved.
## Not run: projectId <- "59a5af20c80891534e3c2bde" ListJobs(projectId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" ListJobs(projectId) ## End(Not run)
Retrieve lift chart data for a model for all available data partitions (see DataPartition)
ListLiftCharts(model, fallbackToParentInsights = FALSE)
ListLiftCharts(model, fallbackToParentInsights = FALSE)
model |
dataRobotModel. A DataRobot model object like that returned by |
fallbackToParentInsights |
logical. If TRUE, this will return the lift chart data for the model's parent if the lift chart is not available for the model and the model has a parent model. |
data.frame with the following components:
binWeight. Numeric: weight of the bin. For weighted projects, the sum of the weights of all rows in the bin; otherwise, the number of rows in the bin.
actual. Numeric: sum of actual target values in bin.
predicted. Numeric: sum of predicted target values in bin.
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) ListLiftCharts(model) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) ListLiftCharts(model) ## End(Not run)
This function returns the list of features (typically, response variable and raw covariates) used in building the model specified by model, an S3 object of class 'dataRobotModel'.
ListModelFeatures(model)
ListModelFeatures(model)
model |
An S3 object of class dataRobotModel like that returned by the function GetModel, or each element of the list returned by the function ListModels. |
A character vector of feature names, with one component for each model feature.
Other feature functions:
GetFeatureInfo()
,
ListFeatureInfo()
,
as.dataRobotFeatureInfo()
## Not run: modelId <- "5996f820af07fc605e81ead4" ListModelFeatures(modelId) ## End(Not run)
## Not run: modelId <- "5996f820af07fc605e81ead4" ListModelFeatures(modelId) ## End(Not run)
In time series projects, a new set of modeling features is created after setting the partitioning options. These features are automatically derived from those in the project's dataset and are the features used for modeling. Modeling features are only accessible once the target and partitioning options have been set. In projects that don't use time series modeling, once the target has been set, ModelingFeaturelists and Featurelists will behave the same.
ListModelingFeaturelists(project)
ListModelingFeaturelists(project)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
An S3 object of class 'listOfFeaturelists', which is a list of dataframes: each element of the list corresponds to one featurelist associated with the project, and each dataframe has one row and the following four columns:
featurelistId. Unique alphanumeric identifier for the featurelist.
projectId. Unique alphanumeric project identifier.
features. Comma-separated character string listing the variables included in the featurelist.
name. Character string giving the name of the featurelist.
## Not run: projectId <- "59a5af20c80891534e3c2bde" ListModelingFeaturelists(projectId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" ListModelingFeaturelists(projectId) ## End(Not run)
This function requests information on DataRobot Autopilot modeling tasks that are not complete, for one of three reasons: the task is running and has not yet completed; the task is queued and has not yet been started; or, the task has terminated due to an error.
ListModelJobs(project, status = NULL)
ListModelJobs(project, status = NULL)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
status |
character. The status of the desired jobs: one of JobStatus$Queue, JobStatus$InProgress, orJobStatus$Error. If NULL (default), queued and inprogress jobs are returned. |
The jobStatus variable specifies which of the three groups of modeling tasks is of interest. Specifically, if jobStatus has the value 'inprogress', the request returns information about modeling tasks that are running but not yet complete; if jobStatus has the value 'queue', the request returns information about modeling tasks that are scheduled to run but have not yet started; if jobStatus has the value 'error', the request returns information about modeling tasks that have terminated due to an error. By default, jobStatus is NULL, which means jobs with status "inprogress" or "queue" are returned, but not those with status "error".
A list of lists with one element for each modeling task in the group being queried; if there are no tasks in the class being queried, an empty list is returned. If the group is not empty, a list is returned with the following nine elements:
status. Prediction job status; an element of JobStatus, e.g. JobStatus$Queue.
processes. List of character vectors describing any preprocessing applied.
projectId. Character string giving the unique identifier for the project.
modelId character. The unique identifier for the related model.
samplePct. Numeric: the percentage of the dataset used for model building.
modelType. Character string specifying the model type.
modelCategory. Character string: what kind of model this is - 'prime' for DataRobot Prime models, 'blend' for blender models, and 'model' for other models.
featurelistId. Character string: id of the featurelist used in fitting the model.
blueprintId. Character string: id of the DataRobot blueprint on which the model is based.
modelJobId. Character: id of the job.
isBlocked logical. If TRUE, the job is blocked (cannot be executed) until its dependencies are resolved.
## Not run: projectId <- "59a5af20c80891534e3c2bde" ListModelJobs(projectId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" ListModelJobs(projectId) ## End(Not run)
DataRobot will help pick out a few models from your project that meet certain criteria, such as being the most accurate model or being a model that captures a good blend of both prediction speed and model accuracy.
ListModelRecommendations(project)
ListModelRecommendations(project)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
A list containing information about each recommendation made by DataRobot, containing:
modelId character. The model ID of the recommended model.
projectId character. The project ID of the project the recommendations were made for.
recommendationType character. The type of recommendation being made.
## Not run: projectId <- "5984b4d7100d2b31c1166529" ListModelRecommendations(projectId) ## End(Not run)
## Not run: projectId <- "5984b4d7100d2b31c1166529" ListModelRecommendations(projectId) ## End(Not run)
This function requests the model information for the DataRobot project specified by the project argument, described under Arguments. This parameter may be obtained in several ways, including: (1), from the projectId element of the list returned by ListProjects; (2), as the object returned by the GetProject function; or (3), as the list returned by the SetupProject function. The function returns an S3 object of class 'listOfModels'.
ListModels(project, orderBy = NULL, filter = NULL)
ListModels(project, orderBy = NULL, filter = NULL)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
orderBy |
character. Optional. A vector of keys to order the list by. You can
order by |
filter |
list. Optional. A named list of parameters to search a model
by, such as |
An S3 object of class listOfModels, which may be characterized using R's generic summary function or converted to a dataframe with the as.data.frame method.
## Not run: projectId <- "59a5af20c80891534e3c2bde" ListModels(projectId) ListModels(projectId, orderBy=c("samplePct", "-metric")) ListModels(projectId, filter=list("sample_pct__gt" = 64, "name" = "Ridge")) ListModels(projectId, filter=list("isStarred" = TRUE)) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" ListModels(projectId) ListModels(projectId, orderBy=c("samplePct", "-metric")) ListModels(projectId, filter=list("sample_pct__gt" = 64, "name" = "Ridge")) ListModels(projectId, filter=list("isStarred" = TRUE)) ## End(Not run)
This function returns an S3 object of class listDataRobotPredictionDataset that describes all prediction datasets available for the project specified by the project parameter. This list may be converted to a dataframe with the as.data.frame method for objects of class listDataRobotPredictionDataset.
ListPredictionDatasets(project)
ListPredictionDatasets(project)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
An S3 object of class 'listDataRobotPredictionDataset', which is a list of dataframes: each element of the list corresponds to one prediction dataset associated with the project, and each dataframe has one row and the following columns:
id character. The unique alphanumeric identifier for the dataset.
numColumns numeric. Number of columns in dataset.
name character. Name of dataset file.
created character. time of upload.
projectId character. String giving the unique alphanumeric identifier for the project.
numRows numeric. Number of rows in dataset.
forecastPoint. The point relative to which predictions will be generated, based on the forecast window of the project. Only specified in time series projects, otherwise will be NULL.
## Not run: projectId <- "59a5af20c80891534e3c2bde" ListPredictionDatasets(projectId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" ListPredictionDatasets(projectId) ## End(Not run)
Retrieve metadata for prediction explanations in specified project
ListPredictionExplanationsMetadata( project, modelId = NULL, limit = NULL, offset = NULL )
ListPredictionExplanationsMetadata( project, modelId = NULL, limit = NULL, offset = NULL )
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
modelId |
character. Optional. If specified, only prediction explanations computed for this model will be returned. |
limit |
integer. Optional. At most this many results are returned, default: no limit |
offset |
integer. This many results will be skipped, default: 0 |
List of metadata for all prediction explanations in the project.
Each element of list is metadata for one prediction explanations
(for format see GetPredictionExplanationsMetadata
).
## Not run: projectId <- "59a5af20c80891534e3c2bde" ListPredictionExplanationsMetadata(projectId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" ListPredictionExplanationsMetadata(projectId) ## End(Not run)
This function itemizes the predictions available for a given project, model, and/or dataset.
Note that this function does not actually return the predictions. Use
GetPredictions(projectId, predictionId)
to get the predictions for a particular
set of predictions.
ListPredictions(project, modelId = NULL, datasetId = NULL)
ListPredictions(project, modelId = NULL, datasetId = NULL)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
modelId |
numeric. Optional. Filter returned predictions to only be predictions made against the model specified by this model ID. |
datasetId |
numeric. Optional. Filter returned predictions to only be predictions made against the prediction dataset specified by this dataset ID. |
A data.frame specifying:
projectId character. The ID of the project the predictions were made in.
datasetId character. The dataset ID of the dataset used to make predictions
modelId character. The model ID of the model used to make predictions.
predictionId character. The unique ID corresponding to those predictions. Use
GetPredictions(projectId, predictionId)
to fetch the individual predictions.
includesPredictionIntervals logical. Whether or not the predictions include
prediction intervals. See Predict
for details.
predictionIntervalsSize integer. Optional. The size, in percent, of prediction
intervals or NULL if there are no intervals. See Predict
for details.
## Not run: projectId <- "59a5af20c80891534e3c2bde" predictions <- ListPredictions(projectId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" predictions <- ListPredictions(projectId) ## End(Not run)
List all available prediction servers.
ListPredictionServers()
ListPredictionServers()
A list of DataRobotPredictionServer objects containing:
id character. The ID of the prediction server.
url character. The URL of the prediction server.
dataRobotKey character. The key used to access the prediction server.
## Not run: ListPredictionServers() ## End(Not run)
## Not run: ListPredictionServers() ## End(Not run)
Training a model using a ruleset is a necessary prerequisite for being able to download the code for a ruleset.
ListPrimeFiles(project, parentModelId = NULL, modelId = NULL)
ListPrimeFiles(project, parentModelId = NULL, modelId = NULL)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
parentModelId |
numeric. Optional. Filter for only those prime files approximating this parent model. |
modelId |
numeric. Optional. Filter for only those prime files with code for this prime model. |
List of lists. Each element of the list corresponds to one Prime file available to download. The elements of this list have the same format as the return value of GetPrimeFile.
## Not run: projectId <- "59a5af20c80891534e3c2bde" ListPrimeFiles(projectId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" ListPrimeFiles(projectId) ## End(Not run)
This function requests the DataRobot Prime models information for the DataRobot project specified by the project argument, described under Arguments.
ListPrimeModels(project)
ListPrimeModels(project)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
The function returns data.frame containing information about each DataRobot Prime model in a project (one row per Prime model)
data.frame (classed as dataRobotPrimeModels
) containing
information about each DataRobot Prime model in a project (one row per
Prime model).
## Not run: projectId <- "59a5af20c80891534e3c2bde" ListPrimeModels(projectId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" ListPrimeModels(projectId) ## End(Not run)
This function returns an S3 object of class projectSummaryList that describes all (optionally filtered) DataRobot modeling projects available to the user. This list may be converted into a dataframe with the as.data.frame method for this class of S3 objects.
ListProjects(filter = NULL, limit = NULL, offset = NULL)
ListProjects(filter = NULL, limit = NULL, offset = NULL)
filter |
list. Optional. A named list that can be used to specify various filters. Currently 'projectName' is supported which will filter returned projects for projects with names containing the specified string. |
limit |
integer. Optional. At most this many results are returned. Invoking 'ListProjects' with this value against DataRobot 8.x (API 2.28) or older will throw an error. |
offset |
integer. Optional. This many results will be skipped. Invoking 'ListProjects' with this value against DataRobot 8.x (API 2.28) or older will throw an error. |
An S3 object of class 'projectSummaryList', consisting of the following elements:
projectId. List of character strings giving the unique DataRobot identifier for each project.
projectName. List of character strings giving the user-supplied project names.
fileName. List of character strings giving the name of the modeling dataset for each project.
stage. List of character strings specifying each project's Autopilot stage (e.g., 'aim'
is necessary to set target). Use ProjectStage
to get a list of options.
autopilotMode. List of integers specifying the Autopilot mode (0 = fully automatic, 1 = semi-automatic, 2 = manual).
created. List of character strings giving the project creation time and date.
target. List of character strings giving the name of the target variable for each project.
metric. List of character strings identifying the fitting metric optimized for each project.
partition. Dataframe with one row for each project and 12 columns specifying partitioning details.
advancedOptions. Dataframe with one row for each project and 4 columns specifying values for advanced option parameters.
positiveClass. Character string identifying the positive target class for binary classification projects.
maxTrainPct. The maximum percentage of the project dataset that can be used without going into the validation data or being too large to submit any blueprint for training a project.
maxTrainRows. The maximum number of rows that can be trained on without going into the validation data or being too large to submit any blueprint for training.
holdoutUnlocked. Logical flag indicating whether holdout subset results have been computed.
targetType. Character string giving the type of modeling project (e.g., regression or binary classification).
## Not run: ListProjects() ListProjects(filter = list("projectName" = "TimeSeries")) ## End(Not run)
## Not run: ListProjects() ListProjects(filter = list("projectName" = "TimeSeries")) ## End(Not run)
Retrieve information about all DataRobot models with a rating table.
ListRatingTableModels(project)
ListRatingTableModels(project)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
data.frame containing information about each model with a rating table in a project (one row per model with a rating table).
## Not run: projectId <- "5984b4d7100d2b31c1166529" ListRatingTableModels(projectId) ## End(Not run)
## Not run: projectId <- "5984b4d7100d2b31c1166529" ListRatingTableModels(projectId) ## End(Not run)
Retrieve information about all rating tables.
ListRatingTables(project)
ListRatingTables(project)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
data.frame containing information about each rating table in a project (one row per model with a rating table).
## Not run: projectId <- "5984b4d7100d2b31c1166529" ListRatingTables(projectId) ## End(Not run)
## Not run: projectId <- "5984b4d7100d2b31c1166529" ListRatingTables(projectId) ## End(Not run)
DataPartition
). This chart is only available for regression models that are not
time-aware.Retrieve residuals chart data for a model for all available data partitions (see
DataPartition
). This chart is only available for regression models that are not
time-aware.
ListResidualsCharts(model, fallbackToParentInsights = FALSE)
ListResidualsCharts(model, fallbackToParentInsights = FALSE)
model |
dataRobotModel. A DataRobot model object like that returned by |
fallbackToParentInsights |
logical. If TRUE, this will return the residuals chart data for the model's parent if the residuals chart is not available for the model and the model has a parent model. |
list of objects containing residuals chart data for all available data partitions. See
DataPartition
for details. Each object has the following components:
residualMean. Numeric: the arithmetic mean of the predicted value minus the actual value over the downsampled dataset.
coefficientOfDetermination. Numeric: aka the r-squared value. This value is calculated over the downsampled output, not the full input.
data. data.frame: The rows of chart data in [actual, predicted, residual, row number] form. If the row number was not available at the time of model creation, the row number will be null.
histogram. list: Data to plot a histogram of residual values. Each object contains:
intervalStart. Numeric: Start value for an interval, inclusive.
intervalEnd. Numeric: End value for an interval, exclusive for all but the last interval.
occurrences. Integer: the number of times the predicted value fits within the interval.
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) ListResidualsCharts(model) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) ListResidualsCharts(model) ## End(Not run)
Retrieve ROC curve data for a model for all available data partitions (see DataPartition)
ListRocCurves(model, fallbackToParentInsights = FALSE)
ListRocCurves(model, fallbackToParentInsights = FALSE)
model |
dataRobotModel. A DataRobot model object like that returned by |
fallbackToParentInsights |
logical. If TRUE, this will return the lift chart data for the model's parent if the lift chart is not available for the model and the model has a parent model. |
list of lists where each list is renamed as the data partitions source and returns the following components:
source. Character: data partitions for which ROC curve data is returned (see DataPartition).
negativeClassPredictions. Numeric: example predictions for the negative class for each data partition source.
rocPoints. data.frame: each row represents pre-calculated metrics (accuracy, f1_score, false_negative_score, true_negative_score, true_positive_score, false_positive_score, true_negative_rate, false_positive_rate, true_positive_rate, matthews_correlation_coefficient, positive_predictive_value, negative_predictive_value, threshold) associated with different thresholds for the ROC curve.
positiveClassPredictions. Numeric: example predictions for the positive class for each data partition source.
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) ListRocCurves(model) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) ListRocCurves(model) ## End(Not run)
Note that currently only data sources and data stores can be shared with this API.
ListSharingAccess(object, batchSize = NULL)
ListSharingAccess(object, batchSize = NULL)
object |
object. The shared object to inspect access for. |
batchSize |
integer. The number of requests per page to expect. |
A list specifying information on access:
username character. The name of the user with access.
userId character. The ID of the user with access.
role character. The type of access granted. See SharingRole
for options.
canShare logical. Whether the user can further share access.
## Not run: dataStoreId <- "5c1303269300d900016b41a7" dataStore <- GetDataStore(dataStoreId) ListSharingAccess(dataStore) ## End(Not run)
## Not run: dataStoreId <- "5c1303269300d900016b41a7" dataStore <- GetDataStore(dataStoreId) ListSharingAccess(dataStore) ## End(Not run)
Star models and add them to this list using StarModel
or ToggleStarForModel
.
Unstar models and remove them from this list using UnstarModel
or
ToggleStarForModel
ListStarredModels(project, orderBy = NULL)
ListStarredModels(project, orderBy = NULL)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
orderBy |
character. Optional. A vector of keys to order the list by. You can
order by |
An S3 object of class listOfModels, which may be characterized using R's generic summary function or converted to a dataframe with the as.data.frame method.
## Not run: projectId <- "59a5af20c80891534e3c2bde" ListStarredModels(projectId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" ListStarredModels(projectId) ## End(Not run)
Retrieve information about all training prediction datasets in a project.
ListTrainingPredictions(project)
ListTrainingPredictions(project)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
data.frame containing information about each training prediction.
## Not run: projectId <- "5984b4d7100d2b31c1166529" ListTrainingPredictions(projectId) ## End(Not run)
## Not run: projectId <- "5984b4d7100d2b31c1166529" ListTrainingPredictions(projectId) ## End(Not run)
Retrieve information about all imported models This function returns a data.frame that describes all imported models
ListTransferableModels(limit = NULL, offset = NULL)
ListTransferableModels(limit = NULL, offset = NULL)
limit |
integer. The number of records to return. The server will use a (possibly finite) default if not specified. |
offset |
integer. The number of records to skip. |
A data.frame describing uploaded transferable model with the following components:
note. Character string Manually added node about this imported model.
datasetName. Character string Filename of the dataset used to create the project the model belonged to.
modelName. Character string Model type describing the model generated by DataRobot.
displayName. Character string Manually specified human-readable name of the imported model.
target. Character string The target of the project the model belonged to prior to export.
projectName. Character string Name of the project the model belonged to prior to export.
importedByUsername. Character string Username of the user who imported the model.
importedAt. Character string The time the model was imported.
version. Numeric Project version of the project the model belonged to.
projectId. Character id of the project the model belonged to prior to export.
featurelistName. Character string Name of the featurelist used to train the model.
createdByUsername. Character string Username of the user who created the model prior to export.
importedById. Character string id of the user who imported the model.
id. Character string id of the import.
createdById. Character string id of the user who created the model prior to export.
modelId. Character string original id of the model prior to export.
originUrl. Character string URL.
Other Transferable Model functions:
DeleteTransferableModel()
,
DownloadTransferableModel()
,
GetTransferableModel()
,
RequestTransferableModel()
,
UpdateTransferableModel()
,
UploadTransferableModel()
## Not run: ListTransferableModels() ## End(Not run)
## Not run: ListTransferableModels() ## End(Not run)
Make a HTTP request
MakeDataRobotRequest( requestMethod, routeString, addUrl = TRUE, returnRawResponse = TRUE, as = "json", simplifyDataFrame = TRUE, body = NULL, query = NULL, timeout = DefaultHTTPTimeout, encode = NULL, followLocation = TRUE, filename = NULL, stopOnError = TRUE )
MakeDataRobotRequest( requestMethod, routeString, addUrl = TRUE, returnRawResponse = TRUE, as = "json", simplifyDataFrame = TRUE, body = NULL, query = NULL, timeout = DefaultHTTPTimeout, encode = NULL, followLocation = TRUE, filename = NULL, stopOnError = TRUE )
requestMethod |
function. A function from httr (e.g., 'httr::GET', 'httr::POST') to use. |
routeString |
character. The path to make the request on. |
addUrl |
logical. Should the endpoint be prepended to the routeString? (Default TRUE). |
returnRawResponse |
logical. Whether to return the raw httr response object (as opposed to post processing and returning the content of that object, which is the default.) |
as |
character. What should the resulting data be interpreted as? (default "json").
Use "file" to download as a file (see |
simplifyDataFrame |
logical. Whether to invoke |
body |
list. The body of the request for POST. |
query |
list. The query parameters for GET. |
timeout |
numeric. How many seconds before the request times out? |
encode |
character. What should the body be encoded as for the JSON request? |
followLocation |
logical. Should HTTR follow the location if provided? (Default TRUE). |
filename |
character. The path of the file to download to, if it is a download request. |
stopOnError |
logical. If there is an error, should it be raised as a fatal R error? (Default TRUE). |
For usage, see 'GetModelCapabilities
'.
ModelCapability
ModelCapability
An object of class list
of length 12.
Model replacement reason
ModelReplacementReason
ModelReplacementReason
An object of class list
of length 6.
Added in DataRobot API 2.23.
MulticlassDeploymentAccuracyMetric
MulticlassDeploymentAccuracyMetric
An object of class list
of length 3.
The DataRobot APIs returns dates in RFC 3339 format.
parseRFC3339Timestamp(timestampstring)
parseRFC3339Timestamp(timestampstring)
timestampstring |
character. Timestamp in RFC 3339 format. |
The input timestamp as a POSIXt
Other API datetime functions:
RFC3339DateTimeFormat
,
formatRFC3339Timestamp()
,
transformRFC3339Period()
,
validateReportingPeriodTime()
This function pauses the DataRobot modeling queue for a specified project
PauseQueue(project)
PauseQueue(project)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
## Not run: projectId <- "59a5af20c80891534e3c2bde" PauseQueue(projectId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" PauseQueue(projectId) ## End(Not run)
Periodicity max time step
PeriodicityMaxTimeStep
PeriodicityMaxTimeStep
An object of class numeric
of length 1.
Same as time units, but kept for backwards compatibility.
PeriodicityTimeUnits
PeriodicityTimeUnits
An object of class list
of length 8.
Method for R's generic plot function for DataRobot S3 objects of class listOfModels. This function generates a horizontal barplot as described under Details.
## S3 method for class 'listOfModels' plot( x, y, metric = NULL, pct = NULL, selectRecords = NULL, orderDecreasing = NULL, textSize = 0.8, textColor = "black", borderColor = "blue", xpos = NULL, ... )
## S3 method for class 'listOfModels' plot( x, y, metric = NULL, pct = NULL, selectRecords = NULL, orderDecreasing = NULL, textSize = 0.8, textColor = "black", borderColor = "blue", xpos = NULL, ... )
x |
S3 object of class listOfModels to be plotted. |
y |
Not used; included for conformance with plot() generic function parameter requirements. |
metric |
character. Optional. Defines the metric to be used in constructing the barplot. If NULL (the default), the validation set value for the project fitting metric is used; otherwise, this value must name one of the elements of the metrics list associated with each model in x. |
pct |
integer. Optional. Specifies a samplePct value used in selecting models to include in the barplot summary. If NULL (the default), all project models are included. Note, however, that this list of models is intersected with the list of models defined by the selectRecords parameter, so that only those models identified by both selectRecords and pct appear in the plot. |
selectRecords |
integer. Optional. A vector that specifies the individual elements of the list x to be included in the barplot summary. If NULL (the default), all models are included. Note, however, that this list of models is intersected with the list of models defined by the pct parameter, so that only those models identified by both selectRecords and pct appear in the plot. |
orderDecreasing |
logical. Optional. If TRUE, the barplot is built from the bottom up in decreasing order of the metric values; if FALSE, the barplot is built in increasing order of metric values. The default is NULL, which causes the plot to be generated in the order in which the models appear in the list x. |
textSize |
numeric. Optional. Multiplicative scaling factor for the model name labels on the barplot. |
textColor |
character. Optional. If character, this parameter specifies the text color used in labelling all models in the barplot; if a character vector, it specifies one color for each model in the plot. |
borderColor |
character. Optional. Specifies the border color for all bars in the barplot, surrounding a transparent background. |
xpos |
numeric. Optional. Defines the horizontal position of the center of all text labels on the plot. The default is NULL, which causes all text to be centered in the plot; if xpos is a single number, all text labels are centered at this position; if xpos is a vector, it specifies one center position for each model in the plot. |
... |
list. Optional. Additional named parameters to be passed to R's barplot function used in generating the plot |
This function generates a horizontal barplot with one bar for each model characterized in the 'listOfModels' object x. The length of each bar is specified by the value of metric; if this parameter is specified as NULL (the default), the project fitting metric is used, as determined by the projectMetric value from the first element of x. Text is added to each bar in the plot, centered at the position specified by the xpos parameter, based on the value of the modelType element of each model in the list x. The size and color of these text labels may be controlled with the textSize and textColor parameters. The order in which these models appear on the plot is controlled by the choice of metric and the value of the orderDecreasing parameter, and subsets of the models appearing in the list x may be selected via the pct and selectRecords parameters.
None. This function is called for its side-effect of generating a plot.
## Not run: projectId <- "59a5af20c80891534e3c2bde" plot(ListModels(projectId)) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" plot(ListModels(projectId)) ## End(Not run)
This is a list that contains the valid values for PostgreSQL drivers.
PostgreSQLdrivers
PostgreSQLdrivers
An object of class list
of length 2.
This function can be used to predict with a particular model.
Predict( model, predictionDataset, classPrefix = "class_", maxWait = 600, forecastPoint = NULL, predictionsStartDate = NULL, predictionsEndDate = NULL, type = "response", includePredictionIntervals = FALSE, predictionIntervalsSize = NULL )
Predict( model, predictionDataset, classPrefix = "class_", maxWait = 600, forecastPoint = NULL, predictionsStartDate = NULL, predictionsEndDate = NULL, type = "response", includePredictionIntervals = FALSE, predictionIntervalsSize = NULL )
model |
An S3 object of class dataRobotModel like that returned by the function GetModel, or each element of the list returned by the function ListModels. |
predictionDataset |
object. Either a dataframe of data to predict on or a DataRobot
prediction dataset object of class |
classPrefix |
character. For multiclass projects returning prediction probabilities, this prefix is prepended to each class in the header of the dataframe. Defaults to "class_". |
maxWait |
integer. The maximum time (in seconds) to wait for the prediction job to complete. |
forecastPoint |
character. Optional. The point relative to which predictions will be generated, based on the forecast window of the project. Only specified in time series projects. |
predictionsStartDate |
datetime. Optional. Only specified in time series projects.
The start date for bulk predictions. Note that this parameter is for generating
historical predictions using the training data. This parameter should be provided in
conjunction |
predictionsEndDate |
datetime. Optional. Only specified in time series projects.
The end date for bulk predictions. Note that this parameter is for generating
historical predictions using the training data. This parameter should be provided
in conjunction |
type |
character. String specifying the type of response for binary classifiers; see Details. |
includePredictionIntervals |
logical. Optional. Should prediction intervals bounds should be part of predictions? Only available for time series projects. See "Details" for more info. |
predictionIntervalsSize |
numeric. Optional. Size of the prediction intervals, in percent. Only available for time series projects. See "Details" for more info. |
The contents of the return vector depends on the modeling
task - binary classification, multiclass classification, or regression;
whether or not the underlying data is time series, multiseries, cross-series, or not
time series; and the value of the type
parameter. For non-time-series regression tasks,
the type parameter is ignored and a vector of numerical predictions of the response
variable is returned.
This function will error if the requested job has errored or
if it has not completed within maxWait
seconds.
See RequestPredictions
and GetPredictions
for more details.
Vector of predictions, depending on the modeling task ("Binary", "Multiclass", or "Regression") and the value of the type parameter; see Details.
## Not run: trainIndex <- sample(nrow(iris) * 0.7) trainIris <- iris[trainIndex, ] testIris <- iris[-trainIndex, ] project <- StartProject(trainIris, "iris", target = "Petal_Width", wait = TRUE) model <- GetRecommendedModel(project) predictions <- Predict(model, testIris) # Or, if prediction intervals are desired (datetime only) model <- GetRecommendedModel(datetimeProject) predictions <- Predict(model, dataset, includePredictionIntervals = TRUE, predictionIntervalsSize = 100, type = "raw") ## End(Not run)
## Not run: trainIndex <- sample(nrow(iris) * 0.7) trainIris <- iris[trainIndex, ] testIris <- iris[-trainIndex, ] project <- StartProject(trainIris, "iris", target = "Petal_Width", wait = TRUE) model <- GetRecommendedModel(project) predictions <- Predict(model, testIris) # Or, if prediction intervals are desired (datetime only) model <- GetRecommendedModel(datetimeProject) predictions <- Predict(model, dataset, includePredictionIntervals = TRUE, predictionIntervalsSize = 100, type = "raw") ## End(Not run)
predict
method.Retrieve model predictions using R's default S3 predict
method.
## S3 method for class 'dataRobotModel' predict(object, ...)
## S3 method for class 'dataRobotModel' predict(object, ...)
object |
dataRobotModel. The object of class |
... |
list. Additional arguments to pass to |
Predict
## Not run: trainIndex <- sample(nrow(iris) * 0.7) trainIris <- iris[trainIndex, ] testIris <- iris[-trainIndex, ] project <- StartProject(trainIris, "iris", target = "Petal_Width", wait = TRUE) model <- GetRecommendedModel(project) predictions <- predict(model, testIris) ## End(Not run)
## Not run: trainIndex <- sample(nrow(iris) * 0.7) trainIris <- iris[trainIndex, ] testIris <- iris[-trainIndex, ] project <- StartProject(trainIris, "iris", target = "Petal_Width", wait = TRUE) model <- GetRecommendedModel(project) predictions <- predict(model, testIris) ## End(Not run)
If dataset creation times out, the error message includes a URL corresponding to the creation task. That URL can be passed to this function (which will return the completed dataset info details when finished) to resume waiting for creation.
PredictionDatasetFromAsyncUrl(asyncUrl, maxWait = 600)
PredictionDatasetFromAsyncUrl(asyncUrl, maxWait = 600)
asyncUrl |
The temporary status URL |
maxWait |
The maximum time to wait (in seconds) for creation before aborting. |
This is a list that contains the valid values for downloadable code programming languages.
PrimeLanguage
PrimeLanguage
An object of class list
of length 2.
If project creation times out, the error message includes a URL corresponding to the project creation task. That URL can be passed to this function (which will return the completed project details when finished) to resume waiting for project creation.
ProjectFromJobResponse(jobResponse, maxWait = 600)
ProjectFromJobResponse(jobResponse, maxWait = 600)
jobResponse |
An HTTP POST response that includes a redirect to the temporary status URL. |
maxWait |
The maximum time to wait (in seconds) for project creation before aborting. |
Project stage
ProjectStage
ProjectStage
An object of class list
of length 4.
MostAccurate
retrieves the most accurate model based on validation or
cross-validation results. In most cases, this will be a blender model.
RecommendedModelType
RecommendedModelType
An object of class list
of length 3.
FastAccurate
retrieves the most accurate individual model (not blender) that passes
set guidelines for prediction speed. If no models meet the prediction speed guideline, this
will not retrieve anything.
RecommendedForDeployment
retrieves the most accurate individual model. This model
will have undergone specific pre-preparations to be deployment ready. See
GetModelRecommendation
for details.
replace NULL in $metrics list elements with NA
ReformatMetrics(metricsList)
ReformatMetrics(metricsList)
metricsList |
list. List of metrics to reformat. |
Added in DataRobot API 2.18.
RegressionDeploymentAccuracyMetric
RegressionDeploymentAccuracyMetric
An object of class list
of length 12.
Renames a rating table to a different name.
RenameRatingTable(project, ratingTableId, ratingTableName)
RenameRatingTable(project, ratingTableId, ratingTableName)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
ratingTableId |
character. The ID of the rating table. |
ratingTableName |
character. The new name for the rating table. |
An S3 object of class 'dataRobotRatingTable' summarizing all available information about the renamed rating table.
## Not run: projectId <- "5984b4d7100d2b31c1166529" ratingTableId <- "5984b4d7100d2b31c1166529" RenameRatingTable(projectId, ratingTableId, "Renamed Table") ## End(Not run)
## Not run: projectId <- "5984b4d7100d2b31c1166529" ratingTableId <- "5984b4d7100d2b31c1166529" RenameRatingTable(projectId, ratingTableId, "Renamed Table") ## End(Not run)
This function reorders columns in a data.frame without relying on dplyr or data.table. You only need to specify the columns that should be moved; all others will be slotted in the gaps. Invalid columns are ignored.
reorderColumns(df, vars)
reorderColumns(df, vars)
df |
data.frame with named columns. |
vars |
integer. named vector where the names represent column names in df that should be moved. The value of each item is the new location of that column. |
A copy of the input data.frame, with columns rearranged per vars
{ df <- data.frame(Time=c(1,2), In=c(2,3), Out=c(3,4), Files=c(4,5)) datarobot:::reorderColumns(df, c("In" = 3, "Time" = 4)) }
{ df <- data.frame(Time=c(1,2), In=c(2,3), Out=c(3,4), Files=c(4,5)) datarobot:::reorderColumns(df, c("In" = 3, "Time" = 4)) }
Replace a model in a deployment with another model.
ReplaceDeployedModel( deploymentId, newModelId, replacementReason, maxWait = 600 )
ReplaceDeployedModel( deploymentId, newModelId, replacementReason, maxWait = 600 )
deploymentId |
character. The ID of the deployment. |
newModelId |
character. The ID of the model to use in the deployment. This model will replace the old model. You can also pass a dataRobotModel object. |
replacementReason |
character. Optional. The reason for replacing the deployment.
See |
maxWait |
integer. How long to wait (in seconds) for the computation to complete before returning a timeout error? (Default 600 seconds) |
A DataRobotDeployment object containing:
id character. The ID of the deployment.
label character. The label of the deployment.
description character. The description of the deployment.
defaultPredictionServer list. Information on the default prediction
server connected with the deployment. See ListPredictionServers
for details.
model dataRobotModel. The model associated with the deployment.
See GetModel
for details.
capabilities list. Information on the capabilities of the deployment.
predictionUsage list. Information on the prediction usage of the deployment.
permissions list. User's permissions on the deployment.
serviceHealth list. Information on the service health of the deployment.
modelHealth list. Information on the model health of the deployment.
accuracyHealth list. Information on the accuracy health of the deployment.
## Not run: deploymentId <- "5e319d2e422fbd6b58a5edad" newModelId <- "5996f820af07fc605e81ead4" ReplaceDeployedModel(deploymentId, newModelId, ModelReplacementReason$Other) ## End(Not run)
## Not run: deploymentId <- "5e319d2e422fbd6b58a5edad" newModelId <- "5996f820af07fc605e81ead4" ReplaceDeployedModel(deploymentId, newModelId, ModelReplacementReason$Other) ## End(Not run)
This function will create several rulesets that approximate the specified model. The code used in the approximation can be downloaded to be run locally. Currently only Python and Java downloadable code is available
RequestApproximation(project, modelId)
RequestApproximation(project, modelId)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
modelId |
character. Unique alphanumeric identifier for the model of interest. |
General workflow of creating and downloading Prime code may look like following: RequestApproximation - create several rulesets that approximate the specified model GetRulesets - list all rulesets created for the parent model RequestPrimeModel - create Prime model for specified ruleset (use one of rulesets return by GetRulesets) GetPrimeModelFromJobId - get PrimeModelId using JobId returned by RequestPrimeModel CreatePrimeCode - create code for one of available Prime models GetPrimeFileFromJobId - get PrimeFileId using JobId returned by CreatePrimeCode DownloadPrimeCode - download specified Prime code file
job Id
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" RequestApproximation(projectId, modelId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" RequestApproximation(projectId, modelId) ## End(Not run)
This function requests the creation of a blend of several models in specified DataRobot project. The function also allows the user to specify method used for blending. This function returns an integer modelJobId value, which can be used by the GetBlenderModelFromJobId function to return the full blender model object.
RequestBlender(project, modelsToBlend, blendMethod)
RequestBlender(project, modelsToBlend, blendMethod)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
modelsToBlend |
character. Vector listing the model Ids to be blended. |
blendMethod |
character. Parameter specifying blending method. See acceptable values within BlendMethods. |
An integer value that can be used as the modelJobId parameter in subsequent calls to the GetBlenderModelFromJobId function.
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelsToBlend <- c("5996f820af07fc605e81ead4", "59a5ce3301e9f0296721c64c") RequestBlender(projectId, modelsToBlend, "GLM") ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelsToBlend <- c("5996f820af07fc605e81ead4", "59a5ce3301e9f0296721c64c") RequestBlender(projectId, modelsToBlend, "GLM") ## End(Not run)
Call this function to request the project be formatted as a cross series project with a group by column.
RequestCrossSeriesDetection( project, dateColumn, multiseriesIdColumns = NULL, crossSeriesGroupByColumns = NULL, maxWait = 600 )
RequestCrossSeriesDetection( project, dateColumn, multiseriesIdColumns = NULL, crossSeriesGroupByColumns = NULL, maxWait = 600 )
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
dateColumn |
character. The name of the column containing the date that defines the time series. |
multiseriesIdColumns |
character. Optional. The Series ID to demarcate the series. If not specified, DataRobot will attempt to automatically infer the series ID. |
crossSeriesGroupByColumns |
character. Optional. Column to split a cross series into
further groups. For example, if every series is sales of an individual product, the cross
series group could be e product category with values like "men's clothing", "sports
equipment", etc. Requires multiseries with |
maxWait |
integer. The maximum time (in seconds) to wait for the model job to complete. |
Note that this function no longer needs to be called directly, but is called
indirectly as a part of SetTarget
(which itself is called indirectly as part of
StartProject
) when you pass a crossSeriesGroupByColumn
using
CreateDatetimePartitionSpecification
.
A named list which contains:
timeSeriesEligible logical. Whether or not the series is eligible to be used for time series.
crossSeriesEligible logical. Whether or not the cross series group by column is eligible for cross-series modeling. Will be NULL if no cross series group by column is used.
crossSeriesEligibilityReason character. The type of cross series eligibility (or ineligibility).
timeUnit character. For time series eligible features, the time unit covered by a single time step, e.g. "HOUR", or NULL for features that are not time series eligible.
timeStep integer. Expected difference in time units between rows in the data. Will be NULL for features that are not time series eligible.
Other MultiSeriesProject functions:
GetMultiSeriesProperties()
,
RequestMultiSeriesDetection()
,
as.dataRobotMultiSeriesProperties()
This adds a Feature Impact job to the project queue.
RequestFeatureImpact(model, rowCount = NULL)
RequestFeatureImpact(model, rowCount = NULL)
model |
character. The model for which you want to compute Feature Impact, e.g.
from the list of models returned by |
rowCount |
numeric. The sample size to use for Feature Impact computation. It is possible to re-compute Feature Impact with a different row count. |
A job ID (character)
## Not run: model <- ListModels(project)[[1]] featureImpactJobId <- RequestFeatureImpact(model) featureImpact <- GetFeatureImpactForJobId(project, featureImpactJobId) ## End(Not run)
## Not run: model <- ListModels(project)[[1]] featureImpactJobId <- RequestFeatureImpact(model) featureImpact <- GetFeatureImpactForJobId(project, featureImpactJobId) ## End(Not run)
Requires that this model belongs to a datetime partitioned project. If it does not, an error will occur when submitting the job
RequestFrozenDatetimeModel( model, trainingRowCount = NULL, trainingDuration = NULL, trainingStartDate = NULL, trainingEndDate = NULL, timeWindowSamplePct = NULL )
RequestFrozenDatetimeModel( model, trainingRowCount = NULL, trainingDuration = NULL, trainingStartDate = NULL, trainingEndDate = NULL, timeWindowSamplePct = NULL )
model |
An S3 object of class dataRobotModel like that returned by the function GetModel, or each element of the list returned by the function ListModels. |
trainingRowCount |
integer. (optional) the number of rows of data that should be used to train the model. |
trainingDuration |
character. string (optional) a duration string specifying what time range the data used to train the model should span. |
trainingStartDate |
character. string(optional) the start date of the data to train to model on (" be used. |
trainingEndDate |
character. string(optional) the end date of the data to train the model on (" will be used. |
timeWindowSamplePct |
integer. (optional) May only be specified when the requested model is a time window (e.g. duration or start and end dates). An integer between 1 and 99 indicating the percentage to sample by within the window. The points kept are determined by a random uniform sample. |
Frozen models use the same tuning parameters as their parent model instead of independently optimizing them to allow efficiently retraining models on larger amounts of the training data.
In addition to trainingRowCount and trainingDuration, frozen datetime models may be trained on an exact date range. Only one of trainingRowCount, trainingDuration, or trainingStartDate and trainingEndDate should be specified. Models specified using trainingStartDate and trainingEndDate are the only ones that can be trained into the holdout data (once the holdout is unlocked).
An integer value that can be used as the modelJobId parameter in subsequent calls to the GetDatetimeModelFromJobId function.
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetDatetimeModel(modelId) RequestFrozenDatetimeModel(model) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetDatetimeModel(modelId) RequestFrozenDatetimeModel(model) ## End(Not run)
Frozen models use the same tuning parameters as their parent model instead of independently optimizing them to allow efficiently retraining models on larger amounts of the training data.
RequestFrozenModel(model, samplePct = NULL, trainingRowCount = NULL)
RequestFrozenModel(model, samplePct = NULL, trainingRowCount = NULL)
model |
An S3 object of class dataRobotModel like that returned by the function GetModel, or each element of the list returned by the function ListModels. |
samplePct |
Numeric, specifying the percentage of the training dataset to be used in building the new model |
trainingRowCount |
integer. The number of rows to use to train the requested model. |
Either 'sample_pct' or 'training_row_count' can be used to specify the amount of data to use, but not both. If neither are specified, a default of the maximum amount of data that can safely be used to train any blueprint without going into the validation data will be selected. In smart-sampled projects, 'samplePct' and 'trainingRowCount' are assumed to be in terms of rows of the minority class.
Note : For datetime partitioned projects, use 'RequestFrozenDatetimeModel' instead
An integer value that can be used as the modelJobId parameter in subsequent calls to the GetModelFromJobId function.
An integer value that can be used as the modelJobId parameter in subsequent calls to the GetModelFromJobId function.
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) RequestFrozenModel(model, samplePct = 10) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) RequestFrozenModel(model, samplePct = 10) ## End(Not run)
Call this function to request the project be formatted as a multiseries project, with the
dateColumn
specifying the time series.
RequestMultiSeriesDetection( project, dateColumn, multiseriesIdColumns = NULL, maxWait = 600 )
RequestMultiSeriesDetection( project, dateColumn, multiseriesIdColumns = NULL, maxWait = 600 )
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
dateColumn |
character. The name of the column containing the date that defines the time series. |
multiseriesIdColumns |
character. Optional. The Series ID to demarcate the series. If not specified, DataRobot will attempt to automatically infer the series ID. |
maxWait |
integer. The maximum time (in seconds) to wait for the model job to complete. |
Note that as of v2.13 this function no longer needs to be called directly, but is called
indirectly as a part of SetTarget
(which itself is called indirectly as part of
StartProject
) when you pass a multiseries partition using
CreateDatetimePartitionSpecification
.
A named list which contains:
timeSeriesEligible logical. Whether or not the series is eligible to be used for time series.
crossSeriesEligible logical. Whether or not the cross series group by column is eligible for cross-series modeling. Will be NULL if no cross series group by column is used.
crossSeriesEligibilityReason character. The type of cross series eligibility (or ineligibility).
timeUnit character. For time series eligible features, the time unit covered by a single time step, e.g. "HOUR", or NULL for features that are not time series eligible.
timeStep integer. Expected difference in time units between rows in the data. Will be NULL for features that are not time series eligible.
Other MultiSeriesProject functions:
GetMultiSeriesProperties()
,
RequestCrossSeriesDetection()
,
as.dataRobotMultiSeriesProperties()
This function requests the creation of a new datetime model in the DataRobot modeling project defined by the project parameter. The function also allows the user to specify alternatives to the project default for featurelist, samplePct, and scoringType. This function returns an integer modelJobId value, which can be used by the GetDatetimeModelFromJobId function to return the full model object.
RequestNewDatetimeModel( project, blueprint, featurelist = NULL, trainingRowCount = NULL, trainingDuration = NULL, timeWindowSamplePct = NULL, monotonicIncreasingFeaturelistId = NULL, monotonicDecreasingFeaturelistId = NULL )
RequestNewDatetimeModel( project, blueprint, featurelist = NULL, trainingRowCount = NULL, trainingDuration = NULL, timeWindowSamplePct = NULL, monotonicIncreasingFeaturelistId = NULL, monotonicDecreasingFeaturelistId = NULL )
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
blueprint |
list. A list with at least the following two elements: blueprintId and projectId. Note that the individual elements of the list returned by ListBlueprints are admissible values for this parameter. |
featurelist |
list. A list that contains the element featurelistId that specifies the featurelist to be used in building the model; if not specified (i.e., for the default value NULL), the project default (Informative Features) is used. |
trainingRowCount |
integer. Optional, the number of rows of data that should be used to train the model. If specified, trainingDuration may not be specified. |
trainingDuration |
character. String (optional) a duration string specifying what time range the data used to train the model should span. If specified, trainingRowCount may not be specified. |
timeWindowSamplePct |
integer. Optional. May only be specified when the requested model is a time window (e.g. duration or start and end dates). An integer between 1 and 99 indicating the percentage to sample by within the window. The points kept are determined by a random uniform sample. |
monotonicIncreasingFeaturelistId |
character. Optional. The id of the featurelist
that defines the set of features with a monotonically increasing relationship to the
target. If |
monotonicDecreasingFeaturelistId |
character. Optional. The id of the featurelist
that defines the set of features with a monotonically decreasing relationship to the
target. If |
Motivation for this function is the fact that some models - e.g., very complex machine learning models fit to large datasets - may take a long time to complete. Splitting the model creation request from model retrieval in these cases allows the user to perform other interactive R session tasks between the time the model creation/update request is made and the time the final model is available.
An integer value that can be used as the modelJobId parameter in subsequent calls to the GetDatetimeModelFromJobId function.
## Not run: projectId <- "59a5af20c80891534e3c2bde" blueprints <- ListBlueprints(projectId) blueprint <- blueprints[[1]] RequestNewDatetimeModel(projectId, blueprint) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" blueprints <- ListBlueprints(projectId) blueprint <- blueprints[[1]] RequestNewDatetimeModel(projectId, blueprint) ## End(Not run)
This function requests the creation of a new model in the DataRobot modeling project defined by the project parameter. The function also allows the user to specify alternatives to the project default for featurelist, samplePct, and scoringType. This function returns an integer modelJobId value, which can be used by the GetModelFromJobId function to return the full model object.
RequestNewModel( project, blueprint, featurelist = NULL, samplePct = NULL, trainingRowCount = NULL, scoringType = NULL, monotonicIncreasingFeaturelistId = NULL, monotonicDecreasingFeaturelistId = NULL )
RequestNewModel( project, blueprint, featurelist = NULL, samplePct = NULL, trainingRowCount = NULL, scoringType = NULL, monotonicIncreasingFeaturelistId = NULL, monotonicDecreasingFeaturelistId = NULL )
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
blueprint |
list. A list with at least the following two elements: blueprintId and projectId. Note that the individual elements of the list returned by ListBlueprints are admissible values for this parameter. |
featurelist |
list. A list that contains the element featurelistId that specifies the featurelist to be used in building the model; if not specified (i.e., for the default value NULL), the project default (Informative Features) is used. |
samplePct |
numeric. The percentage of the training dataset to be used in building the new model; if not specified (i.e., for the default value NULL), the maxTrainPct value for the project is used. Value should be between 0 and 100. |
trainingRowCount |
integer. The number of rows to use to train the requested model. |
scoringType |
character. String specifying the scoring type; default is validation set scoring, but cross-validation averaging is also possible. |
monotonicIncreasingFeaturelistId |
character. Optional. The id of the featurelist
that defines the set of features with a monotonically increasing relationship to the
target. If |
monotonicDecreasingFeaturelistId |
character. Optional. The id of the featurelist
that defines the set of features with a monotonically decreasing relationship to the
target. If |
Motivation for this function is the fact that some models - e.g., very complex machine learning models fit to large datasets - may take a long time to complete. Splitting the model creation request from model retrieval in these cases allows the user to perform other interactive R session tasks between the time the model creation/update request is made and the time the final model is available.
Either 'sample_pct' or 'training_row_count' can be used to specify the amount of data to use, but not both. If neither are specified, a default of the maximum amount of data that can safely be used to train any blueprint without going into the validation data will be selected. In smart-sampled projects, 'samplePct' and 'trainingRowCount' are assumed to be in terms of rows of the minority class.
Note : For datetime partitioned projects, use RequestNewDatetimeModel
instead
An integer value that can be used as the modelJobId parameter in subsequent calls to the GetModelFromJobId function.
## Not run: projectId <- "59a5af20c80891534e3c2bde" blueprints <- ListBlueprints(projectId) blueprint <- blueprints[[1]] RequestNewModel(projectId, blueprint) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" blueprints <- ListBlueprints(projectId) blueprint <- blueprints[[1]] RequestNewModel(projectId, blueprint) ## End(Not run)
Create a new model from a rating table.
RequestNewRatingTableModel(project, ratingTableId)
RequestNewRatingTableModel(project, ratingTableId)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
ratingTableId |
character. The ID of the rating table. |
An integer value that can be used as the modelJobId parameter in subsequent calls to the GetModelFromJobId function.
## Not run: projectId <- "59a5af20c80891534e3c2bde" ratingTableId <- "5984b4d7100d2b31c1166529" RequestNewModel(projectId, ratingTableId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" ratingTableId <- "5984b4d7100d2b31c1166529" RequestNewModel(projectId, ratingTableId) ## End(Not run)
In order to create PredictionExplanations for a particular model and dataset, you must first:
Compute feature impact for the model via RequestFeatureImpact()
Compute a PredictionExplanationsInitialization for the model via
RequestPredictionExplanationsInitialization()
Compute predictions for the model and dataset via'RequestPredictions()
After prediction explanations are requested information about them can be accessed using
the functions GetPredictionExplanationsMetadataFromJobId
and
GetPredictionExplanationsMetadata
. Prediction explanations themselves can be accessed
using the functions GetPredictionExplanationsRows
,
GetPredictionExplanationsRowsAsDataFrame
, and DownloadPredictionExplanations
.
RequestPredictionExplanations( model, datasetId, maxExplanations = NULL, thresholdLow = NULL, thresholdHigh = NULL )
RequestPredictionExplanations( model, datasetId, maxExplanations = NULL, thresholdLow = NULL, thresholdHigh = NULL )
model |
An S3 object of class dataRobotModel like that returned by the function GetModel, or each element of the list returned by the function ListModels. |
datasetId |
character. ID of the prediction dataset for which prediction explanations are requested. |
maxExplanations |
integer. Optional. The maximum number of prediction explanations to supply per row of the dataset, default: 3. |
thresholdLow |
numeric. Optional. The lower threshold, below which a prediction must
score in order for prediction explanations to be computed for a row in the dataset. If
neither |
thresholdHigh |
numeric. Optional. The high threshold, above which a prediction must score
in order for prediction explanations to be computed. If neither |
thresholdHigh
and thresholdLow
are optional filters applied to speed up
computation. When at least one is specified, only the selected outlier rows will have
prediction explanations computed. Rows are considered to be outliers if their predicted
value (in case of regression projects) or probability of being the positive
class (in case of classification projects) is less than threshold_low
or greater than
thresholdHigh
. If neither is specified, prediction explanations will be computed for
all rows.
job Id
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" datasets <- ListPredictionDatasets(projectId) dataset <- datasets[[1]] datasetId <- dataset$id model <- GetModel(projectId, modelId) RequestPredictionExplanations(model, datasetId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" datasets <- ListPredictionDatasets(projectId) dataset <- datasets[[1]] datasetId <- dataset$id model <- GetModel(projectId, modelId) RequestPredictionExplanations(model, datasetId) ## End(Not run)
Prediction explanations initializations are a prerequisite for computing prediction explanations, and include a sample of what the computed prediction explanations for a prediction dataset would look like.
RequestPredictionExplanationsInitialization(model)
RequestPredictionExplanationsInitialization(model)
model |
An S3 object of class dataRobotModel like that returned by the function GetModel, or each element of the list returned by the function ListModels. |
job Id
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) RequestPredictionExplanationsInitialization(model) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) RequestPredictionExplanationsInitialization(model) ## End(Not run)
Prediction intervals can now be returned for predictions with datetime models.
Use 'includePredictionIntervals = TRUE' in calls to Predict
or RequestPredictions
.
For each model, prediction intervals estimate the range of values DataRobot expects actual values
of the target to fall within. They are similar to a confidence interval of a prediction, but are
based on the residual errors measured during the backtesting for the selected model.
RequestPredictions( project, modelId, datasetId, includePredictionIntervals = NULL, predictionIntervalsSize = NULL )
RequestPredictions( project, modelId, datasetId, includePredictionIntervals = NULL, predictionIntervalsSize = NULL )
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
modelId |
numeric. The ID of the model to use to make predictions |
datasetId |
numeric. The ID of the dataset to make predictions against (as uploaded from
|
includePredictionIntervals |
logical. Optional. Should prediction intervals bounds should be part of predictions? Only available for time series projects. See "Details" for more info. |
predictionIntervalsSize |
numeric. Optional. Size of the prediction intervals, in percent. Only available for time series projects. See "Details" for more info. |
predictJobId to be used by GetPredictions function to retrieve the model predictions.
## Not run: dataset <- UploadPredictionDataset(project, diamonds_small) model <- ListModels(project)[[1]] modelId <- model$modelId predictJobId <- RequestPredictions(project, modelId, dataset$id) predictions <- GetPredictions(project, predictJobId) # Or, if prediction intervals are desired (datetime only) predictJobId <- RequestPredictions(datetimeProject, DatetimeModelId, includePredictionIntervals = TRUE, predictionIntervalsSize = 100) predictions <- GetPredictions(datetimeProject, predictJobId, type = "raw") ## End(Not run)
## Not run: dataset <- UploadPredictionDataset(project, diamonds_small) model <- ListModels(project)[[1]] modelId <- model$modelId predictJobId <- RequestPredictions(project, modelId, dataset$id) predictions <- GetPredictions(project, predictJobId) # Or, if prediction intervals are desired (datetime only) predictJobId <- RequestPredictions(datetimeProject, DatetimeModelId, includePredictionIntervals = TRUE, predictionIntervalsSize = 100) predictions <- GetPredictions(datetimeProject, predictJobId, type = "raw") ## End(Not run)
Training a model using a ruleset is a necessary prerequisite for being able to download the code for a ruleset.
RequestPrimeModel(project, ruleset)
RequestPrimeModel(project, ruleset)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
ruleset |
list. A list specifying ruleset parameters (see GetRulesets) |
job Id
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" rulesets <- GetRulesets(projectId, modelId) ruleset <- rulesets[[1]] RequestPrimeModel(projectId, ruleset) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" rulesets <- GetRulesets(projectId, modelId) ruleset <- rulesets[[1]] RequestPrimeModel(projectId, ruleset) ## End(Not run)
This function requests a refit of the model defined by the model parameter to the same training dataset used in building it originally, but with a different fraction of the data, specified by the samplePct parameter. The function returns an integer value that may be used with the function GetModelFromJobId to retrieve the model after fitting is complete.
RequestSampleSizeUpdate(model, samplePct = NULL, trainingRowCount = NULL)
RequestSampleSizeUpdate(model, samplePct = NULL, trainingRowCount = NULL)
model |
An S3 object of class dataRobotModel like that returned by the function GetModel, or each element of the list returned by the function ListModels. |
samplePct |
Numeric, specifying the percentage of the training dataset to be used in building the new model. |
trainingRowCount |
integer. The number of rows to use to train the requested model. |
Motivation for this function is the fact that some models - e.g., very complex machine learning models fit to large datasets - may take a long time to complete. Splitting the model creation request from model retrieval in these cases allows the user to perform other interactive R session tasks between the time the model creation/update request is made and the time the final model is available.
Either 'sample_pct' or 'training_row_count' can be used to specify the amount of data to use, but not both. If neither are specified, a default of the maximum amount of data that can safely be used to train any blueprint without going into the validation data will be selected. In smart-sampled projects, 'samplePct' and 'trainingRowCount' are assumed to be in terms of rows of the minority class.
Integer, value to be used as the modelJobId parameter in calling the function GetModelFromJobId to retrieve the updated model.
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) RequestSampleSizeUpdate(model, samplePct = 100) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) RequestSampleSizeUpdate(model, samplePct = 100) ## End(Not run)
Note that you can call GetSeriesAccuracy
without calling this function, and the
series accuracy will be requested automatically.
RequestSeriesAccuracy(model)
RequestSeriesAccuracy(model)
model |
character. The model for which you want to compute Feature Impact, e.g.
from the list of models returned by |
Job ID for the async job associated with the computation.
## Not run: projectId <- "5984b4d7100d2b31c1166529" modelId <- "5984b4d7100d2b31c1166529" model <- GetModel(projectId, modelId) jobId <- RequestSeriesAccuracy(projectId, modelId) WaitForJobToComplete(projectId, jobId) ## End(Not run)
## Not run: projectId <- "5984b4d7100d2b31c1166529" modelId <- "5984b4d7100d2b31c1166529" model <- GetModel(projectId, modelId) jobId <- RequestSeriesAccuracy(projectId, modelId) WaitForJobToComplete(projectId, jobId) ## End(Not run)
Request training predictions for a specific model.
RequestTrainingPredictions(model, dataSubset)
RequestTrainingPredictions(model, dataSubset)
model |
An S3 object of class dataRobotModel like that returned by the function GetModel, or each element of the list returned by the function ListModels. |
dataSubset |
character. What data subset would you like to predict on?
Possible options are included in
|
job Id
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) RequestTrainingPredictions(model, dataSubset = DataSubset$All) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) RequestTrainingPredictions(model, dataSubset = DataSubset$All) ## End(Not run)
Requests generation of an transferable model file for use in an on-premise DataRobot standalone prediction environment. This function can only be used if model export is enabled, and will only be useful if you have an on-premise environment in which to import it.
RequestTransferableModel(project, modelId, predictionIntervalSize = NULL)
RequestTransferableModel(project, modelId, predictionIntervalSize = NULL)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
modelId |
numeric. Unique alphanumeric identifier for the model of interest. |
predictionIntervalSize |
integer. Optional. Added in 2.19. For supervised time series projects, this is the desired prediction interval size for the exported model. A prediction interval is the range of values DataRobot expects actual values of the target to fall within 0 to 100 (inclusive). |
This function does not download the exported file. Use
DownloadTransferableModel
for that.
jobId
Other Transferable Model functions:
DeleteTransferableModel()
,
DownloadTransferableModel()
,
GetTransferableModel()
,
ListTransferableModels()
,
UpdateTransferableModel()
,
UploadTransferableModel()
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" jobId <- RequestTransferableModel(projectId, modelId, 50) WaitForJobToComplete(projectId, jobId) file <- file.path(tempdir(), "model.drmodel") DownloadTransferableModel(projectObject, modelId, file) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" jobId <- RequestTransferableModel(projectId, modelId, 50) WaitForJobToComplete(projectId, jobId) file <- file.path(tempdir(), "model.drmodel") DownloadTransferableModel(projectObject, modelId, file) ## End(Not run)
The DataRobot API returns dates in RFC 3339 format. Since this comes from a Python datetime object, we assume that the period returned is in the format "
RFC3339DateTimeFormat
RFC3339DateTimeFormat
An object of class character
of length 1.
Other API datetime functions:
formatRFC3339Timestamp()
,
parseRFC3339Timestamp()
,
transformRFC3339Period()
,
validateReportingPeriodTime()
The advanced tuning feature allows you to manually set model parameters and override the DataRobot default selections. It is generally available for Eureqa models. To use this feature with other model types, contact your CFDS for more information.
RunInteractiveTuning(model)
RunInteractiveTuning(model)
model |
dataRobotModel. A DataRobot model object to get tuning parameters for. |
This function runs an interactive session to iterate you through individual arguments for each tunable hyperparameter, presenting you with the defaults and other available information. You can set each parameter one at a time, skipping ones you don't intend to set. At the end, it will return a job ID that can be used to get the tuned model.
Note that sometimes you may see the exact same parameter more than once. These are for different parts of the blueprint that use the same parameter (e.g., one hot encoding for text and then one hot encoding for numeric). They are listed in the order they are found in the blueprint but unfortunately more user-facing information cannot be provided.
A job ID that can be used to get the tuned model.
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" myXGBModel <- GetModel(projectId, modelId) tuningJob <- RunInteractiveTuning(myXGBModel) tunedModel <- GetModelFromJobId(projectId, tuningJob) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" myXGBModel <- GetModel(projectId, modelId) tuningJob <- RunInteractiveTuning(myXGBModel) tunedModel <- GetModelFromJobId(projectId, tuningJob) ## End(Not run)
Some backtests may be unavailable if the model is trained into their validation data.
ScoreBacktests(model, wait = FALSE)
ScoreBacktests(model, wait = FALSE)
model |
An S3 object of class dataRobotModel like that returned by the function GetModel, or each element of the list returned by the function ListModels. |
wait |
logical. If TRUE, wait until job completion. |
job ID of pending job if wait
is FALSE. Use WaitForJobToComplete
to await job completion. If wait
is TRUE, will wait until completion and return
NULL
. Upon completion, all available backtests will have scores computed.
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) ScoreBacktests(model) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) ScoreBacktests(model) ## End(Not run)
Added in DataRobot API 2.20.
SegmentAnalysisAttribute
SegmentAnalysisAttribute
An object of class list
of length 3.
For usage, see GetDeploymentServiceStats
.
For details, see "Calculating features across series" in the time series section of the DataRobot user guide.
SeriesAggregationType
SeriesAggregationType
An object of class list
of length 2.
The prediction threshold is used by a binary classification model when deciding between the positive and negative class.
SetPredictionThreshold(model, threshold)
SetPredictionThreshold(model, threshold)
model |
An S3 object of class dataRobotModel like that returned by the function GetModel, or each element of the list returned by the function ListModels. |
threshold |
numeric. The threshold to use when deciding between the positive and negative class. Should be between 0 and 1 inclusive. |
Note: This feature can only can be used when PredictionThresholdReadOnly
is FALSE
.
Models typically cannot have their prediction threshold modified if they have been used to
set a deployment or predictions have been made with the dedicated prediction API.
Returns NULL but updates the model in place.
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) SetPredictionThreshold(model, threshold = 0.6) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) SetPredictionThreshold(model, threshold = 0.6) ## End(Not run)
This function sets the target variable for the project defined by project, starting the process of building models to predict the response variable target. Both of these parameters - project and target - are required and they are sufficient to start a modeling project with DataRobot default specifications for the other optional parameters.
SetTarget( project, target, metric = NULL, weights = NULL, partition = NULL, mode = AutopilotMode$Quick, seed = NULL, targetType = NULL, positiveClass = NULL, blueprintThreshold = NULL, responseCap = NULL, featurelistId = NULL, smartDownsampled = NULL, majorityDownsamplingRate = NULL, accuracyOptimizedBlueprints = NULL, offset = NULL, exposure = NULL, eventsCount = NULL, monotonicIncreasingFeaturelistId = NULL, monotonicDecreasingFeaturelistId = NULL, onlyIncludeMonotonicBlueprints = FALSE, maxWait = 600 )
SetTarget( project, target, metric = NULL, weights = NULL, partition = NULL, mode = AutopilotMode$Quick, seed = NULL, targetType = NULL, positiveClass = NULL, blueprintThreshold = NULL, responseCap = NULL, featurelistId = NULL, smartDownsampled = NULL, majorityDownsamplingRate = NULL, accuracyOptimizedBlueprints = NULL, offset = NULL, exposure = NULL, eventsCount = NULL, monotonicIncreasingFeaturelistId = NULL, monotonicDecreasingFeaturelistId = NULL, onlyIncludeMonotonicBlueprints = FALSE, maxWait = 600 )
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
target |
character. String giving the name of the response variable to be predicted by all project models. |
metric |
character. Optional. String specifying the model fitting metric to be optimized; a list of valid options for this parameter, which depends on both project and target, may be obtained with the function GetValidMetrics. |
weights |
character. Optional. String specifying the name of the column from the modeling dataset to be used as weights in model fitting. |
partition |
partition. Optional. S3 object of class 'partition' whose elements specify a valid partitioning scheme. See help for functions CreateGroupPartition, CreateRandomPartition, CreateStratifiedPartition, CreateUserPartition and CreateDatetimePartitionSpecification. |
mode |
character. Optional. Specifies the autopilot mode used to start the
modeling project; See |
seed |
integer. Optional. Seed for the random number generator used in creating random partitions for model fitting. |
targetType |
character. Optional. Used to specify the targetType to use for a project.
Valid options are "Binary", "Multiclass", "Regression". Set to "Multiclass" to enable
multiclass modeling. Otherwise, it can help to disambiguate, i.e. telling DataRobot how to
handle a numeric target with a few unique values that could be used for either multiclass
or regression. See |
positiveClass |
character. Optional. Target variable value corresponding to a positive response in binary classification problems. |
blueprintThreshold |
integer. Optional. The maximum time (in hours) that any modeling blueprint is allowed to run before being excluded from subsequent autopilot stages. |
responseCap |
numeric. Optional. Floating point value, between 0.5 and 1.0, specifying a capping limit for the response variable. The default value NULL corresponds to an uncapped response, equivalent to responseCap = 1.0. |
featurelistId |
numeric. Specifies which feature list to use. If NULL (default), a default featurelist is used. |
smartDownsampled |
logical. Optional. Whether to use smart downsampling to throw away excess rows of the majority class. Only applicable to classification and zero-boosted regression projects. |
majorityDownsamplingRate |
numeric. Optional. Floating point value, between 0.0 and 100.0. The percentage of the majority rows that should be kept. Specify only if using smart downsampling. May not cause the majority class to become smaller than the minority class. |
accuracyOptimizedBlueprints |
logical. Optional. When enabled, accuracy optimized blueprints will run in autopilot for the project. These are longer-running model blueprints that provide increased accuracy over normal blueprints that run during autopilot. |
offset |
character. Optional. Vector of the names of the columns containing the offset of each row. |
exposure |
character. Optional. The name of a column containing the exposure of each row. |
eventsCount |
character. Optional. The name of a column specifying the events count. |
monotonicIncreasingFeaturelistId |
character. Optional. The id of the featurelist
that defines the set of features with a monotonically increasing relationship to the
target. If |
monotonicDecreasingFeaturelistId |
character. Optional. The id of the featurelist
that defines the set of features with a monotonically decreasing relationship to the
target. If |
onlyIncludeMonotonicBlueprints |
logical. Optional. When TRUE, only blueprints that support enforcing monotonic constraints will be available in the project or selected for the autopilot. |
maxWait |
integer. Specifies how many seconds to wait for the server to finish analyzing the target and begin the modeling process. If the process takes longer than this parameter specifies, execution will stop (but the server will continue to process the request). |
## Not run: projectId <- "59a5af20c80891534e3c2bde" SetTarget(projectId, "targetFeature") SetTarget(projectId, "targetFeature", metric = "LogLoss") SetTarget(projectId, "targetFeature", mode = AutopilotMode$Manual) SetTarget(projectId, "targetFeature", targetType = TargetType$Multiclass) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" SetTarget(projectId, "targetFeature") SetTarget(projectId, "targetFeature", metric = "LogLoss") SetTarget(projectId, "targetFeature", mode = AutopilotMode$Manual) SetTarget(projectId, "targetFeature", targetType = TargetType$Multiclass) ## End(Not run)
This function uploads a modeling dataset defined by the dataSource parameter and allows specification of the optional project name projectName. The dataSource parameter can be either the name of a CSV file or a dataframe; in the latter case, it is saved as a CSV file whose name is described in the Details section. This function returns the projectName specified in the calling sequence, the unique alphanumeric identifier projectId for the new project, the name of the modeling dataset uploaded to create this project, and the project creation time and date.
SetupProject(dataSource, projectName = NULL, maxWait = 60 * 60)
SetupProject(dataSource, projectName = NULL, maxWait = 60 * 60)
dataSource |
object. Either (a) the name of a CSV file, (b) a dataframe or (c) url to a publicly available file; in each case, this parameter identifies the source of the data from which all project models will be built. See Details. |
projectName |
character. Optional. String specifying a project name. |
maxWait |
integer. The maximum time to wait for each of two steps: (1) The initial project creation request, and (2) data processing that occurs after receiving the response to this initial request. |
The DataRobot modeling engine requires a CSV file containing the data to be used in fitting models, and this has been implemented here in two ways. The first and simpler is to specify dataSource as the name of this CSV file, but for the convenience of those who wish to work with dataframes, this function also provides the option of specifying a dataframe, which is then written to a CSV file and uploaded to the DataRobot server. In this case, the file name is either specified directly by the user through the saveFile parameter, or indirectly from the name of the dataSource dataframe if saveFile = NULL (the default). In this second case, the file name consists of the name of the dataSource dataframe with the string csvExtension appended.
A named list that contains:
character. The name assigned to the DataRobot project
character. The unique alphanumeric project identifier for this DataRobot project
character. The name of the CSV modeling file uploaded for this project
character. The time and date of project creation
## Not run: SetupProject(iris, "dr-iris") ## End(Not run)
## Not run: SetupProject(iris, "dr-iris") ## End(Not run)
Create a project from a data source.
SetupProjectFromDataSource( dataSourceId, username, password, projectName = NULL, maxWait = 60 * 60 )
SetupProjectFromDataSource( dataSourceId, username, password, projectName = NULL, maxWait = 60 * 60 )
dataSourceId |
character. The ID of the data source to create a project from. |
username |
character. The username to use for authentication to the database. |
password |
character. The password to use for authentication to the database. |
projectName |
character. Optional. String specifying a project name. The password is encrypted at server side and never saved or stored. |
maxWait |
integer. The maximum time to wait for each of two steps: (1) The initial project creation request, and (2) data processing that occurs after receiving the response to this initial request. |
A named list that contains:
character. The name assigned to the DataRobot project
character. The unique alphanumeric project identifier for this DataRobot project
character. The name of the CSV modeling file uploaded for this project
character. The time and date of project creation
## Not run: dataSourceId <- "5c1303269300d900016b41a7" SetupProjectFromDataSource(dataSourceId, username = "username", password = "hunter1", projectName = "My Project") ## End(Not run)
## Not run: dataSourceId <- "5c1303269300d900016b41a7" SetupProjectFromDataSource(dataSourceId, username = "username", password = "hunter1", projectName = "My Project") ## End(Not run)
This function is deprecated. Use SetupProjectFromDataSource
instead.
SetupProjectFromHDFS(url, port = NULL, projectName = NULL, maxWait = 60 * 60)
SetupProjectFromHDFS(url, port = NULL, projectName = NULL, maxWait = 60 * 60)
url |
character. The location of the WebHDFS file, both server and full path. Per the DataRobot specification, must begin with hdfs:// |
port |
integer. Optional. The port to use. If not specified, will default to the server default (50070). |
projectName |
character. Optional. String specifying a project name. |
maxWait |
integer. The maximum time to wait for each of two steps: (1) The initial project creation request, and (2) data processing that occurs after receiving the response to this initial request. |
This function returns the projectName specified in the calling sequence, the unique alphanumeric identifier projectId for the new project, the name of the modeling dataset uploaded to create this project, and the project creation time and date.
A named list that contains:
character. The name assigned to the DataRobot project
character. The unique alphanumeric project identifier for this DataRobot project
character. The name of the CSV modeling file uploaded for this project
character. The time and date of project creation
## Not run: SetupProjectFromHDFS(url = 'hdfs://path/to/data', port = 12345, projectName = 'dataProject') ## End(Not run)
## Not run: SetupProjectFromHDFS(url = 'hdfs://path/to/data', port = 12345, projectName = 'dataProject') ## End(Not run)
This is a list that contains the valid values for granting access to other users (see
Share
). If you wish, you can specify access roles using the list values, e.g.,
SharingRole$ReadWrite
instead of typing the string "READ_WRITE". This way you can
benefit from autocomplete and not have to remember the valid options.
SharingRole
SharingRole
An object of class list
of length 6.
Owner
allows any action including deletion.
ReadWrite
or Editor
allows modifications to the state, e.g., renaming
and creating data sources from a data store, but *not* deleting the entity.
ReadOnly
or Consumer
- for data sources, enables creating projects and predictions;
for data stores, allows viewing them only.
This is a list that contains the valid values for source type
SourceType
SourceType
An object of class list
of length 2.
Star a model.
StarModel(model)
StarModel(model)
model |
character. The model for which you want to compute Feature Impact, e.g.
from the list of models returned by |
the model object, but now starred
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) StarModel(model) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) StarModel(model) ## End(Not run)
There is an error if autopilot is currently running on or has already finished running on the provided featurelist and also if project's target was not selected (via SetTarget).
StartNewAutoPilot(project, featurelistId, mode = AutopilotMode$FullAuto)
StartNewAutoPilot(project, featurelistId, mode = AutopilotMode$FullAuto)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
featurelistId |
numeric. Specifies which feature list to use. |
mode |
character. The desired autopilot mode. Currently only AutopilotMode$FullAuto is supported. |
## Not run: projectId <- "59a5af20c80891534e3c2bde" featureList <- CreateFeaturelist(projectId, "myFeaturelist", c("feature1", "feature2")) featurelistId <- featureList$featurelistId StartNewAutoPilot(projectId, featurelistId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" featureList <- CreateFeaturelist(projectId, "myFeaturelist", c("feature1", "feature2")) featurelistId <- featureList$featurelistId StartNewAutoPilot(projectId, featurelistId) ## End(Not run)
This function is a convenient shorthand to start a project and set the target.
See SetupProject
and SetTarget
.
StartProject( dataSource, projectName = NULL, target, metric = NULL, weights = NULL, partition = NULL, mode = NULL, seed = NULL, targetType = NULL, positiveClass = NULL, blueprintThreshold = NULL, responseCap = NULL, featurelistId = NULL, smartDownsampled = NULL, majorityDownsamplingRate = NULL, accuracyOptimizedBlueprints = NULL, offset = NULL, exposure = NULL, eventsCount = NULL, monotonicIncreasingFeaturelistId = NULL, monotonicDecreasingFeaturelistId = NULL, onlyIncludeMonotonicBlueprints = FALSE, workerCount = NULL, wait = FALSE, checkInterval = 20, timeout = NULL, username = NULL, password = NULL, verbosity = 1, maxWait = 600 )
StartProject( dataSource, projectName = NULL, target, metric = NULL, weights = NULL, partition = NULL, mode = NULL, seed = NULL, targetType = NULL, positiveClass = NULL, blueprintThreshold = NULL, responseCap = NULL, featurelistId = NULL, smartDownsampled = NULL, majorityDownsamplingRate = NULL, accuracyOptimizedBlueprints = NULL, offset = NULL, exposure = NULL, eventsCount = NULL, monotonicIncreasingFeaturelistId = NULL, monotonicDecreasingFeaturelistId = NULL, onlyIncludeMonotonicBlueprints = FALSE, workerCount = NULL, wait = FALSE, checkInterval = 20, timeout = NULL, username = NULL, password = NULL, verbosity = 1, maxWait = 600 )
dataSource |
object. Either (a) the name of a CSV file, (b) a dataframe or (c) url to a publicly available file; in each case, this parameter identifies the source of the data from which all project models will be built. See Details. |
projectName |
character. Optional. String specifying a project name. |
target |
character. String giving the name of the response variable to be predicted by all project models. |
metric |
character. Optional. String specifying the model fitting metric to be optimized; a list of valid options for this parameter, which depends on both project and target, may be obtained with the function GetValidMetrics. |
weights |
character. Optional. String specifying the name of the column from the modeling dataset to be used as weights in model fitting. |
partition |
partition. Optional. S3 object of class 'partition' whose elements specify a valid partitioning scheme. See help for functions CreateGroupPartition, CreateRandomPartition, CreateStratifiedPartition, CreateUserPartition and CreateDatetimePartitionSpecification. |
mode |
character. Optional. Specifies the autopilot mode used to start the
modeling project; See |
seed |
integer. Optional. Seed for the random number generator used in creating random partitions for model fitting. |
targetType |
character. Optional. Used to specify the targetType to use for a project.
Valid options are "Binary", "Multiclass", "Regression". Set to "Multiclass" to enable
multiclass modeling. Otherwise, it can help to disambiguate, i.e. telling DataRobot how to
handle a numeric target with a few unique values that could be used for either multiclass
or regression. See |
positiveClass |
character. Optional. Target variable value corresponding to a positive response in binary classification problems. |
blueprintThreshold |
integer. Optional. The maximum time (in hours) that any modeling blueprint is allowed to run before being excluded from subsequent autopilot stages. |
responseCap |
numeric. Optional. Floating point value, between 0.5 and 1.0, specifying a capping limit for the response variable. The default value NULL corresponds to an uncapped response, equivalent to responseCap = 1.0. |
featurelistId |
numeric. Specifies which feature list to use. If NULL (default), a default featurelist is used. |
smartDownsampled |
logical. Optional. Whether to use smart downsampling to throw away excess rows of the majority class. Only applicable to classification and zero-boosted regression projects. |
majorityDownsamplingRate |
numeric. Optional. Floating point value, between 0.0 and 100.0. The percentage of the majority rows that should be kept. Specify only if using smart downsampling. May not cause the majority class to become smaller than the minority class. |
accuracyOptimizedBlueprints |
logical. Optional. When enabled, accuracy optimized blueprints will run in autopilot for the project. These are longer-running model blueprints that provide increased accuracy over normal blueprints that run during autopilot. |
offset |
character. Optional. Vector of the names of the columns containing the offset of each row. |
exposure |
character. Optional. The name of a column containing the exposure of each row. |
eventsCount |
character. Optional. The name of a column specifying the events count. |
monotonicIncreasingFeaturelistId |
character. Optional. The id of the featurelist
that defines the set of features with a monotonically increasing relationship to the
target. If |
monotonicDecreasingFeaturelistId |
character. Optional. The id of the featurelist
that defines the set of features with a monotonically decreasing relationship to the
target. If |
onlyIncludeMonotonicBlueprints |
logical. Optional. When TRUE, only blueprints that support enforcing monotonic constraints will be available in the project or selected for the autopilot. |
workerCount |
integer. The number of workers to run (default 2). Use |
wait |
logical. If |
checkInterval |
numeric. Optional. Maximum wait (in seconds) between checks that Autopilot is finished. Defaults to 20. |
timeout |
numeric. Optional. Time (in seconds) after which to give up (Default is no timeout). There is an error if Autopilot is not finished before timing out. |
username |
character. The username to use for authentication to the database. |
password |
character. The password to use for authentication to the database. |
verbosity |
numeric. Optional. 0 is silent, 1 or more displays information about progress. Default is 1. |
maxWait |
integer. Specifies how many seconds to wait for the server to finish analyzing the target and begin the modeling process. If the process takes longer than this parameter specifies, execution will stop (but the server will continue to process the request). |
## Not run: projectId <- "59a5af20c80891534e3c2bde" StartProject(iris, projectName = "iris", target = "Species", targetType = TargetType$Multiclass) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" StartProject(iris, projectName = "iris", target = "Species", targetType = TargetType$Multiclass) ## End(Not run)
Creates a waiter function that can be used in a loop while trying some task many times. The waiter sleeps while waiting to try again, with sleep times determined by exponential back-off.
StartRetryWaiter(timeout = NULL, delay = 0.1, maxdelay = 1)
StartRetryWaiter(timeout = NULL, delay = 0.1, maxdelay = 1)
timeout |
integer. How long (in seconds) to keep trying before timing out (NULL means no timeout) |
delay |
integer. Initial delay between tries (in seconds). |
maxdelay |
integer. Maximum delay (in seconds) between tries. |
function which gets the waiter status. This function returns a list with these items:
index numeric. How many times we have waited.
secondsWaited numeric. How long (in seconds) since we started the timer.
stillTrying logical. Whether we should keep trying or give up (logical)
The advanced tuning feature allows you to manually set model parameters and override the DataRobot default selections.
StartTuningSession(model)
StartTuningSession(model)
model |
dataRobotModel. A DataRobot model object to get tuning parameters for. |
A function that can be used to tune the model. The function will take model
,
the model object to tune, and will have individual arguments for each tunable hyperparameter
that are each set to the default value for that hyperparameter. Furthermore, the function
takes tuningDescription
which can be used to describe the hyperparameter tuning
taking place for future reference. The function itself will return a job ID that can be used
to get the tuned model.
RunInteractiveTuning
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" myXGBModel <- GetModel(projectId, modelId) RunTune <- StartTuningSession(myXGBModel) tuningJob <- RunTune(myXGBModel, colsample_bytree = 0.4, colsample_bylevel = 0.8) tunedModel <- GetModelFromJobId(projectId, tuningJob) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" myXGBModel <- GetModel(projectId, modelId) RunTune <- StartTuningSession(myXGBModel) tuningJob <- RunTune(myXGBModel, colsample_bytree = 0.4, colsample_bylevel = 0.8) tunedModel <- GetModelFromJobId(projectId, tuningJob) ## End(Not run)
Convert a function into a single string for DataRobot
Stringify(functionToConvert, dputFile = tempfile())
Stringify(functionToConvert, dputFile = tempfile())
functionToConvert |
function. The function to convert to a string. |
dputFile |
character. Optional. A filepath to sink dput into. |
The actuals submitted will be used to calculate accuracy metrics. Values are not processed immediately and may take some time to propagate through deployment systems. Submission of actuals is limited to 10,000,000 actuals per hour. For time series deployments, total actuals = number of actuals * number of forecast distances. For example, submitting 10 actuals for a deployment with 50 forecast distances = 500 total actuals. For multiclass deployments, a similar calculation is made where total actuals = number of actuals * number of classes. For example, submitting 10 actuals for a deployment with 20 classes = 200 actuals.
SubmitActuals(actuals, deploymentId, batchSize = 10000)
SubmitActuals(actuals, deploymentId, batchSize = 10000)
actuals |
dataframe. Data that describes actual values. Any strings stored as factors will
be coerced to characters with
|
deploymentId |
character. The ID of the deployment. |
batchSize |
integer. Optional. The max number of actuals in each batch request. Cannot exceed 10000. |
Other deployment accuracy functions:
GetDeploymentAccuracyOverTime()
,
GetDeploymentAccuracy()
,
GetDeploymentAssociationId()
## Not run: deploymentId <- "5e319d2e422fbd6b58a5edad" myActuals <- data.frame(associationId = c("439917"), actualValue = c("True"), wasActedOn = c(TRUE)) SubmitActuals(actuals = myActuals, deploymentId) ## End(Not run)
## Not run: deploymentId <- "5e319d2e422fbd6b58a5edad" myActuals <- data.frame(associationId = c("439917"), actualValue = c("True"), wasActedOn = c(TRUE)) SubmitActuals(actuals = myActuals, deploymentId) ## End(Not run)
These functions extend R's generic summary function to the DataRobot S3 object classes dataRobotModel, dataRobotProject, listOfBlueprints, listOfFeaturelists, listOfModels, and projectSummaryList.
## S3 method for class 'dataRobotModel' summary(object, ...) ## S3 method for class 'dataRobotProject' summary(object, ...) ## S3 method for class 'listOfBlueprints' summary(object, nList = 6, ...) ## S3 method for class 'listOfFeaturelists' summary(object, nList = 6, ...) ## S3 method for class 'listOfModels' summary(object, nList = 6, ...) ## S3 method for class 'projectSummaryList' summary(object, nList = 6, ...)
## S3 method for class 'dataRobotModel' summary(object, ...) ## S3 method for class 'dataRobotProject' summary(object, ...) ## S3 method for class 'listOfBlueprints' summary(object, nList = 6, ...) ## S3 method for class 'listOfFeaturelists' summary(object, nList = 6, ...) ## S3 method for class 'listOfModels' summary(object, nList = 6, ...) ## S3 method for class 'projectSummaryList' summary(object, nList = 6, ...)
object |
The S3 object to be summarized. |
... |
list. Not currently used. |
nList |
integer. For the 'listOf' class objects, the first nList elements of the list are summarized in the dataframe in the second element of the list returned by the function. |
An object-specific summary: for objects of class dataRobotModel and dataRobotProject, this summary is a character vector giving key characteristics of the model or project, respectively; for the other object classes, the value is a two-element list where the first element is a brief summary character string and the second element is a more detailed dataframe with nList elements. The summary of object has the following components: modelType, expandedModel (constructed from modelType and processes), modelId, blueprintId, and projectId.
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) summary(model) ## End(Not run) ## Not run: projectId <- "59a5af20c80891534e3c2bde" project <- GetProject(projectId) summary(project) ## End(Not run) ## Not run: projectId <- "59a5af20c80891534e3c2bde" blueprints <- ListBlueprints(projectId) summary(blueprints) ## End(Not run) ## Not run: projectId <- "59a5af20c80891534e3c2bde" featureList <- CreateFeaturelist(projectId, "myFeaturelist", c("feature1", "feature2")) summary(featureList) ## End(Not run) ## Not run: projectId <- "59a5af20c80891534e3c2bde" models <- ListModels(projectId) summary(models) ## End(Not run) ## Not run: projectSummary <- ListProjects() summary(projectSummary) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) summary(model) ## End(Not run) ## Not run: projectId <- "59a5af20c80891534e3c2bde" project <- GetProject(projectId) summary(project) ## End(Not run) ## Not run: projectId <- "59a5af20c80891534e3c2bde" blueprints <- ListBlueprints(projectId) summary(blueprints) ## End(Not run) ## Not run: projectId <- "59a5af20c80891534e3c2bde" featureList <- CreateFeaturelist(projectId, "myFeaturelist", c("feature1", "feature2")) summary(featureList) ## End(Not run) ## Not run: projectId <- "59a5af20c80891534e3c2bde" models <- ListModels(projectId) summary(models) ## End(Not run) ## Not run: projectSummary <- ListProjects() summary(projectSummary) ## End(Not run)
Summarize the list of tuning parameters available for a model.
## S3 method for class 'listOfDataRobotTuningParameters' summary(object, ...)
## S3 method for class 'listOfDataRobotTuningParameters' summary(object, ...)
object |
list. The list of tuning parameters to summarize. |
... |
list. Extra parameters that are ignored. Used to allow S3 inheritance to work. |
A data.frame detailing the following about each tuning parameter:
name character. The name of the tuning parameter.
current character. The current searched values of that parameter.
default character. The default value of that parameter.
constraint character. A short description of the possible values that parameter can take.
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) summary(GetTuningParameters(model)) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) summary(GetTuningParameters(model)) ## End(Not run)
Target leakage report values
TargetLeakageType
TargetLeakageType
An object of class list
of length 4.
This is a list that contains the valid values for the Target Types
TargetType
TargetType
An object of class list
of length 3.
Test the database connection to the data store.
TestDataStore(dataStoreId, username, password)
TestDataStore(dataStoreId, username, password)
dataStoreId |
character. The ID of the data store to update. |
username |
character. The username to use for authentication to the database. |
password |
character. The password to use for authentication to the database. The password is encrypted at server side and never saved or stored. |
TRUE if successful, otherwise it will error.
## Not run: dataStoreId <- "5c1303269300d900016b41a7" TestDataStore(dataStoreId, username = "myUser", password = "mySecurePass129") ## End(Not run)
## Not run: dataStoreId <- "5c1303269300d900016b41a7" TestDataStore(dataStoreId, username = "myUser", password = "mySecurePass129") ## End(Not run)
Tidies a ServiceOverTime response object for use in a DF
tidyServiceOverTimeObject(df, valueColName)
tidyServiceOverTimeObject(df, valueColName)
df |
A data frame that contains the following:
|
valueColName |
character. The column in df currently named 'value' will be renamed to this. |
Time units
TimeUnits
TimeUnits
An object of class list
of length 8.
Star a model if it is unstarred, otherwise unstar the model.
ToggleStarForModel(model)
ToggleStarForModel(model)
model |
character. The model for which you want to compute Feature Impact, e.g.
from the list of models returned by |
the model object, but now starred if unstarred or unstarred if starred.
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) ToggleStarForModel(model) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) ToggleStarForModel(model) ## End(Not run)
The DataRobot Monitoring APIs return dates formatted as RFC 3339 strings. This is the same as ISO 8601. Specifically, 'T' is the date/time separator and 'Z' is used to denote UTC. Fractional seconds are returned. e.g. 2020-01-01T05:00:00.000000Z
transformRFC3339Period(periodContainer)
transformRFC3339Period(periodContainer)
periodContainer |
an object containing the following:
|
Other API datetime functions:
RFC3339DateTimeFormat
,
formatRFC3339Timestamp()
,
parseRFC3339Timestamp()
,
validateReportingPeriodTime()
Treat as exponential
TreatAsExponential
TreatAsExponential
An object of class list
of length 3.
Checks to see if we are trying to submit 'NULL' as a value.
TryingToSubmitNull(body)
TryingToSubmitNull(body)
body |
list. The body to check for NULL. |
This function unpauses the modeling queue for a specified DataRobot project.
UnpauseQueue(project)
UnpauseQueue(project)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
## Not run: projectId <- "59a5af20c80891534e3c2bde" UnpauseQueue(projectId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" UnpauseQueue(projectId) ## End(Not run)
Unstar a model.
UnstarModel(model)
UnstarModel(model)
model |
character. The model for which you want to compute Feature Impact, e.g.
from the list of models returned by |
the model object, but now unstarred
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) UnstarModel(model) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) UnstarModel(model) ## End(Not run)
Update access to a particular object.
UpdateAccess(object, access)
UpdateAccess(object, access)
object |
object. The shared object to inspect access for. |
access |
dataRobotAccessList. A list specifying access given to all users. See
|
## Not run: dataStoreId <- "5c1303269300d900016b41a7" dataStore <- GetDataStore(dataStoreId) access <- ListSharingAccess(dataStore) # Remove access from the first user and grant it to [email protected] instead. access[[1]]$username <- "[email protected]" UpdateAccess(dataStore, access) # Change access to a Read Only role. access[[1]]$role <- SharingRole$ReadOnly UpdateAccess(dataStore, access) ## End(Not run)
## Not run: dataStoreId <- "5c1303269300d900016b41a7" dataStore <- GetDataStore(dataStoreId) access <- ListSharingAccess(dataStore) # Remove access from the first user and grant it to [email protected] instead. access[[1]]$username <- "[email protected]" UpdateAccess(dataStore, access) # Change access to a Read Only role. access[[1]]$role <- SharingRole$ReadOnly UpdateAccess(dataStore, access) ## End(Not run)
Currently supports changing the name of a calendar.
UpdateCalendar(calendarId, name = NULL)
UpdateCalendar(calendarId, name = NULL)
calendarId |
character. The ID of the calendar to retrieve. |
name |
character. The new name to name the calendar. |
An S3 object of class "dataRobotCalendar"
## Not run: calendarId <- "5da75da31fb4a45b8a815a53" UpdateCalendar(calendarId, name = "New name for calendar") ## End(Not run)
## Not run: calendarId <- "5da75da31fb4a45b8a815a53" UpdateCalendar(calendarId, name = "New name for calendar") ## End(Not run)
Note that default templates cannot be updated.
UpdateComplianceDocTemplate(templateId, name = NULL, sections = NULL)
UpdateComplianceDocTemplate(templateId, name = NULL, sections = NULL)
templateId |
character. The ID of the template to update. |
name |
character. Optional. A new name to identify the compliance doc template by. |
sections |
list. Optional. Section definitions for the compliance template. |
The updated compliance doc template object.
## Not run: sections <- list(list("title" = "Missing Values Report", "highlightedText" = "NOTICE", "regularText" = paste("This dataset had a lot of Missing Values." "See the chart below: {{missingValues}}"), "type" = "user"), list("title" = "Blueprints", "regularText" = "{{blueprintDiagram}} /n Blueprint for this model", "type" = "user")) templateId <- "5cf85080d9436e5c310c796d" UpdateComplianceDocTemplate(templateId, name = "newName", sections = sections) ## End(Not run)
## Not run: sections <- list(list("title" = "Missing Values Report", "highlightedText" = "NOTICE", "regularText" = paste("This dataset had a lot of Missing Values." "See the chart below: {{missingValues}}"), "type" = "user"), list("title" = "Blueprints", "regularText" = "{{blueprintDiagram}} /n Blueprint for this model", "type" = "user")) templateId <- "5cf85080d9436e5c310c796d" UpdateComplianceDocTemplate(templateId, name = "newName", sections = sections) ## End(Not run)
Update a data store.
UpdateDataSource( dataSourceId, canonicalName = NULL, dataStoreId = NULL, query = NULL, table = NULL, schema = NULL, partitionColumn = NULL, fetchSize = NULL )
UpdateDataSource( dataSourceId, canonicalName = NULL, dataStoreId = NULL, query = NULL, table = NULL, schema = NULL, partitionColumn = NULL, fetchSize = NULL )
dataSourceId |
character. The ID of the data store to update. |
canonicalName |
character. The user-friendly name of the data source. |
dataStoreId |
character. The ID of the data store to connect to. |
query |
character. A query to execute on the data store to get the data. Optional. |
table |
character. The specified database table. Optional. |
schema |
character. The specified database schema. Optional. |
partitionColumn |
character. The name of the partition column. Optional. |
fetchSize |
integer. a user specified fetch size in the range [1, 20000]. Optional. By default a fetchSize will be assigned to balance throughput and memory usage |
## Not run: dataSourceId <- "5c1303269300d900016b41a7" UpdateDataSource(dataSourceId, canonicalName = "Different Name") ## End(Not run)
## Not run: dataSourceId <- "5c1303269300d900016b41a7" UpdateDataSource(dataSourceId, canonicalName = "Different Name") ## End(Not run)
Update a data store.
UpdateDataStore( dataStoreId, canonicalName = NULL, driverId = NULL, jdbcUrl = NULL )
UpdateDataStore( dataStoreId, canonicalName = NULL, driverId = NULL, jdbcUrl = NULL )
dataStoreId |
character. The ID of the data store to update. |
canonicalName |
character. The user-friendly name of the data store. |
driverId |
character. The ID of the driver to use. |
jdbcUrl |
character. The full JDBC url. |
## Not run: dataStoreId <- "5c1303269300d900016b41a7" UpdateDataStore(dataStoreId, canonicalName = "Different Name") ## End(Not run)
## Not run: dataStoreId <- "5c1303269300d900016b41a7" UpdateDataStore(dataStoreId, canonicalName = "Different Name") ## End(Not run)
Update drift tracking settings for a deployment.
UpdateDeploymentDriftTrackingSettings( deploymentId, targetDriftEnabled = NULL, featureDriftEnabled = NULL, maxWait = 600 )
UpdateDeploymentDriftTrackingSettings( deploymentId, targetDriftEnabled = NULL, featureDriftEnabled = NULL, maxWait = 600 )
deploymentId |
character. The ID of the deployment. |
targetDriftEnabled |
logical. Optional. Set to TRUE to enable target drift. Set to FALSE to disable. |
featureDriftEnabled |
logical. Optional. Set to TRUE to enable feature drift. Set to FALSE to disable. |
maxWait |
integer. How long to wait (in seconds) for the computation to complete before returning a timeout error? (Default 600 seconds) |
A list with the following information on drift tracking:
associationId
predictionIntervals list. A list with two keys:
enabled. 'TRUE' if prediction intervals are enabled and 'FALSE' otherwise.
percentiles list. A list of percentiles, if prediction intervals are enabled.
targetDrift list. A list with one key, 'enabled', which is 'TRUE' if target drift is enabled, and 'FALSE' otherwise.
featureDrift list. A list with one key, 'enabled', which is 'TRUE' if feature drift is enabled, and 'FALSE' otherwise.
## Not run: deploymentId <- "5e319d2e422fbd6b58a5edad" UpdateDeploymentDriftTrackingSettings(deploymentId, targetDriftEnabled = TRUE) ## End(Not run)
## Not run: deploymentId <- "5e319d2e422fbd6b58a5edad" UpdateDeploymentDriftTrackingSettings(deploymentId, targetDriftEnabled = TRUE) ## End(Not run)
Updates a featurelist to change the name or description.
UpdateFeaturelist(featurelist, listName = NULL, description = NULL)
UpdateFeaturelist(featurelist, listName = NULL, description = NULL)
featurelist |
list. The featurelist to delete. |
listName |
character. String identifying the new featurelist to be created. |
description |
character. A user-friendly description to give a featurelist. |
A list with the following four elements describing the featurelist created:
Character string giving the unique alphanumeric identifier for the new featurelist.
Character string giving the projectId identifying the project to which the featurelist was added.
Character vector with the names of the variables included in the new featurelist.
Character string giving the name of the new featurelist.
Updates a modeling featurelist to change the name or description.
UpdateModelingFeaturelist(featurelist, listName = NULL, description = NULL)
UpdateModelingFeaturelist(featurelist, listName = NULL, description = NULL)
featurelist |
list. The modeling featurelist to delete. |
listName |
character. String identifying the new featurelist to be created. |
description |
character. A user-friendly description to give a featurelist. |
This function updates parameters for the project defined by project.
UpdateProject( project, newProjectName = NULL, workerCount = NULL, holdoutUnlocked = NULL )
UpdateProject( project, newProjectName = NULL, workerCount = NULL, holdoutUnlocked = NULL )
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
newProjectName |
character. Updated value for the projectName parameter associated with the project. |
workerCount |
integer. The number of workers to run (default 2). Use |
holdoutUnlocked |
logical. Either NULL (default) or TRUE. If TRUE, this function requests the DataRobot Autopilot to unlock the holdout data subset. |
## Not run: projectId <- "59a5af20c80891534e3c2bde" UpdateProject(projectId, newProjectName = "cooler Project") UpdateProject(projectId, workerCount = 20) UpdateProject(projectId, holdoutUnlocked = TRUE) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" UpdateProject(projectId, newProjectName = "cooler Project") UpdateProject(projectId, workerCount = 20) UpdateProject(projectId, holdoutUnlocked = TRUE) ## End(Not run)
Update the display name or note for an imported model.
UpdateTransferableModel(importId, displayName = NULL, note = NULL)
UpdateTransferableModel(importId, displayName = NULL, note = NULL)
importId |
character. Id of the import. |
displayName |
character. The new display name. |
note |
character. The new note. |
A list describing uploaded transferable model with the following components:
note. Character string Manually added node about this imported model.
datasetName. Character string Filename of the dataset used to create the project the model belonged to.
modelName. Character string Model type describing the model generated by DataRobot.
displayName. Character string Manually specified human-readable name of the imported model.
target. Character string The target of the project the model belonged to prior to export.
projectName. Character string Name of the project the model belonged to prior to export.
importedByUsername. Character string Username of the user who imported the model.
importedAt. Character string The time the model was imported.
version. Numeric Project version of the project the model belonged to.
projectId. Character id of the project the model belonged to prior to export.
featurelistName. Character string Name of the featurelist used to train the model.
createdByUsername. Character string Username of the user who created the model prior to export.
importedById. Character string id of the user who imported the model.
id. Character string id of the import.
createdById. Character string id of the user who created the model prior to export.
modelId. Character string original id of the model prior to export.
originUrl. Character string URL.
Other Transferable Model functions:
DeleteTransferableModel()
,
DownloadTransferableModel()
,
GetTransferableModel()
,
ListTransferableModels()
,
RequestTransferableModel()
,
UploadTransferableModel()
## Not run: id <- UploadTransferableModel("model.drmodel") UpdateTransferableModel(id, displayName = "NewName", note = "This is my note.") ## End(Not run)
## Not run: id <- UploadTransferableModel("model.drmodel") UpdateTransferableModel(id, displayName = "NewName", note = "This is my note.") ## End(Not run)
The structure of the compliance doc template can be specified by either a file specified by
filename
or by specifying it with a list via sections
.
UploadComplianceDocTemplate(name, filename = NULL, sections = NULL)
UploadComplianceDocTemplate(name, filename = NULL, sections = NULL)
name |
character. A name to identify the compliance doc template by. |
filename |
character. Optional. Filename of file to save the compliance doc template to. |
sections |
list. Optional. Section definitions for the compliance template. |
Nothing returned, but uploads the compliance doc template.
## Not run: ## Create a compliance documentation template from uploading a file DownloadComplianceDocTemplate("template.json") # Edit template.json in your favorite editor UploadComplianceDocTemplate("myTemplate", "template.json") ## Create a compliance documentation template from a list. sections <- list(list("title" = "Missing Values Report", "highlightedText" = "NOTICE", "regularText" = paste("This dataset had a lot of Missing Values.", "See the chart below: {{missingValues}}"), "type" = "user"), list("title" = "Blueprints", "regularText" = "{{blueprintDiagram}} /n Blueprint for this model", "type" = "user")) ## End(Not run)
## Not run: ## Create a compliance documentation template from uploading a file DownloadComplianceDocTemplate("template.json") # Edit template.json in your favorite editor UploadComplianceDocTemplate("myTemplate", "template.json") ## Create a compliance documentation template from a list. sections <- list(list("title" = "Missing Values Report", "highlightedText" = "NOTICE", "regularText" = paste("This dataset had a lot of Missing Values.", "See the chart below: {{missingValues}}"), "type" = "user"), list("title" = "Blueprints", "regularText" = "{{blueprintDiagram}} /n Blueprint for this model", "type" = "user")) ## End(Not run)
Takes either a file path or a dataframe and returns output for POST that specifies
the file object via form upload. This function is meant to facilitate uploading
CSV data sources into DataRobot, such as through SetupProject
.
UploadData(dataSource, fileName = NULL)
UploadData(dataSource, fileName = NULL)
dataSource |
character. The file to upload. |
fileName |
character. The name of the file after it is uploaded. If not set, defaults to the name of the uploaded file. |
An httr object specifying the form upload content of the file path.
SetupProject
The DataRobot prediction engine requires a CSV file containing the data to be used in prediction, and this has been implemented here in two ways. The first and simpler is to specify dataSource as the name of this CSV file, but for the convenience of those who wish to work with dataframes, this function also provides the option of specifying a dataframe, which is then written to a CSV file and uploaded to the DataRobot server.
UploadPredictionDataset( project, dataSource, forecastPoint = NULL, predictionsStartDate = NULL, predictionsEndDate = NULL, relaxKIAFeaturesCheck = NULL, maxWait = 600 )
UploadPredictionDataset( project, dataSource, forecastPoint = NULL, predictionsStartDate = NULL, predictionsEndDate = NULL, relaxKIAFeaturesCheck = NULL, maxWait = 600 )
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
dataSource |
object. Either (a) the name of a CSV file (b) a dataframe or (c) url to publicly available file; in each case, this parameter identifies the source of the data for which predictions will be calculated. |
forecastPoint |
character. Optional. The point relative to which predictions will be generated, based on the forecast window of the project. Only specified in time series projects. |
predictionsStartDate |
datetime. Optional. Only specified in time series projects.
The start date for bulk predictions. Note that this parameter is for generating
historical predictions using the training data. This parameter should be provided in
conjunction |
predictionsEndDate |
datetime. Optional. Only specified in time series projects.
The end date for bulk predictions. Note that this parameter is for generating
historical predictions using the training data. This parameter should be provided
in conjunction |
relaxKIAFeaturesCheck |
logical. For time series projects only. If True, missing values in the known in advance features are allowed in the forecast window at the prediction time. If omitted or FALSE, missing values are not allowed. |
maxWait |
integer. The maximum time (in seconds) to wait for each of two steps: (1) The initial dataset upload request, and (2) data processing that occurs after receiving the response to this initial request. |
list with the following components:
id character. The unique alphanumeric identifier for the dataset.
numColumns numeric. Number of columns in dataset.
name character. Name of dataset file.
created character. time of upload.
projectId character. String giving the unique alphanumeric identifier for the project.
numRows numeric. Number of rows in dataset.
forecastPoint character. The point relative to which predictions will be generated, based on the forecast window of the project. Only specified in time series projects, otherwise will be NULL.
dataQualityWarnings list. A list of available warnings about potential problems in the uploaded prediction dataset. Will be empty if there are no warnings.
## Not run: projectId <- "59a5af20c80891534e3c2bde" UploadPredictionDataset(projectId, iris) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" UploadPredictionDataset(projectId, iris) ## End(Not run)
Upload a prediction dataset from a data source.
UploadPredictionDatasetFromDataSource( project, dataSourceId, username, password, forecastPoint = NULL, maxWait = 600, relaxKIAFeaturesCheck = NULL )
UploadPredictionDatasetFromDataSource( project, dataSourceId, username, password, forecastPoint = NULL, maxWait = 600, relaxKIAFeaturesCheck = NULL )
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
dataSourceId |
character. The id of the data source |
username |
character. The username to use for authentication to the database. |
password |
character. The password to use for authentication to the database. The password is encrypted at server side and never saved or stored. |
forecastPoint |
character. Optional. The point relative to which predictions will be generated, based on the forecast window of the project. Only specified in time series projects. |
maxWait |
integer. The maximum time (in seconds) to wait for each of two steps: (1) The initial dataset upload request, and (2) data processing that occurs after receiving the response to this initial request. |
relaxKIAFeaturesCheck |
logical. For time series projects only. If True, missing values in the known in advance features are allowed in the forecast window at the prediction time. If omitted or FALSE, missing values are not allowed. |
## Not run: dataSourceId <- "5c1303269300d900016b41a7" TestDataStore(dataSourceId, username = "myUser", password = "mySecurePass129") ## End(Not run)
## Not run: dataSourceId <- "5c1303269300d900016b41a7" TestDataStore(dataSourceId, username = "myUser", password = "mySecurePass129") ## End(Not run)
Import a previously exported model for predictions.
UploadTransferableModel(modelFile, maxWait = 600)
UploadTransferableModel(modelFile, maxWait = 600)
modelFile |
character. Path to binary transferable model file. |
maxWait |
integer. Specifies how many seconds to wait for upload to finish. |
A list describing uploaded transferable model with the following components:
note. Character string Manually added node about this imported model.
datasetName. Character string Filename of the dataset used to create the project the model belonged to.
modelName. Character string Model type describing the model generated by DataRobot.
displayName. Character string Manually specified human-readable name of the imported model.
target. Character string The target of the project the model belonged to prior to export.
projectName. Character string Name of the project the model belonged to prior to export.
importedByUsername. Character string Username of the user who imported the model.
importedAt. Character string The time the model was imported.
version. Numeric Project version of the project the model belonged to.
projectId. Character id of the project the model belonged to prior to export.
featurelistName. Character string Name of the featurelist used to train the model.
createdByUsername. Character string Username of the user who created the model prior to export.
importedById. Character string id of the user who imported the model.
id. Character string id of the import.
createdById. Character string id of the user who created the model prior to export.
modelId. Character string original id of the model prior to export.
originUrl. Character string URL.
Other Transferable Model functions:
DeleteTransferableModel()
,
DownloadTransferableModel()
,
GetTransferableModel()
,
ListTransferableModels()
,
RequestTransferableModel()
,
UpdateTransferableModel()
## Not run: UploadTransferableModel("model.drmodel") ## End(Not run)
## Not run: UploadTransferableModel("model.drmodel") ## End(Not run)
Validate that the actuals are a dataframe and contain required columns.
ValidateActuals(actuals, error = TRUE)
ValidateActuals(actuals, error = TRUE)
actuals |
dataframe. Contains all actuals to be submitted. |
error |
logical. Should an error be raised if there is an issue? |
TRUE if the actuals dataframe has required properties, otherwise FALSE or raises error.
Get a calendar id from a calendar object.
ValidateCalendar(calendar)
ValidateCalendar(calendar)
calendar |
object. Either list with calendarId element or calendarId value |
Validate that model belongs to class 'dataRobotModel' and includes projectId and modelId.
ValidateModel(model)
ValidateModel(model)
model |
An S3 object of class dataRobotModel like that returned by the function GetModel, or each element of the list returned by the function ListModels. |
Validate that the multiseries properties indicate a successful multiseries setup.
ValidateMultiSeriesProperties(properties, error = TRUE)
ValidateMultiSeriesProperties(properties, error = TRUE)
properties |
list. List of multiseries properties. |
error |
logical. Should an error be raised if there is an issue? |
TRUE if all properties verify, otherwise FALSE or raises error.
A valid parameter paramValue
is either NULL or in the space
of paramPossibilities
.
ValidateParameterIn(paramValue, paramPossibilities, allowNULL = TRUE)
ValidateParameterIn(paramValue, paramPossibilities, allowNULL = TRUE)
paramValue |
object. The parameter value to check. |
paramPossibilities |
vector. A vector of possible values for the parameter. |
allowNULL |
logical. Whether or not to allow NULL as a possibility. |
TRUE if paramValue
is valid, otherwise it raises an error.
## Not run: ValidateParameterIn("all", DataSubset) ## End(Not run)
## Not run: ValidateParameterIn("all", DataSubset) ## End(Not run)
Checks if a partition is valid.
ValidatePartition(validationType, partition, reps = NULL, validationPct = NULL)
ValidatePartition(validationType, partition, reps = NULL, validationPct = NULL)
validationType |
character. The type of partition to validate. |
partition |
partition. The partition object. |
reps |
numeric. The number of repetitions for a CV validation. |
validationPct |
numeric. The size of the validation set for TVH validation. |
Get a projectId from a project object.
ValidateProject(project)
ValidateProject(project)
project |
object. Either list with projectId element or projectId value |
Validate a potential deployment model replacement.
ValidateReplaceDeployedModel(deploymentId, newModelId)
ValidateReplaceDeployedModel(deploymentId, newModelId)
deploymentId |
character. The ID of the deployment. |
newModelId |
character. The ID of the model to use in the deployment. This model will replace the old model. You can also pass a dataRobotModel object. |
A validation report with:
status character. Either PASSED or FAILED depending on whether all checks passed or not.
message character. A message explaining the status failure, if any.
checks list. A list of each check and the individual status.
## Not run: deploymentId <- "5e319d2e422fbd6b58a5edad" newModelId <- "5996f820af07fc605e81ead4" ValidateReplaceDeployedModel(deploymentId, newModelId) ## End(Not run)
## Not run: deploymentId <- "5e319d2e422fbd6b58a5edad" newModelId <- "5996f820af07fc605e81ead4" ValidateReplaceDeployedModel(deploymentId, newModelId) ## End(Not run)
GetDeploymentServiceStats
,
GetDeploymentAccuracy
, GetDeploymentServiceStatsOverTime
, and
GetDeploymentAccuracyOverTime
.Helper function for validating reporting period objects used by
the deployment monitoring functions. See GetDeploymentServiceStats
,
GetDeploymentAccuracy
, GetDeploymentServiceStatsOverTime
, and
GetDeploymentAccuracyOverTime
.
validateReportingPeriodTime(timestamp, tsName = "timestamp")
validateReportingPeriodTime(timestamp, tsName = "timestamp")
timestamp |
character. A timestamp in RFC 3339 format. |
tsName |
character. Optional. Explanation of the timestamp for error messages. |
Other API datetime functions:
RFC3339DateTimeFormat
,
formatRFC3339Timestamp()
,
parseRFC3339Timestamp()
,
transformRFC3339Period()
Types of variable transformations
VariableTransformTypes
VariableTransformTypes
An object of class list
of length 4.
This function brings up a web page that displays detailed model information like that available from the standard DataRobot user interface (e.g., graphical representations of model structures).
ViewWebModel(model)
ViewWebModel(model)
model |
An S3 object of class dataRobotModel like that returned by the function GetModel, or each element of the list returned by the function ListModels. |
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) ViewWebModel(model) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" modelId <- "5996f820af07fc605e81ead4" model <- GetModel(projectId, modelId) ViewWebModel(model) ## End(Not run)
This function brings up a web page that displays detailed project information like that available from the standard DataRobot user interface.
ViewWebProject(project)
ViewWebProject(project)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
## Not run: projectId <- "59a5af20c80891534e3c2bde" ViewWebProject(projectId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" ViewWebProject(projectId) ## End(Not run)
This function periodically checks whether Autopilot is finished and returns only after it is.
WaitForAutopilot(project, checkInterval = 20, timeout = NULL, verbosity = 1)
WaitForAutopilot(project, checkInterval = 20, timeout = NULL, verbosity = 1)
project |
character. The project for which you want to wait until autopilot is finished. |
checkInterval |
numeric. Optional. Maximum wait (in seconds) between checks that Autopilot is finished. Defaults to 20. |
timeout |
numeric. Optional. Time (in seconds) after which to give up (Default is no timeout). There is an error if Autopilot is not finished before timing out. |
verbosity |
numeric. Optional. 0 is silent, 1 or more displays information about progress. Default is 1. |
## Not run: projectId <- "59a5af20c80891534e3c2bde" WaitForAutopilot(projectId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" WaitForAutopilot(projectId) ## End(Not run)
Wait for specified job to complete
WaitForJobToComplete(project, jobId, maxWait = 600)
WaitForJobToComplete(project, jobId, maxWait = 600)
project |
character. Either (1) a character string giving the unique alphanumeric identifier for the project, or (2) a list containing the element projectId with this identifier. |
jobId |
integer identifier (returned for example by RequestPrimeModel) |
maxWait |
maximum time to wait (in seconds) for the job to complete |
## Not run: projectId <- "59a5af20c80891534e3c2bde" blueprints <- ListBlueprints(projectId) blueprint <- blueprints[[1]] jobId <- RequestNewModel(projectId, blueprint) WaitForJobToComplete(projectId, jobId) ## End(Not run)
## Not run: projectId <- "59a5af20c80891534e3c2bde" blueprints <- ListBlueprints(projectId) blueprint <- blueprints[[1]] jobId <- RequestNewModel(projectId, blueprint) WaitForJobToComplete(projectId, jobId) ## End(Not run)