Running R scripts
Concept Information
Analyze an Analytics table in an external R script and then return data from R to create a new table in the Analytics project. Source data is passed to R as a data frame that you can reference using a provided function.
Working with Analytics data in R
If you are preparing the R script to run from Analytics, familiarize yourself with the ways data is passed back and forth between Analytics and R. You must use the R functions provided by Analytics in your R script to successfully run the RCOMMAND.
Referencing Analytics data in the R script
The Analytics table is passed to the script as an R data frame. Data frames are tabular data objects that may contain columns of different modes, or types, of data.
To work with the data frame created by Analytics in an R script, invoke the acl.readData() function and store the returned data frame in a variable:
# stores the Analytics table in a data frame called myTable that can be referenced throughout the script
myTable<-acl.readData()
To retrieve data from a cell in the data frame, you can use one of the following approaches:
-
Using row and column coordinates:
# Retrieves the value in the first row and second column of the data frame
myTable[1,2]Note
Coordinates are based on the order of fields specified in the command, not the table layout or view that is currently open.
-
Using row and column names:
# Retrieves the value in the first row and "myColumnTitle" column of the data frame
myTable["1","myColumnTitle"]You must specify the KEEPTITLE option of the command to use column names.
Rows are named "1", "2", "3", and increment accordingly. You may also use a combination of names and coordinates.
Passing data back to Analytics
To return a data frame or matrix back to Analytics and create a new table, use the following syntax:
# Passes myNewTable data frame back to Analytics to create a new table
acl.output<-myNewTable
Note
You must return a data frame or a matrix to Analytics when the R script terminates. Ensure the columns in the data frame or matrix contain only atomic values and not lists, matrices, arrays, or non-atomic objects. If the values cannot be translated into Analytics data types, the command fails.
R data mapping
Analytics data types are translated into R data types using a translation process between the Analytics project and the R script:
Analytics data type | R data type(s) |
---|---|
Logical | Logical |
Numeric | Numeric |
Character | Character |
Datetime | Date, POSIXct, POSIXlt |
Performance and file size limits
The time it takes to run your R script and process the data that is returned increases for input data exceeding 1 GB. R does not support input files of 2 GB or higher.
The number of records sent to R also affects performance. For two tables with the same file size but a differing record count, processing the table with fewer records is faster.
Handling multi-byte character data
If you are sending data to R in a multi-byte character set, such as Chinese, you must set the system locale appropriately in your R script. To successfully send a table of multi-byte data to R, the first line of the R script must contain the following function:
# Example that sets locale to Chinese
Sys.setlocale("LC_ALL","Chinese")
For more information about Sys.setlocale( ), see the R documentation.
Hello world example
Analytics command
RCOMMAND FIELDS "Hello", ", world!" TO "r_result" RSCRIPT "C:\scripts\r_scripts\analysis.r"
R script (analysis.r)
srcTable<-acl.readData()
# create table to send back to ACL
output<-data.frame(
c(srcTable[1,1]),
c(srcTable[1,2])
)
# add column names and send table back to ACL
colnames(output) <- c("Greeting","Subject")
acl.output<-output
Run an R script
- From the menu, select Analyze > R.
The RCOMMAND dialog box opens.
- Next to the R Script field, click Browse and navigate to the R script on your computer that you want to run.
- Click Select Fields and add one or more fields to include in the data frame that Analytics makes available in the R script.
Tip
You can also include expressions as fields in the data frame. To create an expression, click Expr and use the functions, fields, and operators available to you in the dialog box. For more information, see Expression Builder overview.
- Optional. In the RCommand Options section, define how you want to send the Analytics data to the R script.
For more information, see RCommand options.
- Optional. To filter the records that are sent to the R script, click If and use the Expression Builder dialog box to create a conditional expression to use as the filter.
For more information about creating expressions using the Expression Builder, see Creating expressions using the Expression Builder.
- To specify the output table, click To and in the File name field, enter a name for the table and associated .FIL file.
Use the folder explorer to navigate to the folder you want to use to store the source data file.
Note
Analytics table names are limited to 64 alphanumeric characters, not including the .FIL extension. The name can include the underscore character ( _ ), but no other special characters, or any spaces. The name cannot start with a number.
- Optional. On the More tab of the dialog box, specify any scope options for the command.
For more information, see More tab.
- To run the command, click OK.
RCOMMAND dialog box options
RCommand options
Option | Description |
---|---|
Export with field names | Use the column titles of the source Analytics table as header values for the R data frame. This option sets KEEPTITLE option on the command and is required if you want to retrieve data using column names in the R script. |
Column Separator | The character to use as the separator between fields when sending data to R. |
Text Qualifier |
The character to use as the text qualifier to identify field values when sending data to R. |
More tab
Option | Description |
---|---|
All | Processes all records in the view (default selection). |
First | Processes from the first record in the table and includes only the specified number of records. |
Next |
Processes from the currently selected record in the table and includes only the specified number of records. Note The number of records specified in the First or Next options references either the physical or the indexed order of records in a table, and disregards any filtering or quick sorting applied to the view. However, results of analytical operations respect any filtering. If a view is quick sorted, Next behaves like First. Caution There is a known issue in the current version with Next when running the RCOMMAND. Avoid using this option as the record reference may reset to the first record regardless of which record is selected. |
While |
Uses a WHILE statement to limit the processing of records in the primary table based on criteria. Records in the view are processed only while the specified condition evaluates to true. As soon as the condition evaluates to false, the processing terminates, and no further records are considered. For more information, see Creating expressions using the Expression Builder. |