Working with field definitions

A field definition is information that delineates a single field in a print image or PDF file. Because a print image or PDF file is an image, without any metadata identifying fields and records, you need to specify one or more field definitions to identify the fields in the file, and differentiate them from surrounding data or white space.

One or more field definitions may be automatically created by Analytics during the file definition process, or you may have to manually create field definitions.

Using the initial data value to uniquely identify a set of records

To start manually defining a print image or PDF file, you select an initial data value, and then capture an associated set of records. If you decide to use part or all of the initial data value to uniquely identify the set of records, follow the guidelines below when choosing the field containing the initial data value.

  • The field can be positioned anywhere in the record. It does not have to be the first field in the record.
  • Look for a field in which the data has a consistent structure. For example:
    • a date field with a consistent format such as MM/DD/YYYY
    • an SSN field
    • a credit card number field
    • any ID or numeric field with a consistent structure

    You will have greater success using a consistently structured field than you will using a field with varying contents.

  • One or more consistently positioned characters in the field must be unique, or uniquely positioned, when compared to data above or below the field.
  • Avoid a field with missing values. It is possible to use a field with missing values, but it complicates the process of defining the file.

Note

The value you use to uniquely identify a set of records does not have to be contained in the initial data value or the initial data field. It can occur anywhere in the row containing the initial data value. For more information, see Defining and importing print image (report) files and PDF files.

The Field Definition dialog box

The Field Definition dialog box is where you specify the information that delineates a field in a print image or PDF file.

The figure below shows the Field Definition dialog box with the Advanced Options expanded.

The table below explains the purpose of each item in the Field Definition dialog box:

Item name

Purpose

Name

Specifies a field name other than the generic field name assigned by Analytics.

The name you specify becomes the physical field name in the resulting Analytics table – that is, the field name in the table layout.

Type

Specifies the data type of the field.

The options are Character, Numeric, and Datetime. If the values in a numeric or datetime field are inconsistent, you can try defining and importing the fields as character data.

Starts on Line

Specifies which line in a record contains the start of the field.

For example:

  • If each record containing the field appears on a single line, then the value must be ‘1’
  • If each record containing the field spans two lines, and the field starts on the second line, then the value must be ‘2’

Starts in Column

Specifies the starting byte position of the field.

For example, if three blank spaces at the beginning of a line precede the first character in a field, then the Starts in Column value must be ‘4’ (non-Unicode Analytics), or '7' (Unicode data in Unicode Analytics).

Note

The starting position of a field is critical to the success of the defining and importing process. Once a field is defined, scroll through the source file to ensure the starting position accommodates all values in the field. Adjust the starting position if necessary.

For Unicode data, typically you should specify an odd-numbered starting byte position. Specifying an even-numbered starting position can cause characters to display incorrectly.

Field Width

Specifies the length in bytes of the field.

The length you specify becomes the physical field length in the resulting Analytics table – that is, the field length in the table layout.

Note

Field length is critical to the success of the defining and importing process. Once a field is defined, scroll through the source file to ensure that the field is long enough to accommodate all values in the field. Adjust the length if necessary.

For Unicode data, specify an even number of bytes only. Specifying an odd number of bytes can cause characters to display incorrectly.

Field Height

Specifies the number of lines that constitute a single value in the field.

For example:

  • If each value appears on a single line, then the field height must be ‘1’
  • If each value spans two lines, then the field height must be ‘2’
  • If each value spans a varying number of lines, such as the content of a Note field, set the field height to accommodate the value that spans the greatest number of lines (see Ends on blank line below)

Decimals

(numeric fields only)

Specifies the number of decimal places in numeric values.

Format

(numeric and datetime fields only)

Specifies the format for numeric or datetime data.

The format needs to match the format of the numeric or datetime values in the source file.

For example:

  • If numbers such as -1,234.00 appear in the field, you need to select or specify the format -9,999,999.99.
  • If dates such as 31/12/2015 appear in the field, you need to select or specify the format DD/MM/YYYY. Use MMM in the format to match months that use abbreviations or that are spelled out.

Tip

If numeric or datetime data in the source file is inconsistently formatted, you can import it as character data and try to clean up the inconsistencies using Analytics functions in the resulting Analytics table.

Convert to single field

(character fields only)

(multiline fields only)

Specifies that multiline fields defined in the source file are imported to Analytics as a single field containing the data from all lines.

For example, if you define address data that spans several lines, selecting Convert to single field creates a single field with all the address data on one line.

If you leave Convert to single field unselected (the default setting), multiline fields are imported to Analytics as multiple fields each containing the data from a single line.

Fill if Blank

Specifies that a field value is copied to subsequent blank values until a new field value occurs.

For example, if the value “01” in the Product Class field appears in only the first record of a block of Product Class 01 records, selecting Fill if Blank causes the value “01” to appear with every record.

Ends on blank line

(multiline fields only)

Specifies that values in a multiline field terminate when they encounter a blank line.

This option addresses a situation that occurs when values in a multiline field span a varying number of lines. You must set the Field Height to accommodate the value that spans the greatest number of lines. However, doing so can cause a mismatch between values with fewer lines and field or record boundaries. Selecting Ends on blank line causes the field height, and the field or record boundaries, to dynamically resize to fit the number of lines occupied by each value.

Note

This feature only works if one or more blank lines separate each value in the multiline field.

Column Defaults for Reporting:

  • Width
  • Alternate Column Title
  • Suppress Totals

    (numeric fields only)

  • Control Total

    (numeric fields only)

Note

The Column Defaults for Reporting settings are optional. They do not affect processing of the field in the Data Definition Wizard. The same properties can be set later, in Analytics.

Specifies properties of the field as it appears in the default view in the resulting Analytics table, and in Analytics reports.

  • Width – Specifies the display width of the field in bytes.

    This value is used as the column size when displaying the contents of the field in Analytics views and reports.

  • Alternate Column Title – Specifies a column heading to use, instead of the field name, when displaying the field in Analytics views and reports.
  • Suppress Totals – Specifies that values in the field are not automatically totaled in Analytics reports.

    By default, Analytics automatically totals numeric fields in reports. You can suppress this behavior if the field contains data, such as unit prices, for which totals are not meaningful.

  • Control Total – Specifies that the field is a control total field.

    A control total is the sum of values in a numeric field, which can be used to check data integrity. When you extract or sort data to a new table, Analytics includes the input and output totals of a control total field in the table history. Input refers to the original table. Output refers to the new table. If the two totals match, no data was lost in the extract or sort operation.

    If you specify control totals for more than one field, the table history reports on only the numeric field with the leftmost starting position.

    Note

    The Control Total setting in the Field Definition dialog box does not create control totals when you import a print image or PDF file to Analytics. For information about creating control totals for this purpose, see Defining and importing print image (report) files and PDF files.