- 6 Minutes to read
Taxonomy Representation
- 6 Minutes to read
Taxonomy in DataX is represented by the following JSON layout. Using this layout you can establish a single Taxonomy for all your syndicated/custom/ad-hoc segments/audiences, raw attributes, advertiser specific data, region specific data, etc. organized in a flat or a hierarchical manner. Some examples are provided in the Overview Section and Appendix A.
A node (taxonomy element) in the taxonomy can be one of the following three types:
SEGMENT - (binary inclusion/exclusion qualifications against predefined criteria/event), or
ATTRIBUTE - (raw values for a user as defined in the terminology table above).
A placeholder element used as parent of another element.
Data is categorized as a segment or an attribute via a “type” property. A segment or an attribute is made region or advertiser specified by assigning properties such as “users” to the taxonomy element. Grouping/Hierarchy is established by nesting taxonomy elements via the subTaxonomy tag.
DataX V1 schema structure listed below:
V1 JSON Structure
[
{
"id" : {string},
"name" : {string},
"description" : {string},
"users" : {
"include" : [ {string} ],
"exclude" : [ {string} ]
},
"type" : {string},
"subTaxonomy" : [
{
"id" : {string},
"name" : {string},
"description" : {string},
"users" : {
"include" : [ {string} ],
"exclude" : [ {string} ]
},
"type" : {string},
"subTaxonomy" : [
{
"id" : {string},
"name" : {string},
"description" : {string},
"users" : {
"include" : [ {string} ],
"exclude" : [ {string} ]
},
"type" : {string},
"attributeDef" : {
"type" : {string},
"values" : [ {string} ]
}
}
]
}
]
}
]
JSON Property Descriptions
Property Name | Type | Description |
---|---|---|
id | String | A case insensitive unique ID for this taxonomy node. A node may represent a segment, an attribute or be a placeholder element used as parent of another element. Taxonomy nodes that will collect data (such as leaf nodes) must have an ID, and every ID must be unique across the entire taxonomy, not just within the immediate parent. A placeholder element used as parent of another element may, optionally, not have an ID associated with it. Although it is required to use ID for all nodes that needs to be referenced when using incremental taxonomy APIs. Where provided, it must be a string of 1 to 64 UTF-8 characters. You will export User/Audience data to DataX against these IDs. |
Name | String | A human friendly equivalent of ‘id’. This field will be shown in the UI as a tree node for your taxonomy. It will be validated for uniqueness across its siblings (under the immediate parent). It is okay for two nodes to have the same name, if they don’t share the same immediate parent. It must be a string of 255 UTF-8 characters. Account managers/authorized users will create Rules/Audiences by taking the name as a reference. |
description | String | [optional] A free text for human reference and context sensitive help. It must be a string of up to 1024 characters, if provided. |
subTaxonomy | Array | [optional] Enables hierarchical organization of your data set. Required only for parent nodes. Leaf nodes (lowest level segments or attributes) should not specify this property. |
users.include | Array | [optional] A node’s data can be restricted for use only with a certain set of exclusive users (such as first party data for advertisers - Walmart, Target etc.). A child node will inherit its parent’s, unless overridden by the child itself. Include can be used to restrict access to advertisers. Please check in with your Yahoo contact for valid MDM ID values. |
users.exclude | Array | [optional] A node’s data can be global with while also restricting certain advertisers (such as first party data for advertisers - Walmart, Target etc.). A child node will inherit its parent’s, unless overridden by the child itself. Exclude can be used to restrict advertisers access to certain segments. Please check in with your Yahoo contact for valid MDM ID values. |
type | String | [optional] Supported values, one of the following: 1. SEGMENT – Represents a set of users that qualify against a certain pre-defined criteria (segment) or who perform a certain action (event). 2. ATTRIBUTE – Represents a value associated with a user. This value can be a score, age, zip code, liking, category etc. – essentially any raw attribute that is more than just a binary qualification or event represented by a pre-defined segment. When to use SEGMENT vs ATTRIBUTE Use SEGMENT when the value associated with a taxonomy node is implicit (e.g., baby product buyers, people who are searching for flight tickets etc.). Use ATTRIBUTE when a user carries an explicit value against a taxonomy node (e.g., Score, Age etc). Some taxonomy nodes may lend themselves equally to SEGMENT as well as ATTRIBUTES, e.g., Sex can be modeled as an ATTRIBUTE (with possible values Male, Female) or as two separate SEGMENTS (Male Segment and Female Segment). Scope: Type may be specified at a parent and/or a leaf node level. When specified for a parent node, all its children also inherit that type. Nodes of one type cannot be children of another type. In other words, a SEGMENT cannot be a child of an ATTRIBUTE or vice versa. A parent without a type may hold both SEGMENT & ATTRIBUTE child nodes. |
attributeDef | Object | Attribute Definition. Required only for ATTRIBUTES. Provide null or completely skip for SEGMENTS. |
attributeDef.type | String | Specialized Types 1. ENUM: Specify this for an attribute that has less than 100 unique values. Provide a list of “values” against the accompanying property. 2. DATETIME: Specify this for a date, time or a date and time attribute. It is okay to leave “values” as null or completely skip it. 3. ZIPCODE: Specify this for the zipcode component of an address. It is okay to leave “values” as null or completely skip it. Generic Types 4. NUMBER: This could be a score, age, dollar spend, etc. Please specify a range of values in the “values” property. The value can be a negative floating number. If it is a range then it should be delimited with a pipe (|), e.g., “-3.00|5.25” (without any spaces).
Though everything can be categorized as a “String”, providing more specialized types where applicable will help account managers use your data more effectively. It will also help in automated validations and catch data related issues upfront. |
attributeDef.value | Array | List of values or ranges. Mandatory for ENUM and NUMBER types. The String values will be case-insensitive. Examples:
|