- 1 Minute to read
Formatting Rules
- 1 Minute to read
User data is uploaded by sending a multipart/form-data POST request very similar to the way Taxonomy is uploaded:
Form “metadata”, with Content-Type “application/json”, is the JSON formatted metadata discussed in the previous section on Metadata.
Form “data”, with Content-Type “application/octet-stream”, is the bz2 compressed stream of a set of rows separated by new line characters. Representation format of rows vary by the API being called.
All user/audience bz2 compressed data should be limited to 5GB. If compressed user/audience file is larger than 5GB, please consider splitting them into smaller files smaller than 5GB.
Sections on /audience, /audience/segment, /audience/attribute and /audience/opt-out APIs describe their way of representing a unit of data, which may either be a JSON object or simple text. Irrespective of the representation format, DataX requires that each unit of data be collapsed as a single line of input in the payload, prior to compression. This enables parallel distributed processing of data within DataX.
For example, if the following JSON object represents attributes of a user: (please ignore actual JSON format. Representation format for a unit of data is discussed in detail in the next sub-sections.)
{
"urn" : "user-1",
"att" : [
{
"id" : "YaScore",
"val" : 28
},
{
"id" : "Age",
"val" : 32
}
]
}
Each unit of data must be represented as a single line of input in the final payload. An attribute upload with YaScore and Age values for three users must look like the following prior to compression:
{"urn":"user-1",
"att":[{"id":"YaScore","val":56},
{"id":"Age","val":32}]}
{"urn":"user-2",
"att":[{"id":"YaScore","val":34},
{"id":"Age","val":44}]}
{"urn":"user-3",
"att":[{"id":"YaScore","val":78},
{"id":"Age","val":21}]}