Best Practices
  • 9 Minutes to read

    Best Practices


      Article summary

      Abstract

      Describes best practices for optimal performance and expected results when working with DataX APIs and endpoints.

      Overview

      Understanding each DataX API endpoint and expected results is important for DataX partners and developers. This document describes a series of best practices, with accompanying code examples, in addition to notes and clarifications on Production and Sandbox differences, polling, and FAQS.

      The code examples focus on how best to use the following APIs:

      Note

      A full list of all status codes and messages is also provided.

      POST /taxonomy API

      Use this API endpoint to create a new version of your Yahoo Ad Tech taxonomy. If successful, the new taxonomy replaces and supersedes the currently active taxonomy.

      The example below describes the changes that occur from the current taxonomy when making a POST call to /taxonomy with these segments:

      1. Segment 222 is soft-deleted. Note that segment membership will not be cleared.

      2. Segment 333 is created.

      3. Segment 111 is updated.

      Example

      Current Taxonomy

      [{
                     "id": "111",
                     "name": "Segment 111"
             },
      
             {
                     "id": "222",
                     "name": "Segment 222"
             }
      ]

      POST to /taxonomy with following:

          [{
                   "id": "111",
                   "name": "Segment 112"
           },
      
           {
                   "id": "333",
                   "name": "Segment 333"
           }
      ]

      An Exception Thrown

      Making this POST call to /taxomony will cause an exception to be thrown because the parent segment 111 is assigned to mdm id 100 and the child is assigned to 123.

      As a best practice, the parent segment in this example needs to include mdm id 123.

      [{
                    "id": "111",
                    "name": "Segment 111",
                    "user": {
                            "include": ["100"]
                    },
                    "subTaxonomy": [{
                            "id": "333",
                            "name": "Segment 333",
                            "user": {
                                    "include": ["123"]
                            }
                    }]
            },
      
            {
                    "id": "222",
                    "name": "Segment 222"
            }
      ]

      As a best practice, shown in the example below, the child segment 333 inherits the permissions mdm id 100.

      [{
                    "id": "111",
                    "name": "Segment 111",
                    "user": {
                            "include": ["100"]
                    },
                    "subTaxonomy": [{
                            "id": "333",
                            "name": "Segment 333"
                    }]
            },
      
            {
                    "id": "222",
                    "name": "Segment 222"
            }
      ]

      PUT /taxonomy/append/<parent id> API

      Use this API endpoint to send incremental taxonomy data that is merged with the currently active taxonomy. The API supports options to insert, replace, move taxonomy nodes and its sub-tree to an existing taxonomy.

      The example below describes the changes that occur from the current taxonomy when making a PUT call to /taxonomy/append/<parent id> with these segments specified:

      Segment 444 is soft-deleted.

      Segment 555 is created.

      Note that the subTaxonomy field is treated as an overwrite.

      Current Taxonomy:

      [{
                    "id": "111",
                    "name": "Segment 111",
                    "subTaxonomy": [{
                            "id": "333",
                            "name": "Segment 333",
                            "subTaxonomy": [{
                                    "id": "444",
                                    "name": "Segment 444"
                            }]
                    }]
            },
            {
                    "id": "222",
                    "name": "Segment 222"
            }
      ]

      PUT /taxonomy/append/111 with following:

      {
            "id": "333",
            "name": "Segment 333",
            "subTaxonomy": [{
                    "id": "555",
                    "name": "Segment 555"
            }]
      }

      In the example below the following occurs:

      Segment 444 is updated.

      Segment 222 is moved under segment 333.

      Current Taxonomy:

      [{
                    "id": "111",
                    "name": "Segment 111",
                    "subTaxonomy": [{
                            "id": "333",
                            "name": "Segment 333",
                            "subTaxonomy": [{
                                    "id": "444",
                                    "name": "Segment 444"
                            }]
                    }]
            },
            {
                    "id": "222",
                    "name": "Segment 222"
            }
       ]

      PUT /taxonomy/append/111 with the following:

      {
            "id": "333",
            "name": "Segment 333",
            "subTaxonomy": [
              {
                    "id": "222",
                    "name": "Segment 222"
            },
              {
                    "id": "444",
                    "name": "Segment 445"
            }]
      }

      POST /usermatch API

      Expected status

      Most successful requests to this API endpoint will return a status of ACCEPTED_WITH_ERRORS. The reason is that unmatched records will count as an error, and in most cases there will be unmatched records.

      Event Log Data (DSP API) Primary ID Behavior

      The primary id reported in the event level data coming from the DSP API will switch to PXID once the first usermatch request successfully processes. If DXIDs were used before, they will no longer be returned. This behavior requires planning on the partners, so that you don’t accidentally cause the switch before you are ready to consume PXIDs keyed event logs.

      POST /audience API

      Use this API to upload any and all types of user data – for example, to upload one or more segments, scores and/or a set of attributes, or a disjoint mixture of all of these in a single upload for a set of users.

      Note

      Audience requests sent within 8 hours of each other will be batch processed together. The API does not provide any guarantees that it will process requests in the exact order as they were received.

      As a best practice, to delete users at the segment level and resubmit the user with the expiry set to 0, do this:

      { "urn":{string}, "seg":[{"id":"1111","exp":0}] }

      Example Zip+4

      Example: 94519-2710

      Streamline urn types list

      The list should include these (since they are the most common):

      • ZIP4

      • IXID

      • Mobile IDs: IDFA (iOS ID for Advertisers) and GPADVID (Google Play Advertising ID)

      • PXID

      • DXID (Deprecated)

      Filename field in the body

      As a best practice, note that the filename field is required. If it is not included, the request will return a 500 error code.

      Example:

      POST /v1/audience HTTP/1.1
      Host: datax.yahooapis.com
      Content-Type: multipart/form-data;boundary=xyz
      
      --xyz
      Content-Type: application/json;charset=UTF-8 Content-Disposition: form-data; name="metadata"
      {
              "description" : "user qualifications – daily bucket 05/19/2013",
              "extensions" : { "urnType" : "IXID" }
      }
      --xyz
      Content-Type: application/octet-stream;charset=UTF-8
      Content-Disposition: form-data; name="data"; filename=”somefilename.bz2”
      
      < bz2 compressed data >
      --xyz--

      Expected Status for Hashed Email

      Most successful requests to this API endpoint will return a status of ACCEPTED_WITH_ERRORS. The reason is that unmatched records will count as an error, and in most cases there will be unmatched records.

      GET /self and /errors

      GET /self

      You can get the status of a taxonomy or audience request using the /self endpoint.

      The request id is returned as part of the response when the request is submitted.

      https://datax.yahooapis.com/link/self/{requestid}

      The response that is returned is documented in this section: Metadata.

      GET /errors

      If the status indicates an error, you can download the full error using this endpoint:

      https://datax.yahooapis.com/link/errors/{requestid}

      Polling

      Partners typically poll for status at a regular interval until complete or if an error is returned.

      As a best practice, we recommend that you use the following polling intervals:

      • Taxonomy requests - every 15 minutes.

      • Audience/partner match requests - every 2 hours.

      Email Formatting

      For purposes of email normalization, note the following:

      Before uploading your JSON, you need to encrypt it with SHA256. That means, you must convert your email list to a hash which you would then place in your file.

      For example: If the original email is [email protected], the hashed value in the file would be d48adb3c108a657adf7597921f3bfc591ee3f00d658d2d288e0bb396ac0d5964.

      Important

      Your file name must be properly normalized using lowercase characters and contain no spaces.

      DataX only supports pre-encrypted files. Once the files are encrypted, all the personal data that resides on the files will be protected, using the SHA256 function, so that no raw emails are ever stored.

      If an email address is not hashed in the proper format, DataX will not process the audience records.

      FAQs

      Q. Do all endpoint urls for Partner Match use HTTPS://?

      A. Yes, all of our DataX suite APIs use HTTPS:// for API calls.


      Q. Is the 100 requests/hour quota shared between the partner match API and Datax? Or does it have its own 100 request limit? So in theory you could send 100 audience requests, 100 taxonomy requests, and 100 partner match api requests in 1 hour.

      A. Each 100 request/hour limit is separate.


      Q. If you execute multiple partner match calls, will this not overwrite the previous match output?

      A. No, it is an append call, and will just add to the existing partner match IDs.


      Q. Does the Line Ending on CSV body support the newline character (forward slash n) (\n) only?

      A. Yes.


      Q. Is a full list of all the statuses and their descriptions available?

      A. Yes. The statuses and their meaning can be found in the following table.


      Status Code

      Status Message

      Description

      202 Accepted

      Uploaded request accepted

      The upload request is finished.

      400 DxInvalidRequest

      <urnType> is not supported

      Provided urnType is not supported.

      400 DxJobNotFound

      Cannot Find Job with id <request_id>

      The request_id is not found in the datax db.

      400 Bad Request

      Bad Application Id

      The application id is not correct.

      500 DxInternalError

      Unable to Create Job

      The upload job can’t be created.

      500 UNABLE_TO_PROCESS_REQUEST

      Failed to process. Please try after sometime

      Server is not available during the processing time.


      Was this article helpful?

      What's Next