Address Cleanse API

Overview

The Address Cleanse API is a web service that can be used to perform address validation, deduplication and adding extra data. Make changes to imported data files.
It follows RESTful principles – accepting HTTPS requests and returning JSON response documents.
See the Address Cleanse API reference for the full interface definition.

Quick Start

Register or Login to your Address Cleanse account to get started.
Create an Address Cleanse API Token as described in the Authentication section below.
To upload a file, a job configuration file must also be supplied. The creation of this is described in the Job Configuration section below.

Authentication / Configuration

Authentication

All Address Cleanse API methods require a security token for authorisation. This token can be generated using the web interface here once logged in.

Simply click vertical ellipsis menu (three vertical dots) in the top right hand corner of the screen to open up a menu. Select Export API Token; a window will appear containing the token text. Store that text for use by your application in all Address Cleanse API service calls.

The token should be sent in a HTTPS header of the format…

Authorization: Bearer $TOKEN_VALUE

Job Configuration

Address Cleanse API job configuration is done via the web interface here once logged in. This only needs to be done once as the same settings can be used for any number of input files, providing the file format and structure remains constant.

Once a job configuration has been created through the UI, it can be exported for use with your client application. A step by step guide of how to user the Address Cleanse UI is availble within the Portal Help Centre

Click on the menu icon in the top right corner and select Export settings for API. A window will appear containing the JSON text describing your settings. Save that text for use by your application, it will be needed by the /api/v1/run method.

Upload a file

There are 3 different methods to be used in uploading a file. This is to allow for the uploading of large files and in avoiding import latency.

Begin

First call the begin method, giving it a size and filePrefix (name).

GET https://www.hopewiser.com.au/online-address-cleanse/api/v1/files/upload/begin?size=XXX&filePrefix=NAME

Parameters:

Name	Description	Example Value
size	The file size in bytes	16384
filePrefix	A string that will become part of the filename on the server	CustomerAddresses

Response:

{
  "fileId":"CustomerAddresses1234.dat",
  "chunkUrl":"https://filesvr.hopewiser.com.au/filesvr/upload/chunk"
}

Chunk

Then call the chunk method using the chunkUrl (in place of the service endpoint used in all other methods here), the fileId returned by /begin and an idx indicating which chunk this is, starting from zero. The file data should be sent in the body of the HTTPS POST request as raw binary data bytes.

PUT https://filesvr.hopewiser.com.au/filesvr/upload/chunk?fileId=XXX&idx=0

PUT https://filesvr.hopewiser.com.au/filesvr/upload/chunk?fileId=XXX&idx=1

etc.

Parameters:

Name	Description	Example Value
fileId	ID returned by /begin	CustomerAddresses1234
idx	Index number for the chunk	0

It is recommended the chunks be no larger than 10 MB; especially if your application is browser based. Call this method repeatedly for each part of the file you upload, incrementing the idx parameter by 1 each time.

End

Once the file data is completely uploaded, call the end method with the fileId returned by /begin, and optionally a hash field containing the sha256 hash of the file. If a hash is provided it will be compared with the file data stored on the server to guarantee the uploaded file’s integrity.

GET https://www.hopewiser.com.au/online-address-cleanse/api/v1/files/upload/end?fileId=XXX&hash=YYY

Parameters:

Name	Description	Example Value
fileId	ID returned by /begin	CustomerAddresses1234
hash	SHA256 hash of the file (optional)	d785d959746f4180c8c6ef885d9abaccb5a43f004c718a906271c3de014eb69b

Retrieve a list of uploaded files

A list of uploaded files and their metadata can be retrieved using the files method which takes no parameters and returns a JSON array containing a description of each uploaded file. This request returns files that have been processed/run. Those that have only been uploaded, via the proceeding instructions, will not be shown.

GET https://www.hopewiser.com.au/online-address-cleanse/api/v1/files

Response:

[
  {
    "id": 147,
    "name": "input_address.txt",
    "inputFileServerId": "bureauInput7122032991119785308.dat",
    "cols": 6,
    "rowCount": 597,
    "delimiter": ",",
    "hasHeader": true,
    "useQuotes": false,
    "creationDate": 1510939445,
    "lastUpdate": 1510939445
  }
]

The dates in the response are given using Unix time, which is the number of seconds that have elapsed since January 1, 1970 (midnight UTC/GMT), not counting leap seconds (in ISO 8601: 1970-01-01T00:00:00Z).

Run an Address Cleanse API job

Start a cleanse job

To start a processing run on an uploaded file, call the run method, providing the inputFileId which was returned in the /upload methods, and a name to identify this job. The request body should contain the JSON settings data previously generated from the web interface and exported using Export settings for API in an earlier step.

PUT https://www.hopewiser.com.au/online-address-cleanse/api/v1/run?inputFileId=CustomerAddresses1234&name=CleanseRun1

Parameters:

Name	Description	Example Value
inputFileId	ID returned by /begin	CustomerAddresses1234
name	A name to identify this run	CleanseRun1

Body:

{
  "protocol": 2,
  "mafId": 2,
  "inputFileDesc": {
    "cols": 7,
    "delimiter": "|",
    "hasHeader": true,
    "useQuotes": true
  },
  "inputFields": [
  {
    "name": "Add Line 1",
    "mappingId": 6,
    "mappingOrder": 9,
    "inputOrder":14069
  },
  {
    "name": "Add Line 2",
    "mappingId": 6,
    "mappingOrder": 10,
    "inputOrder":14070
  },
  {
    "name": "Add Line 3",
    "mappingId": 6,
    "mappingOrder": 11,
    "inputOrder":14071
  },
  {
    "name": "Add Line 4",
    "mappingId": 6,
    "mappingOrder": 12,
    "inputOrder":14072
  },
  {
    "name": "Add Line 5",
    "mappingId": 6,
    "mappingOrder": 13,
    "inputOrder":14073
  },
  {
    "name": "City",
    "mappingId": 6,
    "mappingOrder": 14,
    "inputOrder":14074
  },
  {
    "name": "Country",
    "mappingId": 0,
    "mappingOrder": 15,
    "inputOrder":14074
  }
  }],
  "outputFields": [{
    "groupId": 1
  },
  {
    "groupId": 8
  },
  {
    "groupId": 2
  },
  {
    "groupId": 3
  },
  { 
    "groupId": 11 
  },
  { 
    "groupId": 12
  },
  { 
    "groupId": 13
  },
  {
    "groupId": 14
  }],
  "dedupe": {
    "duplicationLevel": 2,
    "dedupeSeparateFile": false
  },
}

Response:

{
  "runId": 49
}

Get the status of a job

Depending on the size of the file and the options selected, the processing run could take only a few seconds or several minutes. To determine the current status of the job you can poll using the status method, giving it the runId returned from /run. This will return a block of JSON text containing the current status of the job. The returned HTTP status code will also change. While the job is running the returned HTTP status will be 404, once the job is complete it will become 200.

GET https://www.hopewiser.com.au/online-address-cleanse/api/v1/run/status?runId=49

Parameter:

Name	Description	Example Value
runId	ID returned by /run	49

{
    "name": "CleanseRun1",
    "statusCode": 6,
    "statusDesc": "Processing",
    "stage": {
        "stageCount": 7,
        "currentStage": 5,
        "stageName": "Post processing",
        "stageStatus": "Postprocess",
        "stageProgress": 0.14285714285714286
    }
}

The stage part of the response will only appear whilst the processing run is in progress. Once the cleanse part of the job is complete the response will only contain the name, statusCode and statusDesc sections.

Possible status codes

Value	Description
0	Uploading
1	Uploaded
2	QueuedForTest
3	ProcessingTest
4	FinishedTest
5	QueuedForResults
6	ProcessingResults
7	AwaitingPayment
8	Paid
9	Error
10	Deleted

The number and values of the stages in the stage object are subject to change. The field names are fixed, but their values may be updated.

For example, there are currently 11 stages, but that number may increase or decrease in future. If it does change the stageCount will update to reflect this.

Get results

Once a job is complete various statistics and metadata is available to download. These can be viewed without payment. To download the results file however, it must first be paid for.

All the results methods take as a parameter the runId returned by the run method to identify which job is the one of interest.

Statistics

To obtain details of matches for the various categories of checks made by Address Cleanse (address matches, de-duplications and suppressions), use the statistics method…

GET https://www.hopewiser.com.au/online-address-cleanse/api/v1/results/statistics?runId=49

Parameter:

Name	Description	Example Value
runId	ID returned by /run	49

Body:

{
  "recordCount": integer,
  "codingStats": [
    {
      "name": string,
      "description": string,
      "count": integer
    }
  ],
  "addressUpdateStats": [
    {
      "name": string,
      "description": string,
      "count": integer
    }
  ],
  "dedupeStats": [
    {
      "name": string,
      "description": string,
      "count": integer
    }
  ],
}

Response:

{
    "recordCount": 76,
    "codingStats": [
        {
            "name": "Matched",
            "description": "A correct address",
            "count": 11
        },
        {
            "name": "UnmatchedPremise",
            "description": "The premise, e.g. house number, was not found",
            "count": 1
        },
        {
            "name": "UnmatchedStreet",
            "description": "The thoroughfare, e.g. street, was not found",
            "count": 1
        },
        {
            "name": "UnmatchedTown",
            "description": "The town was not found",
            "count": 62
        },
        {
            "name": "Unmatched",
            "description": "No parts of the address could be verified",
            "count": 1
        }
    ],
    "dedupeStats": [
        {
            "name": "Unique",
            "description": "The record is unique",
            "count": 98
        },
        {
            "name": "Parent",
            "description": "The first record that is a duplicate based on the level selected",
            "count": 1
        },
        {
            "name": "Child",
            "description": "The record is a duplicate of a parent record",
            "count": 1
        }
    ],
}

The name and description for the codingStats may be one of the following;

Name	Description
Matched	A correct address
UnmatchedPremise	The premise, e.g. house number, was not found
UnmatchedStreet	The thoroughfare, e.g. street, was not found
UnmatchedTown	The town was not found
Unmatched	No parts of the address could be verified

The name and description for the addressUpdateStats may be one of the following;

Name	Description
RecordCount	The number of records where the address was updated
PostcodeAdded	The number of records where the postcode was not specified in the input but was added to the output
PostcodeUpdated	The number of records where the postcode was specified in the input but was updated in the output
SubpremiseUpdated	The number of records where the subpremise was either added or updated to the output
HouseUpdated	The number of records where the house was either added or updated to the output
StreetUpdated	The number of records where the street was either added or updated to the output
DistrictUpdated	The number of records where the district was either added or updated to the output
Unknown	Unknown

The name and description for the dedupeStats may be one of the following;

Name	Description
Unique	The record is unique
Parent	The first record that is a duplicate based on the level selected
Child	The record is a duplicate of a parent record

Costs

To see a breakdown of the various charges incurred for the run, use the costs method…

GET https://www.hopewiser.com.au/online-address-cleanse/api/v1/results/costs?runId=49

Parameter:

Name	Description	Example Value
runId	ID returned by /run	49

Body:

[
  {
    "name": string,
    "cost": integer
  }
]

Response:

[
    {
        "name": "Address Validation",
        "netcost": 3072,
        "cost": 3610
    },
    {
        "name": "De-duplication",
        "netcost": 43,
        "cost": 50
    },
]

The name may be one of the following;

Processing Charge
Address Validation
De-duplication

The cost is in Australian Dollars, excluding GST.

Pay

To allow the file to be downloaded use the pay method…

GET https://www.hopewiser.com.au/online-address-cleanse/api/v1/results/pay?runId=49

Parameter:

Name	Description	Example Value
runId	ID returned by /run	49

If you have not set up an agreement with Hopewiser to allow automatic payments, this will return a URL you must manually visit in a browser to complete your payment information.

If you have set up such an agreement then you must still call the pay method, but no further action is needed.

Response:

{
  "invoiceUrl": "https://www.hopewiser.com.au/payment/xxxx"
}

{
  "invoiceId": "196"
}

Download results file

To retrieve the results (upon completion of payment step), call the file method. This will return a URL from which a zip file containing results and statistics can be downloaded. Please note the “Authorization: Bearer” token is required to authenticate the download URL.

GET https://www.hopewiser.com.au/online-address-cleanse/api/v1/results/file?runId=49

Parameter:

Name	Description	Example Value
runId	ID returned by /run	49

Response:

{
  "downloadUrl": "https://filesvr.hopewiser.com.au/filesvr/download?downloadName=AddressCleanseResults.zip&fileId=CleanseRun1234.dat"
  "fileServerId": "job-13952588965434633508814.dat",
}