Late Data Handling
Disadvantages of using conventional method
-
Relatively hard to maintain
-
Raw data are not aggregated. Self aggregation and comparison are required. Raw files not combined to branch levels if one of the counters is offline, it would result in lower branch level traffic
-
No visibility on whether late data will be re included next day
-
Data Integrity: Inaccurate data especially when one offline counter affects the overall accuracy of counting, in terms of sales conversion, visitor counts etc. No visibility if data retrieved is partially completed or fully completed, if it is clean data with no offline counters and no unverified data
Handling:
-
Ensure the data collected is fully verified
-
Ensure the branch level data collected has no missing data due to one of the counter offline
-
Ensure the data is fully uploaded to the server and get the complete set to compare with sales conversion
Video Guide
Step 1: Generate AToken
method: POST
url: https://v9.footfallcam.com/account/GenerateAccessToken
Body(payload):
{
"email": "[email protected]",
"password": "123456",
"expiration": "2024-03-15"
}
After clicking send, you may able to get the AToken in the Body Response.
Step 2: Generate site detail
method: GET
url: https://data.footfallcam.com/api/Sites
Hearder: add the KEYWORD AToken in key, and paste the AToken generated before into value
After clicking send, you can get the BranchId in the Body Response.
Step 3: Call Cube Example
Method: POST
URL: https://cube.footfallcam.com/API/v1/load
Authorization: Bearer Token: {{your access token}}
KEY Content-Type VALUE: application/json
Body:
{
"query": {
"measures": [
"sitearea_footfallcounting_hour.FC01_1_SUM",
"sitearea_footfallcounting_hour.FC02_1_SUM"
],
"dimensions": [
"sitearea_footfallcounting_hour.BranchName",
"sitearea_footfallcounting_hour.Time"
],
"filters": [
{
"member": "sitearea_footfallcounting_hour.BranchId",
"operator": "equals",
"values": [
"6903"
]
},
{
"member": "sitearea_footfallcounting_hour.isoperating",
"operator": "equals",
"values": [
"true"
]
}
],
"timeDimensions": [
{
"dimension": "sitearea_footfallcounting_hour.Time",
"dateRange": [
"2023-07-27T00:00:00.000",
"2023-07-27"
]
}
],
"order": [
[
"sitearea_footfallcounting_hour.BranchId",
"asc"
],
[
"sitearea_footfallcounting_hour.Time",
"asc"
]
],
"offset": 0,
"limit": 100
}
}
You can get the data in the Body Response after clicking send.
Please refer List of Cube to use other cube and refer Metric Documentation for the metrics.
FAQ: How to Handle Late Data?
Why Late Data can happen?
Late data can happen when some devices are offline before the data is retrieved via the API. This prevents the data from being sent to our database in real-time.
Solution
There is a column called 'AggregationStatus' in the following four cubes:
-
site_1d_summary
-
site_1h_summary
-
area_1d_summary
-
area_1h_summary
There are four types of AggregationStatus:
-
Complete - This indicates that all the data has been fully aggregated.
-
Precheck - This normally indicates that this particular hour/day is still ongoing and thus the aggregation status has not yet been confirmed.
-
Late Data - This indicates there exist late data and the data is not yet fully aggregated.
-
Missing Data - This indicates that there exist missing data that are not late. This shouldn't appear under normal circumstances.
Below is an example payload and the corresponding output:
Payload
{
"query": {
"measures": [
"site_1d_summary.A01",
"site_1d_summary.A02"
],
"dimensions": [
"site_1d_summary.AggregationStatus",
"site_1d_summary.AggregationStatusLastUpdateUTCDateTime"
],
"filters": [
{
"member": "site_1d_summary.AreaId",
"operator": "equals",
"values": [
"123"
]
}
],
"timeDimensions": [
{
"dimension": "site_1d_summary.Time",
"dateRange": [
"2024-08-17T00:00:00.000",
"2024-08-22T23:59:59.999"
],
"granularity": "day"
}
],
"order": [
[
"site_1d_summary.Time",
"asc"
]
],
"offset": 0,
"limit": 100
}
}
Output Data