skip to primary navigation skip to content

AdaptiveCity / SmartCambridge / CDBB Digital Twin

Research

Home

Overview of the AdaptiveCity / SmartCambridge / CDBB Digital Twin Program

Our work involves the collection and analysis of historical and real-time sensor data from within an urban region (working with SmartCambridge) and technically similar work for in-building sensor data and building management (working with the Center for Digital Built Britain ).

We envisage a world where buildings and cities operate largely autonomously, such that the infrastructure is itself the major consumer of the reference and real-time data. In this environment, dense sensor deployments will allow accurate assessment of the current state and also provide historical data from which 'normal' patterns of behaviour can be learned and anomolies detected. Continuous analysis will be required such that issues (such as congestion) can be predicted and timely action taken.

Within Cambridge, this work involves a close collaboration between the Department of Computer Science and Technology and the Institute for Manufacturing.

More on the Centre for Digital Built Britain

This research/output/report forms part of the Centre for Digital Built Britain’s (CDBB) work at the University of Cambridge within the Construction Innovation Hub (CIH) which brings together world-class expertise from the Manufacturing Technology Centre (MTC), BRE and CDBB to transform the UK construction sector. The Construction Innovation Hub is funded by UK Research and Innovation through the Industrial Strategy Fund.

Further information regarding CDBB is available here.

And more on SmartCambridge

Smart Cambridge is exploring how data, emerging technology and digital connectivity can be used to transform the way people live, work and travel in the Greater Cambridge area and beyond.

This rapidly evolving programme is harnessing the latest technologies to improve the economic strength and sustainability of the area.

Local councils, technology businesses, university researchers and partner organisations are working together to find smart ways to tackle city challenges, such as transport and air quality.

The work is supported by the Connecting Cambridgeshire partnership programme, led by Cambridgeshire County Council, which is improving the county’s digital infrastructure with better broadband, free public WiFi and wider mobile coverage.

With investment from the Greater Cambridge Partnership and the Cambridgeshire and Peterborough Combined Authority, the Smart Cambridge programme is being scaled up from 2017-2020, to focus on maximising the impact of transport-related work through:

  • Better quantity, quality and use of data
  • Embedding digital solutions and emerging technology
  • Collaboration with business, community and academic sectors
  • The pioneering research is managed through Connecting Cambridgeshire and is overseen by a Project Board and an Advisory Group to steer the work and give technical guidance.

Further information regarding SmartCambridge is available here.

Data standards, time, timeliness and real-time data processing.

Ticking away, the moments that make up a dull day.[ref]

Hopefully, it is sufficiently clear from the design of our platform and all the documentation included with every project that we care about time and timeliness a great deal. Where the vast majority of building information management and sensor deployment projects simply collect and store the data for subsequent processing (e.g. deploy some NO2 sensors in a city, collect the data over a few months, analyze the collected information to produce a great paper) that is categorically not the approach taken in our research.

We are planning for the cities and buildings of the future requiring a significant autonomy of operation and the sensors represent the eyes and ears of that intelligent infrastructure. Incoming real-time data must be interpreted and analyzed such that patterns and derived events can be recognized promptly and timely actions taken.

There is a general question (which we are yet to answer) which is whether all data we work with should be treated as time-series data. E.g. the properties of a building asset (like an office) are often treated as 'fixed', traditionally captured on a working drawing, and if the office is changed then that will eventually be reflected in updated versions of the drawing for the whole floor. Our model stores the changes to each of our 'static' data types effectively as a timestamped transaction log.

The concepts of changing data and timely action are central to the work that we are doing. A large body of our work is concerned with managing effectively the spatial and temporal data associated with urban and in-building data. We believe this tackles the important real-time requirement while still supporting the 'planning timescale' initiatives such as producing a particulates heatmap for a city (or a building) collected over a period of a month. Prompt recognition of spatio-temporal patterns in heterogeneous sensor data requires careful consideration of the space and time coordinates, particularly when the 'sensors' are moving around.

Time

We attach timestamps to our data at the earliest opportunity and continue to add timestamps at other steps in the processing. One timestamp is more important than all the others, i.e. the time most sensibly associated with the sensor reading, and we give this the name acp_ts. There will be multiple other times that will be relevant for most sensors, e.g. for our CNN camera DeepDish there will be a time the image was taken, the time the object detection produced the count, the time the data was sent from the sensor, the time it arrived at our platform, and so on. In this case we would deem the time the picture was taken most appropriate for acp_ts.

In due course a higher-level program may decide the multiple DeepDish counts of people from spatially neighboring sensors are sufficient to raise a Covid proximity warning, and that warning becomes its own 'reading' (we would use the word 'event') flowing through the system with its own acp_ts. The relationship between these timestamps itself becomes of significant interest.

Timeliness

This concept combines three factors:

  • The natural period over which sensor readings could be considered relevant for a decision or action to be taken.
  • The urgency with which a decision should be taken, e.g. to warn of a car crash or school shooting.
  • The natural time taken for a reading to be collected or event to be recognized. For example a device measuring gas concentration make take a certain amount of time to generate a reading or a system may take a while to decide that occupants of a building are not behaving 'normally'.

In summary, the timeliness required in analysing and acting upon sensor data is an important property and the work we do targets timescales of seconds or less. We do not assume simple analysis of readings from a single sensor or indeed sensors of a single type, rather an action may be based upon a cumulative recent history of sensor readings from a diverse set of sensors resulting in an action such as "close the highway".

In this context the difference between planning and action needs to be considered. For example you can collect pollution levels for a city and analyze that ten years later and still produce useful information that will aid the planning process for that city. In contrast, our experience with bus position data around Cambridge UK is that after the information is a few seconds old there is little practical use for it. We derive, in real-time, road speed information from the position reports of the buses - this information has been of interest to people analyzing traffic in the Cambridge region for several years and been distributed to over 100 researchers, but the relative interest in the historic position of any bus is extremely limited (to those working on bus position analysis systems who don't have their own real-time platform).

Real-time data

This is the data flowing 'live' through our platform, through the real-time analytics and pattern recognition, to the visualisation tools and actuators that affect the infrastructure. The data that has been stored ceases to be 'real-time'.

Time-series data

This is the simple process of associating a time with a data value, such as most sensor data. Data stored in a database is considered by us to be 'historical' data although still potentially useful in analysing the real-time data. Using a time-series database particularly suited to time/value datasets typically speeds up queries using temporal parameters but should not be confused with a 'real-time' platform although reference to this historical data may well help inform pattern recognition on the real-time data flowing through our platform. The fact that real-time data pouring into a time-series database will have the effect that the database will contain the latest sensor data values does not make the time-series database a real-time platform.

Asynchronous support

Somewhat late to the party, programming languages are adding 'async' features (e.g. Python) that assist in handling inherently asynchronous sensor data in a more efficient way both from the clarity of the programming and the efficiency of the code. The core of the Adaptive City Platform is implemented with Java/Vertx.

Stream processing

This is essentially what we are doing with Eclipse Vertx, which is a combination of asynchronous programming support (in Java) with a data publish/subscribe mechanism called the EventBus. Apache Storm and Apache Kafka have gained popular support in the past couple of years both of which have the 'real-time' characteristic that data is pushed through the system. Beyond that common characteristic, the available 'stream processing' platforms differ significantly in their approach particularly with regard to guaranteeing delivery of data and managing what is in effect a committed transaction log. For urban and in-building sensor data processing it is helpful to have a lightweight but low-latency system, recognizing that sensor data is inherently unreliable and in the event of a temporary outage it often preferable to start back up immediately with new incoming data rather than delay that by pumping through older sensor readings which are now less important.