Written by: Sondre Ninive Andersen (Development Engineer)
Written by: Sondre Ninive Andersen (Development Engineer)
Electronics are permeating more and more of the environment around us, and today it is possible to use embedded devices to monitor almost anything. Low-power sensors, processors, and radios enable sensing devices to run on battery power, possibly extended by energy harvesting, with an almost indefinite lifespan. However, the plethora of available components, not only processors and radios, but also sensors, batteries, and energy harvesting solutions, makes technology choice a non-trivial part of the design process. This article outlines a simulation-based methodology, and presents some advantages it might give. This article is based on an academic paper, published at the IEEE ICIT 2022 conference.
When designing an IoT system, one of the first steps in the design process is to select components. A typical IoT-system consists of a processor, a sensing device, a means of wireless connectivity, an energy buffer, and possibly some kind of energy harvesting device. The application the device will be used in sets the requirements for these parts, but not necessarily in a straight-forward way.
This raises the question of how best to approach the component-selection step of the design. A typical process might start with the design goals, such as “the device needs a lifetime of six months and must capture a minimum of 80% of the interesting events”. Working from this, and for instance determining that a solar panel is the most suitable energy harvesting device, one would next look into how much energy the measurement, on-board processing, and transmission will take for every event, and make an educated guess on the average frequency of events.
This would give an average energy requirement, which would need to be met by selecting an appropriate size for the solar panel. Finally, the energy buffer would be sized to supply enough energy for the night. The approximate and averaging nature of all these calculations means that a large safety margin would have to be included, in order to gain sufficient confidence that the specifications would be met.
The main problem with the above approach is that it only looks at average values, while disregarding all minute-to-minute transient behavior, both in terms of energy- and data availability. This runs the risk of over-dimensioning the components while trying to guarantee enough energy and energy buffer capacity to make the average values applicable.
If one instead creates a virtual model of the system, it becomes possible to run parametric simulations, allowing hundreds of combinations of different components, and even different control schemes, to be evaluated. This can both allow detailed insight into how the system will operate, and provide much more data on which to base the component choice.
The first step in this process is to create a model of the system. This model can be quite simple, but needs to be detailed enough that the relevant selection of components can be captured. The model must encompass both the operation of the IoT-system, and the environment in which it is placed, including a model of the energy harvesting potential (if relevant), and the characteristics of the phenomena to be measured.
Within this model, the relevant performance metrics needs to be defined. This can be things internal to the system, such as the fraction of time or the maximum time span spent with no energy available, or it can be related to the measurements, such as the average quality of the transmitted measurements. If the phenomena to be measured occurs in events, a relevant metric might be the fraction of events that are measured and transmitted.
With the model defined, a simulation framework can be implemented, and simulations can be run. These can be single simulations, evaluating the performance of a certain system configuration, or a batch of simulations can be run, with one or more parameters adjusted across several values.
If a batch of simulations is run, with two parameters swept across a range of values, the metrics can be presented in 2D plots. These plots can then give a lot of useful information about the system behavior, such as different operating regimes, trade-offs present between metrics, and optimal parameterizations.
More details about this example can be seen in the academic paper published at the IEEE ICIT 2022 conference.
As an example, we will consider a device intended to monitor the vibrations of a bridge in order to evaluate changes in its structural state over time. The sensor will be solar powered, and will use an accelerometer to measure the ring-down vibrations after the structure has been exited by a vehicle, meaning the measurement opportunities occur as events. In order to save on transmission energy, the vibration data will be processed on-board, and we’ll assume that the quality of these events can be estimated both before and, more accurately, after processing (for instance by analyzing the amount of noise present). This gives a data pipeline which looks like this:
There are a lot of parameters that can be adjusted including the size of the solar panel, the size of the energy buffer, the energy requirements of the sensor, processor, or communication modem (ES, EP, and ET in the figure respectively), and more. In this case, we will mainly investigate the effects of the power-management policy.
The policy we will examine will always measure events (if there is enough energy), and the result is stored in a prioritized queue capable of storing up to n events. When the system has enough energy to process and transmit an event, it will select the highest quality measurement in the queue to process and transmit. If, at any point, an event is estimated to have a quality lower than a threshold, θ, it is discarded.
Next, we define three metrics to evaluate the system. The first is the ratio of the generated events that are transmitted (fT), the second is the average delay between an event occurs, and it being transmitted (τ̄), and the third is the average quality of transmitted events (v̄).
With the three metrics and the two parameters defined, we can implement the system, run the simulations, and plot heatmaps for each metric. These plots show a lot of interesting information about the operation of the system, and can potentially be used as guidance if this system were to be designed and deployed.
In the plots, each point represents a separate simulation. The vertical axis shows the queue size, and the quality threshold value is shown on the horizontal axis. The color scale shows the value of each metric. The plots are divided into three regions, A, B, and C, in which the system operates in qualitatively different modes.
In region A, both the transmitted ratio and the average quality seem to be more or less constant, with the average latency being the most sensitive metric to parameterization. In this region, we can see that both decreasing the queue capacity, and increasing the quality threshold results in lower average latency.
However, if the queue capacity is decreased too much, the system might enter region B. In this region, the average quality begins to suffer, as the queue is too small to adequately prioritize the most valuable events. This region is characterized by a queue-size-controlled trade-of between quality and latency in the events, while the transmitted ratio remains more or less constant.
Finally, region C signifies a qualitatively different mode of operation, as the available energy is now sufficient to fully handle all events with value above the threshold. In this energy-abundant mode of operation, every event has zero latency, the performance is independent of the queue size (since all events are handled immediately), and the average quality and transmitted ratio is only determined by how picky the system is in what events it will transmit.
Using these plots, and recognizing that the data used for the simulation is not perfect, we might conclude that in this example, a threshold θ=0.3 and a queue size n=5 might be a good choice of parameters. This location in the plots has decent values for all the metrics, while being somewhat separated from the steepest gradients, thus avoiding massively different performance if the simulations are a bit off.
Obviously, this approach comes with a higher up-front cost, as the system needs to be modelled and a simulator needs to be implemented. Additionally, the environment and system architecture must be known to a certain degree.
However, this approach has several potential advantages. Simulating the system allows greater insight into its operation, which means that any interesting statistics may be extracted. Questions that would be difficult to answer with static analysis, such as “What is the probability that the system will be able to measure an event occurring at 2:00 AM during the night”, can easily be answered by evaluating the simulation results.
This means that the engineer can reduce the required margins in component choices, and allows the tuning of control algorithms before the system is tested in the field. This not only provides an opportunity to reduce the cost of production and operation of the system, but also increases the confidence that the system will perform as expected.