Machine learning on resource-constrained ubiquitous devices suffers from high energy consumption and slow execution. The number of clock cycles that is consumed by arithmetic instructions has an immediate impact on both. In computer systems, the number of consumed cycles depends on particular operations and the types of their operands. We propose a new class of probabilistic graphical models that approximates the full joint probability distribution of discrete multivariate random variables by relying only on integer addition/multiplication and binary bit shift operations. This allows us to sample from high-dimensional generative models and to use structured discriminative classifiers even on computational devices with slow floating point units or in situations where energy has to be saved. While theory and experiments on random synthetic data suggest that hard instances (leading to a large approximation error) exist, experiments on benchmark and real-world data show that the integer models achieve qualitatively the same results as their double-precision counterparts. Moreover, clock cycle consumption on two hardware platforms is regarded, where our results show that resource savings due to integer approximation is even larger on low-end hardware. The integer models consume half of the clock cycles and a small fraction of memory compared to ordinary undirected graphical models.
Pushing machine learning towards the edge, often implies the restriction to ultra-low-power (ULP) devices with rather limited compute capabilities. Such devices are usually distributed in space and time. Nevertheless, the setting in which the training data is distributed among several devices in a network with presumably high communication costs has not yet been investigated. We close this gap by deriving and exploiting a new property of integer models. More precisely, we present a model averaging scheme whose communication complexity is sub-linear w.r.t. the parameter dimension d, and provide an upper bound on the global loss. Experimental results on benchmark data show, that the aggregated model is often on par with the non-distributed global model.