I’m a software developer working in the cellular IoT space and new to this group. My name is Matt. I’ve struck a deal with a reputable book publisher to write the first book on cellular connectivity in the IoT space. It’s a practical guide to the many challenges of establishing and maintaining cellular connectivity, especially for mobile systems moving in and between regions of poor connectivity. The book should be released in late 2024.
My Area of Interest
My company produces and sells SpotMyBus (see spotmybus.com), a system for tracking and managing fleets of school buses. Our customers are school districts and we serve districts in many states.
We designed and manufacture our own tracking device. It runs Linux (Ubuntu 20.04), and uses a hybrid keypad/text-to-speech user interface. When placed in “debug” mode, it calls out signal strength, MNO, and distance traveled since previous report every 10 seconds, while also calling out connections, disconnections, and connection progress (while trying to connect). It logs everything it speaks, and a whole lot more.
We also design and manufacture our own cellular modems. After designing a modem around a ublox module (SARA R410) that is unstable and unreliable, and getting absolutely no technical support (after having purchased a reel of modules) we threw in the towel on ublox and switched Quectel. We developed modems around their BG95/BG96/EG91 family of modules and have found their modules and support to be outstanding.
I have, for some time, wanted to be able to additionally report distance to connected cell tower. Theoretically, if I know the signal strength at Point A and the signal strength at Point B (and both points were registered on the same tower), and the distance between Point A and Point B, and the relationship between distance to tower and signal strength then I can calculate the distance from each of the points to the tower–assuming the tower and Point A and Point B are all points on a single line.
In practice, signal strength measurements frequently vary by 3 dBm (often much more) from one reading to the next even when a device is stationary, so as a practical matter that approach does not seem useful (I think).
Recently, I started poking around the Internet looking for the kind of data provided by the OpenCellId project.
I only found opencellid.org about a week ago, so my understandings are still developing. If I make any incorrect statements or assumption, I hope some kind reader will correct me.
I downloaded the data for the US, massaged it a bit to discard fields I don’t need, reduce the records to 17 (binary) bytes per cell, and organize the data for quick disk-based lookups. The file is 85mb and fits on my tracker, so now it can do look-ups on-the-fly and call out distance-to-tower too (every 10 seconds). Oops, actually not.
Cell Location vs. Cell Tower Location
The lat/lon coordinates associated with each cell are an estimate of the center of each cell (CC) not cell tower location (CTL). To be clear, the CTL is the lat/lon point over which the cell tower providing coverage the cell is located. The CC is an estimate of the lat/lon of the center of the region for which the CTL provides coverage. And, as I understand the OpenCellId project, the CC and CTL will never be the same, because the CTL will always be on an edge of the cell.
This seems counterintuitive if you imagine (incorrectly) that the antenna(s) for a cell gets mounted on a tower and transmits/receives in all directions. In that case, the lat/lon of each CC and lat/lon of its CTL would be the same. But take a look at this diagram,
(from the OpenCellID wikki).
Typically, but not always, an MNO wants coverage from a cell tower to extend in all directions. In that case, the MNO hangs three sets of antennas (each set typically includes one transmitting antenna and two receiving antennas) on a tower at 120 degree intervals around the z-axis of the antenna, Each antenna set covers 120 degrees, and together they cover a region approximately centered on the CTL.
Each of those antenna sets creates a separate cell. and has its own unique tuple,
where all three cells would share , and each would have a different . Notice in the diagram there are three yellow dots. Each corresponds to the center of the coverage region of one cell. Now it should be obvious that the CC and the CTL can never be the same, and the CTL is always on the cell border and distance between CC and CTL will be approximately 1/2 the span of a cell.
So, if you seek cell tower locations (CTLs), as I do, the lat/lon estimates in the data files are not what you’re looking for.
My interest in cell tower locations (tied to ) is to develop a deeper understanding of the the relationships between signal-strength-and-quality/distance-to-tower/tower-handoff-behavior. Cell tower location appears to me central to that investigation.
**Who Uses Cell Location Data, and for What? **
For my purposes CCs are useless, where CTLs would be very useful. I understand that OpenCellID does not exist to serve my needs ;), but from discussion threads I’ve read there are others with strong interest in CTLs too.
When working with a CTL, generally, as a device gets closer to its lat/lon the signal will get stronger, and it will get weaker as one moves away. Working with a CC is a different matter. If we draw a line with one endpoint at the CTL (call it point A), passing through CC and extend it somewhat farther to another endpoint (point B), and place a cellular modem on the line at point B, as it moves toward point A (the tower), the signal will get stronger as it approaches the CC, and will also continue to get stronger after it passes the CC and continues toward the A (the CTL). Distance calculations from the cellular modem’s position to CC will show signal strength increasing as distance to CC decreases (from B to CC) and then show signal strength also increasing as distance to CC increases (from CC to A).
I can see many use cases for CTL data, but I’m not seeing the use cases for CC data. Can some readers inform me?
It’s not even clear to me OpenCellID data can be used effectively to answer the question: is cell coverage available at a particular lat/lon. I’m not saying it can’t, just that it’s not clear to me. If, for example the Range for given CC is small, say 1000, then it seems likely coverage will be available. If Range is 10,000, it seems less likely. Distance from the CC becomes a wildcard. A greater distance from the CC will increase the likelihood of connectivity if the distance to the CTL coincidentally decreases, but will decrease the likelihood of connectivity if the distance to the CTL coincidentally increases.
Estimating Cell Tower Locations
Data reported to OpenCellID must have <lat/lon> to be accepted, but is not required to include signal strength information. Can anyone comment on the portion of the cell data that does include signal strength information?
It seems obvious that data which includes signal strength info can be used to approximate CTLs since closer proximity to a tower correlates well with signal strength.
Intuitively (dangerous word), it seems no more difficult to estimate CTL than CC and I expect the Range values (where Range would be related to CTL rather than CC) would have less variance for CTL than CC.
If CTLs can be estimated for, say, the three cells hanging together on a tower, then perhaps careful data analysis would allow those three cells to be tied together into a “super cell” (e.g. by averaging their respective locations) with a virtual Super Cell Tower Location (SCTL). The SCTL would have the fundamental characteristic I desire: signal strength generally decrease with distance from the SCTL.
I also came across cellmapper.net. They present maps with icons indicting towers. Clicking on a tower produces a list of cells associated with the tower. It also produces a set of polygons indicating an estimate of geographical coverage for the cells on the tower. My impression is the polygons are very cool to look at but very imprecise with huge areas (that do have coverage) appearing to be uncovered.
My point is not that cellmapper is better or worse, but that they have, estimated CTLs and tied together the cells sharing a tower (according to their estimation. Unfortunately, they do not appear to share the lat/lon for the cell tower icons in their maps.
The Range column in the OpenCellID .csv files is interesting, but varies from pretty useful to not so much so. For many cells (~3 million), the Range values are “pretty small,” at 1000m, but for many the Range values are 10km or more. If the calculated distance from the tracker to the estimated cell location, is 1500 meters and the Range is 5000 meters, then all I really know is the tower is 0 - 6500 meters a way, and I have application for data that rough.
Please note: I am not complaining. Rather, I am trying to understand the realities of using OpenCellId data, and thinking through approaches to improve Range values in the database.
My trackers are on school buses. Half my customers are rural. It’s rural areas, I expect, where your data is most sparse and least precise. School buses go everywhere. Currently, my trackers log cell info every 10 seconds, but that is easily varied. My cellular modems use multi-SIMs which get access to ATT, T-Mobile and Verizon, so a single tracker can, theoretically, gather cell info for all three MNOs.
My trackers send location packets every 30 seconds which include some signal strength information. I am able to squeeze the packets into 18 bytes (payload) and use UDP to reduce packet overhead. The EG91 module has access to:
srxlev (RX level value for Base Station)
I cannot send all of that data with each location packet, as it would greatly increase the packet size, but I could accumulate data on each tracker, analyze it periodically, and transfer a summary to one of our servers. The server could move the data to OpenCellId. For example, during a month, a tracker might collect many hundreds of samples for each cell it passes through. It could then perform the same sort of analysis OpenCellID performs on such data, and share the result for each cell, thereby reducing the the volume of data transferred by more than 99%.
I would be happy to discuss this matter if you folks are interested.