Cell Tower Location vs. Cell Location

matt · December 22, 2023, 10:26pm

Hi,

I’m a software developer working in the cellular IoT space and new to this group. My name is Matt. I’ve struck a deal with a reputable book publisher to write the first book on cellular connectivity in the IoT space. It’s a practical guide to the many challenges of establishing and maintaining cellular connectivity, especially for mobile systems moving in and between regions of poor connectivity. The book should be released in late 2024.

My Area of Interest

My company produces and sells SpotMyBus (see spotmybus.com), a system for tracking and managing fleets of school buses. Our customers are school districts and we serve districts in many states.

We designed and manufacture our own tracking device. It runs Linux (Ubuntu 20.04), and uses a hybrid keypad/text-to-speech user interface. When placed in “debug” mode, it calls out signal strength, MNO, and distance traveled since previous report every 10 seconds, while also calling out connections, disconnections, and connection progress (while trying to connect). It logs everything it speaks, and a whole lot more.

We also design and manufacture our own cellular modems. After designing a modem around a ublox module (SARA R410) that is unstable and unreliable, and getting absolutely no technical support (after having purchased a reel of modules) we threw in the towel on ublox and switched Quectel. We developed modems around their BG95/BG96/EG91 family of modules and have found their modules and support to be outstanding.

I have, for some time, wanted to be able to additionally report distance to connected cell tower. Theoretically, if I know the signal strength at Point A and the signal strength at Point B (and both points were registered on the same tower), and the distance between Point A and Point B, and the relationship between distance to tower and signal strength then I can calculate the distance from each of the points to the tower–assuming the tower and Point A and Point B are all points on a single line.

In practice, signal strength measurements frequently vary by 3 dBm (often much more) from one reading to the next even when a device is stationary, so as a practical matter that approach does not seem useful (I think).

Recently, I started poking around the Internet looking for the kind of data provided by the OpenCellId project.

I only found opencellid.org about a week ago, so my understandings are still developing. If I make any incorrect statements or assumption, I hope some kind reader will correct me.

I downloaded the data for the US, massaged it a bit to discard fields I don’t need, reduce the records to 17 (binary) bytes per cell, and organize the data for quick disk-based lookups. The file is 85mb and fits on my tracker, so now it can do look-ups on-the-fly and call out distance-to-tower too (every 10 seconds). Oops, actually not.

Cell Location vs. Cell Tower Location

The lat/lon coordinates associated with each cell are an estimate of the center of each cell (CC) not cell tower location (CTL). To be clear, the CTL is the lat/lon point over which the cell tower providing coverage the cell is located. The CC is an estimate of the lat/lon of the center of the region for which the CTL provides coverage. And, as I understand the OpenCellId project, the CC and CTL will never be the same, because the CTL will always be on an edge of the cell.

This seems counterintuitive if you imagine (incorrectly) that the antenna(s) for a cell gets mounted on a tower and transmits/receives in all directions. In that case, the lat/lon of each CC and lat/lon of its CTL would be the same. But take a look at this diagram,

cell_tower

(from the OpenCellID wikki).

Typically, but not always, an MNO wants coverage from a cell tower to extend in all directions. In that case, the MNO hangs three sets of antennas (each set typically includes one transmitting antenna and two receiving antennas) on a tower at 120 degree intervals around the z-axis of the antenna, Each antenna set covers 120 degrees, and together they cover a region approximately centered on the CTL.

Each of those antenna sets creates a separate cell. and has its own unique tuple,

where all three cells would share , and each would have a different . Notice in the diagram there are three yellow dots. Each corresponds to the center of the coverage region of one cell. Now it should be obvious that the CC and the CTL can never be the same, and the CTL is always on the cell border and distance between CC and CTL will be approximately 1/2 the span of a cell.

So, if you seek cell tower locations (CTLs), as I do, the lat/lon estimates in the data files are not what you’re looking for.

My interest in cell tower locations (tied to ) is to develop a deeper understanding of the the relationships between signal-strength-and-quality/distance-to-tower/tower-handoff-behavior. Cell tower location appears to me central to that investigation.

**Who Uses Cell Location Data, and for What? **

For my purposes CCs are useless, where CTLs would be very useful. I understand that OpenCellID does not exist to serve my needs ;), but from discussion threads I’ve read there are others with strong interest in CTLs too.

When working with a CTL, generally, as a device gets closer to its lat/lon the signal will get stronger, and it will get weaker as one moves away. Working with a CC is a different matter. If we draw a line with one endpoint at the CTL (call it point A), passing through CC and extend it somewhat farther to another endpoint (point B), and place a cellular modem on the line at point B, as it moves toward point A (the tower), the signal will get stronger as it approaches the CC, and will also continue to get stronger after it passes the CC and continues toward the A (the CTL). Distance calculations from the cellular modem’s position to CC will show signal strength increasing as distance to CC decreases (from B to CC) and then show signal strength also increasing as distance to CC increases (from CC to A).

I can see many use cases for CTL data, but I’m not seeing the use cases for CC data. Can some readers inform me?

It’s not even clear to me OpenCellID data can be used effectively to answer the question: is cell coverage available at a particular lat/lon. I’m not saying it can’t, just that it’s not clear to me. If, for example the Range for given CC is small, say 1000, then it seems likely coverage will be available. If Range is 10,000, it seems less likely. Distance from the CC becomes a wildcard. A greater distance from the CC will increase the likelihood of connectivity if the distance to the CTL coincidentally decreases, but will decrease the likelihood of connectivity if the distance to the CTL coincidentally increases.

Estimating Cell Tower Locations

Data reported to OpenCellID must have <lat/lon> to be accepted, but is not required to include signal strength information. Can anyone comment on the portion of the cell data that does include signal strength information?

It seems obvious that data which includes signal strength info can be used to approximate CTLs since closer proximity to a tower correlates well with signal strength.

Intuitively (dangerous word), it seems no more difficult to estimate CTL than CC and I expect the Range values (where Range would be related to CTL rather than CC) would have less variance for CTL than CC.

If CTLs can be estimated for, say, the three cells hanging together on a tower, then perhaps careful data analysis would allow those three cells to be tied together into a “super cell” (e.g. by averaging their respective locations) with a virtual Super Cell Tower Location (SCTL). The SCTL would have the fundamental characteristic I desire: signal strength generally decrease with distance from the SCTL.

I also came across cellmapper.net. They present maps with icons indicting towers. Clicking on a tower produces a list of cells associated with the tower. It also produces a set of polygons indicating an estimate of geographical coverage for the cells on the tower. My impression is the polygons are very cool to look at but very imprecise with huge areas (that do have coverage) appearing to be uncovered.

My point is not that cellmapper is better or worse, but that they have, estimated CTLs and tied together the cells sharing a tower (according to their estimation. Unfortunately, they do not appear to share the lat/lon for the cell tower icons in their maps.

Range Data

The Range column in the OpenCellID .csv files is interesting, but varies from pretty useful to not so much so. For many cells (~3 million), the Range values are “pretty small,” at 1000m, but for many the Range values are 10km or more. If the calculated distance from the tracker to the estimated cell location, is 1500 meters and the Range is 5000 meters, then all I really know is the tower is 0 - 6500 meters a way, and I have application for data that rough.

Please note: I am not complaining. Rather, I am trying to understand the realities of using OpenCellId data, and thinking through approaches to improve Range values in the database.

Contributing Data

My trackers are on school buses. Half my customers are rural. It’s rural areas, I expect, where your data is most sparse and least precise. School buses go everywhere. Currently, my trackers log cell info every 10 seconds, but that is easily varied. My cellular modems use multi-SIMs which get access to ATT, T-Mobile and Verizon, so a single tracker can, theoretically, gather cell info for all three MNOs.

My trackers send location packets every 30 seconds which include some signal strength information. I am able to squeeze the packets into 18 bytes (payload) and use UDP to reduce packet overhead. The EG91 module has access to:

MCC
MNC
Radio Technology
TAC
Cell ID
RSSI
RSRP
RSRQ
SINR
srxlev (RX level value for Base Station)
Band
Bandwidth

I cannot send all of that data with each location packet, as it would greatly increase the packet size, but I could accumulate data on each tracker, analyze it periodically, and transfer a summary to one of our servers. The server could move the data to OpenCellId. For example, during a month, a tracker might collect many hundreds of samples for each cell it passes through. It could then perform the same sort of analysis OpenCellID performs on such data, and share the result for each cell, thereby reducing the the volume of data transferred by more than 99%.

I would be happy to discuss this matter if you folks are interested.

Regards,

Matt
1.757.277.4330
[email protected]

Sagar · December 27, 2023, 9:38am

Hey @matt - Please do notify us about your book here on our forums when it’s out. Thanks for your detailed message and for bringing up some thought-provoking questions. I’m glad to see your interest in OpenCellId data and potential contributions.

Use Cases: CTL data would be great to have, but collecting this is complicated. Telecom firms don’t share data on their cell towers. Our data collection approach uses input of observations from contributor’s devices to approximate position of CC. Utilizing this to then guess position of CTL adds another layer of approximation that reduces the overall confidence, unless we have large volumes of high frequency observation data. Without it, just the CC data is good enough for asset tracking implementations. Given that this involves triangulation, signal strength is given lower importance and knowledge of the exact Cell Tower Location (CTL) is not necessary.

OpenCellId Data and Cell Coverage: While we don’t have firsthand experience, there are use cases where CTL is used to approximate cell coverage. The first step, considering OpenCellId’s CC data, is to cluster CCs that are close to each other and assume it as a CTL. It’s a workaround to gauge potential cell coverage at a specific location.

Enhancing Range Values: I’m honestly, not very sure at the moment. I’ll have our team explore this further and see if there are potential improvements.

Estimating CTLs and the Role of Signal Strength: CTLs can be estimated by clustering CCs that are close to each other, as we’ve encountered in some use cases. However, signal strength might not be considered in this estimation due to its inherent unreliability. The first issue here is that we get data from volunteers when they can contribute - which means the frequency isn’t very high. Moreover, community members don’t use the same approach/ app/ hardware when contributing data and the variation in data creates its own set of complexities. That being said, if you have access to data with high frequency, from specific devices, for the regions your devices operate in, your approach with signal strength data will give you better results.

Contributing Data: We appreciate your willingness to contribute data from your trackers, especially in rural areas with sparse data. Your approach of sampling data to reduce packet size is not only practical but also beneficial for optimizing data transfer efficiency. We see this as a positive contribution to the OpenCellId project and would be happy to discuss this matter further.

Best of luck with SpotMyBus and your upcoming book! If you have any questions or if there’s anything specific you’d like to discuss in more detail, feel free to reach out.

matt · December 29, 2023, 7:44am

Sagar:

@Sagar

Thank you very much for your thoughtful reply.

If you are willing I would like to explore the problem of estimating Cell Tower Locations (CTLs) a bit further.

You wrote:
CTLs can be estimated by clustering CCs that are close to each other, as we’ve encountered in some use cases. However, signal strength might not be considered in this estimation due to its inherent unreliability. The first issue here is that we get data from volunteers when they can contribute - which means the frequency isn’t very high. Moreover, community members don’t use the same approach/ app/ hardware when contributing data and the variation in data creates its own set of complexities. That being said, if you have access to data with high frequency, from specific devices, for the regions your devices operate in, your approach with signal strength data will give you better results.

Your points are well-taken, and I do not suppose it is a simple matter to estimate CLTs. Your suggestion of estimating CTLs by “clusterin CCs that are close to each other,” seems unlikely, but maybe I’m missing something.

To explore this, please consider this map of 7 cell towers: T1, T2, … T7

Assume each tower has three sets of antennas mounted so each set covers 120 degrees of the cell. Tower T1, creates cells 11, 12, 13, and the centers of each cell are shown as:

CC11 CC12 CC13

The three cells created by each tower are outlined in a single color to help clarify which of the 21 cells (7 towers with 3 cells each) is associated with each tower.

Now, suppose we only have the OpenCellId database information. Then we know CCs, but we do not know the positions of any towers. Because the map is showing the “ground truth,” we know that CC41 is produced by T4. Yes, it is adjacent to cells CC42 and CC43, so that might let is infer that CC41, CC42 and CC43 share a tower, and averaging the locations of the CCs for those cells will provide an approximation of the location of the T4.

However, CC41 is also adjacent to CC13 and CC22. Averaging the CCs of those cells will estimate a tower at the northernmost point of C41. C41 is also adjacent to CC13 and CC42, and also to CC22 and CC43.

Interestingly, C41 is also adjacent to all four of,

CC13 CC22 CC42 CC43

and averaging those locations will estimate the tower position at CC41

I don’t see how looking at clusters of adjacent cells helps us estimate CTLs.

Adding Signal Strength Information

We know that each cell is produced by a directional transmitter (3 to a tower), so the tower must always be somewhere on the border of the cell and can never be inside the cell boundary. Therefore, if we look at signal strength information within a cell, stronger signals will be associated with shorter distances to a tower, and weaker signals will be associate with greater distances.

I understand that each device (cell phone) collecting data for OpenCellId is different and devices at the same location may report different signal strengths due difference in the equipment, but for any device moving within a cell the signal strength it measures will generally be higher when closer to the tower and lower when farther away. My point is that within each cell, it seems possible to estimate where on the cell boundary the transmitter (tower) lies if data points include lat/lon and signal strength.

Now, returning to the map, and assuming that for each cell the boundary location has been (even coursely) estimated, the three boundary estimates of CC41, CC42 and CC43 will likely be much closer to each other than the boundary estimates of CC41, CC13 and CC22. This approach lets us determine which adjacent cells do share a tower and which ones do not. If we then average the locations of the boundary estimates of the cells we infer to share a tower (e.g. CC41, CC42 CC43) we get an estimate of the CTL.

Any thoughts?

Regards,

Matt