Documenting the Columns in the downloadable Cells database CSV

Hi Sagar,

I read this thread and I feel confused about the exact definition of the range. Here is my interpretation: it enables to define a circle area whose radius is expressed in meters by the ‘range’ property which possibly contains the cell tower.

Could you please confirm my assumption?

Thank you very much for your support.

Prudence

You are right, range is an approximation of where the cell might be.

We have a confidence benchmark of 98% - this means we 98% of times, the actual location of device will be within the circle. In a standard API response; there is a 98% chance that the device is at (x,y) or z meters sq around it.

  • If we were to reduce our confidence benchmark to 50% - it would mean that 50% of the times, actual location of the device will be within a circle. Because my confidence benchmark is low, I can reduce the accuracy to…say 300 meters. So, there is a 50% chance that the device is at (x,y) or 300 meters sq around it.
  • If we were to reduce our confidence benchmark to 10% - it would mean that 10% of the times, actual location of the device will be within a circle. Because my confidence benchmark is low, I can reduce the accuracy to…say 50 meters. So, there is a 10% chance that the device is at (x,y) or 50 meters sq around it.

If we have a higher confidence benchmark, the radius (accuracy value) returned will be larger.

Thank you Sagar for your clear answer; no doubt any more.

Prudence

Hi,

I am a researcher at Wharton, University of Pennsylvania and trying to use the cell phone tower location data in opencellID for academic research purposes. I was hoping to get some clarification on the data and will much appreciate any help.

I see that the data has multiple rows corresponding to the same MNO-cellID. For example, see the data below. Can you please help me understand how should I interpret these multiple rows for the same operator-CellID tuple? Which of these is likely to be the most accurate with regards to lattitude-longitude information of the tower?

radio Mobilecountrycode Mobilenetworkcode area cellID unit lon lat range samples changeable created updated averageSignal
GSM 405 51 8086 4112 0 87.28981 23.54727 1000 10 1 1.41E+09 1.49E+09 0
GSM 405 51 6032 4112 0 89.20967 26.34247 1000 2 1 1.44E+09 1.44E+09 0
UMTS 405 51 8190 4112 0 88.57386 23.00332 10070 2 1 1.46E+09 1.48E+09 0
UMTS 405 51 8182 4112 0 87.98195 22.7726 1000 1 1 1.47E+09 1.47E+09 0
UMTS 405 51 9017 4112 0 88.31337 24.29558 1000 3 1 1.48E+09 1.48E+09 0
UMTS 405 51 7080 4112 0 88.12752 23.63777 1000 1 1 1.49E+09 1.49E+09 0
UMTS 405 51 2002 4112 0 87.09068 23.62679 1000 1 1 1.49E+09 1.49E+09 0

Would be a great help if you could kindly throw some light on this.

The OpenCelliD downloadable dataset does not have duplicate cells. Unique cell identifiers consists of four parts: Mobile Country Code (MCC), Mobile Network Code (MNC), Location Area Code (LAC/ area) and Cell Identification (CI/cellID). According to the example you shared:

  • 405-51-8086-4112 is at 87.28981, 23.54727
  • 405-51-6032-4112 is at 89.20967, 26.34247

Hi, I downloaded data for one specific country and it doesn’t have any labels for the columns. Can I just assume it is organized the same way you show here and add in those column names?

1 Like

ami me aparece todo unido como puedo hacer para que me aparezca por columnas

si solo debes de separarlas en excel ve a datos luego a texto en columnas luego a delimitados y le das por coma y listo, y ya las vas organizando como muestra en la imagen de arriba

Hello,
I’ve downloaded the full database recently but I can’t find anywhere a documentation on the meaning of the “unit” column. It seems to only have 0 or -1 for value.
Can anyone explain what it means?
Thanks a lot

Hi, Sagar

I have some dilemma about this. Let’s say that I have 10 measurements for a new cell (not present in the db). If most/all of my measurements don’t really have a signal strength value (as I have read in multiple places here in the forum), additionally, if no TX power value is used, how can you do any triangulation with my values. In my basic understanding, for triangulation, you would need values for each measurement for the distance between where the measurement was taken and the cell tower. But, if no signal strength is provided (and to tx power), then no such distance value can be obtained.
I can imagine, that if these 10 measurement have only GPS values for where the measurement was taken, probably only some averaging out can happen, but not triangulation?
If this is the case, then, you can calculate the position of the cell tower by averaging out the values. Afterwards for a new batch of 10 measurements, you may not need the signal strength indication anymore and calculate the distance between each measurement taken and cell tower based on reported gps only, and only then be able to do some kind of triangulation.

Is is possible to give me insight into this or is my understanding far off from what is happening?

Hi,

If you’re submitting us 10 measurements observing a single cell - without signal strength, it’ll not be possible for us to triangulate the position of that cell. We’ll use your measurements in combination with measurements from other user’s who have submitted signal strength. Your measurements can also be used as a verification of the approximate position of the cell.

Averaging only GPS coordinates of the device making these observations does not give us a good understanding of the position of the cell but it can roughly place the cell in a given city/ town.

I think I was surprised by this. Here, it says that the average signal strength for most of the cells is 0. On the other hand, you mentioned that you also do triangulation and you need signal strength for it in one way or another. So, if I understand this correctly, in order for a new cell to be registered correctly (and in order to do any triangulation for the measurements reported for that new cell), you have to have signal strength (at least for the initial measurements)? And, the reason why the average is 0 for some cells is because maybe there were too many measurements reported, where most of them don’t have any signal strength, but not all, because at least some have to be non-zero for triangulation to happen (for example, the first ones)? Basically, it shouldn’t be possible for a cell to exist in the db where the signal strength for all the measurements was 0. I guess, there can be an exception, for example, a record directly from the carrier.

Sorry for taking your time and thanks for the quick answer to my previous question.

Hey, have you managed to find out about this field?
For Brazil I can see some other values, but can not make sense of it either.
Cheers

Dear Sagar,

Thanks for providing the methodology for cell position estimation, which convinced my deduction as well. But I am wondering how could you know if the received signal strength is x% of that of Cell A, without knowing the actual emission power of that Cell A?

Hi where can I found the latest version of the global database because I have a mexico one and it has more antenas that the global

1 Like

The database exports only comprise cells seen in the last 18 months. Our team is exploring ways to make historical data available, but we don’t have a timeline for it just yet.

1 Like