File talk:Rir-rate.svg

Latest comment: 26 days ago by Kleinesfilmröllchen in topic Source specification

Source specification

edit

Mro , or others: could you please

  • provide a precise source? - full URLs
  • Is it about IPv4 -6 or both ???

--Itu (talk) 08:46, 30 May 2021 (UTC)Reply

I am currently working on updating this graph, and to me the only reasonable way to arrive at this statistic seems like the following:
  • ftp://ftp.ripe.net/pub/stats has standardized CSV statistics files for every RIR (the exact same files should be available on the other RIR's FTP servers too, but I have not checked). There are README files describing (among other things) the format, how the data is compiled and that most RIRs should be mirroring each other's data. There are at least three different formats in use (the old, the new "basic", and the new "extended"). In essence, these files contain a full list of all IPv4 and IPv6 address ranges and AS numbers, and their state (mostly "allocated" (to the RIR) and "assigned" (to a local registry or an organization below the RIR)).
  • Account for all the minute differences in how the RIRs compiled their data. Having poured over this with scripts for days, I can say there's tons of edge cases, but all of these have one obvious and correct solution, such as a nonstandard comment at the start of a file, a slightly different directory structure, or incorrectly computed/formatted checksums.
  • For each day where there is data available (over 95% of days after 2003), take all the IPv4 address ranges that are considered "assigned" by the RIR, and sum up all of their address counts (which are given directly per range of IPv4, or as an IPv6 CIDR prefix). This gives you the totality of currently assigned IPv4 addresses by that RIR on that day.
  • Calculate differences in these assignment counts to get the assignment rate. Since assignments increase and decrease quite a bit almost daily, I think these have to be averaged out over at least a monthly timespan.
Some details on graph creation are not known, like the aforementioned averaging rate, as well as GNUPlot formatting settings. My new graph will most definitely diverge, but given that @Mro has not provided source code, this is an acceptable shortcoming for updating the graph data to span until the present day. My source code (Python using pandas and matplotlib) will be published and linked on this page.
To answer your questions specifically: (1) see above FTP URL and files below it; there are literal tens of thousands of files (and an estimated 64GB of total data by my count) making up these statistics. (2) This is only IPv4, since even a single IPv6 assignment range from a RIR (at least /64) is larger than the entire 4-billion IPv4 address space.
Kleinesfilmröllchen (talk) 22:57, 4 September 2024 (UTC)Reply
Return to the file "Rir-rate.svg".