[Rpm-metadata] Proposal: using a DBMS for package metadata

seth vidal skvidal at phy.duke.edu
Sun Oct 3 01:28:38 EDT 2004

> Currently, the Fedora Development repository has a mean RPM size of
> about 1 MB, a mean .hdr size of about 10 KB and 100-200 mirrors
> (counting them precisely requires to avoid double counting ones with
> both http and ftp).

well the .hdr files aren't used anymore -that's what this list is
about. :)

> This can be sidestepped by simply opening the DBMS port to the world if
> a sufficiently secure DBMS exists; otherwise, a thin wrapper sanitizing
> and limiting SQL queries could do (it may have already been written).

1. Mirror admins will not run an sql server are their mirror servers
2. no one in their right mind would open up a sql connection to the

> For instance, assuming that package updates are uniform in time, it
> seems that by using several files for power-of-2-sized time intervals, a
> "changed-time > x" query can be done with at most twice the optimal
> bandwidth and between 2 and lg(t) times the optimal server disk space
> where lg is the base-2 logarithm and t is the repository life time
> expressed in units of the smallest update delta time.
> Queries of "which packages include the given file" can also be baked
> trivially by creating a file for each RPM file including the name of the
> packages that contain it. However, this will waste a lot of disk space,
> pollute disk caches, possibly require hard drive seeks, etc.
> Alternatively more than one RPM-file could be packaged in a filesystem
> file, but this will probably require more roundtrips.

You're talking about putting WAY too much intelligence on the server.
Way Way Way, too much.


More information about the Rpm-metadata mailing list