[Rpm-metadata] Proposal: using a DBMS for package metadata

Luca Barbieri luca.barbieri at gmail.com
Fri Oct 1 18:49:09 EDT 2004

Hash: SHA1

After reading about the new repomd format, I thought about another
possible solution to the metadata storage and retrieval problem, that
may be worth considering and that apparently hasn't been discussed.

The solution consists of putting the repository metadata in a DBMS (such
as MySQL, PostgreSQL or Firebird), on the main repository server, on
mirrors and on clients using the same database format.

Package metadata would be stored in a database table, where the main
fields would be a record id, the package name, the date of the last
change to the record and the package version.

Repository servers would just run a public DBMS server or a thin wrapper
allowing arbitrary SQL queries.

With such an arrangement, yum updating, "apt-get update" and mirror sync
can be performed by using an SQL query asking for all records with a
last-changed-date greater that the time of the last update; the returned
~ records are then inserted in the local database, replacing records with
equal package names.

This process clearly uses bandwidth proportional to the number of
updated packages, which is optimal and asymptotically better than
formats like repomd and yum headers that require time proportional to
the total number of packages in the repository.

Furthermore being able to do arbitrary SQL queries provides extreme
flexibility, allowing for instance to only mirror/update some fields
(like not downloading long descriptions), using of only a part of the
repository without having to get metadata for everything (such as only a
few packages from an unstable distribution).

The main issues of this scheme are server CPU usage and security, but I
believe that its flexibility and bandwidth efficiency compensate them.

Thus, I'd like to hear opinions on whether this method could be a good
candidate for adoption by package tools and distributions.

- --
Luca Barbieri
Version: GnuPG v1.2.6 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://enigmail.mozdev.org


More information about the Rpm-metadata mailing list