#!/usr/bin/perl

=pod

=head1 NAME

em_autogrip - create and maintain an Emdebian Grip repository

=cut

use strict;
use warnings;
use IO::File;
use File::Copy;
use File::Basename;
use Parse::Debian::Packages;
use Emdebian::Grip; # internal module
use Debian::Packages::Compare;
use IO::Uncompress::Gunzip qw(gunzip $GunzipError) ;

=head1 Synopsis

 Syntax: em_autogrip -b PATH [OPTIONS] [COMMAND [PACKAGES ...]]
         em_autogrip -?|-h|--help|--version

 Commands:
 -b|--base-path PATH:           path to the top level grip directory [required]

 -p|--package PACKAGES ... :    add binary package(s) to the repository
 -s|--source  PACKAGES ... :    add source package(s) to the repository
 -t|--testing:                  only work on testing instead of unstable
   --noskipold:                 reprepro option for newly added packages
   --missing:                   print a list of missing source packages
   --build-depends:             print a list of missing build dependencies
   --britney:                   print the status of testing migrations

 -?|-h|--help|--version:        print this help message and exit

Options:
 -n|--dry-run:                  check which packages would be processed
 -m|--mirror MIRROR:            use a different Debian mirror for setup
                                [default: http://ftp.uk.debian.org/debian]
   --filter-name STRING:        alternative name for the filter repository
   --grip-name STRING:          alternative name for the grip repository
   --add-new:                   if a source package is found to be missing,
                                 or outdated in unstable, add it to the list.

=cut

=head1 Description

The default is to update all the packages so far existing in the
filter repository, in all architectures.

After adding binary packages, ensure that em_autogrip is run
without any options so that any missing source packages and any other
Emdebian TDebs can be updated.

Although em_autogrip will setup the initial configuration files for
the repository, it will not modify any existing files *except*
the pkglist filter that prevents the mirror adding unwanted packages.

The mirror option only has an effect if there is no repository already
found at the specified directory.

In particular, em_autogrip will only handle unstable by default.
Migrations to testing and stable, even the creation of testing and
stable, are not handled by em_autogrip. em_autogrip defaults to
including packages into unstable and will only include packages directly
into testing in C<--testing> mode when updating to versions of packages
uploaded into Debian via testing-proposed-updates or when catching up
with a new (or stalled) repository. C<--testing> mode requires a
pre-configured testing configuration in C<reprepro>.

Public repositories should also use Secure-Apt by adding a value for
SignWith: to each distribution in the Grip repository (there is no
point signing the filter repository as it should not be public and is
merely a filtered copy of existing, officially signed, repositories).

em_autogrip also updates the locale repository, shared by Emdebian
Grip and Emdebian Crush.

Note that em_autogrip will only update the *binary* package(s)
specified, even when it includes the full source package. This is down
to how reprepro runs the filtered update - all binary packages expected
to be listed in 'dpkg --get-selections' are included and even if a
source package includes another binary, it will not be downloaded
in the reprepro update. As em_autogrip does not actually build any
packages from source, unless reprepro downloads the pre-built binary
into the filter repository, that binary package will not be available
to em_autogrip. This means that the same source package in Debian
may be listed as generating a *smaller* number of binary packages in
Emdebian Grip.

=head1 Secure Apt and reprepro

The secret key for the GnuPG key specified with SignWith: needs to be in
the secret keyring of each user performing repository updates.

To verify the release in update rules, copy F</etc/apt/trusted.gpg> to
F<~/.gnupg/trustedkeys.gpg> for all users who need to run updates. To
add keys to the list available for C<gpgv> use:

 C<gpg --no-default-keyring --keyring ~/.gnupg/trustedkeys.gpg --import keys.gpg>

=cut

=head1 Bugs

Problems with the automatic gripping of packages:

 1. Source packages need to complement binary packages.
 2. Binary packages with the same name as the source package cause both
    to be included
 3. Some such binary packages cause unwanted dependencies to be added.
 4. Some Architecture:all packages are dependencies of packages that only
    exist on some architectures, which breaks the edos-debcheck.

An example of 3. is lsb. An example of 4. is acpi-support-base.

The source package lsb is needed to complement lsb-desktop but the
lsb binary package is a meta-package for the entire lsb suite which
brings in all of Qt.

acpi-support-base is Architecture: all but depends on acpid which is
Architecture: any [i386 amd64] - i.e. acpi-support-base should only
exist on i386 and amd64 but as it is Architecture: all, it gets added
to arm, armel, mips, mipsel and powerpc as well - at which point it
has to be removed. There are ongoing discussions about such packages.

 http://lists.debian.org/debian-devel/2009/01/msg00246.html

=head1 Signal:Noise ratio in output

One important point here - reprepro outputs B<a lot> of messages and
may include lots of statements about errors and checksum mismatches,
'skipping foo' and 'downgrading bar' from and to the same version. The
problem is that the useful information is hidden within all the noise,
so not all reprepro STDERR (or STDOUT) output can be simply
ignored. For now, just go by effects. If something is broken, look for
errors that relate specifically to that package but ignore "errors"
where everything is fine. Something like that. More work in the
Emdebian::Grip module should isolate duplicate operations and
unnecessary work, which in turn, should cut out most of the noise.

=head1 Using add-new

In --testing mode, em_autogrip checks for packages that have missing or
outdated source packages in unstable and outputs a sample command that
can be run to fill the gap. If --add-new is used, that sample command
will be run - it does mean that --add-new requires --testing and that
a second run of --testing without --add-new will be needed. This support
is part of grip_cron.sh

=head1 Build dependencies

In the absence of a quicker way to identify which real package Provides:
a virtual dependency, F<apt-cache showpkg> is used against the main
system cache. If this machine is not running Debian unstable, the list
may be inaccurate or skip dependencies that are provided by packages
that are only available in unstable (or if running stable, packages
which are only in unstable or testing).

=head1 Repetition

If a package fails to build from source in Debian, C<em_autogrip>
will keep on trying to update it until the same version exists in the
filter repository for all supported architectures.

Equally, manual tinkering with packages in the Grip repository,
e.g. adding modified versions for testing, will cause the original
Debian version to keep appearing in the C<em_autogrip> updates and
reprepro will ignore the built package as long as the modified version
is higher.

=head1 Old packages

C<em_autogrip> does not handle removals from the archive - these
are manual within Debian too. Packages that only exist in stable
or oldstable will confuse C<em_autogrip>, especially if the old
package name is 'Provided' by another package which already exists
in Grip. e.g. postgresql.

=head1 Adding lots of packages in one run

Sometimes, perhaps when setting up a new mirror, a full list of
packages already exists on another site. Copying that pkglist into
the new site will clear that list as the filter repository on the
new site is empty. To avoid this problem, create the pkglist you
need, then run the filter update run directly:

 reprepro -b /PATH/filter -v update

Now run C<em_autogrip> without specifying any packages.

 em_autogrip -b /PATH/

Note that C<reprepro> needs the path to the filter directory,
C<em_autogrip> needs the path to the directory above where it can find
F<./filter/>, F<./grip/> and F<./locale/>.

C<em_autogrip> will then update the pkglist file with the final
contents of the filter repository.

=head1 Ubuntu / non-Debian sources/suites

Emdebian Grip is still Debian, so although non-Debian repositories can
be supported, the resulting Grip repository still requires a Debian-like
layout. In particular, an 'unstable' suite must exist, even if the
codename of that suite is not called 'sid'. Equally, if the repository
is to support britney migrations, a suite called 'testing' must exist.

Remember, suites will change when a Debian stable release is made (i.e.
testing points to something else after the release compared to what it
contained before the release). Codenames do not change - squeeze always
contains squeeze, even once squeeze is released as stable.

=cut

use vars qw/ %debianunstable %debiantesting %gripunstable %griptesting
 %tdebunstable %tdebtesting $base @archlist $packages $suite @locroots
 $prog $our_version $mirror $noskip @bin @cmd $update $dryrun
 $mode $filter_name $grip_name $c $plural $verbose $addnew %builddeps /;

my $prog = basename($0);
$our_version = &scripts_version();
my $mirror='http://ftp.uk.debian.org/debian'; # default only for initial setup
$noskip = '';
$filter_name = 'filter';
$grip_name = 'grip';
$suite = "unstable"; # at first
$verbose = 0;
$mode='';

while( @ARGV ) {
	$_= shift( @ARGV );
	last if m/^--$/;
	if (!/^-/) {
		unshift(@ARGV,$_);
		last;
	} elsif (/^(-\?|-h|--help|--version)$/) {
		&usageversion();
		exit (0);
	} elsif (/^(-m|--mirror)$/) {
		$mirror = shift;
	} elsif (/^(-v|--verbose)$/) {
		$verbose++;
	} elsif (/^(-q|--quiet)$/) {
		$verbose--;
	} elsif (/^(-b|--base-path)$/) {
		$base = shift;
	} elsif (/^(--noskipold)$/) {
		$noskip="--noskipold";
	} elsif (/^(-p|--package)$/) {
		push @bin, shift;
		$mode = 'binary';
	} elsif (/^(-s|--source)$/) {
		push @cmd, shift;
		$mode = 'source';
	} elsif (/(^--missing)$/) {
		$mode = 'missing';
	} elsif (/(^--build-depends)$/) {
		$mode = 'builddeps';
	} elsif (/^(-t|--testing)$/) {
		$mode = 'testing';
	} elsif (/^(--britney)$/) {
		$mode = 'britney';
	} elsif (/^(-n|--dry-run)$/) {
		$dryrun++;
	} elsif (/^(--add-new)$/) {
		$addnew++;
	} elsif (/^(--edos)$/) {
		$mode = 'edos';
	} elsif (/^(--filter-name)$/) {
		$filter_name = shift;
	} elsif (/^(--grip-name)$/) {
		$grip_name = shift;
	} else {
		die "$prog: Unknown option $_.\n";
	}
}

if (@ARGV) {
	push @cmd, @ARGV if ($mode eq 'source');
	push @bin, @ARGV if ($mode eq 'binary');
}

$verbose = ($verbose > 0) ? $verbose : 0;
&set_noskip($noskip);
&set_dry_run if (defined $dryrun);

die "ERR: Please specify an existing directory for the base-path.\n"
	if (not defined $base);

$base .= '/' if ("$base" !~ m:/$:);
if (not -d $base) {
	print "ERR: Please specify an existing directory for the base-path: $base\n";
	exit 1;
}
&stamp;
&set_base($base);
&set_repo_names ($filter_name, $grip_name);

=head1 Architecture list

The list of architectures supported by a particular Grip setup cannot
be easily changed - a lot of repository updates are needed before new
architectures can be added to the array. Existing architectures
can be dropped relatively easily. Sequence is unimportant.

 @archlist = qw/i386 amd64 arm armel powerpc mips mipsel/;

=cut

# base must be set first.
my $a = &get_archlist ($suite, $filter_name);
@archlist = (not defined $a or not @$a) ?
	sort qw/i386 amd64 arm armel powerpc mips mipsel/ : sort @$a;
foreach my $test (@archlist) {
	next if ($test eq "source");
	my $retval = system ("LC_ALL=C dpkg-architecture -a$test > /dev/null 2>&1");
	$retval /= 256;
	if ($retval == 9) {
		die ("\nERR: `dpkg-architecture` does not support '$test'!\n\n");
	}
}
my $l = &get_locale_roots ($suite, 'locale');
@locroots = (not defined $l or not @$l) ? qw/ af am ang ar as ast az be bg
bn br bs ca cs cy da de dz el en eo es et eu fa fi fr ga gl gu he hi hr
hu hy ia id io is it ja ka kn km ko ku ky lg li lt lv mai mg mi mk ml mn mr
ms nb ne nl nn no ns nso oc or pa pl ps pt rm ro ru rw si sk sl sq sr sv
ta te th tk tl tr tt ug uk ur uz vi wa wo xh yi zh zu / : @$l;

&setup_repos ($mirror, $suite) if ( not -f "${base}${filter_name}/conf/pkglist" );
if (not -d "${base}/${grip_name}/incoming") {
	mkdir "${base}/${grip_name}/incoming";
}
print "INF: Checking that a testing repository exists... "
	if ($mode eq 'testing' or $mode eq 'britney');

if (($mode eq 'testing' or $mode eq 'britney')
	and not -d "${base}/${filter_name}/dists/testing") {
	print "\n\nERR: No testing distribution has been set up for filter.\n";
	print "ERR: Aborting . . . \n";
	exit 5;
}

if (($mode eq 'testing' or $mode eq 'britney')
	and not -d "${base}/$grip_name/dists/testing") {
	print "\n\nERR: No testing distribution has been set up for grip.";
	print "ERR: Aborting . . . \n";
	exit 10
}

if (($mode eq 'testing' or $mode eq 'britney')
	and not -d "${base}/locale/dists/testing") {
	print "\n\nERR: No testing distribution has been set up for locale.\n";
	print "ERR: Aborting . . . \n";
	exit 11
}

print "ok\n" if (($mode eq 'testing' or $mode eq 'britney'));

if ($mode eq 'testing') {
	$suite = 'testing';
	print "INF: add-new is set.\n" if (defined $addnew);
	print "INF: Calculating britney data, please wait . . . \n";
	my $complain = &grip_britney;
	if (defined $complain and defined $addnew) {
		my $list = join (' ', @$complain);
		print "INF: Adding '$list' sources to unstable.\n";
		&extend_filter ($list);
		&set_noskip('--noskipold');
		&update_repo ($verbose);
		&set_noskip('');
		&load_data;
		foreach my $pkg (@$complain) {
			my $src = $debianunstable{$pkg}{'Src'};
			if (not defined $debianunstable{$src}{'source'}) {
				# really shouldn't happen in this phase.
				warn ("ERR: $pkg is not a source package name. Try '$src'\n");
				next;
			}
			# note that sources are added to unstable
			# then sorted out later for testing.
			grip_source ($src, $debianunstable{$src}{'source'}, 'unstable', 'source');
			&clean_incoming ($grip_name);
		}
	}
	print "INF: Calculation done - reloading package data . . .\n";
	&load_data;
	print "INF: Checking for any missing migrations . . .\n";
	&migrate_missing;
	print "INF: Done.\n";
	exit 0;
}

if ($mode eq 'edos') {
	&edos($mode, $suite);
	exit 0;
}

&load_data;

if ($mode eq 'britney') {
	print "INF: Calculating britney data, please wait . . . \n";
	&print_testing_status;
	print "Debian unstable: ".scalar (keys %debianunstable)."\n";
	print "Debian testing : ".scalar (keys %debiantesting)."\n";
	print "Grip unstable  : ".scalar (keys %gripunstable)."\n";
	print "Grip testing   : ".scalar (keys %griptesting)."\n";
	print "INF: Done.\n";
	exit 0;
}

if ($mode eq 'missing') {
	&print_missing ($filter_name, $grip_name);
	exit 0;
}

if ($mode eq 'builddeps') {
	my $host = `dpkg-architecture -qDEB_BUILD_ARCH_CPU`;
	chomp ($host);
	&print_build_deps ($filter_name, $grip_name);
	print "INF: Note: Virtual packages are likely to exist in this list.\n";
	print "INF: Not all of the calculated packages are necessarily available".
	" for the '$host' architecture.\n";
	my $hostname = `hostname -f`;
	chomp ($hostname);
	print "INF: If '$hostname' is not running Debian unstable, some ".
	"Provides: packages might have been missed.\n";
	exit 0;
}

if ($mode eq 'source') {
	my $list = join (' ', @cmd);
	&extend_filter ($list);
	&set_noskip('--noskipold');
	&update_repo ($verbose);
	&set_noskip('');
	&load_data;
	foreach my $pkg (@cmd) {
		my $src = $debianunstable{$pkg}{'Src'};
		if (not defined $debianunstable{$src}{'source'}) {
			warn ("ERR: $pkg is not a source package name. Try '$src'\n");
			next;
		}
		grip_source ($src, $debianunstable{$src}{'source'}, $suite, 'source');
		&clean_incoming ($grip_name);
	}
	my $date = `date`;
	chomp ($date);
	print "End: {source}: $date\n";
	exit 0;
}

if ($mode eq 'binary') {
	my ($flag, $version, $binaries, $arch_hash, @arch_list, %hash);
	my $list = join(' ', sort (@bin));
	&extend_filter ($list);
	&set_noskip('--noskipold');
	&update_repo ($verbose);
	&set_noskip('');
	foreach my $pkg (@bin) {
		# work out the arch.
		foreach my $arch (sort @archlist) {
			next if ($arch eq 'source');
			my $detail = &get_single_package ('unstable', $filter_name, $pkg, $arch);
			my $src = (defined $detail->{'Source'}) ? $detail->{'Source'} : $pkg;
			print "INF: $pkg is from the $src source package.\n";
			$hash{$pkg}=$src;
			$flag++ if ($src ne $pkg);
			&grip_binary($pkg, $detail->{'Version'}, 'unstable', $arch);
			&clean_incoming ($grip_name);
			if (defined $detail->{'Architecture'}) {
				last if ($detail->{'Architecture'} eq 'all');
			}
		}
		&load_data;
	}
	# possibly new source package identified, refresh data.
	if (defined $flag) {
		print "INF: Updating source package(s) . . \n";
		my $list = join(' ', sort (values %hash));
		&extend_filter ($list);
		&set_noskip('--noskipold');
		&update_repo ($verbose);
		&set_noskip('');
		&load_data;
		foreach my $srcpkg (sort values %hash) {
			my $srcdetail = &get_single_package ('unstable', $filter_name, $srcpkg,'source');
			print "INF: Source version in unstable: ".$srcdetail->{'Version'}."\n";
			&grip_source ($srcpkg, $srcdetail->{'Version'}, 'unstable', 'source');
			&clean_incoming ($grip_name);
		}
	}
	my $date = `date`;
	chomp ($date);
	print "End: {binary}: $date\n";
	exit 0;
}

&update_filter;
&update_repo($verbose);

my $missing = &get_missing_sources('unstable', "$filter_name", "$grip_name");
$c = scalar (keys %$missing);
if ($c > 0) {
	$plural = ($c > 1) ? "source packages need" : "source package needs";
	print "INF: $c $plural to be updated or installed.\n";
	print "INF: ".join (' ', sort keys %$missing)."\n";
	foreach my $pkg (sort keys %$missing) {
		my $src = $debianunstable{$pkg}{'Src'};
		grip_source ($src, $debianunstable{$pkg}{'source'}, $suite, 'source');
		&clean_incoming ($grip_name);
	}
}

my $binaries = &get_missing_binaries ('unstable', "$filter_name", "$grip_name");
$c = scalar (keys %$binaries);
if ($c <= 0) {
	print scalar (keys %debianunstable)." packages in the filter repository.\n";
	print scalar (keys %gripunstable)." packages in the Grip repository.\n";
	print "INF: Nothing to do.\n";
	my $date = `date`;
	chomp ($date);
	print "End: {update}: $date\n\n";
	exit 0;
}
$plural = ($c > 1) ? "packages need" : "package needs";
print "$c $plural to be updated or installed.\n";
print join (' ', sort keys %$binaries)."\n";
foreach my $pkg (sort keys %$binaries) {
	&extend_filter ($pkg);
	&load_data;
	foreach my $a (@archlist) {
		next if ($a eq 'source');
		next unless (defined $debianunstable{$pkg}{$a});
		grip_binary ($pkg, $debianunstable{$pkg}{$a}, $suite, $a);
		&clean_incoming ($grip_name);
	}
}

=head1 recursive edos considered risky

Recursion is still risky so edos is left as a manual step.
The problem appears to be that once the repository gets out
of step with Debian, an update must happen before edos can
be resolved. Once the repository is up to date, edos can be
run, apparently, without problems. Testing continues to see
if simply moving the function lower in the flow resolves the
problems.

=cut

#&edos;

my $date = `date`;
chomp ($date);
print "End: $date\n\n";
exit 0;

=head1 Copyright and Licence

 Copyright (C) 2007-2010 Neil Williams <codehelp@debian.org>

 This package is free software; you can redistribute it and/or modify
 it under the terms of the GNU General Public License as published by
 the Free Software Foundation; either version 3 of the License, or
 (at your option) any later version.

 This program is distributed in the hope that it will be useful,
 but WITHOUT ANY WARRANTY; without even the implied warranty of
 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
 GNU General Public License for more details.

 You should have received a copy of the GNU General Public License
 along with this program.  If not, see <http://www.gnu.org/licenses/>.

=cut

## subroutines

sub usageversion {
	printf STDERR (_g("
%s - create an Emdebian Grip repository
version $our_version

Syntax: %s -b PATH [OPTIONS] [COMMAND [PACKAGES ...]]
        %s -?|-h|--help|--version

Commands:
-b|--base-path PATH:           path to the top level grip directory [required]

-p|--package PACKAGES ... :    add binary package(s) to the repository
-s|--source  PACKAGES ... :    add source package(s) to the repository
-t|--testing:                  only work on testing instead of unstable
   --missing:                   print a list of missing source packages
   --britney:                   print the status of testing migrations

-?|-h|--help|--version:        print this help message and exit

Options:
-n|--dry-run:                  check which packages would be processed
-m|--mirror MIRROR:            use a different Debian mirror for setup
                                [default: http://ftp.uk.debian.org/debian]
   --noskipold:                reprepro option for newly added packages
   --filter-name STRING:       alternative name for the filter repository
   --grip-name STRING:         alternative name for the grip repository

The default is to update all the packages so far existing in the
filter repository, in all architectures.

After adding binary packages, ensure that $prog is run
without any options so that any missing source packages and any other
Emdebian TDebs can be updated.

Although %s will setup the initial configuration files for
the repository, it will not modify any existing files *except*
the pkglist filter that prevents the mirror adding unwanted packages.

In particular, $prog will only handle unstable by default.
Migrations to testing and stable, even the creation of testing and
stable, are not handled by $prog. $prog always
includes packages into unstable.

Public repositories should also use Secure-Apt by adding a value for
SignWith: to each distribution in the Grip repository (there is no
point signing the filter repository as it should not be public).

$prog also updates the locale repository, shared by Emdebian
Grip and Emdebian Crush.

Note that %s will only update the *binary* package(s)
specified, even when it includes the full source package. This is down
to how reprepro runs the filtered update - all binary packages expected
to be listed in 'dpkg --get-selections' are included and even if a
source package includes another binary, it will not be downloaded
in the reprepro update. As $prog does not actually build any
packages from source, unless reprepro downloads the pre-built binary
into the filter repository, that binary package will not be available
to $prog. This means that the same source package in Debian
may be listed as generating a *smaller* number of binary packages in
Emdebian Grip.

"), $prog, $prog, $prog, $prog, $prog)
	or die "$0: failed to write usage: $!\n";
}

sub load_data {
	my $debu  = &read_packages ('unstable', $filter_name);
	my $gripu = &read_packages ('unstable', $grip_name);
	my $tdebu = &read_locale   ('unstable', 'locale');
	%debianunstable = %$debu   if (defined $debu);
	%gripunstable   = %$gripu  if (defined $gripu);
	%tdebunstable   = %$tdebu  if (defined $tdebu);
	if (($mode eq 'testing') or ($mode eq 'britney')) {
		my $debt  = &read_packages ('testing',  $filter_name);
		my $gript = &read_packages ('testing',  $grip_name);
		my $tdebt = &read_locale   ('testing',  'locale');
		%griptesting    = %$gript  if (defined $gript);
		%debiantesting  = %$debt   if (defined $debt);
		%tdebtesting    = %$tdebt  if (defined $tdebt);
	}
}

sub stamp {
	print "This is $prog - a tool to create and maintain an ";
	print "Emdebian Grip repository.\n";
	print "Version: $our_version\n";
	my $deps = `dpkg -l emdebian-grip-server emdebian-grip emdebian-tdeb|grep ii`;
	$deps =~ s/ +/ /g;;
	my @depends=();
	my @dep=split("\n", $deps);
	foreach my $d (@dep) {
		$d =~ s/ii //;
		chomp($d);
		# drop the descriptions.
		$d =~ s/^(.*?) (.*?) .*$/$1 \($2\)/;
		push @depends, $d;
	}
	print "Dependencies: ".join(", ",@depends)."\nBuildd: ";
	system ("hostname -f");
	my $date = `date`;
	chomp ($date);
	print "Start: $date\n";
}
