Hi,
On Tue, Jan 08, 2002 at 02:19:35AM +0000, mike wrote:
> #!/usr/bin/perl -w
> open(FILES,">lista");
> my @list2 = `grep 'HREF' $ARGV[0]`;
> foreach $list2 (@list2){
> $list2=~s/HREF="//g;
> $list2=~s/(\#.*\")//g;
> $list2=~s/(\")//g;
> $list2=~`grep 'html' $list2`;
> print FILES "$list2"};
You shouldn't really run grep from perl, I don't see the point and you
might want to look at HTML::LinkExtor which seems to do what you want
but better - you don't distinguish between HTML markup and content:
<A HREF="foobar.html">
and
<p>And an HREF="something.html" can be added to an <a> tag to ...
> open (FILES2,"<lista");
> @filea=<FILES2>;
Hrm? You've just created a list of the files you want and then you
wrote the list to a file and then read the same list straight back in?
Can't you just keep it in a variable?
@filea = split("\n",$list2);
should probably have had the same effect.
> open (FILE3,">lista1");
> foreach $filea (@filea){
> %seen =();
> @list3 = grep{ ! $seen{$_} ++ } @filea;}
The loop does nothing useful here AFAICS.
> print FILE3 @list3;
> close FILE3;
> open (FILE6,"<lista1") or die "cant open";
> @list4=<FILE6>;
You've done the "I don't like variables, files are nice" thing again.
> foreach $list4(@list4){
> print $list4;
> $file1=$list4;
> open(FILEX,"<<$file1");
You only want one <.
> @file2=<FILEX>;
> print @file2;
> }
On Wed, Jan 09, 2002 at 01:05:44AM +0000, mike wrote:
> Me again - solved most of it in perl except for one annoyance - for
> the life of me I cant match > int he HTML - taags on the other side
> are fine just wont let me get rid of it
Just match with >. You don't need to escape it.
[huggie@bounce ~]$ perl -le '$a = "foo>bar"; if ($a =~ m/>(.*)$/) {
print $1; }'
bar
Simon.
-- Just another wannabie | I only play with my computer | Just another fool ----------------------+ on days that end in "y". +------------------- This message was brought to you the letter Q and the number 22. htag.pl 0.0.19 -- http://www.earth.li/projectpurple/progs/htag.html -------------------------------------------------------------------- http://www.lug.org.uk http://www.linuxportal.co.uk http://www.linuxjob.co.uk http://www.linuxshop.co.uk --------------------------------------------------------------------
This archive was generated by hypermail 2.1.3 : Wed 09 Jan 2002 - 09:35:43 GMT