Zurück

Akadia Information Technology


Figures:  

Code Listing or Textfile.

Contents:   

Difference of two Two Dates in Perl
Implementing a UNIQUE Index in Perl
Numerical and String Comparison
Putting Commas in Numbers
How to get size of a file ?
Reading char by char with Perl
Undefined und defined
Print Hash Elements
Passing by Reference
Passing by Value
How to start index.html as a CGI-Script for Apache Server ?
How to get Perl Documentation ?
How to install Perl Module on NT with Package Manager PPM from ActivState
Here documents in Perl
Typeglobs and Filehandles
Read output from OS-program as input in perl-script
Read all files *.txt using the glob() function
Reading Lines with Continuation Characters
Easy editing with regular expressions (Cookbook)
How to retrieve email with POP3 and Perl ?
How do I send email from a perl script ?
How Perl finds modules and libraries ?
How to see what modules Perl is loading ?
Code Listings and Examples 
  

Links: 

Online Regular Expression References for Perl
Perl modules for Win95, Win98, and NT
Perl tutorials for beginners
Using Perl and NT


Difference of two Two Dates in Perl

Overview

You need to find the number of days between two dates or times. If your dates are in Epoch seconds, and fall in the range Fri Dec 13 20:45:52 1901 to Tue Jan 19 03:14:07 2038 (inclusive), simply subtract one from the other and convert the seconds to days.

If you have distinct DMYMHS values, or are worried about the range limitations of Epoch seconds, use the Date::Calc module from CPAN. It can calculate the difference between dates:

Solution in Perl

#!/usr/bin/perl -w
#--------------------------------------------------
# File: date_diff.perl
#
# Autor: Martin Zahn, 31.10.2004
#
# Purpose: Calculate Difference between two Dates.
# Date must be in Format: MM/DD/YYYY-HH:MI:SS
# Example 11/02/2004-17:11:27
# ---------------------------------------------------

use Date::Calc qw(Delta_DHMS);
@ARGV == 2 or die "usage:  date_diff.perl start_date end_date\n";
my $start_date = $ARGV[0];
my $end_date = $ARGV[1];
my $start_month;
my $start_day;
my $start_year;
my $start_hour;
my $start_minute;
my $start_second;
my $end_month;
my $end_day;
my $end_year;
my $end_hour;
my $end_minute;
my $end_second;
$start_day = substr($start_date,3,2);
$start_month = substr($start_date,0,2);
$start_year = substr($start_date,6,4);
$start_hour = substr($start_date,11,2);
$start_minute = substr($start_date,14,2);
$start_second = substr($start_date,17,2);
$end_day = substr($end_date,3,2);
$end_month = substr($end_date,0,2);
$end_year = substr($end_date,6,4);
$end_hour = substr($end_date,11,2);
$end_minute = substr($end_date,14,2);
$end_second = substr($end_date,17,2);
@from  = ($start_year, $start_month, $start_day,
          $start_hour, $start_minute, $start_second);
@to =    ($end_year, $end_month, $end_day, $end_hour,
          $end_minute, $end_second);
@diff = Delta_DHMS(@from, @to);

$d = $diff[0] * 24 * 60 * 60;
$h = $diff[1] * 60 * 60;
$m = $diff[2] * 60;
$s = $diff[3];
$t = $d + $h + $m + $s;
print "$start_date - $end_date is $diff[0] Days $diff[1]
        Hours $diff[2] Minutes $diff[3] Seconds,
        Total = $t [Sec]\n";
$ ./date_diff.perl 11/05/2004-07:19:05 11/05/2004-09:30:23
  11/05/2004-07:19:05 - 11/05/2004-09:30:23 is
  0 Days 2 Hours 11 Minutes 18 Seconds, Total = 7878 [Sec]

One problem with Epoch seconds is how to convert the large integers back to forms that people can read. The above example shows one way of converting an Epoch seconds value back to its component numbers of days, hours, minutes, and seconds.

Implementing a UNIQUE Index in Perl

Overview

You want to eliminate duplicate values from a list, such as when you build the list from a file or from the output of another command. If you have Oracle or any other RDBMS it's simple - however how to accomplish this task without a UNIQUE INDEX offered by any database system. Perl can be the solution!

This Tip is equally applicable to removing duplicates as they occur in input and to removing duplicates from a file.

Solution

As an example look at the following file:

1941793993473 228013316500427 8940000000000000000 1234 63
1941796973039 228014010727379 8940000000000000001 1234 63
1941792475618 228012111776405 8940000000000000002 1234 63
1941792783433 228014112291994 8940000000000000003 1234 63
1941793993473 228013514731297 8940000000000000004 1234 63
1941787512799 228014417581198 8940000000000000005 1234 63
1941795720438 228014015018031 8940000000000000006 1234 63
1941795166746 228012116512702 8940000000000000007 1234 63
1941794293060 228014015058761 8940000000000000008 1234 63
1941793993473 228012114838798 8940000000000000009 1234 63
1941793993473 228014112624584 8940000000000000010 1234 63

You want to eliminate all duplicate rows -- in the example the rows after the first row with the number 1941793993473.

Use a hash to record which items have been seen, then keys to extract them. You can use perl's idea of truth to shorten and speed up your code.

Here is the perl code:

#!/usr/bin/perl -w

@ARGV == 1 or die "usage: unique.perl <file>\n";

# Define Hash and Array
%seen = ();
@uniq = ();

# Open File to process
($infile) = $ARGV[0];
open(INFILE, "< $infile")
  or die "Can't open $infile for reading: $!\n";
@list = <INFILE>;

# Loop through the
File
foreach $line (@list) {
  ($col1,$col2,$col3,$col4,$col5) = split(" ",$line);

  unless ($seen{$col1}) {

    #
If we get here, we have not seen it before
    print ("col1=$col1 ");
    $rest = $col2 . " " . $col3;
    print ("$rest\n");

    $seen{$col1} = $col2 . $col3 . $col4 . $col5;
    push(@uniq, $col1);
  }
}

Start the Perl Program

$ ./unique.perl unique.txt

1941793993473 228013316500427 8940000000000000000
1941796973039 228014010727379 8940000000000000001
1941792475618 228012111776405 8940000000000000002
1941792783433 228014112291994 8940000000000000003
1941796418213 228013514731297 8940000000000000004
1941787512799 228014417581198 8940000000000000005
1941795720438 228014015018031 8940000000000000006
1941795166746 228012116512702 8940000000000000007
1941794293060 228014015058761 8940000000000000008

The question at the heart of the matter is "Have I seen this element before?" Hashes are ideally suited to such lookups. The solution shown builds up the array of unique values as we go along, using a hash to record whether something is already in the array.

Numerical and String Comparison

Comparison                    Numeric             String
--------------------------------------------------------------------
Equal                         ==                  eq
Not Equal                     !=                  ne
Less than                     <                   lt
Grather than                  >                   gt
Less than or equal to         <=                  le
Greather than or equal to     >=                  ge
--------------------------------------------------------------------

Putting Commas in Numbers

#!/usr/bin/perl

# Putting Commas in Numbers

sub commify {
    my $text = reverse $_[0];
    $text =~ s/(\d\d\d)(?=\d)(?!\d*\.)/$1,/g;
    return scalar reverse $text;
}

$hits = 39586765342;
$output = "Your web page received $hits accesses last month.\n";
print commify($output);

==> Output: Your web page received 39,586,765,342 accesses
            last month.

How to get size of a file ?

Lookup in File Array, Position 7 for the File Size. The following CGI example reads an prints an image file.

$no_bytes = (stat ($gif_image))[7];    # Size of image
$piece_size = $no_bytes / 10;
print "Content-type: image/gif", "\n";
print "Content-length: $no_bytes", "\n\n";
for ($loop=0; $loop <= $no_bytes; $loop += $piece_size) {
  read (IMAGE, $data, $piece_size);
}
print $data;

Reading char by char with Perl

Getc() reads char by char from STDIN

for ($i = 0; $i < $ENV{'CONTENT_LENGTH'}; $i++) {
   $MyBuffer .= getc;
}

Undefined und defined

Variables are undefined "undef" as long as they are not associated with some data. In the following example, $name becomes undef if we reach end-of-file.

while (defined ($name = <WORDLIST>)) {
  ...
}

Print Hash Elements

This construct is very often used in Perl.

foreach $var (sort keys(%ENV)) {
  print $ENV{$var}\n";
}

Passing by Reference

All the parameters passed to your sibroutine are in the special Perl variable @_. This arry actually references the locations of the passed-in variables.

sub cgi_encode {
    ($str) = @_;  # Passed by Reference
    .....
    return ($str);
}

Passing by Value

Declare the variable as local within the subroutine.

sub cgi_encode {
    local ($str) = @_;   # Passed by Value
    .....
    return ($str);
}

How to start index.html as a CGI-Script for Apache Server ?

Enter in Apache's httpd.conf:

AddType application/x-httpd-cgi index.html

The file index.html is a CGI Perl-Programm with the ending .html which will be executed instead of loaded.

How to get Perl Documentation ?

Use perldoc DBI::DBD to read documentation on Perl-Module DBI::DBD.

How to install Perl Module on NT with Package Manager PPM from ActivState

With each package you need two files from activestate: The .ppd file and the .tar.gz file. The .ppd file is (I presume) a perl package description. It is a text file. The tar.gz is a compressed archive. So, to install the DBI package you would need to download:

DBI.ppd
DBI.tar.gz

The .ppd files are available from the ActiveState.

What you should do is save this in a directory structure as follows:

c:\packages (.ppd files)
c:\packages\x86 (.tar.gz files)

Once you have the .ppd and .tar.gz file in the c:\packages directory structure, you will need to run ppm.pl To do this, change to the directory which holds ppm.pl.

Then run perl package manager. You will see the PPM> prompt. Enter the following command:

set repository local c:\packages

Then enter the search command. You should see a list of the available packages. Use the install command to install a specific package. The installation for each package will potentially bring up a unique set of prompts specific to the package.

Here documents in Perl

If you want to printout HTML-Code, it's better to use a HERE (Inline) document, than to surround each HTML line with a print ("HTML-CODE\n"); If you use print, you have to quote each " to \", which is very boring.

  print<<"HTMLBLOCK"
Plain HTML Code
HTMLBLOCK

Typeglobs and Filehandles

If you want to pass or store filehandles in Perl, then you have to use a typeglob: *

#!/usr/bin/perl -w

my $fh = newopen('/etc/hosts');
my $line;

while (defined ($line = <$fh>)) {
  $length += length($line);
  print ($line);
}
close $fh;

sub newopen {
my $path = shift;
local *FH;   # Not my !
open (FH,$path) || return undef;
return *FH;
}

Read output from OS-program as input in perl-script

Simple use backquotas:

$info = `finger $user`;

Read all files *.txt using the glob() function

Simple use the glob() function for filename globbing. This example also show how to format a report.

#!/usr/bin/perl

# Read all files *.sec using the glob() function
while (defined ($filename = glob("*.txt"))) {

  # Open creates filehandle WORDLIST by associating it with
  # a file named wordlist in the current directory

  open (WORDLIST, $filename) || die "Can't open $filename: $!";

  # Check that wordlist has changed within the last seven days
  if (-M WORDLIST >= 7.0) {
    die "Sorry, the $filename is older than seven days.";
  }

  # Read lines from wordlist files, one line at a time
  # Store each line into the $name variable. At the end of
  # the file we get an empty string.

  while (defined ($name = <WORDLIST>)) { # Read the name in $name

    chomp ($name);

    # Now read the secret word

    $word = <WORDLIST>;
    chomp ($word);

    # Write to STDOUT, formated list (see below)

    write;
  }
  close (WORDLIST);
  print ("\n");
}

#
# Define STDOUT format for the report body
#
format STDOUT =
@<<<<<<<<<<<<<<<<<<<<<< @<<<<<<<<<<<<<<<<<< @<<<<<<<<<<<<<<<<<<<<<<<
$filename, $name, $word
.
#
# Define STDOUT format for the report header
# Page @<< (Page with a three character field definition)
# $%       Number of printed pages
#
format STDOUT_TOP =
Page @<<
$%

Filename                Name                Password
======================= =================== ========================
.

Reading Lines with Continuation Characters

You have a file with long lines split over two or more lines, with backslashes "\" to indicate that a continuation line follows. You want to rejoin those split lines.

#!/usr/bin/perl

while (defined($line = <STDIN>) ) {
    chomp $line;
    if ($line =~ s/\\\s*$//) {
        $line .= <STDIN>;
        redo unless eof(STDIN);
    }
    print ("$line\n");
}

Easy editing with regular expressions (Cookbook)

Auch in der schönen (Computer) Welt der GUI's (Graphical User Interfaces ....) muss man immer noch öfters Files editieren, umstellen, Teile herausschneiden, einsetzen, umdrehen usw. Nach wie vor ist Perl wohl der "Weltmeister" wenn es um diese Aufgabe geht. Im folgenen kleinen Artikel eine Auswahl nützlicher Editier-Möglichkeiten (Regular Expressions).

1). Single Character Patterns
-----------------------------
/./   Matches any single character except newline (\n)
/*/   Zero or more of the immediately previous character
/.*/  Matches any string
/a./  Matches aa, ab, ax, but not a\n

2). Predefined Character Class Abbreviations
--------------------------------------------
digit = 0-9
word  = Any single letter, digit or underscore
special = Matches any spacae, tab, carriage return, newline, formfeed

\d  (a digit)     [0-9]
\w  (a word)      [a-zA-Z0-9_]
\s  (special)     [ \r\t\n\f] 
\D  (not digit)   [^0-9]
\W  (not word)    [^a-zA-Z0-9_]
\S  (not special) [^ \r\t\n\f]

# Cut everything after first word
$aSring = "Martin Zahn";
$aSring =~ s/\W.*//;
==> Martin

# Match < exit>, <\t\texit>, <\n\nexit>
$aSring = "  exit";
if ($aSring =~ /^\s*exit/) {    
  print ("$aSring>\n");
}
==> exit

# Cut <;\n\n\t> (; and following special chars until end of string)
$aSring = "select;   ";
$aSring =~ s/;\s*$//;
==> select

# i = ignore case
$aSring = "    SELECT   ";
if ($aSring =~ /^\s*select/i) {
  print ("$aSring>\n"); 
}
==> SELECT

# Cut all blanks,tabs in front
$aSring =~ s/^\s*//;
==> < SELECT > => <SELECT >

# Cut all leading blanks,tabs
$aSring =~ s/\s*$//;
==> <SELECT > => <SELECT>

3). Grouping Patterns
---------------------
We want to say: "One or more", "Up to five of those" etc

/*/ or /{0,}/  Zero or more of the immediately previous characters
/+/ or /{1,}/  One  or more of the immediately previous characters
/?/ or /{0,1}/ Zero or one of the immediately previous characters

# Cut all "x" and replace them with "Zahn"
$aSring = "Martin xxxxxxxx Seftigen";
$aSring =~ s/x+/Zahn/;
$aSring =~ s/x{1,}/Zahn/;
==> Martin Zahn Seftigen

# Cut all 7"x" and replace them with "Zahn"
$aSring = "Martin xxxxxxxx Seftigen";
$aSring =~ s/x{1,7}/Zahn/;     
==> Martin Zahnx Seftigen

4). Greedy (rightmost as possible: DEFAULT)
-------------------------------------------
Proceed until the last occurrence in string.

$aSring = "a xxx c xxxxxxxxxxxx c xxx dkkkkk";
$aSring =~ s/a.*c//;           
==> < xxx dkkkkk>

5). Non greedy (leftmost as possible)
-------------------------------------
Proceed until first occurrence in string.
Put a "?" after multiplier: "*?")

$aSring = "a xxx c xxxxxxxxxxxx c xxx dkkkkk";
$aSring =~ s/a.*?c//;          
==> < xxxxxxxxxxxx c xxx dkkkkk>

6). Parenthesis as memories
---------------------------
The parenthesis () acts as storages which can be
referenced with \1 ... \n

# The same char on position 2 as on position 1 ?
$aSring = "Martin&Zahn&";
if ($aSring =~ /Martin(.)Zahn\1/) { # This will match
  print ("$aSring\n");  
}
$aSring = "Martin&Zahn*";
if ($aSring =~ /Martin(.)Zahn\1/) { # This will NOT match
  print ("$aSring\n");  
}

# The referenced part can be more than one character
$aSring = "aMARTINbMARTINc";
if ($aSring =~ /a(.*)b\1c/) { # This will match
  print ("$aSring\n");  
}

# Match to chars
$aSring = "Hello World";
if ($aSring =~ /(.)\1/) {    # This matches the double ll !
  print ("$aSring\n");
}

# Swapping two words

# After a successful pattern match, the read-only variables
# $1, $2, $3 etc are set according to \1, \2, \3 and can be
# used further in the program

$aSring = "This is a Test";
$aSring =~ s/(\w+)\s+(\w+)/\2 \1/; #
==> is This a Test

# Put each word in a text in < >

$aSring = "This is a Test";
$aSring =~ s/(\w+)/<$1>/g; # g means global
==> <This> <is> <a> <Test>

7). Alternation
---------------
/a|b|c/ matches a or b or c
/^x|y/  matches x at the beginning of a line or y anywhere

# Parenthesis can be used to group words !
# Note, that this doesn't remenber anything for \1, it's just
# for grouping words.

# Matches songbird or bluebird
$aSring = "bluebird";
$aSring =~ /(song|blue)bird/

8). Grouping
------------
/(a|b)(c|d)/ matches ac ad bc bd
/abc*/       matches abc abcc abccc and so on
/(abc)*/     matches abcabc abcabcabc and so on
/^(x|y)/     matches x or y at the beginning of the line

$aSring = "abcabcabc";
if ($aSring =~ /(abc)*/) {
  print ("$aSring\n"); # This matches !
}

9). Using a different delimiter
-------------------------------
To avoid /\/users\/zahn\/work/ we can use another delimiter
with the modifier m#users/zahn/work#

$aSring = "/users/zahn/work";
if ($aSring =~ m#/users/zahn/work#) {
  print ($aSring\n"); # This matches !
}

10). Word boundary
------------------
If you want "Erik" but NOT "Eriko" then use the \b word boundary

$aSring = "Erik";
if ($aSring =~ m/erik\b/i) {
  print ("$aSring\n"); # This matches !
}
$aSring = "Eriko";
if ($aSring =~ m/erik\b/i) {
  print ("$aSring\n"); # This matches not!
}

Examples

# Get infos from Unix command uptime:
$uptime = `uptime`;
($load_average) = ($uptime =~ /average: ([^,]*)/);
print ("$load_average\n");

# ($load_average) heisst, nimm die gefundenen zeichen auf, wichtig
#                 ist, dass man die Klammern setzt.
#   average:      Suche nach "average:" in $uptime
#   ([^,]*):      Finde alles (* heisst 0 und mehr), bis zum
#                 nächsten Komma ","
#                 Ganz genau gesagt, finde alles, das nicht (^)
#                 einem Komma entspricht, 0 oder mehrmals.

# Cut the hours part from the following String
$aString = "www.akadia.com - - [10/Jan/1999:03:51:11 +0100]";
($hour) = ($aString =~ m|\[\d+/\w+/\d+:([^:]+)|);
print ("$hour\n");

# Search from left to right until the first [
# Search then for digits followed by "/"
# Search then for a String followed by "/"
# Search then for digits followed by ":"
# Save all which is not a ":" in $1

# Remove C comments from source /* ..... */
LINE: while (defined($line = <STDIN>)) {
  next LINE if ($line =~ /^$/); # Skip blank lines
  next LINE if ($line =~ /^#/); # Skip comments
  $file .= $line;
}
$file =~ s {
  /\* # Match the opening delimiter
  .*? # Match a minimal number of characters
  \*/ # Match the closing delimiter
} []gsx;
print ("$file");

# Trim blanks from a string

$aString = " Hello World ";
$aString =~ s/^\s*(.*?)\s*$/$1/;
print ("$aString\n");
==> Hello World

# Reverse 1st two fields

$aString = "one two three four";
$aString =~ s/([^ ]*) *([^ ]*)/$2 $1/;
print ("$aString\n");
==> two one three four

How to retrieve email with POP3 and Perl ?

We used this Tip of the week, to debug one of our POP3 connections to the E-Mail server.

Introduction

As with SMTP, there are modules at CPAN that you can use to get your email from any platform, such as Net::POP3 and Mail::POP3Client.  We recommend that you look at all of the available POP3 modules, and pick the one that meets your needs.  This article will show how to use Net::POP3, which is a module that talks directly to a POP3 server.

What is POP3?

The Post Office Protocol (version 3),  is the most widely used protocol to retrieve email on the Internet. As with SMTP, a client program (your script) will establish a two-way communication with a POP3 server so that you can retrieve your email. Unlike SMTP, POP3 doesn't return any numeric reply codes. The only reply codes are +OK and -ERR.

Is POP3 the only protocol for retrieving email?

There is another way to retrieve and manage email, called the Internet Message Access Protocol, or IMAP. IMAP was designed as a superset of POP3, and overcomes some of its limitations. Because POP3 is so widely used we won't be covering IMAP.

Let's see how to use Net::POP3 to retrieve email from your server.

What is Net::POP3?

The module is available at CPAN as part of the libnet package, written by Graham Barr. Libnet is a package that has a collection of modules pertaining to networking. Here's what it includes:
  

Net::FTP

File Transfer Protocol

Net::NNTP

Network News Transfer Protocol

Net::SMTP

Simple Mail Transfer Protocol

Net::PH

CCSO Nameserver Client

Net::POP3

Post Office Protocol 3

Net::SNPP

Simple Network pager Protocol

Net::Time

Network Time protocols

Net::Domain

Determine your network internet domain (if possible)

Net::POP3 allows you to write Perl scripts that automate the interaction with a POP3 server to retrieve email.  If you use it (or another POP3 module) then you don't really need to know much about the POP3 protocol. We encourage to familiarize yourself with it so you know how it works, but you won't need to memorize anything in order to use Net::POP3.

Create a new POP3 object

As with Net::SMTP, the first step is to create a new object that connects to your server.  If you're not sure what your server is called, check the program that you currently use to read your email, It should look something like: pop.provider.com

Here's an example of how to create the new object:

use Net::POP3;
my $ServerName = "mail.your_isp.com";
my $pop3 = Net::POP3->new($ServerName);
    die "Couldn't log on to server" unless $pop3;

If the connection fails, then there's no point in continuing the mail transaction and that's why I call die(). Although I specify $ServerName in the call to new(), it is optional. If you don't specify it then new() will use the POP3_Host that is specified in Net::Config.  Since we're not covering that here, I'll be specifying the server name in all the example code.

There are two other parameters that are optional.  The first is Debug and the second is Timeout.  The default value for Timeout is 120 seconds, and is the amount of time to wait for a response from the server.  If you want to change that value, you'd set Timeout for the number of seconds you want to wait before you call it quits.

Initiate the mail transaction

Now its time to tell the POP3 server to get ready to retrieve your email.  In order to do that, you'll need to give the server your user name and password for your email account.  Anonymous logins are not allowed.  There are two ways to login to your account.  The first is to call login() with your name and password passed as parameters.  If you don't specify them, then your current username is used.  Net::Netrc will look up the password based on the host and your current user name.  But since that moves into areas we aren't covering here, I'll show you how to use login() with the parameters specified.

my $UserName = "user";
my $Password = "password";
my $Num_Messages = $pop3->login($UserName, $Password);

If the call to login() is successful, then the number of messages in your mailbox is returned.  If it fails, then undef is returned.

Once your have logged in to your email account, you can send any of the POP3 commands, using calls such as list(), popstat(), or delete().  Not sure what commands to use?  That's when it helps to know a little about the protocol.  If you know what it does, you'll know what calls to use.  We'll be working with a few of them when I show you how to get a list of your messages.  I'll show you how to get the list, parse the message header, and print a list of sender name and subject for each message.  That way you can do a quick check of your mailbox to see if there's any new messages, and decide whether or not you'd like to use your email program to download your messages

How to get a list of messages

POP3 specifies a command called LIST that returns the message numbers and the size of each message, like this:

Client:    LIST
Server:    +OK 3 messages (640 octets)
Server:    1 220
Server:    2 200
Server:    3 220
Server:    . ...

You can also call it for a specific message, like this:

Client:    LIST 2
Server:    +OK 2 200

Net::POP3 automates the LIST command with list().  If you call it without specifying a parameter, then  it will return a hash reference.  The keys are the numbers of all undeleted messages, and the value is the size of the message.  This is similar to the list shown in the first example above.

my $Messages;
$Messages = $pop3->list();

See?  Its pretty painless.  Now let's loop through the list to see what we have.  We'll be using a method called top(), which retrieves the header and a specified number of lines in the body of the text (the default is 10).  top() automates the POP3 command TOP, which does the same thing.  The header consists of the fields described in RFC 822, and can be quite long.  We're going to write a subroutine that will extract the information in the "From" and "Subject" fields, but first, let's look at how the loop works.

my $msg_id;

foreach $msg_id (keys(%$Messages))
{
    my $MsgContent = $pop3->top($msg_id);
    PrintList(@$MsgContent);
}

Remember that list() used without parameters returns a hash reference with the message number and size of the message in the keys and values, respectively.   If we send the message number ($msg_id) to top(), we'll get back a reference to an array containing the header and 10 lines of the message body.  In this example we won't be looking at the message body, just the header.  The syntax may look a little different because we're using references, but we're not going to be talking about that here.  Just remember that its a little different than you may have seen, and don't worry about it for now.  We'll tackle that subject in another article.

How to parse the message header

The steps for the PrintList subroutine are simple.  We'll look for "From:" and "Subject:" at the beginning of a line, and extract a substring from that line.  When we have values for both fields, we'll quit the loop, print the values, and repeat these steps for all the messages.  That way you'll have a quick and dirty way to see what's in your mailbox.  Here's the PrintList subroutine.

sub PrintList
{
    my (@lines) = @_;
    my ($from, $line, $subject);
   ...
}

The first step is to assign the parameter to a local variable.  Following that, we declare the local variables we'll need to get the job done.  The next step is to cycle through each line in the header, looking for "From" and "Subject".  When we find "From: ", we'll take whatever characters follow it and assign it to $from.  I set a limit of 40 characters, since that should be enough to know who sent the message.

foreach $line (@lines)
{
  if($line =~ m/^From: (.*)/) # if we find "From: "
    {
      $from = $1;  # Let's save what follows "From:"
      $from =~ s/"|<.*>//g; # Clean up the string
      $from = substr($from, 0, 39); # Let's keep it to 40 chars
    }
    ...
}

We do a little cleanup work on the string, because the "From: " field can contain the sender's name as well as the email address.

If there's a name, we'll use that.  If not, then we'll take off the angle brackets '<>' from the email address.  In either case, the result will be easy to read.  We'll do pretty much the same thing for the "Subject: " field.

foreach $line (@lines)
{
  if($line =~ m/^From: (.*)/)
  {
      ...
  }
  elsif( $line =~ m/^Subject: (.*)/)
  {
     $subject = $1;
     $subject = substr($subject, 0, 29);
  }
  last if( defined($subject) && defined($from) );
}

However, this time we won't try to clean up the subject string since, obviously, we don't want to lose any valuable information. 

Finally, we quit the loop if we have both the "From: " and "Subject: " fields.  Once we have those there's no point in continuing to loop through the header.

Closing a Net::POP3 connection

Once we have returned from PrintList, all we have left to do is close the connection to the server.  Its as simple as:

$pop3->quit();

Summary

This was an overview of how to use Net::POP3. I encourage you to learn more about the POP3 protocol and to use the knowledge when you use this module.  For instance, if you know that POP3 defines a command to download messages to your computer, you'll know that the method get() is the one to use in order to do this.

The example I presented was quite simple, but can be expanded to suit your needs.  For instance, you could also look for the "Date: " field and add that to the PrintList subroutine. Or, you could list all your messages, delete the ones you know you won't read, and use your email program to download the rest.

Here is the complete script showing how to use Net::POP3.

#!/usr/local/bin/perl

#=================================
# How to read email with Net::POP3
#=================================
# This script is designed to show
# how to use Net::POP3 to read your
# email from a Perl script.
#=================================

use strict;
print "Content-type: text/plain", "\n\n";
use Net::POP3;

# This debug flag will print debugging code to your 
# browser, depending on its value
# Set this to 1 to send debug code to your browser.
# Set it to 0 to turn it off.


my $DEBUG = 1;

if($DEBUG)
{
   $| = 1;
   open(STDERR, ">&STDOUT");
}

# Set this variable to your POP3 server name
my $ServerName = "mail.your_isp.com";

# Create a new POP3 object
my $pop3 = Net::POP3->new($ServerName, Debug => 1);

# If you can't connect, don't proceed with the rest of the script
die "Couldn't log on to server" unless $pop3;

# Initiate the mail transaction
my $UserName = "user";
my $Password = "password";

my $Num_Messages = $pop3->login($UserName, $Password);

# Get the list of messages
my $Messages;
my $msg_id;

$Messages = $pop3->list();

# Parse each message header for "From" and "Subject" fields
foreach $msg_id (keys(%$Messages))
{
    my $MsgContent = $pop3->top($msg_id, 20);
    PrintList(@$MsgContent);
}

# Close the connection
$pop3->quit();

# This subroutine parses the "From" and "Subject"
# fields of a message header

sub PrintList
{
 # Assign parameter to a local variable
 my (@lines) = @_;

 # Declare local variables
 my ($from, $line, $subject);

 # Check each line in the header
 foreach $line (@lines)
 {
   if($line =~ m/^From: (.*)/)
   {
      # We found the "From" field, so let's get what we need
      $from = $1;
      $from =~ s/"|<.*>//g;
      $from = substr($from, 0, 39);
   }
   elsif( $line =~ m/^Subject: (.*)/)
   {
      # We found the "Subject" field, so let's get what we need
      $subject = $1;
      $subject = substr($subject, 0, 29);
   }
   # If we have parsed the "From" and "Subject"
   # field, then we don't need to keep on going.
   # Let's quit the loop here.

   last if( defined($subject) && defined($from) );
  }

  # Print the result
  printf "From: %-40s Subject: %s\n", $from, $subject;
}# end: PrintList()

How do I send email from a perl script ?

One of the most common question that is asked "How do I send email from my script?"  You'll frequently find that this question is answered by suggesting you use mail programs like sendmail.  However, if your system doesn't have sendmail, then what do you do?

Fortunately, there are modules at CPAN that you can use to send email from any platform, such as Net::SMTP, Mail::Mailer, Mail::Sender, and Mail::Sendmail. This article will show how to use Net::SMTP, which is a module that talks directly to an SMTP server.

What is SMTP?

The Simple Mail Transfer Protocol is the most widely used protocol to send messages on the Internet. Basically, it establishes a two-way communication between client and server Mail Transfer Agents (MTAs).  If you send an email message to someone, it will probably have to travel through one or more mail servers before it reaches its destination. The servers use SMTP as a standard way to communicate with each other in order to deliver the message.

Why use SMTP for sending email?

If you are using a platform that has a mail program such as sendmail, you may be wondering why you are encouraged to learn more about SMTP.  After all, if your scripts are correctly configured and they work just fine,  why not stick with what works?  Well, suppose you decide to release a script on the internet, or you have a client that would like you to write an email script on a platform you haven't used before.  Would you know how to do it?  You would if you knew about SMTP and the Perl modules that use it.

In our opinion, its always best to strive to write code that is platform independent. That way your scripts will run on any machine without having to change the code. Yes, there will always be configuration issues, such as specifying the name of the SMTP server, but that's not the same as having to rewrite part of your script to accommodate another operating system. Doing that invites trouble because every code change introduces the possibility of adding a bug to your program. If there are options available that will work on many platforms you'll be better off using them.

Let's see how to use Net::SMTP to send email from your script.

What is Net::SMTP?

The module is available at CPAN as part of the libnet package, written by Graham Barr.  The nice thing about  libnet is that it is a collection of modules pertaining to networking.  Graham has done a great job of implementing the client side of some common protocols, as shown in the list below.  Its a good solution not only for email, but also for doing additional projects like file uploading or downloading with FTP.  
 
Here's what it includes:
 

Net::FTP

File Transfer Protocol

Net::NNTP

Network News Transfer Protocol

Net::SMTP

Simple Mail Transfer Protocol

Net::PH

CCSO Nameserver Client

Net::POP3

Post Office Protocol 3

Net::SNPP

Simple Network pager Protocol

Net::Time

Network Time protocols

Net::Domain

Determine your network internet domain (if possible)

Net::SMTP allows you to write Perl scripts that can interact with an SMTP server to send email.  Note that Net::POP3 is the module to use if you want to receive email from a POP3 server.

How to create a new SMTP object with Net::SMTP

This is a simple step that creates an SMTP object that connects to your server.  If you don't remember the name of your SMTP server, check your email program to see what it is. It should look something like: smtp.your_isp.com

Here's how to create a new SMTP object:

my $ServerName = "smtp.your_isp.com";

# Connect to the server
$smtp = Net::SMTP->new($ServerName);
   die "Couldn't connect to server" unless $smtp;

Be sure to verify that the connection to your server is successful.  If it isn't then there's no point in trying to continue sending the email, and that's why we included the call to die().

There are a few optional parameters you can use when you call new(). The first is Debug, which will give you information about your connection if you set it to 1. The second parameter is Timeout, which is the number of seconds your script will wait before it stops trying to establish a connection.  The default is 120 seconds.  There is a third, Hello, which will send a HELO command to the server, if you ever  need to specify your domain.

Initiate the mail transaction

Now its time to tell the SMTP server to get ready for the message.

my $MailFrom = "someone\@sender.com";
my $MailTo = "someone_else\@recipient.com";

$smtp->mail( $MailFrom );
$smtp->to( $MailTo );

What these two lines of code do is handle all the background SMTP communication so you don't have to. In a nutshell, your script is sending the MAIL command to the server, which readies it for receiving your message.  The value for $MailFrom is the "real" address of the sender, not the one that appears in the message that the recipient receives. SMTP builds a full reverse-path for a message, starting with the client host (you).  The reverse-path is updated each time your message passes through another SMTP server, so $MailFrom needs to specify your actual email address. You can specify the name that appears in the "From:" and "To:" fields in the body of the message.

Your script then sends the RCPT TO command, which gives the server the address where your message should be sent.  Again, this is the "real" address, not the address that shows up in the email message.  This is handy if you are sending email to a list of recipients, but you don't want all of their names sent to all the recipients.  In the message body, you can specify their actual address, or use a value such as "mailing-list@mydomain.com".  The server doesn't care what shows up in the message body, but it will have to have the actual email address(es) in order to send the message.  Therefore, the value of $MailTo will have to contain the correct address.

Once the server has these pieces of information, it is ready to receive your message.  Let's take a look at how to send it.

How to send a message with Net::SMTP?

Net::SMTP has a method called data() that starts to send a message to the server.  The next step is to call it, and you can do it in one of two ways.

There is an optional parameter to data() that can be either a list or a reference to a list. If you call data with this parameter specified, then that's all you have to do to send the message.  The call to data() takes care of the complete SMTP transaction, including sending your data and also sending the termination string to the server. This termination string is .\r\n (a dot on a line by itself), and let's the server know that there isn't any more data to be sent.  Here's an example of using data() this way:

$smtp->data("Hello World!");

If you don't specify a list with data(), then you'll have to use two other methods: datasend() and dataend().  When used in this way, data() simply tells the server that your script is ready to start sending data. Dataend() will send the termination string when you're done.  The message is actually sent to the server with datasend(), which you can use as many times as you need to before calling dataend().  Here's how to use the second method:

# Start the mail
$smtp->data();

# Send the message
$smtp->datasend("Hello World!\n\n");

# Send the termination string
$smtp->dataend();

Personally, we like the second method better, because its easier to split out important lines in the message, like the "From:", "To:", and "Subject:" fields. You can cram all of those into a list, but its harder to read. If code is easy to read and understand, its easier to debug and maintain.

How to specify header fields in your email message?

# Start the mail
$smtp->data();

# Send the header.
$smtp->datasend("To: mailing-list\@mydomain.com");
$smtp->datasend("From: MyMailList\@mydomain.comt\n");
$smtp->datasend("Subject: Updates To My Home Page\n");
$smtp->datasend("\n");

# Send the message
$smtp->datasend("Here are all the cool new links...\n\n");

# Send the termination string
$smtp->dataend();

When you're finished sending your message, call quit() to close the connection to your server.

$smtp->quit();

How to debug with Net::SMTP

There is an optional parameter to new() that can help you if you need to debug a problem sending email. If you set Debug to 1, then the output of the connection to your SMTP server will be printed.  The output consists of information about the Net::SMTP modules followed by the entire mail transaction. You'll be able to see what information is sent to and from your server, including the reply codes. Those can help you determine what the problem is and how to fix it.

$smtp = Net::SMTP->new($ServerName, Debug => 1);

If you're working with a CGI script, you'll need to have the following lines near the top of your script:

$| = 1;
open(STDERR, ">&STDOUT");

This will send the output to your browser so you can see exactly what is happening during your mail transaction. If you're running your script from the command line, the output will print to the screen.

Summary

As you can see, using Net::SMTP is an easy way to write a script that sends email without worrying about which platform or external mail program to use.  Although it helps to know a little about SMTP, you don't have to be an expert in order to send email.

Example Script

Here is the complete script showing how to use Net::SMTP.

#!/usr/local/bin/perl

# ================================
# How to send email with Net::SMTP
# ================================
# This script is designed to show
# how to use Net::SMTP to send
# email from a Perl script.
# ================================

use Net::SMTP;
print "Content-type: text/plain", "\n\n";

# This debug flag will print debugging code to your
# browser, depending on its value. Set this to 1 to send
# debug code to your browser.  Set it to 0 to turn it off.

my $DEBUG = 1;

if($DEBUG)
{
   $| = 1;
   open(STDERR, ">&STDOUT");
}

# Set this variable to your smtp server name
my $ServerName = "smtp.your_isp.com";

# Create a new SMTP object
$smtp = Net::SMTP->new($ServerName, Debug => 1);

# If you can't connect, don't proceed with the rest of the script
die "Couldn't connect to server" unless $smtp;

# Initiate the mail transaction
# Your "real" email address

my $MailFrom = "someone\@sender.com";

# Recipient's "real" email address
my $MailTo = "someone_else\@recipient.com";

$smtp->mail( $MailFrom );
$smtp->to( $MailTo );

# Start the mail
$smtp->data();

# Send the header
# This address will appear in the message

$smtp->datasend("To:  mailing-list\@mydomain.com\n");

# So will this one
$smtp->datasend("From:  MyMailList\@mydomain.com\n");
$smtp->datasend("Subject: Test Message\n");
$smtp->datasend("\n");

# Send the body.
$smtp->datasend("Hello World!\n\n");

# Send the termination string
$smtp->dataend();

# Close the connection
$smtp->quit();

How Perl finds modules and libraries ?

Introduction

What do I do when my script can't find a module I know is installed?". Good question.  Let's look at how to solve that problem. There are two ways that you can find out what Perl is doing when it looks for modules.  One is to print @INC, which is an array that Perl uses to determine the directories to search for libraries and modules.  The other is print %INC, which is a list of the actual modules Perl has loaded from the environment.  If your script can't find a particular module or library, you can save yourself hours of work by simply looking at the values Perl is using.  Let's see how its done.

The code to print @INC is simple:

print "@INC";

Or, you can do this from the command line with:

perl -e "print @INC";

Either way, you'll get a list of directories that Perl will use to search for modules.  The directories in the list are in the order that Perl will use when it searches.  On my computer, the list is as follows:

C:/Perl/lib C:/Perl/site/lib . C:\WEBDEV\CGI-BIN

This means that Perl will first look for a module in "C:/Perl/lib".  If it doesn't find it, then it will move to "C:/Perl/site/lib".  The "." means that it will search in the current directory, which is the one I'm using to run the script.  (It will change when I run scripts in different directories).  Finally, if all else fails, Perl will look for modules in "C:\WEBDEV\CGI-BIN", which is the directory I use for developing my CGI scripts.  If the module isn't in one of those directories, Perl will give up after printing the message:

Can't locate MyModule.pm in @INC at line

The directories may be different on your computer, but whatever they are, the message will be the same for you.  Or worse, you'll get the dreaded "Internal Server Error" message in a browser window, and you'll have no idea why.

To solve this problem, first comment out the line that loads the module, and then print @INC to get a list of the directories Perl is searching.  That may be enough information to figure out what's wrong and fix it.  If it isn't, then print %INC to see what modules Perl is loading from the environment.  We'll see how in the next section.

Debug @INC

  • Comment out the line that loads the module

#use MyModule;

  • Print the values @INC

print "\@INC = @INC\n";

If the module is in a directory that isn't listed you have two choices.  Either move it to one that is, or modify @INC.

How to see what modules Perl is loading ?

Here's the code you'll need to see the list of modules Perl loads when it runs your script:

my $key;
foreach $key (sort keys(%INC)) {
  print "$key => $INC{$key}\n";
}

If you have the following code,

use MyModule;
my $key;
foreach $key (sort keys(%INC)) {
  print "$key => $INC{$key}\n";
}

you'll see these results:

MyModule.pm => C:/Perl/lib/MyModule.pm

The advantage here is that you'll see exactly what Perl sees without having to look through directories yourself.  Using %INC is extremely helpful in cases where you have written a module and its giving you weird errors.  It may be that you have inadvertently given your module the same name as one included in the central library. Remember that Perl will use the first module that it finds in the search path, which may not be the module you wrote.  The only way you'd know for sure is to print %INC.

Now that you know how to determine what information Perl is using when it looks for and loads a module, let's learn how to modify @INC.  Knowing this will allow you to gain control over the directories you can use when you want to load libraries or modules in your scripts.

How to modify @INC

The code is simple: use lib "my/path/to/modules";

The "use lib" part always stays the same, but you'll have to modify "my/path/to/modules" to match your path.  Of course, the line of code will have to come before you try to load a module, like this:

use lib "C:\\MyModuleDirectory";
use MyModule;

If you then print both @INC and %INC, you'll see something like:

@INC = C:\MyModuleDirectory C:/Perl/lib C:/Perl/site/lib . C:\WEBDEV\CGI-BIN

MyModule.pm => C:\MyModuleDirectory/MyModule.pm

As you can see, "use lib" will add the directory that you specify as the first location to be searched.