[wplug] pnc history

bpmedley at 4321.tv bpmedley at 4321.tv
Thu Aug 30 15:05:08 EDT 2001


A while back I mentioned that I had made a perl program that would login to
pnc and get your account history.  Since then I have made a few
improvements.  This improved version is attached.  Below are some of the

- added ability to print account summary.
- added ability to output to a text file.
- output filenames are user controllable.  This is accomplished by the user
  supplying a file format that will be converted to the actual filename.
  It is very similar to printf(3).  This is used when making the output is
  set to make a .qif or a text file.

For help on getting started please run the program with the -man option.

As before, this program requires some perl modules that are usually not
installed by default.  Even more unfortunate, is that I can't remeber the
exact modules that I needed to install to write it.  Sorry.  Here are a few
that I remeber:

LWP       - Library for WWW access in Perl
AppConfig - Module for reading configuration files and parsing command
            line arguments
Maybe Net::SSL for https urls.

If this is a problem for you, and you want to use this, please let me know
and I'll put some more time into figuring out what modules were required.

#! /usr/bin/perl -w

# $Header: /home/bmedley/docs/RCS/grab_pnc_history.pl,v 1.14 2001/08/30 19:01:45 bmedley Exp $ 
# This program will goto PNC's web page and retrieve the user's account
# history.
# Version:
#   $Revision: 1.14 $
#   $Date: 2001/08/30 19:01:45 $
#   Fix error messages when using AppConfig.
#     - don't print out error messages when rc file not found
#     - print out better error messages when supply wrong options

use strict;

use AppConfig;
use HTTP::Cookies;
use HTTP::Request::Common;
use HTML::Form;
use LWP::UserAgent;
use Pod::Usage;
use POSIX qw(strftime);

# use Data::Dump qw(dump); 
# use LWP::Debug qw(+);

my %cmdargs;
my $ua;

# An array of hashes:
# transactions[x]{date}
# transactions[x]{check_num}
# transactions[x]{withdrawal}
# transactions[x]{deposit}
# transactions[x]{desc} <- array of lines
my @transactions;

# A hash of hashes describing your accounts
# accounts{name}{url_snippet}
# accounts{name}{number}
# accounts{name}{balance}
my %accounts;

# used when outputting a .qif file or when writing the history to a file.
my $output_file;

sub init;
sub get_history;
sub print_summary;

sub print_history_text;
sub print_history_qif;
sub print_history_none;



    eval "&print_history_$cmdargs{output}"; die 
        "Can't print history using: print_history_$cmdargs{output} ($@)\n" if $@;

    exit 0;

sub error;
sub verify;

# Setup our environment
sub init
    my ($config, $filepth);
    my $cookie;

    $filepth = "$ENV{HOME}/.grab_pnc_historyrc";

    # don't print out error messages (internal to AppCongif):
    # (like config file not found)
    $config = AppConfig->new({ERROR => sub {;}, GLOBAL => {DEFAULT  => 0}});
    # print out error messages:
    # $config = AppConfig->new({GLOBAL => {DEFAULT  => 0}});

    $config->define ("debug",   {ARGS => "!"});
    $config->define ("date",    {ARGS => "=s"});
    $config->define ("userid",  {ARGS => "=s", DEFAULT => undef});
    $config->define ("passwd",  {ARGS => "=s", DEFAULT => undef});
    $config->define ("summary", {ARGS => "=s", DEFAULT => undef});
    $config->define ("output",  {ARGS => "=s", DEFAULT => "text"});
    $config->define ("account", {ARGS => "=s", DEFAULT => "Interest Checking"});
    $config->define ("fformat", {ARGS => "=s", DEFAULT => '%n_%d.txt'});
    $config->define ("help");
    $config->define ("man");

    # give command line options precedence over config file
    # the nobundling was required because -account wasn't working.  don't
    # know why...
    $config->getopt(qw(nobundling), \@ARGV) or pod2usage(1);

    pod2usage(1) if $config->help;
    pod2usage(-verbose => 2) if $config->man;

    error "You must specify a userid." if not defined $config->userid;
    error "You must specify a passwd." if not defined $config->passwd;

    # i like the ease-of-use of AppConfig, but I would rather access the options like
    # variables, not subroutine calls.
    %cmdargs = ();
    %cmdargs = $config->varlist(".");

    # check to make sure we have work todo
    if ("none" eq $cmdargs{output} && 
        (!defined $cmdargs{summary} || "none" eq $cmdargs{summary})) {
        die "You told me not to output any history.\n" . 
            "You also don't want a summary.\n" . 
            "Please give me something todo next time.\n";

    # the javascript would have stripped out the dashes and spaces
    $cmdargs{userid} =~ s/-//g;
    $cmdargs{userid} =~ s/\s//g;

    verify "summary", $cmdargs{summary}, ("same", "all", "none");
    verify "output",  $cmdargs{output},  ("text", "qif", "file", "none");

    # generate the user-requested file
    if ("file" eq $cmdargs{output} || "qif" eq $cmdargs{output}) {
        my %fformat;
        my @fformat;

        # setup our text to insert inside of format specifiers
        $fformat{n} = $cmdargs{account};
        $fformat{n} =~ s/ /_/g;
        $fformat{n} =~ y/A-Z/a-z/;
        $fformat{d} = strftime ("%Y-%m-%d", localtime);

        # see what format specifiers the user gave us
        while ($cmdargs{fformat} =~ /%(.)/g) {
            push @fformat, $1;

        # process the format specifiers to make the filename
        $output_file = $cmdargs{fformat};
        foreach (@fformat) {
            die "Unsupported format specifier ($_)\n" if not defined $fformat{$_};

            $output_file =~ s/%$_/$fformat{$_}/;

        if ($cmdargs{debug}) {
            print STDERR "Filename to write to is: $output_file\n";

    # To accomplish writing to a file we set the default filehandle to what the user
    # requested (from the fformat command-line option).
    if ("file" eq $cmdargs{output}) {        
        # open the generated filename
        open FILE, "> $output_file" or die "open ($output_file): $!\n";
        select FILE;
        $cmdargs{output} = "text";

    # setup our "browser"
    $cookie = './cookie_jar.txt';
    unlink $cookie;

    $ua = LWP::UserAgent->new;

                file     => "$cookie",
                autosave => 1 )

} # end init()

# This routine prints an error message to STDERR and exits with the supplied error code.
# This routine will pad the message with "error: " and output a newline.
sub error
    my $msg = shift;
    my $code = shift;

    # default if not given an error code
    $code = 1 if not defined $code;

    print STDERR "error: $msg\n";

    exit $code;

# This routine makes sure that the value we are given is acceptable.
sub verify 
    my $option = shift;
    my $value = shift;
    my $is_ok;

    $is_ok = "no";

    # see if what the user gave us is allowed
    foreach (@_) {
        if ($_ eq $value) {
            $is_ok = "yes";

    if ("yes" eq $is_ok) {

    # else the user specified an invalid argument
    print "Value '$value' for the '$option' option is invalid.\n";
    print "We currently support:\n";
    foreach (@_) {
        print "\t$_\n" if "" ne $_;
        print "\t<empty string>\n" if "" eq $_;

    exit 2;
} # end verify()
sub parse_accounts;
sub parse_history;

# This subroutine goes to pnc's web page and dl's the user's history.
# It's highly sensitive to pnc's setup.  One reason for this is that pnc uses javascript
# in their forms, which is not easily used in perl.
sub get_history
    my ($request, $response);
    my ($which_request);
    my $form;
    my $url;
    my $i;

    # login
    if ($cmdargs{debug}) {
        print STDERR "Trying to login to pnc with userid: $cmdargs{userid}.\n";
    $request = HTTP::Request->new(GET => 
    $response  = $ua->request($request);

    $form = HTML::Form->parse( $response->content, $response->base());

    # give them the username/password
    $form->value( 'UserID',   "$cmdargs{userid}" );
    $form->value( 'Password', "$cmdargs{passwd}" );

    # the form does not have a submit button.  it uses javascript.  we have to add a submit
    # button.
    $form->push_input ( 'submit', {value=>"submit",name=>"submit"} );
    $response = $ua->request( $form->click('submit') );

    # get the accounts
    if ($cmdargs{debug}) {
        print STDERR "Retrieving information for your $cmdargs{account} account.\n";

    $request = HTTP::Request->new(GET => 
    $response = $ua->request ($request);

    parse_accounts $response->content_ref;
    if (not defined $accounts{$cmdargs{account}}) {
        die "Can't find account info for $cmdargs{account}.\n";

    # no reason to get the history if they don't want it..:)
    if ("none" eq $cmdargs{output}) {

    # get the history
    if ($cmdargs{debug}) {
        print STDERR "Obtaining history from pnc.\n";

    $url = 'https://www.accountlink.pncbank.com/Accounts/Deposit';
    $url .= '/depositDetail.jsp?selectedPage=0&accountID=';
    $url .= $accounts{$cmdargs{account}}{url_snippet};
    $url .= '&More=INIT&Sort=DEFAULT&Page=0';
    $request = HTTP::Request->new(GET => $url);

    $response = $ua->request ($request);

    parse_history $response->content_ref;
} # end get_history()

# This routine takes the raw account information data from pnc (i.e. the HTML) and turns
# each entry into an element in the account hash
sub parse_accounts
    my $account_data = shift;
    my $entry;
    my $type;
    my @elements;

    while ($$account_data =~ /document.writeln\(buildRow\((.*)\)\);/mg) {
        $entry = $1;
        $entry =~ s/'//g;

        @elements = split /, /, $entry;

        $type = shift @elements;
        $accounts{$type} = {};

        ($accounts{$type}{url_snippet}, $accounts{$type}{number}, $accounts{$type}{balance}) = @elements;
    } # end parsing HTML from pnc

    if ($cmdargs{debug}) {
        print STDERR "We found the following accounts:\n";

        foreach my $key (sort { $accounts{$a} eq $accounts{$b} } keys %accounts) {
            print STDERR "$key\n";
        print STDERR "\n";

} #end parse_accounts ()

# This routine takes the raw history data from pnc (i.e. the HTML) and turns each entry
# into an element in our transactions array.
sub parse_history
    my $history_data = shift;
    my $entry;
    my $date;
    my $i;

    if ($cmdargs{debug} && $cmdargs{date}) {
        print STDERR "Transactions before $cmdargs{date} will be ignored.\n";

    while ($$history_data =~ /document.writeln\(buildRow\((.*)\)\);/mg) {
        $entry = $1;
        $entry =~ s/'//g;

        # convert dates into iso 8601 date format
        $entry =~ s/(.*?),\s*(.*)/$2/;
        $date  = $1;
        $date  =~ m#(\d\d)/(\d\d)/(\d\d\d\d)#;
        $date  = "$3-$1-$2";

        # only give the user the date range they want
        if ($cmdargs{date}) {
            return if $date lt $cmdargs{date};

        # initialize our data structure
        $i = scalar @transactions;
        $transactions[$i] = {};
        $transactions[$i]{desc} = [];

        # store the values in our data structure
        $transactions[$i]{date} = $date;
        ($transactions[$i]{check_num}, $transactions[$i]{withdrawal}, $transactions[$i]{deposit},
        $transactions[$i]{desc}[0], $transactions[$i]{desc}[1]) = split /, /, $entry;

        # if there was only one line in the description
        $#{$transactions[$i]{desc}}-- if $transactions[$i]{desc}[1]=~ /\&\#160/;

        # tidy up check output
        $transactions[$i]{desc}[0] =~ s/^CHECK\s+(\d+)\s+(\S+)/CHECK $1 $2/;
    } # end parsing the html from pnc

} # end parse_history()

# This routine will print the user requested summaries.
sub print_summary
    my @account_keys;
    # leave as fast as possible if the user doesn't want anything
    return if not defined $cmdargs{summary};
    return if "none" eq $cmdargs{summary};
    # determine what account summaries the user requested
    if ("all" eq $cmdargs{summary}) {
        @account_keys = sort { $accounts{$a} eq $accounts{$b} } keys %accounts;
    } else {
        @account_keys = ($cmdargs{account});

    print "Account Summary:\n\n";
    foreach (@account_keys) {
        printf "%-30s%-29s%-s\n", $_, $accounts{$_}{number}, $accounts{$_}{balance};

    # provide a visual seperator if we're going to print the history out
    if ("text" eq $cmdargs{output}) {
        print "\n", "=" x 80, "\n\n";
} # end print_summary()

# This routine prints out the transactions array as text.
sub print_history_text
    my $line;

    if (0 == @transactions) {
        print "No transactions recorded.\n";
    foreach (@transactions) {
        # we don't print check_num b/c it's in the description

        printf "%s   %-46s%-15s%-s\n", $_->{date}, $_->{desc}[0], $_->{withdrawal}, $_->{deposit};
        # if the description is multi-line this takes care of it
        foreach $line (1 .. $#{$_->{desc}}) {
            print "             $_->{desc}[$line]\n";
        # this is so the user can add their own description.  the spaces are so they can
        # hit 'A' in vi and be lined up with the others.
        print     "             \n";

} # end print_history_text()

# This routine generates a qif file from our transactions array.  It's not tested a
# great deal, because I can't find a good program in Linux that will import qif
# files.
# example for qif files:
# http://www.intuit.com/quicken/technical-support/quicken/old-faqs/dosfaqs/60006.html
sub print_history_qif
    my $QIF = $output_file;
    my $orig_fh;
    my $amt;
    open QIF, "> $QIF" or die "open ($QIF): $!";
    $orig_fh = select;
    select QIF;

    foreach (@transactions) {
        print "!Type:Bank\n";
        print "D$_->{date}\n";
        $_->{check_num} =~ s/\s*//g;
        if ($_->{check_num} =~ /^\d+$/) {
            print "N$_->{check_num}\n";
        $amt = $_->{withdrawal};
        if ($amt =~ /^\$/) {
            # we have a withdrawal
            $amt =~ s/\$/-/;
        } else {
            # we have a deposit
            $amt = $_->{deposit};
            $amt =~ s/\$//;

        print "T$amt\n";
        print "P$_->{desc}[0]\n";
        print "M$_->{desc}[1]\n" if defined $_->{desc}[1];

        print "^\n";

    close QIF;
    select $orig_fh;
} # end print_history_qif()

# this is here so that the user can print only the summaries
sub print_history_none 
} # end print_history_none()


=head1 NAME

grab_pnc_history.pl - Goto PNC's web page and retrieve the user's account history


B<grab_pnc_history.pl> [options]

=head1 OPTIONS

=over 4

=item B<-help>

Prints some help and exits.

=item B<-man>

Prints the manual page and exits.

=item B<-date>

Used to specify a starting date when producing output.  No transaction before
this date will be output.  

date should be specified in the ISO 8601 format (e.g. 2001-08-04).

=item B<-debug>

Prints some (hopefully) helpful messages during execution to figure out
what's going on.

=item B<-userid>

The user id you login to pnc with.

=item B<-passwd>

The password for that user id.

=item B<-account>

The account you want to get information for.  The default is "Interest Checking".
Known values are:

    Interest Checking
    Regular Checking
    Statement Savings

=item B<-summary>

Use this to print out a summary for accounts.

This option take the following arguments:

    same - the summary for the account listed in the -account option is used
    all  - prints out summaries for all accounts
    none - no summary printed

=item B<-output>

What format you want the history output in.  

This option take the following arguments:

    text - print data to screen
    qif  - save output to a qif file
    file - save output to a text file
    none - no history is printed.  useful to only get a summary

The default is text.

=item B<-fformat>

This option controls how filenames are generated.  It is used when the "text" or "qif"
output option is present.  It works similary to "printf(3)", that is, format specifieres
are used to generate the file name.  Below is a list of the supported specifiers:

    %n - account name.  it will be in lowercase and spaces are turned into underscores
    %d - the date that the file is made (i.e. the time the script is ran.
         this is in ISO 8601 format.

The default is "%n_%d.txt".



This program will goto PNC's web page and retrieve the user's account history.


This program accepts information from the command line and from a
configuration file.  The command line overrides values placed inside the
configuration file.  At present the configuration file is:


An example is: 

    # my account
    userid  = 478661344
    passwd  = 1234
    account = Interest Checking   

More generally, this is:
    option [=] [value]

Where "option" is a command line option.  This allows the user to specify
their own defaults and makes sure that the password does not show up in ps(1)
and the like.  

NOTE: Passwords are stored in plaintext.


Use the values in the configuration file and get "Statement Savings" history.

    grab_pnc_history.pl -account="Statement Savings"

Use the values in the configuration file and get "Statement Savings" history.
Send the output to the screen.  However, only transactions that happened 
AFTER and INCLUDING Wed, August 15th, 2001 will be printed.

    grab_pnc_history.pl -account="Statement Savings" -output=text -date=2001-08-15

Use the values in the configuration file and get "Statement Savings" history.
Send the output to a qif file.

    grab_pnc_history.pl -account="Statement Savings" -output=qif

=head1 NOTES

This program needs the following perl modules (which, to the best of my 
knowledge are not installed by default):

    LWP       - Library for WWW access in Perl
    AppConfig - Module for reading configuration files and parsing command line arguments
    (there are probably others, but I've forgotten what I had to install)

I don't know the minimum perl version that is usable.  I use 5.6.0.

Don't forget that the qif file will be overwritten.

In addition to everything said above, this program also uses a file called
./cookie_jar.txt.  As you might expect, this is used to store cookies that pnc
gives us.  However, as you might not expect, this file is deleted at startup and 
then re-created.

Therefore, make sure you don't have a file called cookie_jar.txt in the same
directory as this program.  In the future, this should probably be a temporary

This program prints out the message "Input 'UserID' is readonly at...".  This has
something todo with the form options for UserID.  I'm not really sure how to make
this go away.  Suggestions are welcome.

This program was adopted from code written by Ronald Hill.  His program will dl
account history from Wells Fargo.


