Wednesday, January 23, 2008

The Subroutine Stack

Whenever Perl calls a subroutine, it pushes the details of the subroutine call onto an internal stack. This holds the context of each subroutine, including the parameters that were passed to it in the form of the @_ array, ready to be restored when the call to the next subroutine returns. The number of subroutine calls that the program is currently in is known as the 'depth' of the stack. Calling subroutines are higher in the stack, and called subroutines are lower.

This might seem academic, and to a large extent it is, but Perl allows us to access the calling stack ourselves with the caller function. At any given point we are at the 'bottom' of the stack, and can look 'up' to see the contexts stored on the stack by our caller, its caller, and so on, all the way back to the top of the program. This can be handy for all kinds of reasons, but most especially for debugging.

In a purely scalar context, caller returns the name of the package from which the subroutine was called, and undef if there was no caller. Note that this does not require that the call came from inside another subroutine – it could just as easily be from the main program. In a list context, caller returns the package name, the source file, the line number from which we were called, and the name of the subroutine that was called (i.e. us). This allows us to write error traps in subroutines like:

sub mysub
($pkg, $file, $line) = caller;
die "Called with no parameters at $file line $line" unless @_;
}

If we pass a numeric argument to caller, it looks back up the stack the requested number of levels, and returns a longer list of information. This level can of course be '0', so to get everything that Perl knows about the circumstances surrounding the call to our subroutine we can write:

@caller_info = caller 0;   # or caller(0), if we prefer

This returns a whole slew of items into the list, which may or may not be defined depending on the circumstances. They are, in order:

  • package: the package of the caller.
  • filename: the source file of the caller.
  • line: the line number in the source file.
  • subroutine: the subroutine that was called (that is, us). If we execute code inside an eval statement then this is set to eval.
  • hasargs: this is true if parameters were passed (@_ was defined).
  • wantarray: the value of wantarray inside the caller, see 'Returning Values' later in the chapter.
  • evaltext: the text inside the eval that caused the subroutine to be called, if the subroutine was called by eval.
  • is_require: true if a require or use caused the eval.
  • hints: compilation details, internal use only.
  • bitmask: compilation details, internal use only.

In practice, only the first four items: package, filename, line, and subroutine are of any use to us, which is why they are the only ones returned when we use caller with no arguments. Unfortunately we do not get the name of the calling subroutine this way, so we have to extract that from further up the stack:

# get the name of the calling subroutine, if there was one
$callingsub = (caller 1)[3];

Or, more legibly:

($pkg, $file, $line, $callingsub) = caller 1;

Armed with this information, we can create more informative error messages that report errors with respect to the caller. For example:

# die with a better error message

sub mysub {
($pkg, $file, $line) = caller;
die "Called from ", (caller(1)) [3], " with no parameters at $file line $line \n" unless @_;
...
}

If debugging is our primary interest, a better solution than all the above is to use the Carp module. The Carp module and other debugging aids are covered in Chapter 17.

One final point about the calling stack: if we try to access the stack above the immediate caller we may not always get the right information back. This is because Perl can optimize the stack under some circumstances, removing intermediate levels. The result of this is that caller is not always as consistent as we might expect, so a little caution should be applied to its use.

Monday, January 14, 2008

How to Use the Windiff.exe Utility

To compare two files by using Windiff.exe, follow these steps:
1.Start Windiff.exe.
2.On the File menu, click Compare Files.
3.In the Select First File dialog box, locate and then click a file name for the first file in the comparison, and then click Open.
4.In the Select Second File dialog box, locate and then click a file name for the second file in the comparison, and then click Open.

The information in the right pane indicates whether there is a file difference.
5.To view the actual file differences, click the first line in the Windiff.exe output results, and then on the Expand menu, click Left File Only, Right File Only, or Both Files.

The color-coded results indicate what the file differences are.
To compare two folders by using Windiff.exe, follow these steps:
1.Start Windiff.exe.
2.On the File menu, click Compare Directories.
3.In the Select Directories dialog box, type the two folder names that you want to compare in the Dir1 and Dir2 boxes. If you want to include subfolders, click to select the Include subdirectories check box.

The information in the right pane indicates the differences between the two folders.
4.To view the actual file differences, click the line that you want in the Windiff.exe output results, and then on the Expand menu, click Left File Only, Right File Only or Both Files.

The color-coded results indicate what the file differences are.

Friday, January 11, 2008

Row and Array Comparisons in postgresql

IN

expression IN (value[, ...])

The right-hand side is a parenthesized list of scalar expressions. The result is "true" if the left-hand expression's result is equal to any of the right-hand expressions. This is a shorthand notation for

expression = value1
OR
expression = value2
OR
...

Note that if the left-hand expression yields null, or if there are no equal right-hand values and at least one right-hand expression yields null, the result of the IN construct will be null, not false. This is in accordance with SQL's normal rules for Boolean combinations of null values.

NOT IN

expression NOT IN (value[, ...])

The right-hand side is a parenthesized list of scalar expressions. The result is "true" if the left-hand expression's result is unequal to all of the right-hand expressions. This is a shorthand notation for

expression <> value1
AND
expression <> value2
AND
...

Note that if the left-hand expression yields null, or if there are no equal right-hand values and at least one right-hand expression yields null, the result of the NOT IN construct will be null, not true as one might naively expect. This is in accordance with SQL's normal rules for Boolean combinations of null values.

Tip: x NOT IN y is equivalent to NOT (x IN y) in all cases. However, null values are much more likely to trip up the novice when working with NOT IN than when working with IN. It's best to express your condition positively if possible.

ANY/SOME (array)

expression operator ANY (array expression)
expression operator SOME (array expression)

The right-hand side is a parenthesized expression, which must yield an array value. The left-hand expression is evaluated and compared to each element of the array using the given operator, which must yield a Boolean result. The result of ANY is "true" if any true result is obtained. The result is "false" if no true result is found (including the special case where the array has zero elements).

SOME is a synonym for ANY.

ALL (array)

expression operator ALL (array expression)

The right-hand side is a parenthesized expression, which must yield an array value. The left-hand expression is evaluated and compared to each element of the array using the given operator, which must yield a Boolean result. The result of ALL is "true" if all comparisons yield true (including the special case where the array has zero elements). The result is "false" if any false result is found.

Row-wise Comparison

row_constructor operator row_constructor

The two row values must have the same number of fields. Each side is evaluated and they are compared row-wise. Presently, only = and <> operators are allowed in row-wise comparisons. The result is "true" if the two rows are equal or unequal, respectively.

As usual, null values in the rows are combined per the normal rules of SQL Boolean expressions. Two rows are considered equal if all their corresponding members are non-null and equal; the rows are unequal if any corresponding members are non-null and unequal; otherwise the result of the row comparison is unknown (null).

row_constructor IS DISTINCT FROM row_constructor

This construct is similar to a <> row comparison, but it does not yield null for null inputs. Instead, any null value is considered unequal to (distinct from) any non-null value, and any two nulls are considered equal (not distinct). Thus the result will always be either true or false, never null.

row_constructor IS NULL
row_constructor IS NOT NULL

These constructs test a row value for null or not null. A row value is considered not null if it has at least one field that is not null.

Wednesday, January 9, 2008

Array Functions

Here are the functions you can use with arrays:

* defined(VARIABLE) -- Returns true if VARIABLE has a real value and if the variable has not yet been assigned a value. This is not limited to arrays; any data type can be checked. Also see the exists function for information about associative array keys.
* delete(KEY) -- Removes the key-value pair from the given associative array. If you delete a value from the %ENV array, the environment of the current process is changed, not that of the parent.
* each(ASSOC_ARRAY) -- Returns a two-element list that contains a key and value pair from the given associative array. The function is mainly used so you can iterate over the associate array elements. A null list is returned when the last element has been read.
* exists(KEY) -- Returns true if the KEY is part of the specified associative array. For instance, exists($array{"Orange"}) returns true if the %array associative array has a key with the value of "Orange."
* join(STRING, ARRAY) -- Returns a string that consists of all of the elements of ARRAY joined together by STRING. For instance, join(">>", ("AA", "BB", "cc")) returns "AA>>BB>>cc".
* keys(ASSOC_ARRAY) -- Returns a list that holds all of the keys in a given associative array. The list is not in any particular order.
* map(EXPRESSION, ARRAY) -- Evaluates EXPRESSION for every element of ARRAY. The special variable $ is assigned each element of ARRAY immediately before EXPRESSION is evaluated.
* pack(STRING, ARRAY) -- Creates a binary structure, using STRING as a guide, of the elements of ARRAY. You can look in the next Chapter on References for more information.
* pop(ARRAY) -- Returns the last value of an array. It also reduces the size of the array by one.
* push(ARRAY1, ARRAY2) -- Appends the contents of ARRAY2 to ARRAY1. This increases the size of ARRAY1 as needed.
* reverse(ARRAY) -- Reverses the elements of a given array when used in an array context. When used in a scalar context, the array is converted to a string, and the string is reversed.
* scalar(ARRAY) -- Evaluates the array in a scalar context and returns the number of elements in the array.
* shift(ARRAY) -- Returns the first value of an array. It also reduces the size of the array by one.
* sort(ARRAY) -- Returns a list containing the elements of ARRAY in sorted order. See next Chapter 8on References for more information.
* splice(ARRAY1, OFFSET, -- Replaces elements of ARRAY1 with elements LENGTH, ARRAY2)in ARRAY2. It returns a list holding any elements that were removed. Remember that the $[ variable may change the base array subscript when determining the OFFSET value.
* split(PATTERN, STRING, LIMIT) -- Breaks up a string based on some delimiter. In an array context, it returns a list of the things that were found. In a scalar context, it returns the number of things found.
* undef(VARIABLE) -- Always returns the undefined value. In addition, it undefines VARIABLE, which must be a scalar, an entire array, or a subroutine name.
* unpack(STRING, ARRAY) -- Does the opposite of pack().
* unshift(ARRAY1, ARRAY2) -- Adds the elements of ARRAY2 to the front of ARRAY1. Note that the added elements retain their original order. The size of the new ARRAY1 is returned.
* values(ASSOC_ARRAY) -- Returns a list that holds all of the values in a given associative array. The list is not in any particular order.

Printing an Associative Array

The each() function returns key, value pairs of an associative array one-by-one in a list. This is called iterating over the elements of the array. Iteration is a synonym for looping. So, you also could say that the each() function starts at the beginning of an array and loops through each element until the end of the array is reached. This ability lets you work with key, value pairs in a quick easy manner.

The each() function does not loop by itself. It needs a little help from some Perl control statements. For this example, we'll use the while loop to print an associative array. The while (CONDITION) {} control statement continues to execute any program code surrounded by the curly braces until the CONDITION turns false, assoc.pl.

%array = ( "100", "Green", "200", "Orange");



while (($key, $value) = each(%array)) {

print("$key = $value\n");

}

This program prints:

100 = Green

200 = Orange

The each() function returns false when the end of the array is reached. Therefore, you can use it as the basis of the while's condition. When the end of the array is reached, the program continues execution after the closing curly brace. In this example, the program simply ends.

Checking the Existence of an Element

You can use the defined() function to check if an array element exists before you assign a value to it. This ability is very handy if you are reading values from a disk file and don't want to overlay values already in memory. For instance, suppose you have a disk file of Jazz Artist addresses and you would like to know if any of them are duplicates. You check for duplicates by reading the information one address at a time and assigning the address to an associative array using the Jazz Artist name as the key value. If the Jazz Artist name already exists as a key value, then that address should be flagged for follow up.

Because we haven't talked about disk files yet, we'll need to emulate a disk file with an associative array. And, instead of using Jazz Artist's address, we'll use Jazz Artist number and Jazz Artist name pairs. First, we see what happens when an associative array is created and two values have the same keys, element1.pl.

createPair("100",  "Miles Davis");

createPair("200", "Keith Jarrett");

createPair("100", "John Coltrane");



while (($key, $value) = each %array) {

print("$key, $value\n");

};



sub createPair{

my($key, $value) = @_ ;



$array{$key} = $value;

};

This program prints:

100, John Coltrane

200, Keith Jarrett

This example takes advantages of the global nature of variables. Even though the %array element is set in the createPair() function, the array is still accessible by the main program. Notice that the first key, value pair (100 and Miles Davis) are overwritten when the third key, value pair is encountered. You can see that it is a good idea to be able to determine when an associative array element is already defined so that duplicate entries can be handled. The next program does this, element2.pl.

createPair("100",  "Miles Davis");

createPair("200", "Keith Jarrett");

createPair("100", "John Coltrane");

while (($key, $value) = each %array) {

print("$key, $value\n");

};



sub createPair{

my($key, $value) = @_ ;



while (defined($array{$key})) {

$key++;

}



$array{$key} = $value;

};

This program prints:

100, Miles Davis

101, John Coltrane

200, Keith Jarrett

You can see that the number for John Coltrane has been changed to 101.

Monday, January 7, 2008

How to Create a Bullet List in Microsoft Excel

To Add a Bullet to an Existing Text Entry

1.Position the insertion point at the beginning of your text entry.
2.Type a symbol that you want to use as a bullet. To create the bullet character, press ALT+0149 (type 0149 on the numeric keypad).

You may want to include a space after the character so that the bullet will not be next to the text. Note that the bullet is an extended character and may not be available with all fonts.

Some other examples of characters you can use include: >, /, ~, !, and others.


To Create a Custom Text Format That Includes Bullets

1.Select the cell or range of cells that you want to apply bullets to.
2.On the Format menu, click Cells.
3.On the Number tab, click the Text category, and then click the Custom category.

Microsoft Excel places an at sign (@) in the Type box.
4.In the Type box, place the insertion point before the @, and type the symbol that you want to use as a bullet. To create the bullet character, press ALT+0149 (type 0149 on the numeric keypad).

You may want to include a space after the symbol so that the bullet will not be next to the text. Note that the bullet is an extended character and may not be available with all fonts.

Some other examples of characters you can use include: >, /, ~, !, and others.
5.Click OK.

Friday, January 4, 2008

installing module

> perl Makefile.PL
> make
> make test
> make install
perl -MCPAN -e 'install '

isArray() in Perl

foreach $item (@array){
if(ref($item) eq 'ARRAY'){
#It's an array reference...
#you can read it with $item->[1]
#or dereference it uisng @newarray = @{$item}
}else{
#not an array in any way...
}
}

ordering hash

use Tie::IxHash;
tie %HASH, "Tie::IxHash";
# manipulate %HASH
@keys = keys %HASH; # @keys is in insertion order

#Tie::InsertOrderHash does not allow you to change values in your hash and keep the original ordering. Thus, as mentioned above, updating the days in February will cause February to now appear after December when listing the keys.
#If you want to be able to change values without changing the ordering then Tie::IxHash may be a better option.

delete duplicate hash value's

my %seen;
for my $key (keys %students) {
my $value_key = "@{[values %{$students{$key}}]}";
if (exists $seen{$value_key}) {
delete $students{$key};
}
else {
$seen{$value_key}++;
}
}
undef %seen;

Finding oldest file in directory

my $xml_file = (reverse sort{(stat $a)[10] cmp (stat $b)[10]} glob "$dir_name/*.*")[0];