Bootstrap

Perl (Scripting) - Split fields at delimiter

by Jeremy Canfield | Updated: January 16 2024 | Perl (Scripting) articles

Let's say the $greeting variable has the following text.

my $greeting = "Hello World How are you today";

split can be used to create an index of the data. In this example, split is followed by / /, which means that a single white space will be used as the delimiter, and the results are stored in a list called @items.

my @items = split / /, $greeting;

Dumper can be used to view the results of the split.

use Data::Dumper;
print Dumper @items;

In this example, the is one variable (VAR1) that contains 6 indexes of data.

$VAR1 = 'Hello';
$VAR2 = 'World';
$VAR3 = 'How';
$VAR4 = 'are';
$VAR5 = 'you';
$VAR6 = 'today';

Now, you can do something will individual pieces of data that were split. Notice that the first element has an index of 0 and the last element has an index of 5 when there are 6 elements in the list.

print $items[0]; # Hello
print $items[1]; # World
print $items[2]; # How
print $items[3]; # are
print $items[4]; # you
print $items[5]; # today

I almost always use \s+ to split at unpredictible whitespace.

my @items = split /\s+/, $greeting;

Empty Lines

I account for the scenario where the variable being split contains empty lines, grep can be used to not include the empty lines in the list.

my @items = grep {/\S/} split /\s+/, $greeting;

Periods

If the sting of data is separted with periods, escape the period.

my @items = split /\./, $greeting;

Retain delimiter

It seems like the delimiter is removed by split. The following look ahead will retain the delimiter. In this example, the white space will be included in @items.

my @items = split /(?<=\ )/, $greeting;

Split variable into new variables

Typically, a variable is split into an array. However, there are situation where it makes more sense to split into new variables. For example, let's say example.txt contains the following text.

Jeremy Engineer

This script will store "Jeremy" into a variable called $name and "Engineer" into a variable called $occupation.

open(FH, '<', "example.txt");
while (<FH>){
  my ($name, $occupation) = split(/,/);
  print "$name\n";
  print "$occupation\n";
}
close(FH);

New lines

If you have a variable that contains new lines, the split operator can be used so that each line of data is it's own chunk of data. Let's say the $greeting variable contains new lines.

my $greeting = "Line one\n Line two\n Line three\n Line four\n";

In this example, /\n/ is used to split the $greeing variable at new lines, and to then write the results to the @items list.

my @items = split /\n/, $greeting;

Dumper can again be used to view the results, which shows that each chunk of data is now it's own line.

$VAR1 = [
          'Line one',
          'Line two',
          'Line three',
          'Line four'
        ];

Often, a for each loop is then used to iterate each line.

foreach my $line (@items) {
  print "$line \n";
}

This will print each lines.

Line one
Line two
Line three
Line four

If needed, you could then split again inside of the foreach line, to create an index of each piece of data on each line.

foreach my $line (@items) {
  my @fields = split / /, $line;
  print Dumper \@fields;
}

Using Dumper, we can see that each line is now it's own unique variable, and each whitespace separated piece of data is it's own index.

$VAR1 = [
          'Line',
          'one'
        ];
$VAR1 = [
          'Line',
          'two'
        ];
$VAR1 = [
          'Line',
          'three'
        ];
$VAR1 = [
          'Line',
          'four'
        ];

This will allow us to do something with certain fields of data inside of the foreach loop.

foreach my $line (@items) {
  my @fields = split / /, $line;
  print "@fields[1] \n";
}

In this example, index[1] of each line is printed, which would print the following.

one
two
three
four

Split a file

Let's say example.txt has the following text.

Hello World How are you today

The following Perl script will associate each field of data with a unique variable. In this example, $field0 would have the value "Hello", $field1 would have the value "World", and so on.

open(FH, '<', example.txt);

while (<FH>) {
  $field0 = ((split ' ')[0]);
  $field1 = ((split ' ')[1]);
  $field2 = ((split ' ')[2]);
  $field3 = ((split ' ')[3]);
  $field4 = ((split ' ')[4]);
  $field5 = ((split ' ')[5]);
}

close(FH);

Each variable can then be printed.

print "$field0 \n";
print "$field1 \n";
print "$field2 \n";
print "$field3 \n";
print "$field4 \n";
print "$field5 \n";

Which would produce the following output.

Hello
World
How
are
you
today

Did you find this article helpful?

If so, consider buying me a coffee over at

Did you find this article helpful?

Comments

Add a Comment