Extract fields of data in Perl (split, awk)

Home > Search
  by

Let's say the $greeting variable has the following text.

$greeting = "Hello World How are you today";

 

The split operator can be used to create an index of the data. In this example, split is followed by / /, which means that a single white space will be used as the delimter, and the results are stored in array called @strings.

@strings = split / /, $greeting;

 

Dumper can be used to view the results of the split.

use Data::Dumper;
print Dumper \@strings;

 

In this example, the is one variable (VAR1) that contains 6 indexes of data.

$VAR1 = [
          'Hello',
          'World',
          'How',
          'are',
          'you',
          'today'
        ];

 

Now, you can do something will individual pieces of data that were split.

print @strings[0]; # Hello
print @strings[1]; # World
print @strings[2]; # How
print @strings[3]; # are
print @strings[4]; # you
print @strings[5]; # today

 


New lines

If you have a variable that contains new lines, the split operator can be used so that each line of data is it's own chunk of data. Let's say the $greeting variable contains new lines.

$greeting = "Line one\n Line two\n Line three\n Line four\n";

 

In this example, /\n/ is used to split the $greeing variable at new lines, and to then write the results to the @strings array.

@strings = split /\n/, $greeting;

 

Dumper can again be used to view the results, which shows that each chunk of data is now it's own line.

$VAR1 = [
          'Line one',
          'Line two',
          'Line three',
          'Line four'
        ];

 

Often, a for each loop is then used to iterate each line.

foreach $line (@strings) {
  print "$line \n";
}

 

This will print each lines.

Line one
Line two
Line three
Line four

 

If needed, you could then split again inside of the foreach line, to create an index of each piece of data on each line.

foreach $line (@strings) {
  @fields = split / /, $line;
  print Dumper \@fields;
}

 

Using Dumper, we can see that each line is now it's own unique variable, and each whitespace separated piece of data is it's own index.

$VAR1 = [
          'Line',
          'one'
        ];
$VAR1 = [
          'Line',
          'two'
        ];
$VAR1 = [
          'Line',
          'three'
        ];
$VAR1 = [
          'Line',
          'four'
        ];

 

This will allow us to do something with certain fields of data inside of the foreach loop.

foreach $line (@strings) {
  @fields = split / /, $line;
  print "@fields[1] \n";
}

 

In this example, index[1] of each line is printed, which would print the following.

one
two
three
four

 


Split a file

Let's say example.txt has the following text.

Hello World How are you today

 

The following Perl script will associate each field of data with a unique variable. In this example, $field0 would have the value "Hello", $field1 would have the value "World", and so on.

open(FH, '<', example.txt);

while (<FH>) {
  $field0 = ((split ' ')[0]);
  $field1 = ((split ' ')[1]);
  $field2 = ((split ' ')[2]);
  $field3 = ((split ' ')[3]);
  $field4 = ((split ' ')[4]);
  $field5 = ((split ' ')[5]);
}

close(FH);

 

Each variable can then be printed.

print "$field0 \n";
print "$field1 \n";
print "$field2 \n";
print "$field3 \n";
print "$field4 \n";
print "$field5 \n";

 

Which would produce the following output.

Hello
World
How
are
you
today

 



Add a Comment




We will never share your name or email with anyone. Enter your email if you would like to be notified when we respond to your comment.




Please enter in the box below so that we can be sure you are a human.




Comments